class: center, middle, inverse, title-slide # Instrumentos de Análisis Urbanos II ## Maestría en Economía Urbana ### ### Universidad Torcuato Di Tella ### 08/08/2023 --- layout: true <div class="my-footer"><span>Instrumentos de Análisis Urbanos II - <a href="https://tuqmano.github.io/geo_utdt/">https://tuqmano.github.io/geo_utdt/</a></span></div> --- class: middle, center, inverse # Data Viz I ## La _Gramática de los Gráficos_ --- class: inverse, middle # Garmática de los Gráficos >**Un conjunto formal de reglas para la producción de gráficos estadísticos** -- > **Se basa en la definición de capas** -- **- _Leland Wilkinson_** -- * Estadística y Ciencia de Computación **+** -- * Experto en Viz (_SPSS, Tableau_) **+** -- **->** [_**Grammar of Graphics**_ (1999)](https://www.springer.com/gp/book/9780387245447) --- ## <svg viewBox="0 0 448 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M448 360V24c0-13.3-10.7-24-24-24H96C43 0 0 43 0 96v320c0 53 43 96 96 96h328c13.3 0 24-10.7 24-24v-16c0-7.5-3.5-14.3-8.9-18.7-4.2-15.4-4.2-59.3 0-74.7 5.4-4.3 8.9-11.1 8.9-18.6zM128 134c0-3.3 2.7-6 6-6h212c3.3 0 6 2.7 6 6v20c0 3.3-2.7 6-6 6H134c-3.3 0-6-2.7-6-6v-20zm0 64c0-3.3 2.7-6 6-6h212c3.3 0 6 2.7 6 6v20c0 3.3-2.7 6-6 6H134c-3.3 0-6-2.7-6-6v-20zm253.4 250H96c-17.7 0-32-14.3-32-32 0-17.6 14.4-32 32-32h285.4c-1.9 17.1-1.9 46.9 0 64z"></path></svg> Referencias (I) Tres libros relevantes: - [`ggplot2`: _**Elegant Graphics for Data Analysis**_](https://ggplot2-book.org/) (**H. Wickham**) es el material principal con definiciones de la aplicación de la _gramática de gráficos_ en `R`. -- - [_**Data Visualization: A Pracitcal Introduction**_](https://socviz.co/index.html#preface) (**K. Heley**) disute principios sobre visualización de datos, y consejos práctivos de su aplicación acompañado de código en `R` par reproducirlos. -- - En [_**Fundamentals of Data Visualization**_](https://clauswilke.com/dataviz/) (**Claus Wilke**) presentan argumentos y consejos para realizar visualizaciones profesionales que representen correctamente los datos. --- ## <svg viewBox="0 0 448 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M448 360V24c0-13.3-10.7-24-24-24H96C43 0 0 43 0 96v320c0 53 43 96 96 96h328c13.3 0 24-10.7 24-24v-16c0-7.5-3.5-14.3-8.9-18.7-4.2-15.4-4.2-59.3 0-74.7 5.4-4.3 8.9-11.1 8.9-18.6zM128 134c0-3.3 2.7-6 6-6h212c3.3 0 6 2.7 6 6v20c0 3.3-2.7 6-6 6H134c-3.3 0-6-2.7-6-6v-20zm0 64c0-3.3 2.7-6 6-6h212c3.3 0 6 2.7 6 6v20c0 3.3-2.7 6-6 6H134c-3.3 0-6-2.7-6-6v-20zm253.4 250H96c-17.7 0-32-14.3-32-32 0-17.6 14.4-32 32-32h285.4c-1.9 17.1-1.9 46.9 0 64z"></path></svg> Referencias (II) 1. _Visualización de Datos (Intro)_, en "[R para Ciencia de Datos](https://es.r4ds.hadley.nz/visualizaci%C3%B3n-de-datos.html#introducci%C3%B3n-1)" (Wickham y Grolemnud). 2. [_ModernDive_](https://moderndive.com/2-viz.html) 3. [(a) Urdinez y Cruz](https://arcruz0.github.io/libroadp/dataviz.html); [(b) Montané](https://martinmontane.github.io/CienciaDeDatosBook/visualizaciones-de-datos-en-r.html); y [(c) Vázquez Brust](https://bitsandbricks.github.io/ciencia_de_datos_gente_sociable/visualizaci%C3%B3n.html). --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 15% background-size: 10% # Data Viz ## La **g**ramática de los **g**ráficos en `R` #### _Dibujando por capas_ con [<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M326.612 185.391c59.747 59.809 58.927 155.698.36 214.59-.11.12-.24.25-.36.37l-67.2 67.2c-59.27 59.27-155.699 59.262-214.96 0-59.27-59.26-59.27-155.7 0-214.96l37.106-37.106c9.84-9.84 26.786-3.3 27.294 10.606.648 17.722 3.826 35.527 9.69 52.721 1.986 5.822.567 12.262-3.783 16.612l-13.087 13.087c-28.026 28.026-28.905 73.66-1.155 101.96 28.024 28.579 74.086 28.749 102.325.51l67.2-67.19c28.191-28.191 28.073-73.757 0-101.83-3.701-3.694-7.429-6.564-10.341-8.569a16.037 16.037 0 0 1-6.947-12.606c-.396-10.567 3.348-21.456 11.698-29.806l21.054-21.055c5.521-5.521 14.182-6.199 20.584-1.731a152.482 152.482 0 0 1 20.522 17.197zM467.547 44.449c-59.261-59.262-155.69-59.27-214.96 0l-67.2 67.2c-.12.12-.25.25-.36.37-58.566 58.892-59.387 154.781.36 214.59a152.454 152.454 0 0 0 20.521 17.196c6.402 4.468 15.064 3.789 20.584-1.731l21.054-21.055c8.35-8.35 12.094-19.239 11.698-29.806a16.037 16.037 0 0 0-6.947-12.606c-2.912-2.005-6.64-4.875-10.341-8.569-28.073-28.073-28.191-73.639 0-101.83l67.2-67.19c28.239-28.239 74.3-28.069 102.325.51 27.75 28.3 26.872 73.934-1.155 101.96l-13.087 13.087c-4.35 4.35-5.769 10.79-3.783 16.612 5.864 17.194 9.042 34.999 9.69 52.721.509 13.906 17.454 20.446 27.294 10.606l37.106-37.106c59.271-59.259 59.271-155.699.001-214.959z"></path></svg>`{ggplot2}`](https://ggplot2.tidyverse.org/index.html) <img src="https://github.com/TuQmano/geo_utdt2022/blob/main/fig/ggplot_layers.png?raw=true" width="35%" /> -- **<svg viewBox="0 0 576 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M549.655 124.083c-6.281-23.65-24.787-42.276-48.284-48.597C458.781 64 288 64 288 64S117.22 64 74.629 75.486c-23.497 6.322-42.003 24.947-48.284 48.597-11.412 42.867-11.412 132.305-11.412 132.305s0 89.438 11.412 132.305c6.281 23.65 24.787 41.5 48.284 47.821C117.22 448 288 448 288 448s170.78 0 213.371-11.486c23.497-6.321 42.003-24.171 48.284-47.821 11.412-42.867 11.412-132.305 11.412-132.305s0-89.438-11.412-132.305zm-317.51 213.508V175.185l142.739 81.205-142.739 81.201z"></path></svg>** [_Plotting Anything with `ggplot2`_](https://www.youtube.com/watch?v=h29g21z0a68) - Thomas Lin Pedersen. --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 15% background-size: 10% class: middle, inverse # Data Viz ## RECETA BÁSICA ```r ggplot(data = <DATA>) + <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>)) ``` --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 10% background-size: 10% # Data Viz ### `millas` del [paquete <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M326.612 185.391c59.747 59.809 58.927 155.698.36 214.59-.11.12-.24.25-.36.37l-67.2 67.2c-59.27 59.27-155.699 59.262-214.96 0-59.27-59.26-59.27-155.7 0-214.96l37.106-37.106c9.84-9.84 26.786-3.3 27.294 10.606.648 17.722 3.826 35.527 9.69 52.721 1.986 5.822.567 12.262-3.783 16.612l-13.087 13.087c-28.026 28.026-28.905 73.66-1.155 101.96 28.024 28.579 74.086 28.749 102.325.51l67.2-67.19c28.191-28.191 28.073-73.757 0-101.83-3.701-3.694-7.429-6.564-10.341-8.569a16.037 16.037 0 0 1-6.947-12.606c-.396-10.567 3.348-21.456 11.698-29.806l21.054-21.055c5.521-5.521 14.182-6.199 20.584-1.731a152.482 152.482 0 0 1 20.522 17.197zM467.547 44.449c-59.261-59.262-155.69-59.27-214.96 0l-67.2 67.2c-.12.12-.25.25-.36.37-58.566 58.892-59.387 154.781.36 214.59a152.454 152.454 0 0 0 20.521 17.196c6.402 4.468 15.064 3.789 20.584-1.731l21.054-21.055c8.35-8.35 12.094-19.239 11.698-29.806a16.037 16.037 0 0 0-6.947-12.606c-2.912-2.005-6.64-4.875-10.341-8.569-28.073-28.073-28.191-73.639 0-101.83l67.2-67.19c28.239-28.239 74.3-28.069 102.325.51 27.75 28.3 26.872 73.934-1.155 101.96l-13.087 13.087c-4.35 4.35-5.769 10.79-3.783 16.612 5.864 17.194 9.042 34.999 9.69 52.721.509 13.906 17.454 20.446 27.294 10.606l37.106-37.106c59.271-59.259 59.271-155.699.001-214.959z"></path></svg>`{datos}`](https://cienciadedatos.github.io/datos/) ```r library(tidyverse) library(datos) dplyr::as_tibble(millas) ## # A tibble: 234 x 11 ## fabricante modelo cilindrada anio cilindros transmision traccion ciudad ## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> ## 1 audi a4 1.8 1999 4 auto(l5) d 18 ## 2 audi a4 1.8 1999 4 manual(m5) d 21 ## 3 audi a4 2 2008 4 manual(m6) d 20 ## 4 audi a4 2 2008 4 auto(av) d 21 ## 5 audi a4 2.8 1999 6 auto(l5) d 16 ## 6 audi a4 2.8 1999 6 manual(m5) d 18 ## 7 audi a4 3.1 2008 6 auto(av) d 18 ## 8 audi a4 quattro 1.8 1999 4 manual(m5) 4 18 ## 9 audi a4 quattro 1.8 1999 4 auto(l5) 4 16 ## 10 audi a4 quattro 2 2008 4 manual(m6) 4 20 ## # i 224 more rows ## # i 3 more variables: autopista <int>, combustible <chr>, clase <chr> ``` --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 15% background-size: 10% # Data Viz: _ggplot2_ ### Receta básica ```r ggplot(data = millas) + # DATOS geom_point(mapping = aes(x = cilindrada, y = autopista)) # ESTETICAS ``` <img src="sesion7_files/figure-html/millas-1.png" width="40%" /> --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 15% background-size: 10% # _Dibujando por capas_ ### mapeo y estéticas <img src="https://github.com/TuQmano/geo_utdt2022/blob/main/fig/ggplot_layers.png?raw=true" width="35%" /> --- ## Una capa estética extra con **_aes()_** ```r ggplot(data = millas) + # 'Esteticas' geom_point(mapping = aes(x = cilindrada, y = autopista, * colour = clase)) ``` <img src="sesion7_files/figure-html/unnamed-chunk-5-1.png" width="50%" /> --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 5% background-size: 10% ## _mapping_ VS _setting_ ```r *# Que pasa acá? ggplot(data = millas) + geom_point(mapping = aes(x = cilindrada, y = autopista, * colour = "blue")) ``` <img src="sesion7_files/figure-html/unnamed-chunk-6-1.png" width="40%" /> --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 1% background-size: 10% ## Data Viz: _ggplot2_ ```r ggplot(data = millas) + geom_point(mapping = aes(x = cilindrada, y = autopista), * colour = "blue") ``` <img src="sesion7_files/figure-html/unnamed-chunk-7-1.png" width="40%" /> --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 1% background-size: 10% ## Data Viz: _ggplot2_ Cada **_geometry_ tiene parámetros específicos** que pueden ser ajustados dentro de la capa _estética_. El **color** es uno de ellos. -- A `geom_point` podemos asignarle una forma particular en función de valores de alguna variable, por ejemplo: <img src="https://d33wubrfki0l68.cloudfront.net/2f8f27c472d7df78486e248c40931019b286361b/10d08/visualize_files/figure-html/unnamed-chunk-7-1.png" width="30%" /> * `shape` -- * `size` -- * `alpha` -- * `fill` --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 15% background-size: 10% ## _Dibujando por capas_ : **_facets_** <img src="https://github.com/TuQmano/geo_utdt2022/blob/main/fig/ggplot_layers.png?raw=true" width="35%" /> --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 1% background-size: 10% ## Data Viz: _ggplot2_: **facetas** ##### _small multiples_ ```r ggplot(data = millas) + geom_point(mapping = aes(x = cilindrada, y = autopista)) + * facet_wrap(~ clase, nrow = 2) ``` <img src="sesion7_files/figure-html/unnamed-chunk-10-1.png" width="45%" /> --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 15% background-size: 10% ## _Dibujando por capas_: **_geometry_** <img src="https://github.com/TuQmano/geo_utdt2022/blob/main/fig/ggplot_layers.png?raw=true" width="35%" /> --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 1% background-size: 10% ## Data Viz: _ggplot2_: variedad de **geometrías** posibles <img src="sesion7_files/figure-html/unnamed-chunk-12-1.png" width="35%" /><img src="sesion7_files/figure-html/unnamed-chunk-12-2.png" width="35%" /> -- ```r # izquierda ggplot(data = millas) + * geom_point(mapping = aes(x = cilindrada, y = autopista)) # derecha ggplot(data = millas) + * geom_smooth(mapping = aes(x = cilindrada, y = autopista)) ``` --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 1% background-size: 10% ## Data Viz: _ggplot2_ > Integramos los dos `geom_` como capas de un mismo gráfico ```r ggplot(data = millas) + geom_point(mapping = aes(x = cilindrada, y = autopista)) + geom_smooth(mapping = aes(x = cilindrada, y = autopista)) ``` <img src="sesion7_files/figure-html/unnamed-chunk-14-1.png" width="35%" /> -- #### **Que notan en el código?** --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 1% background-size: 10% ## Data Viz: _ggplot2_ **Podemos asignar parametros _globales_ para todo el gráfico (que pueden ser sobrescritos en capas siguientes)** ```r ggplot(data = millas, aes(x = cilindrada, y = autopista)) + geom_point() + geom_smooth() ``` <img src="sesion7_files/figure-html/unnamed-chunk-15-1.png" width="35%" /> --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 1% background-size: 10% ## Data Viz: _ggplot2_ ```r ggplot(millas, aes(cilindrada, autopista)) + geom_point() + geom_smooth() ``` <img src="sesion7_files/figure-html/unnamed-chunk-16-1.png" width="35%" /> > **Se pueden elidir los nombres de parámetros** -- **<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M326.612 185.391c59.747 59.809 58.927 155.698.36 214.59-.11.12-.24.25-.36.37l-67.2 67.2c-59.27 59.27-155.699 59.262-214.96 0-59.27-59.26-59.27-155.7 0-214.96l37.106-37.106c9.84-9.84 26.786-3.3 27.294 10.606.648 17.722 3.826 35.527 9.69 52.721 1.986 5.822.567 12.262-3.783 16.612l-13.087 13.087c-28.026 28.026-28.905 73.66-1.155 101.96 28.024 28.579 74.086 28.749 102.325.51l67.2-67.19c28.191-28.191 28.073-73.757 0-101.83-3.701-3.694-7.429-6.564-10.341-8.569a16.037 16.037 0 0 1-6.947-12.606c-.396-10.567 3.348-21.456 11.698-29.806l21.054-21.055c5.521-5.521 14.182-6.199 20.584-1.731a152.482 152.482 0 0 1 20.522 17.197zM467.547 44.449c-59.261-59.262-155.69-59.27-214.96 0l-67.2 67.2c-.12.12-.25.25-.36.37-58.566 58.892-59.387 154.781.36 214.59a152.454 152.454 0 0 0 20.521 17.196c6.402 4.468 15.064 3.789 20.584-1.731l21.054-21.055c8.35-8.35 12.094-19.239 11.698-29.806a16.037 16.037 0 0 0-6.947-12.606c-2.912-2.005-6.64-4.875-10.341-8.569-28.073-28.073-28.191-73.639 0-101.83l67.2-67.19c28.239-28.239 74.3-28.069 102.325.51 27.75 28.3 26.872 73.934-1.155 101.96l-13.087 13.087c-4.35 4.35-5.769 10.79-3.783 16.612 5.864 17.194 9.042 34.999 9.69 52.721.509 13.906 17.454 20.446 27.294 10.606l37.106-37.106c59.271-59.259 59.271-155.699.001-214.959z"></path></svg>** Más detalle en esta [Intro a ggplot]("https://es.r4ds.hadley.nz/visualización-de-datos.html") --- background-image: url(https://github.com/rstudio/hex-stickers/raw/master/PNG/ggplot2.png) background-position: 95% 15% background-size: 10% ## _Dibujando por capas_: **_theme_** <img src="https://github.com/TuQmano/geo_utdt2022/blob/main/fig/ggplot_layers.png?raw=true" width="35%" /> --- ## Data Viz: _ggplot2_: **_theme()_** ```r ggplot(millas, aes(cilindrada, autopista)) + geom_point() + geom_point(data = millas %>% filter(fabricante == "audi"), color = "blue", size =3) + geom_smooth(se = FALSE) + labs(title = "Performance de los AUDI", subtitle = "Un gráfico del TuQmano", y = "Etiqueta Y", x = "Etiqueta X", caption = "FUENTE: {datos} 'R Para Ciencia de Datos'") + * ggthemes::theme_wsj() ``` <img src="sesion7_files/figure-html/unnamed-chunk-18-1.png" width="35%" /> --- class: inverse, middle, center # Graficando Properati [`sesiones/scripts/primer_ggplot.R`](https://github.com/TuQmano/geo_utdt/blob/master/sesiones/scripts/primer_ggplot.R)