Thinking Big with Maps in R

Tips on Wrangling Large Vector Data into Interactive Maps


Silvia Canelón, PhD

CANSSI Ontario logo featuring a radial bar chart

CANSSI Ontario Statistical Software Conference

Nov. 10, 2022

Navigation

  • A table of contents for these slides can be accessed by clicking on the hamburger menu icon on the bottom left corner of the screen
  • You can navigate the deck:
    • Using the left/right or up/down arrow keys on your keyboard
    • Clicking on the up/down controls in the bottom right corner of the screen

Silvia Canelón

Data Analyst @ Penn Urban Health Lab

University of Pennsylvania, Philadelphia, PA, US

smiling woman with a tan complexion, dark eyes, and dark long wavy hair styled to one side

Link silviacanelon.com
Mastodon fosstodon.org/@spcanelon
Twitter @spcanelon
GitHub @spcanelon

Static map

Plotting with geom_sf()

  • A lot of lines
  • Some large polygons?
  • Seems like the entire county was covered by tree canopy

I needed an interactive map to take a closer look at the data

Which package to use?

  • leaflet is best for datasets with <50,000 features…my tree canopy dataset had 193,418 multipolygons
  • leafgl is recommended in concert with leaflet (spoiler: it didn’t work for me)
  • settled on mapview

Plotting with mapview

Map of Philadelphia overlaid with tree canopy data, some of which is missing in distinct vertical bands

  • Vertical strips of data missing
  • Non-functional interactivity
  • RStudio would crash
  • VSCode was hit-or-miss

Tweet with text: #RSpatial friends, do you have any tips for working with datasets that have 200k+ features (e.g. multipolygons)? I tried using the #mapview package and it struggled. It couldn't render the full map 👇and I wasn't able to use the zoom controls. What am I missing? #RStats. Tweet fig alt: Map of Philadelphia overlaid with tree canopy data, some of which is missing in distinct vertical bands

Suggestion

Open in QGIS


Tweet text: In such situations I usually export the layer, and explore it in QGIS




QGIS rendering
in action

Tweet text: recently i've counted the number of points in a feature (esp super complicated multipolygons) and simplified the geometry based on that (next tweet!). Next tweet text: e.g. pts <- mapview::npts(feature) if (pts > 10000) {rmapshaper::ms_simplify(feature, keep = 0.1, keep_shapes = TRUE)} else if (pts > 5000) {ms_simplify(feature, keep = 0.3, keep_shapes = TRUE)} else {feature}

Suggestion

Simplify geometry with rmapshaper

  • super complicated multipolygons
  • simplifying eliminating features
    • but it could!

I use the package tmap, it works great. You have 2 viewing options the static one and the one you can zoom in and out.

Suggestion

Try the tmap package

Try {rdeck}. Handles datasets of that size and orders of magnitude larger without hassles. Can also use vector tiles if you need to larger.

Suggestion (WIP)

Try the rdeck package

Thank you #RStats!

Tweet quoting the original tweet asking for help. Text reads: Update: I’ve got a functional map! Many thanks to all the lovely people that contributed to this thread ✨ I simplified the geometries from 193k features to 149k using #rmapshaper and created the map using #tmap #RStats #RSpatial

Bonus slide!

This slide was added after the presentation to include feedback from talk attendees

  • Rendering issues might come from maxing out the RAM
  • To reduce the size of the dataset, one could adjust the precision because sometimes the features contain more digits than you need or can meaningfully rely on
  • The mapdeck package might be another one to try. It’s powerful and sits on Mapbox which lets you pull in nice basemaps as well

Silvia Canelón

smiling woman with a tan complexion, dark eyes, and dark long wavy hair styled to one side

Link silviacanelon.com
Mastodon fosstodon.org/@spcanelon
Twitter @spcanelon
GitHub @spcanelon