Intro to R as GIS

For the past two years I have served as one of two student representatives on the US-IALE executive committee. One of the major things we do in addition to providing a student voice on the ExComm is organize a students-only half-day workshop at our annual meeting. This is offered at no-cost to students attending the conference and we try to do things relevant to the field. In 2015 our chapter hosted the IALE World Congress in Portland, Oregon and with an eye on software that many students are learning to use we (ambitiously) put together an introductory workshop on manipulating and analyzing spatial data using R. We were able to recruit three other people to help develop and deliver the workshop, and managed to cram the whole thing into 4 hours.


Karl providing some guidance during the workshop – equally possible that he’s saying, “I haven’t seen that error before…”

Given that we unleashed a barely controlled firehose of R on the attendees, I think that overall it went okay. Given the material I think it would work better as a 6-8 hour workshop with the option for attendees to bring/use their own data. Maybe this is the way it should be set up from the start, i.e. here is a dataset that I know it works with, now try and do it with your own. I haven’t organized or been a part of delivering many workshops, but I learned a lot and really enjoyed the experience.

If you want to check it out the workshop materials the are freely available here on GitHub.


Edge Effects and Connectivity in Landscape Ecology

The the way the landscape is seen from your perspective or mine is likely similar, yet not quite the same, and still our interactions with this landscape are completely different from that of a wolf or a bird or a plant or microbe. This is infinitely fascinating to me.

This semester we have been having paper discussions during our lab meetings, each led by a different member (grad students and postdocs). The first few were tilted toward the human dimension side of our lab, so I was excited to mix things up and lead a discussion about some traditional landscape ecology research. Thinking about the incredible variety of landscapes, how they are connected and divided, how those patterns of connection and division change depending on your perspective is my version of “going back to the bench.” It is one of the major inspirations to me as a scientist. So this week we talked about some ideas that are at the foundation of landscape ecology, particularly edge effects and connectivity.

What are “edge effects” and “connectivity” anyway? The people in our lab group come from a variety of backgrounds, personally and academically. I asked people to provide a definition of “edge effects” from their perspective and this produced two responses. Everyone has at least a little experience with GIS, so one type of “edge effect” brought up was technological where if you are doing a calculation over a gridded surface the values at the edges of the map end up biased because fewer input cells can be used to calculate the values for these cells. The other definition was ecological where an “edge effect” is due to a abrupt transition between environments or landscape characteristics that creates relatively distinct habitat boundaries. This type of edge effect influences the local climate and the species that are likely to occur or occupy the space on either side.

Connectivity is typically in one of two categories, structural or functional, though these are not necessarily mutually exclusive. Structural connectivity is probably the most familiar type to many people. One example are wildlife corridors, which provide a pathway for animals to travel but are not exactly the type of habitat where they would linger. For me, functional connectivity is more easily characterized by thinking about passively dispersed organisms such as wind dispersed pathogens (I study one of these so I might be a little biased). In this example, the pathogen depends on hosts occurring in sufficient frequency and density in order for it to traverse the landscape, and establish and reproduce in a new location. So, a corridor connecting two larger areas may be structural or functional or both in terms of connectivity.

In the paper that we discussed the authors designed a landscape scale experiment to test the effects of connectivity, fragmentation, and edges on the development and spread of a plant disease. The landscape scale experiment itself is admirable because replication at a scale larger than a laboratory or greenhouse is challenging. It is just so big.

The pathogen they were investigating was southern corn leaf blight on sweet corn. They tested whether a structural corridor affected the spread and development of this wind-dispersed pathogen across the landscape. In addition they tested whether there were edge effects on disease development by placing infected plants at varying distances from the edge of the “habitat” patch. The habitat in this case was “regenerating longleaf pine forest” that had been cut into patches with various configurations (I believe for other purposes, but useful for this experiment). They found that connectivity did not have a detectable effect on disease spread or development, but did detect edge effects that were dependent on the configuration of the patch.

While this landscape was supremely useful for doing experiments with this disease system, a substantial drawback was the realism. The immediate question the came to my mind was if there had been functional connectivity in addition to the structural connectivity would they have detected an effect, especially since this is a passively dispersing pathogen? This is an additional experiment that I and others thought would have really improved the study, but that does not take away from the insights that they did gain. And I think this is how science works, in bits and pieces, fits and starts, and eventually we are able to hopefully say at least one thing about a system or process with substantial confidence.

Temporary files pile-up while using the `raster` package in R

Update: I’m not sure that this method has ever actually worked for me. I would love to hear success/failure for others. Restarting the computer seems to always free things up.

The raster package in R is incredibly useful and powerful free and open source solution for helping do geospatial analysis, especially if you are familiar with R, but don’t work regularly with another GIS software. It is also very useful even if you do, after all, you may not always be working somewhere that can afford licenses for commercial desktop GIS software (ahem, ArcGIS). Though I and my fellow students here at NC State have ready access to commercial software at no cost to ourselves, we really like learning to use and integrate R into our work, because it can then be reproduced and we can collaborate more easily. The quoted information in this post can be found here on Inside-R, a super-helpful reference site. The raster package was developed for:

“Reading, writing, manipulating, analyzing and modeling of gridded spatial data. The package implements basic and high-level functions. Processing of very large files is supported.

It’s that last part that leads to the pile-up of temporary files.

Another student and I are working on a project (it’s his thesis, so really he is doing the work) that is comparing outcomes of species distribution models using predictor variables at different resolutions across the entire extent of Oregon and California. This means processing through a lot of mapped surfaces during the fitting and prediction phases. The raster package is the only way to handle this, because the

“Functions in the raster package create temporary files if the values of an output RasterLayer cannot be stored in memory (RAM). This can happen when no filename is provided to a function and in functions where you cannot provide a filename (e.g. when using ‘raster algebra’).”

Now, in a “normal” R session (using the command line or GUI that comes with R installs) these temporary files are automatically removed at the start of each session. However, it seems that if you are using RStudio under certain settings (maybe the defaults, not positive on that part), then the temporary files may be retained even when you start a new session.  So, if you find that your hard drive is filling up and wondering “Where, why, and how do I fix it?” the solution is right there as part of the raster package with the removeTmpFiles function, which can be implemented to remove all the temporary files with the minimum age of the files indicated by the value for h measured in hours.

#Remove all temporary files that are more than 24 hours old:
#Remove all temporary files currently in existence: