How we use Excel for geospatial data analysis and visualisation

I’ve been giving a number of online talks over the past few months, talking about both the general nature of our work as well as about how we accomplish specific technical tasks. During one session of the latter type, I mentioned how a lot of our GIS work actually happens in MS Excel and one of the audience members wanted more information on what that actually entailed. I attempted an answer then, but a GIS task this morning really highlighted how much I need and love MS Excel. I’m going to use this blogpost to document exactly what I did, so if I’m ever asked this question again, I’ll have some documented proof to support my response!

In brief, I was provided with an Indian Government gazette notification detailing the locations of a set of offshore mining blocks off the coast of India. This was in PDF format and while it comprised both text and tables, the relevant data was in tables alone. There seemed to be only 60 or so rows of data, with about 9 columns. In the distant past, I would have just hand-coded the entire thing using the num-pad on my keyboard, but this time I used one of my new favourite OCR tools (Convertio) to rip the relevant pages of the PDF into a Word document. The tables were then copy/pasted into Excel, and that’s where the real work began.

The original data columns in the gazette notification PDF

The original data columns in the gazette notification PDF

As can be seen from the image above, the spatial information is in an odd format, not really ingestible by most GIS software which needs, at a minimum, coordinate pairs to represent points. In addition, I prefer working with decimal degrees rather than the degree-minute-second format as I’ve had some traumatic experiences with apostrophes and quotes in GIS software. So, now working only in Excel, I first did a quick check/fix to catch the few errors from the OCR process ( 8’s read as 3’s, 7’s read as 1’s), I created four new columns (Lat_E, Lat_W, Long_S, Long_N). Each deg-min column pair was converted into a decimal degree format in one of the new columns using the standard conversion formula [ Decimal Degree = Degree + (Minutes/60) + (Seconds/3600)]. I also created a unique ID (UID) for each row by combining the grid number with the initials of the area; for example, Block 12 in the Arabian Sea has a UID of 12_AS.

Now, for the more complex part; GIS software can ingest text files and visualise spatial data as points, lines or polygons. My desired output for this task was a demarcated text file which had all the information required to visualise each offshore mining block. One method would involve creating 3 new rows for each block record, where each row would eventually contain the coordinates for the NE, NW, SE and SW points. This could be done manually, using cell-handling commands in Excel but would be a lot of very boring manual labour, and I would then need to combine the points together in a GIS package to obtain my polygons. Alternatively, I could write a short piece of Python code which would convert the Excel sheet into a GeoJSON file with the appropriate geometry attributes. This is the most powerful of all available options, but felt like overkill for this task.

Instead, I created a new column and put together a quick cell-handling formula that converted the spatial information into a Well Known Text (WKT) polygon format that GIS software such as QGIS is capable of reading. The formula itself looked like this

=CONCATENATE("Polygon ( (",O2," ",M2,", ",P2," ",M2,", ",P2," ",N2,", ",O2," ",N2,") )")

where the cell reference numbers refer to the (Lat_E, Lat_W, Long_S, Long_N) columns in the correct pair-grouping to obtain the NE, NW, SW and SE corners of each offshore mining block.

The converted lat-long data in four columns, as well as the WKT-friendly polygon information

The converted lat-long data in four columns, as well as the WKT-friendly polygon information

The file was then saved as a Comma-Separated Value (CSV) file. I used QGIS to read the file, selecting the WKT format as the input option, and the polygons appeared! After a quick geometry repair process, I configured the labels, exported the file in KML format for visualisation in Google Earth Pro and also created a rough static map, and the task was complete.

This is just one among the many ways in which we’ve used Excel to do the initial heavy lifting for spatial analysis and visualisation tasks, reserving the more specialised GIS tools for when they’re really needed.

A rough map depicting the final polygons

A rough map depicting the final polygons

26 projects approved by the Indian National Board of Wildlife on April 7th 2020

On 7th April 2020, when India was under a national lockdown due to the COVID-19 pandemic, the Standing Committee of the National Board of Wildlife (NBWL) held a virtual meeting and discussed 31 proposals regarding projects inside or within 10km of protected area. 26 of these projects were approved, of which 9 are for limestone mining near Mukundra Tiger Reserve; we’ve bundled these up together to prepare the 18 stories in the story map below.

Since most citizens are unlikely to read through the dry and detailed minutes of official meetings (PDF here), this story map is meant to be representative and informative of the projects approved in that meeting which, barring a few, we would be unlikely to hear of otherwise.

Using drones to assess above-ground carbon stock

3D models of mangrove islands created by applying photogrammetric techniques on UAV imagery

3D models of mangrove islands created by applying photogrammetric techniques on UAV imagery

In our previous posts about assessing ecosystem carbon stocks through remote sensing, we discussed carbon estimation methods which involved the use of vegetation indices applied to satellite imagery, as well as those that utilized volumetric assessments of biomass derived from aerial surveys using LiDAR and RGB sensors. Several papers discussed the potential of using Unmanned Aerial Vehicles (UAVs) to study forest health and carbon estimation, which we take a closer look at in this post.

 

Methods

1.      Carbon estimation through Above Ground Biomass (AGB) derived from Canopy Height Model (CHM)

This method uses RGB data collected from UAVs to create 3D models of tree clusters. From this information, a Digital Terrain Model (DTM) and Digital Surface Model (DSM) are computed, and a Canopy Height Model (CHM) is created. This CHM can be used to deduce median values for height and Diameter at Breast Height (DBH), from which AGB and subsequently carbon stock can be calculated.

Based on existing literature (Ota et al. 2015; González-Jaramillo, Fries, & Bendix 2019), it appears that using CHMs and hence AGB values derived completely from RGB data is not very accurate, primarily because imagery from UAVs cannot be used to build accurate DTMs. Since forest canopies are opaque to RGB imagery, getting good information on the varying height of the terrain at ground level is difficult. González-Jaramillo et al. (2019) found that the correlation between a CHM derived from RGB data and a CHM derived from LiDAR data was very low. Both these studies thus proceeded to use a DTM derived through LiDAR surveys of the area along with RGB based DSMs to derive CHMs. Values for AGB derived from this combination of sensors showed high correlation with those derived only by LiDAR, as well as with those derived through more traditional allometric methods.

It’s interesting to note that González-Jaramillo et al. (2019), did not conduct LiDAR based data collection for this study. They compared the accuracy of UAV based methods against results from a previous study in the same area done using LiDAR. Having access to this data allowed them to create a CHM model that combined LiDAR and RGB imagery. Due to the high costs and computational power required for LiDAR, they recommend that having a single accurate LiDAR-based DTM for an area would be sufficient to do subsequent studies using only UAVs. This is especially useful for forests that require repeated volume estimations, such as for REDD+ or carbon credit programs. Based on these studies, this method seems like a useful application to derive UAV based carbon estimates, provided that a high resolution DTM is available for the area.

 

2.      REDD+ Protocol, as developed by Goslee et al. (2016).

 

This method is based on field work and allometric equations. Sample circular plots are identified in the area for which carbon is to be estimated. The area of these plots is calculated and used to derive a scaling factor. Field work is then conducted in accordance with established protocols to estimate the AGB of sample trees using allometric equations. The scaling factor is used to extrapolate to the plot and entire area of concern. It may be possible to replace some of the tedious fieldwork with data obtained from UAV imagery.

The basic formula is:
𝐶𝑝 = 𝐷𝑀 ∗ 𝐶𝐹 
where
𝐶𝑝 = carbon stock in plot 
DM = dry biomass in plot
CF = carbon fraction



Based on this formula, it would be interesting to test how dry biomass values derived from UAV imagery would compare to results based on field work. RGB images from a UAV could be used to generate a 3D model of the relevant forest plot, which could then be used to derive volume and hence mass in the plot. Estimating dry biomass from this could be done based on the wet vs dry ratio of the tree species. Standardised carbon fractions are publicly available for most tree species.

 

Since this method specifies which field work protocols are to be used, replacing variables by deriving them through UAV-based RGB data would need to be tested adequately. Results from both methods on the same plots would need to be correlated to check accuracy and determine whether this is truly an option. The scale of the area used in this method varies. Especially for larger areas, using UAVs could potentially increase  both the speed and accuracy of the surveys, since it would reduce the time spent on field work as well as provide estimates on the volume of entire plots, rather than having to sample individual trees per plot for extrapolation.

 

3.      Biomass Expansion Factor (BEF) Method:

In this method, above-ground biomass density is derived from the volume of biomass, wood density and BEF (Brown 1997).  This method is generally based on data from National Forest Inventories (NFI). These are records of forest attributes such as species, DBH, age, class, etc generally collected by governments which is used to estimate volume of the biomass of the given area. This is then combined with the species-specific wood density (standardised and available for different tree species) as well as BEF to estimate Aboveground Biomass and then carbon. BEF is the ratio of aboveground oven-dry biomass of trees to oven-dry biomass of inventoried volume. Standardised constants of BEF have been developed by the IPCC (2014) for use across various forest types.

Volumetric assessments derived through NFI data have been cited as one of the shortcomings of this method (Shi & Liu 2017) since the resolution and relevance of the NFI data varies greatly across the world. We see merit in experimenting to see whether using UAVs could fill this gap. Deriving volume data of forests by creating 3D models with UAVs could be useful in situations that require updated and accurate carbon estimations of relatively smaller forested areas that can be mapped by UAVs. For such areas, this could be an efficient way to estimate carbon depending on the accuracy of the results.

 

4. Normalized Difference Vegetation Index (NDVI) Method:

Carbon estimation through satellite-derived vegetation indices are well documented. The same methods can be applied using multispectral UAVs for higher resolution and accuracy. Tian et al. (2017) tested the correlation of AGB derived by various vegetation indices using a multispectral UAV and found that NDVI showed the highest correlation with AGB. However, even their best model only showed a moderate correlation with the AGB (r^2=0.67). Following this, Gonzalez et al (2019) compared results of AGB derived through a UAV based NDVI with that derived from LiDAR for the same area, and showed no correlation. According to them, this was because the UAV-based sensors they were using were saturated, and the resulting NDVI wasn’t suitable to provide forest structure information. Based on this, it seems possible to use UAVs to derive carbon estimates through vegetation indices in contexts where the NDVI varies significantly within the plot.

 

Experiment

Recently, we’ve been experimenting with collecting data on above-ground biomass from UAV based volumetric assessments over mangrove forests, and have some preliminary results. On the mangrove islands indicated below, we’ve calculated that there’s approximately 57,000 cubic metres of above-ground biomass over an area of 3 acres.

Three acres of mangrove forest across two islands, sequestering ~ 27,000 tons of carbon (with huge caveats!).

Three acres of mangrove forest across two islands, sequestering ~ 27,000 tons of carbon (with huge caveats!).

Using the REDD+ protocol, and standard values for the carbon fraction, we estimate that there’s approximately 27,000 tons of above-ground carbon in this plot (calculated value of 8,837 tons/acre). This is definitely an overestimation, possibly by an order of magnitude; the two main factors being due to all the empty space being accounted for in the canopy, as well as because these equations use dry biomass values. With more field work and research, we’ll be able to obtain site-specific scaling factors that we can use to estimate a stronger relationship between the volume of a mangrove forest and the carbon sequestered within it.

In conclusion, with UAV technology more accessible, it seems like the right time to test what is possible to better understand forest health using this platform. Using UAV data in combination with other data sources and testing various methods may produce interesting results and take us a few steps closer to efficient and accessible monitoring of forest carbon and health through remote sensing. If you have any comments or suggestions regarding the suitability of these methods, or could recommend additional methods, please let us know!

 

References

 

·         Ota, T., Ogawa, M., Shimizu, K., Kajisa, T., Mizoue, N., Yoshida, S., … Ket, N. (2015). Aboveground Biomass Estimation Using Structure  from Motion Approach with Aerial Photographs in a  Seasonal Tropical Forest. Forests6(12), 3882–3898. https://doi.org/10.3390/f6113882

·         González-Jaramillo, V., Fries, A., & Bendix, J. (2019). AGB Estimation in a Tropical Mountain Forest (TMF) by Means of RGB and Multispectral Images Using an Unmanned Aerial Vehicle (UAV). Remote Sensing11(12), 1413. https://doi.org/10.3390/rs11121413

·         Lei Shi and Shirong Liu (February 22nd 2017). Methods of Estimating Forest Biomass: A Review, Biomass Volume Estimation and Valorization for Energy, Jaya Shankar Tumuluru, IntechOpen, DOI: 10.5772/65733.

 

·         Kauppi, P. E., Ausubel, J. H., Fang, J., Mather, A. S., Sedjo, R. A., & Waggoner, P. E. (2006). Returning forests analyzed with the forest identity. Proceedings of the National Academy of Sciences103(46), 17574–17579. https://doi.org/10.1073/pnas.0608343103

 

·         Goslee, K., Walker, S. M., Grais, A., Murray, L., Casarim, F., & Brown, S. (2016). Module C-CS: calculations for estimating carbon stocks. Leaf technical guidance series for the development of a forest carbon monitoring system for REDD+. Winrock International.          https://www.leafasia.org/sites/default/files/tools/Winrock_LEAF_REDD_TechSeries_C-CS_0.pdf

 

·         Inter Governmental Panel on Climate Change. (2014). IPCC Good Practice Guidance for LULUCF (Annex 3A.1 Biomass Default Tables for Section 3.2 Forest Land). Retrieved from https://www.ipcc-nggip.iges.or.jp/public/gpglulucf/gpglulucf_files/Chp3/Anx_3A_1_Data_Tables.pdf

 

·         Brown, S. (1997). Estimating biomass and biomass change of tropical forests: a primer (Vol. 134). Food & Agriculture Org. https://books.google.com/books?hl=en&lr=&id=uv-ISezvitwC&oi=fnd&pg=PA1&dq=Estimating+Biomass+and+Biomass+Change+of+Tropical+Forests:+a+Primer.+(FAO+Forestry+Paper+-+134)&ots=OCwbWs7WzH&sig=xaDzjevMu6Yg5jwbBuciglfkeIQ

 

·         Brown, Sandra., & Lugo, A. E. (1992). Aboveground biomass estimates for tropical moist forests of the Brazilian Amazon. Interciencia. Caracas17(1), 8-18.

 

·         Tian, J., Wang, L., Li, X., Gong, H., Shi, C., Zhong, R., & Liu, X. (2017). Comparison of UAV and WorldView-2 imagery for mapping leaf area index of mangrove forest. International Journal of Applied Earth Observation and Geoinformation61, 22–31. https://doi.org/10.1016/j.jag.2017.05.002