This notebook continues the evaluation of drought indicators introduced in notebook 1. The H SAF ASCAT sensors onboard the MetOp series of polar-orbiting satellites supply near real-time surface soil moisture (SSM) estimates, with data extending back to 2007. This extensive record allows us to compute temporal anomalies from the SSM data, using the long-term mean as a baseline. These anomalies are then standardized as deviations, or Z scores.
As noted in notebook 1, ASCAT SSM is measured in degrees of saturation, making spatial patterns reliant on soil characteristics such as porosity. However, the temporal changes used for drought anomaly detection with Z scores can provide insights into spatial patterns of drought conditions without requiring auxiliary soil data.
import cartopy.crs as ccrsimport datashader as dsimport holoviews as hvimport hvplot.pandas # noqaimport numpy as npimport pandas as pdfrom envrs.download_path import make_url
ERROR 1: PROJ: proj_create_from_database: Open of /home/runner/work/eo-datascience/eo-datascience/.conda_envs/environmental-remote-sensing/share/proj failed
12.3 Standardized Precipitation-Evapotranspiration Index
Historically, many studies on drought monitoring have used hydro-meteorological indicators such as the Standardized Precipitation Index (SPI) and the more comprehensive Standardized Precipitation-Evapotranspiration Index (SPEI). SPEI is seen as the more robust metric for drought monitoring as it uses the difference between precipitation and potential evapotranspiration (\(P – ETo\)), rather than precipitation (\(P\)) as an input alone, and, where the potential evapotranspiration can be calculated from weather data. However, satellite-based soil moisture estimates, and linked drought anomaly indices (e.g., Z scores) might be more suitable for agricultural drought monitoring as they focus more directly on plant water requirements by directly measuring the soil’s available water content.
Both SPEI and standardized soil moisture drought anomalies, such as Z scores, allow for direct comparisons across different regions and time scales.
To summarize:
SPEI is a hydro-meteorological indicator for drought monitoring based on anomalies in precipitation and evapotranspiration
Z scores are based on standardized anomalies derived from soil moisture; e.g., spaceborne microwave-retrieved surface soil moisture
The purpose of this notebook is to compare both drought indicators, particularly focusing on the spatial extent of drought-affected areas. We will relate these indicators to recorded drought events in Mozambique between 2007 and 2021 (The International Disaster Database).
Year/Period
Affected Provinces
Estimated Affected People
2008 (Early)
Maputo, Gaza, Inhambane, Manica, Sofala, Tete
~1.2 million
2010
Maputo, Gaza, Inhambane
~460,000
2015–2016 (El Niño)
Maputo, Gaza, Inhambane, Sofala, Tete
~2.3 million
2021
Cabo Delgado, Tete (south), Manica (north)
~1.56 million
We first load monthly aggregated SPEI data in the following code cell.
Now let’s also load monthly aggregated H SAF ASCAT SSM Z scores which we already prepared for notebook 1. For this exercise we will filter the data to have a comparable time range (up to the end of 2021) as we have for the SPEI data.
Now we will have to combine both datasets. Fortunately, both datasets are already projected on the same grid, so this task does not involve any re-projection and resampling of the data. The join operation is made easy here as we have already defined the same indexes in both datasets when loading the data. We can, in this case, use the join method, which joins the data based on the index values of the left and right datasets.
df_wide = df_spei.join(df_ascat)df_wide
latitude
longitude
spei
zscore
time
location_id
2007-01-01
10234428
-26.855516
32.357162
-1.563430
-0.371901
10235038
-26.849566
32.621094
-1.418305
-0.687805
10235648
-26.843615
32.885020
NaN
-0.721146
10236025
-26.839937
32.457973
-1.502107
-0.183039
10236635
-26.833986
32.721905
-1.427557
-0.285288
...
...
...
...
...
...
2021-12-01
11992443
-10.611925
40.540737
NaN
-1.488362
11993430
-10.603155
40.377617
-1.170215
-1.797294
11995027
-10.588965
40.478430
NaN
-1.587852
11997611
-10.566007
40.416126
-1.141002
-1.628807
12001792
-10.528863
40.454630
NaN
-1.572855
3548880 rows × 4 columns
This operation produced a “record” or “wide” format dataset, typically there is one row for each subject, in this case the indicators “spei” and “zscores” as well as the coordinates.
12.5 Simplifying Drought Severity with Data Binning
We will convert the numeric data of the drought indicators "spei" and "zscore" into discrete categories using the pandas cut method. In pandas, binning data—also known as discretization or quantization—involves dividing continuous numerical data into discrete bins or intervals. This process is beneficial for sharing quantitative data with other users who can easily interpret it, but also it manages outliers, helps creating histograms, and prepares data for machine learning algorithms that require categorical input. Additionally, we will label the binned data, thereby transforming the columns into pandas categorical data types. Pandas categorical data types are used to represent data that takes on a limited and usually fixed number of possible values (categories or classes). This type is particularly useful for categorical variables such as gender, days of the week, or survey responses. It offers efficient storage and operations for categorical data, including handling category ordering and missing values.
The process of binning and labeling drought data based on intensity is somewhat subjective, as the bin thresholds are often arbitrarily assigned and subject to debate. Here, we adhere to the guidelines provided by the World Meteorological Organization and the definitions by McKee et al. (1993)2 for standardized soil moisture based drought indices, where a “moderate” drought is defined as starting at -1 unit of standard deviations.
Now we can use the labels and thresholds to bin the columns of the drought indicators. We make a copy of the original data so that we preserve the original dataset.
The simplified labeled drought indicators now enable us to take the first step in assessing the spatial extent.
To perform a sanity check on our results, we will recreate the plot from notebook 1, but this time using categorical data types for the drought indicators. However, before we can plot this data, we need to reshape it into a “long” or “stacked” format. This format will allow us to use the groupby parameter of the hvplot extension for pandas. We will reshape the data using the pandas melt method, keeping the indexes and coordinates unchanged while stacking the indicator values. The variable column will identify the drought indicator.
Now that we have classified drought intensity into categories, let’s calculate the spatial extent of droughts over time. This will enable us to compare how the two standardized drought indices relate to each other. Conveniently, we can use the pandas value_count method on the new categorical columns "spei" and "zscore", and also employ the normalize parameter to get relative frequencies. This provides an understanding of how many data points fall into each drought class relative to the total number of observations for a certain month.
As a final step we will plot the timeseries of the relative spatial extend of the upper three drought clases (“moderate”, “severe”, and “extreme”). We compare these results to recorded drought events that had a severe impact on food security and destabelized Mozambique.
We observe significant differences between the two drought indicators. The SPEI drought information shows much higher variability compared to the Z-scores derived from H SAF ASCAT SSM data. We find that the spatial extent of the most severe drought classes, as indicated by the Z-scores, aligns well with recorded drought disasters.
The higher variability in SPEI might be due to the more variable precipitation input, while soil moisture has a “memory”, i.e. soil moisture represents water storage. Precipitation does not account for infiltration and overland flow. In SPEI the type of precipitation is not accounted for, and thus a very intense rainfall event, where a lot of rainfall falls in a short period, will not lead to a sustainable increase in soil water content, and plant available water, as most of the water will not infiltrate in the soil. Therefore, deviations from the norm are likely more subdued in soil moisture data than in SPEI but are more representative for plant available water and is intricately linked to agricultural practices that are crucial for food security in Mozambique.
12.7 A Comprehensive Comparison of Drought Indicators
To obtain a more comprehensive and accurate overview of H SAF ASCAT SSM in detecting agriculturally relevant drought events, we will compare it with an additional set of drought indicators. These indicators are monthly Z-scores calculated from different soil moisture estimates. The estimates are derived from in-situ observations, remotely sensed retrievals, or hydro-meteorological models
Dataset
Description
Key Features
Applications
ERA5
ERA5 is a reanalysis dataset produced by ECMWF that provides comprehensive global climate and weather data, including soil moisture.
- High spatial and temporal resolution - Derived from model simulations and observational inputs - Global coverage
- Climate research - Hydrological modeling - Agricultural applications - Drought monitoring - Water resource management
GLDAS
The Global Land Data Assimilation System (GLDAS) soil moisture dataset provides global estimates of soil moisture based on satellite observations, ground measurements, and model simulations.
- High-resolution and temporally consistent - Integrates various data sources - Developed by NASA and NOAA
The ESA CCI (European Space Agency Climate Change Initiative) Soil Moisture project aims to provide long-term, consistent datasets of global soil moisture derived from satellite observations.
- Long-term and consistent datasets - Derived from satellite observations - Supports scientific research and policy-making
We have already prepared a dataset that includes all these drought indicators, along with the previously introduced SPEI and ASCAT data (previously labeled "zscore").
To simplify the calculation of relative drought spatial extent, we will create two functions that loops over the different columns to perform the same analysis as before: 1) set the categorical drought type and 2) calculate relative drought extent.
def set_drought_cat_type(df: pd.DataFrame) -> pd.DataFrame:"""Drought Severity Levels. Parameters ---------- df: Pandas.DataFrame Data frame with continuos drought indicators. Returns ------- Pandas.DataFrame: Data frame with discrete drought indicators """ col_names = df.drop(columns=["longitude", "latitude"]).columnsfor name in col_names: min_border = df[name].min() max_border = df[name].max() thresholds = np.array( [ min_border if min_border <-2else-2.1, # noqa: PLR2004-2,-1.5,-1,0, max_border if max_border >0else0.1, ] ) df[name] = pd.cut(df[name], thresholds, labels=drought_labels)return dfdef calc_drought_areal_extend(df: pd.DataFrame) -> pd.DataFrame:"""Areal Extent of Drought Severity Levels. Parameters ---------- df: Pandas.DataFrame Data frame with discrete drought indicators. Returns ------- Pandas.DataFrame: Data frame with normalize counts of drought levels """ col_names = df.drop(columns=["longitude", "latitude"]).columnsreturn pd.concat( [ df.groupby(level=0)[col].value_counts(normalize=True).unstack() # noqa: PD010for col in col_names ], keys=pd.Index(col_names, name="indicator"), )df_drought_indices_cat = set_drought_cat_type(df_drought_indices.copy())df_drought_extend = calc_drought_areal_extend(df_drought_indices_cat)df_drought_extend.hvplot.area( x="time", y=drought_labels[::-1][2:], groupby="indicator", hover=False, frame_width=800, padding=((0.1, 0.1), (0, 0.9)),) * labels * points
The additional drought indicators share the characteristic of directly measuring or estimating the soil’s water content by modeling soil geophysical conditions. This comparison validates the earlier observation that the SPEI-inferred drought extent exhibits greater variability compared to indicators based on soil moisture. However, with the exception of the H SAF ASCAT SSM-derived Z scores, all indicators show significant discrepancies in their estimates of drought extent for the five major drought events.
12.8 Close-up of the 2015–2016 (El Niño) Drought Event
Let’s now zoom in on one of the most significant drought disasters of recent history in Mozambique. The 2015-2016 drought event which has been related to El Niño significantly impacted agriculture, food security, and water resources. This drought was characterized by reduced rainfall and elevated temperatures, exacerbating conditions for farming and livestock. Estimated numbers for people affected by this events are ~2.3 million.
We filter this event from the loaded dataset and plot as before with hvplot.
A visual inspection of the map displaying various drought indicators reveals broad similarities, with most indicators highlighting drought conditions in the southern provinces of Maputo and Gaza, as well as in Sofala, Tete, and Zambezia. However, the spatial patterns also exhibit significant differences in their finer details. These discrepancies are likely influenced by the inherently coarser resolutions of ERA5 (9 km), ESA CCI, and GLDAS (~25 km).
S. M. Vicente-Serrano, S. Begueria, and J. I.López-Moreno,A Multiscalar Drought Index Sensitive to Global Warming: The Standardized Precipitation Evapotranspiration Index, Journal of Climate, vol. 23, no. 7, pp. 1696-1718, Apr. 2010, doi: 10.1175/2009JCLI2909.1.↩︎
T. B. McKee, N. J. Doesken, and J. Kleist, The Relationship Of Drought Frequency and Duration to the Timescale. Eighth Conference on Applied Climatology, 17-22 January 1993, Anaheim, California↩︎