Automated image processing workflow
Image processing is performed with a number of techniques and software applications — proprietary or open-source. To ensure processing consistency across different regions and between different image products, the STARS project developed a fully automated satellite data processing workflow that seamlessly and automatically performs common pre- and post-processing image processing steps. The idea was to set up a system that requires no human intervention and processes a delivered satellite image archive to derive the required information.
The STARS image processing workflow has initially focused on deriving image products that were useful for the teams working on the regional case studies in W and E Africa and Bangladesh. We had two targets in mind as information goals: (1) develop a fully corrected and orthorectified image time series over the area of work to allow studies of crop development during the season at the level of farm field or even smaller spatial unit, and (2) combine these image time series with farm field geometries with known crop and derive field- and crop-specific statistics from the image to feed a library of spectral, textural and temporal crop characteristics for future use by research teams. While much of our image data was commercially acquired, and thus cannot itself be made fully public, we have agreement to publish at least one such image time series, and we have obligations to publish the crop characteristics library as well. The latter will be made available as a global public good, and for the time coming we are devoted to see it being used and expanded.
Figure 7.1 Schematic representation of the automated image processing workflow
The automated system performs the six processing operations listed below. The first five are processes required to prepare raw delivered data prior to information retrieval. The sixth is a post-processing step that is performed subsequently.
2. Atmospheric correction
3. Geometric correction
4. Image co-registration
6. Extraction of image-derived statistics for FMUs under study
Although proprietary software such as Erdas Imagine, ENVI and ArcGIS have been traditionally used to implement these steps, the license costs are often a limiting factor for scientists/organizations. The increased availability and capability of open-source software in processing remote sensing data presents an opportunity to develop a workflow for satellite data processing that can be used at virtually no cost, and that can be adapted to purpose, when the need arises.
In this regard, the automated image processing workflow developed by the STARS project is solely based on open-source/free software. It draws on the strength of several open-source applications to implement the different processing steps. Listed below are the major open-source applications used, although most of the implementation was compiled in the R statistical programming language:
- R statistical programming language
- Fortran Fortran-based 6S radiative transfer model
- Debian Linux operating system
Accessing and implementing the automated satellite image workflow
The image processing workflow described above has been developed with R as the main programming environment. The code developed is open-source and is freely available. The base operating system requirement is Debian Linux. As satellite image data is often voluminous some minimal hardware requirements must be met to run the software. For STARS, we used a Intel Xeon 2.6GHz dual core processor with 32Gb RDIMM main memory per core equipped with two 400 Gb SSD drives. Storage capacity was obtained with a 120 TB NAS server in RAID-5 operating mode.
Once the supporting software environment, the workflow code, and the required directory structure has been installed and set up on the computer, the sequence of processing is initiated with a copy operation of the satellite data that the user wishes to process. This data is copied (in the format in which it was delivered) into a designated folder. In addition, geometries of the study area of interest and of the farm management units (FMUs) of interest must be available from a spatial database also. These geometric targets need to be identified to the workflow. This will be the only intervention from the user. The remainder of the processing is automatically performed in the sequence described above and data results are placed in designated folders.
In the following sections, the rationale for each of the processing steps and their implementation within the context of the automated workflow are described.