What Is GDAL?
GDAL — the Geospatial Data Abstraction Library — is an open-source toolkit for reading, writing, and transforming raster and vector geospatial data formats. Maintained by OSGeo, GDAL underpins virtually every serious geospatial application you'll encounter: QGIS, ArcGIS, PostGIS, Google Earth Engine, and countless others all rely on GDAL under the hood.
The library ships with a suite of command-line utilities that let you manipulate spatial data directly in your terminal — no GUI required. Once you're comfortable with GDAL, you'll be able to convert formats, reproject data, clip rasters, merge files, and build automated workflows with ease.
Installing GDAL
GDAL is available on all major platforms:
- Linux (Ubuntu/Debian):
sudo apt install gdal-bin python3-gdal - macOS (Homebrew):
brew install gdal - Windows: Install via OSGeo4W or conda (
conda install -c conda-forge gdal) - Python (all platforms):
pip install GDAL(requires system GDAL to be installed first)
Verify installation by running gdalinfo --version in your terminal.
Essential GDAL Utilities
GDAL includes two families of tools: GDAL utilities (for rasters) and OGR utilities (for vectors).
1. gdalinfo — Inspect Raster Files
Before working with any raster, use gdalinfo to understand its properties:
gdalinfo myfile.tif
This outputs the coordinate reference system, bounding box, pixel size, band count, data type, and metadata. Essential for diagnosing projection mismatches or understanding unfamiliar datasets.
2. gdal_translate — Convert & Reformat Rasters
Convert between raster formats or subset a file:
gdal_translate -of GTiff input.img output.tif
Common output formats: GTiff (GeoTIFF), PNG, NetCDF, HFA (Erdas Imagine). You can also scale pixel values, extract specific bands, or compress the output.
3. gdalwarp — Reproject & Clip Rasters
gdalwarp is one of the most powerful GDAL tools. Reproject a raster to a new CRS:
gdalwarp -t_srs EPSG:3857 input.tif reprojected.tif
Clip a raster to a bounding box:
gdalwarp -te xmin ymin xmax ymax input.tif clipped.tif
Clip to a vector boundary (e.g., a country border):
gdalwarp -cutline boundary.gpkg -crop_to_cutline input.tif clipped.tif
4. gdal_merge.py — Mosaic Multiple Rasters
Combine multiple tiles into a single raster mosaic:
gdal_merge.py -o merged.tif tile1.tif tile2.tif tile3.tif
This is invaluable when working with tiled satellite imagery datasets like Landsat or Sentinel scenes.
5. ogrinfo — Inspect Vector Files
The OGR equivalent of gdalinfo for vector data:
ogrinfo -al -so myfile.gpkg
Reports layer names, geometry type, feature count, bounding box, and field schema.
6. ogr2ogr — Convert & Transform Vector Data
ogr2ogr is arguably the most versatile geospatial command-line tool that exists. Convert a Shapefile to GeoJSON:
ogr2ogr -f GeoJSON output.geojson input.shp
Reproject to a different CRS:
ogr2ogr -t_srs EPSG:4326 -f GeoJSON output.geojson input.gpkg
Filter features by SQL query:
ogr2ogr -where "population > 100000" -f GeoJSON cities_large.geojson cities.gpkg
Building Automated Workflows
The real power of GDAL emerges when you combine these utilities in shell scripts or Python pipelines. A typical satellite imagery processing workflow might:
- Download raw .TIFF tiles
- Reproject all tiles to a common CRS with
gdalwarp - Merge them into a single mosaic with
gdal_merge.py - Clip to a study area boundary
- Calculate a vegetation index using
gdal_calc.py - Export as a compressed Cloud-Optimized GeoTIFF (COG)
Each step is a single command — making the entire pipeline scriptable, repeatable, and version-controllable.
Conclusion
GDAL is an indispensable tool for anyone working seriously with geospatial data. Its breadth of format support, reprojection capabilities, and scriptability make it the Swiss Army knife of the GIS world. Invest time learning the half-dozen commands covered here and you'll dramatically accelerate your spatial data workflows.