[gdal-dev] Questions about Raster Attribute Tables

Mikael Rittri Thu, 04 Oct 2018 03:48:01 -0700

Hello,

I am adding support for Raster Attribute Tables for one of our products, and 
there are some things I wonder about.


I have noticed that a TIFF file, Filename.tif, may have Raster Attribute Tables 
stored in a sidecar file of at least three types:


1.     Filename.tif.vat.dbf

2.     Filename.tif.aux.xml

3.     Filename.aux

Are there other conventions for storing Raster Attribute Tables for TIFF files 
that I ought to know about?

I understand that .vat.dbf is an Esri convention using the old dBase format, 
not (yet) supported by GDAL Library for Raster Attribute Tables. Fair enough.

It would seem that the .aux.xml format was designed by GDAL developers, 
although I haven't found any documentation on the history of the format. Esri 
documentation says: "ArcGIS 9.2 introduced the AUX.XML file for certain file 
formats", which can be understood as saying the format was invented by Esri, 
but I suspect they really mean something like "ArcGIS 9.2 adopted the AUX.XML 
format that was introduced by GDAL Library...".  Any comments on that?

The files named Filename.aux are a mystery to me. Esri documentation says: "The 
information stored in an AUX file is only accessible using a product from Esri, 
ERDAS, or a third-party product derived from the RDO/ERaster library." But I 
have found examples of datasets for which GDAL Library can retrieve a Raster 
Attribute Table from a binary Filename.aux. So, maybe some developer of GDAL 
Library has reverse-engineered the Esri Filename.aux format, but I haven't 
found any documentation of that. Or I guess maybe there are two distinct file 
formats that use the ".aux" file extension - the name "aux" isn't very creative 
and maybe Esri never bothered to trademark it.  Does anyone know?

Finally, I have found several datasets of land cover with Raster Attribute 
Tables, both Australian 
(https://data.gov.au/dataset/1556b944-731c-4b7f-a03e-14577c7e68db/resource/1f8174f8-573e-43f2-b110-3d1a13c380e8)
 and American (https://datagateway.nrcs.usda.gov/GDGHome_DirectDownLoad.aspx, 
see "National Land Cover Dataset by State"), where there are .aux.xml files 
that don't seem to be GDAL-compliant. Maybe that's just the way things are, but 
I haven't seen these issues discussed, so I'll mention them here (I am not sure 
if they merit filing a GDAL ticket). I have been using GDAL version 2.2.3, by 
the way.

A) GDAL Library always assumes that fields tagged as RED, GREEN and BLUE 
contain integers.

A common problem, for example in the files

DLCDv1_Class1_CEN.tif.aux.xml
and
DLCDv1_Class_alb94.tif.aux.xml,

in the Australian dataset, is that there are field definitions

      <FieldDefn index="3">
        <Name>RED</Name>
        <Type>1</Type>
        <Usage>6</Usage>
      </FieldDefn>
      <FieldDefn index="4">
        <Name>GREEN</Name>
        <Type>1</Type>
        <Usage>7</Usage>
      </FieldDefn>
      <FieldDefn index="5">
        <Name>BLUE</Name>
        <Type>1</Type>
        <Usage>8</Usage>
      </FieldDefn>

The Type = 1 indicates that the fields are Float values, which is correct - the 
values are Float numbers in the range 0.0 to 1.0 - and the Usage = 6, 7 and 8 
indicates that these fields are special Red, Green and Blue fields, which is 
also correct, semantically. (The integers giving Type and Usage seem to be 
zero-based indexes into the enumeration types GDALRATFieldType and 
GDALRATFieldUsage, see 
https://www.gdal.org/gdal_8h.html#a810154ac91149d1a63c42717258fe16e )

Unfortunately, GDAL Library assumes or demands that a field with Usage tagged 
as Red/Green/Blue always contains integers. When I read the files via GDAL 
Library, what happens is that the Usage = 6, 7 and 8 forces the columns to be 
erroneously interpreted as integers (in the expected range 0 to 255), 
overriding the correct Type = 1 meaning Float. All values are then retrieved as 
0 or 1, it seems. This can be seen from the code lines

    // color columns should be int 0..255
    if( ( eFieldUsage == GFU_Red ) || ( eFieldUsage == GFU_Green ) ||
        ( eFieldUsage == GFU_Blue ) || ( eFieldUsage == GFU_Alpha ) )
    {
        eFieldType = GFT_Integer;
    }

in the method GDALDefaultRasterAttributeTable::CreateColumn() in 
https://github.com/OSGeo/gdal/blob/master/gdal/gcore/gdal_rat.cpp

Although the behaviour agrees with the documentation of GDALRATFieldUsage that 
says that the special color fields must contain integers, I feel that GDAL 
Library is too rigid here. The Type, available in the Field Definition, tells 
us whether a color field contains integers or floats, so why the restriction? I 
guess I could make GDAL read these field contents as floats if I changed the 
Usage to 0, meaning a general-purpose field, but that would remove the 
information that these fields really are the special color fields.

B) Wrong Usage for the VALUE column.

In this case, I feel certain that the data producers are doing things wrong, 
but there are so many of them...
The special Value field is declared as

      <FieldDefn index="1">
        <Name>VALUE</Name>
        <Type>0</Type>
        <Usage>0</Usage>
      </FieldDefn>

Here, the Usage = 0 indicates just a general purpose field. But they should 
have declared

        <Usage>5</Usage>

which indicates the usage GFU_MinMax, saying this column contains the "class 
value".  Without that special usage indicated, GDAL Library has no way of 
knowing that VALUE is the special Value column, so it uses the row number as 
the value instead. Unfortunately, the row numbering starts from 0, whereas the 
contents of the VALUE field are often numbers starting from 1, and sometimes 
the VALUE contents are not consecutive integers anyway.

I am not sure if it makes sense to make GDAL Library more tolerant for this 
kind of error. I guess it could use a rule of thumb, saying that if nothing 
else in the .aux.xml tells what the special Value field is, and if the first or 
second field is of type integer and is named "VALUE" or "Value" or "value", 
then that is the special Value field. But maybe such guesswork is just ugly.

I'd appreciate any comments and advice!
Regards,

Mikael Rittri
Carmenta Geospatial Technologies
www.carmenta.com<http://www.carmenta.com/>

_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

[gdal-dev] Questions about Raster Attribute Tables

Reply via email to