Hello, I am adding support for Raster Attribute Tables for one of our products, and there are some things I wonder about.
I have noticed that a TIFF file, Filename.tif, may have Raster Attribute Tables stored in a sidecar file of at least three types: 1. Filename.tif.vat.dbf 2. Filename.tif.aux.xml 3. Filename.aux Are there other conventions for storing Raster Attribute Tables for TIFF files that I ought to know about? I understand that .vat.dbf is an Esri convention using the old dBase format, not (yet) supported by GDAL Library for Raster Attribute Tables. Fair enough. It would seem that the .aux.xml format was designed by GDAL developers, although I haven't found any documentation on the history of the format. Esri documentation says: "ArcGIS 9.2 introduced the AUX.XML file for certain file formats", which can be understood as saying the format was invented by Esri, but I suspect they really mean something like "ArcGIS 9.2 adopted the AUX.XML format that was introduced by GDAL Library...". Any comments on that? The files named Filename.aux are a mystery to me. Esri documentation says: "The information stored in an AUX file is only accessible using a product from Esri, ERDAS, or a third-party product derived from the RDO/ERaster library." But I have found examples of datasets for which GDAL Library can retrieve a Raster Attribute Table from a binary Filename.aux. So, maybe some developer of GDAL Library has reverse-engineered the Esri Filename.aux format, but I haven't found any documentation of that. Or I guess maybe there are two distinct file formats that use the ".aux" file extension - the name "aux" isn't very creative and maybe Esri never bothered to trademark it. Does anyone know? Finally, I have found several datasets of land cover with Raster Attribute Tables, both Australian (https://data.gov.au/dataset/1556b944-731c-4b7f-a03e-14577c7e68db/resource/1f8174f8-573e-43f2-b110-3d1a13c380e8) and American (https://datagateway.nrcs.usda.gov/GDGHome_DirectDownLoad.aspx, see "National Land Cover Dataset by State"), where there are .aux.xml files that don't seem to be GDAL-compliant. Maybe that's just the way things are, but I haven't seen these issues discussed, so I'll mention them here (I am not sure if they merit filing a GDAL ticket). I have been using GDAL version 2.2.3, by the way. A) GDAL Library always assumes that fields tagged as RED, GREEN and BLUE contain integers. A common problem, for example in the files DLCDv1_Class1_CEN.tif.aux.xml and DLCDv1_Class_alb94.tif.aux.xml, in the Australian dataset, is that there are field definitions <FieldDefn index="3"> <Name>RED</Name> <Type>1</Type> <Usage>6</Usage> </FieldDefn> <FieldDefn index="4"> <Name>GREEN</Name> <Type>1</Type> <Usage>7</Usage> </FieldDefn> <FieldDefn index="5"> <Name>BLUE</Name> <Type>1</Type> <Usage>8</Usage> </FieldDefn> The Type = 1 indicates that the fields are Float values, which is correct - the values are Float numbers in the range 0.0 to 1.0 - and the Usage = 6, 7 and 8 indicates that these fields are special Red, Green and Blue fields, which is also correct, semantically. (The integers giving Type and Usage seem to be zero-based indexes into the enumeration types GDALRATFieldType and GDALRATFieldUsage, see https://www.gdal.org/gdal_8h.html#a810154ac91149d1a63c42717258fe16e ) Unfortunately, GDAL Library assumes or demands that a field with Usage tagged as Red/Green/Blue always contains integers. When I read the files via GDAL Library, what happens is that the Usage = 6, 7 and 8 forces the columns to be erroneously interpreted as integers (in the expected range 0 to 255), overriding the correct Type = 1 meaning Float. All values are then retrieved as 0 or 1, it seems. This can be seen from the code lines // color columns should be int 0..255 if( ( eFieldUsage == GFU_Red ) || ( eFieldUsage == GFU_Green ) || ( eFieldUsage == GFU_Blue ) || ( eFieldUsage == GFU_Alpha ) ) { eFieldType = GFT_Integer; } in the method GDALDefaultRasterAttributeTable::CreateColumn() in https://github.com/OSGeo/gdal/blob/master/gdal/gcore/gdal_rat.cpp Although the behaviour agrees with the documentation of GDALRATFieldUsage that says that the special color fields must contain integers, I feel that GDAL Library is too rigid here. The Type, available in the Field Definition, tells us whether a color field contains integers or floats, so why the restriction? I guess I could make GDAL read these field contents as floats if I changed the Usage to 0, meaning a general-purpose field, but that would remove the information that these fields really are the special color fields. B) Wrong Usage for the VALUE column. In this case, I feel certain that the data producers are doing things wrong, but there are so many of them... The special Value field is declared as <FieldDefn index="1"> <Name>VALUE</Name> <Type>0</Type> <Usage>0</Usage> </FieldDefn> Here, the Usage = 0 indicates just a general purpose field. But they should have declared <Usage>5</Usage> which indicates the usage GFU_MinMax, saying this column contains the "class value". Without that special usage indicated, GDAL Library has no way of knowing that VALUE is the special Value column, so it uses the row number as the value instead. Unfortunately, the row numbering starts from 0, whereas the contents of the VALUE field are often numbers starting from 1, and sometimes the VALUE contents are not consecutive integers anyway. I am not sure if it makes sense to make GDAL Library more tolerant for this kind of error. I guess it could use a rule of thumb, saying that if nothing else in the .aux.xml tells what the special Value field is, and if the first or second field is of type integer and is named "VALUE" or "Value" or "value", then that is the special Value field. But maybe such guesswork is just ugly. I'd appreciate any comments and advice! Regards, Mikael Rittri Carmenta Geospatial Technologies www.carmenta.com<http://www.carmenta.com/>
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev