Hi devs,
in a reent QGIS issue report at https://github.com/qgis/QGIS/issues/53058 , an user complains about an ESRI Shapefile layer that was corrupted after an attribute value was changed and the edit was saved. The corrupted layer is opened by QGIS without errors or warning being reported, anyway it shows only a subset of the original feature geometry: a lot of records have now a null geometry associated, so they cannot be displayed.

After some investigations, although I don't know why and how the layer was corrupted, it seems to me that the issue is mostly due to a corruption of the .idx file: in fact it contains, for various records, incorrect value of index and length of the record. This generates the incorrect reading of such record and the following ones, until the the index in the .idx file and the data in the .shp file line up again.

Running the QGIS "Repair Shapefile" processing algorithm against such layer, the algorithm fails while the .idx file is actually updated but the layer becomes totally invalid and it is not possible to load it in QGIS. The same happens directly using ogrinfo after the .idx file was deleted and the SHAPE_RESTORE_SHX variable was set to YES: the .idx file was recreated but the layer becomes unreadable by both QGIS and ogrinfo.

Inspecting the .idx file created by ogrinfo with SHAPE_RESTORE_SHX=YES (which is the same as the one created by the QGIS tool "Repair Shapefile"), it seems to me ogr fails to properly create the .idx file: it incorrectly stores, in the index file header, the total length in 16-bit words of the .shp file instead of the total length in 16-bit words of the .idx file itself.
In this particular case,
it stores the incorrect value 00 29 2A C2 = 2697922 16-bit words = 5395844 bytes instead of the correct value 00 02 1D 26 = 138534 16-bit words = 277068 bytes

Changing such incorrect value to the correct one in the repaired .idx file, makes the layer valid again and showing again the previously missing feature geometries (with only some glitches and a missing record).

This behaviour seems weird to me, as I remember that the Repair Shapefile tool or the SHAPE_RESTORE_SHX=YES setting worked well to repair Shapefiles with corrupted index in the past.

Maybe the issue in this particular Shapefile prevent ogr to correctly repair the index? For comparison, the old "Shape Checker utility" succeeds to repair the .idx file: it creates the same .idx file as the one created by ogr, apart from the total file length value which is correct.

Any clue as to what may have gone wrong during the layer editing in QGIS that eventually corrupted the layer?


Best regards.

Andrea Giudiceandrea
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to