Re: [gdal-dev] Reading from (geo)parquet using mixed spatia and non-spatiall filters

Even Rouault via gdal-dev Thu, 22 Jan 2026 15:20:58 -0800

Hi Ari,

Looking at the code, I see the driver does read all row groups whereasit could potentially be improved to use row group level statistics toskip all of them but the one matching. That said you can probablyworkaround the issue by using instead SetAttributeFilter("fid =<the-fid>") , or querying directly the ID column if that's yourultimate objective.

More generally Parquet shines more at requesting a significant amount ofdata / bulk loading scenarios than just extracting a single featurewhere you'll get better performance with regular databases with properindices built.


Even

Le 22/01/2026 à 12:49, Ari Jolma a écrit :

Thanks for the replies. I'm progressing but now I hit something Idon't understand.
I have a large GPKG file which I converted into a Parquet file. If Inow do a simple layer.GetFeature(fid) on a random fid on the layer,the feature is retrieved from GPKG really fast (also if the file is inS3) but from Parquet it is slow (~ 20 secs) even on local filesystem.
On both files layer.GetFIDColumn() reports "fid". There is a native"ID" column on the GPKG but fid <> ID.
I used ogr2ogr to create the Parquet file. I had -lco COMPRESSION=None

Ari

Michael Smith kirjoitti 18.1.2026 klo 18.09:
I combine attribute and spatial filters a lot on large parquet filesusing a combination of SetSpatialFilter() and SetAttributeFilter()before querying. I've only had some issues with partition eliminationwhich have now been fixed. Sometimes the ADBC connection can befaster to query but opening the file with gdal.OpenEx() is slower.And ADBC takes more memory. I find the gdal query method generallybetter.
Having access to the sql functions of duckdb is the only reason Iever use ADBC.
Mike

--
http://www.spatialys.com
My software is free, but my time generally not.

_______________________________________________
gdal-dev mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Reading from (geo)parquet using mixed spatia and non-spatiall filters

Reply via email to