Thank you both, that's exactly what I was looking for. On Thu, Sep 25, 2025 at 3:03 AM Rahkonen Jukka < [email protected]> wrote:
> Hi, > > R*Tree is standard SQLite module, so maybe it is best to look at the > SQLite documentation https://www.sqlite.org/rtree.html. > Finding Spatialite documents from the website is always difficult (but it > is Open Source and anybody can offer their help). > Spatialite documents contain some background information and examples and > they are worth reading. You could start from a historic > https://www.gaia-gis.it/gaia-sins/spatialite-tutorial-2.3.1.html#t8. > The current documentation about using spatial index with Spatialite is > https://www.gaia-gis.it/fossil/libspatialite/wiki?name=SpatialIndex, but > it describes the usage of the alternative way to access R*Tree tables with > VirtualSpatialIndex > https://www.gaia-gis.it/gaia-sins/SpatialIndex-Update.pdf that works only > with Spatialite databases. With GeoPackage R*Tree tables must be queried > directly. > > -Jukka Rahkonen- > > > ________________________________________ > Lähettäjä: gdal-dev <[email protected]> käyttäjän Even > Rouault via gdal-dev <[email protected]> puolesta > Lähetetty: Torstai 25. syyskuuta 2025 3.01 > Vastaanottaja: Tim Harris <[email protected]>; gdal dev < > [email protected]> > Aihe: Re: [gdal-dev] Large GPKG files in S3 > > > Hi Tim, > > yes you're running into a well known s/limitation/feature/ of spatial > SQLite (I'm pretty sure that the Spatialite website has one page about > that, but can't find it right now). The use of the ST_ functions doesn't > trigger the RTree automatically, and you need to query it explicitly by > joining with the rtree_{table_name}_{geom_column_name} virtual table. > > For example the following completes quite quickly: > > $ ogrinfo -ro -al -q > /vsicurl/ > https://storage.googleapis.com/open-geodata/linz-examples/nz-building-outlines.gpkg > -sql > <https://storage.googleapis.com/open-geodata/linz-examples/nz-building-outlines.gpkg-sql> > "SELECT * FROM nz_building_outlines p JOIN > rtree_nz_building_outlines_geom rt ON p.rowid = rt.id WHERE rt.minx < > 1771150 and rt.maxx > 1771140 and rt.miny < 5894877 and rt.maxy > > 5894870 and ST_Intersects(p.geom, ST_GeomFromText('POINT(1771141 > 5894876)')) = 1" > > Layer name: SELECT > OGRFeature(SELECT):11471 > building_id (Integer) = 4199821 > name (String) = > use (String) = Unknown > suburb_locality (String) = Karaka > town_city (String) = > territorial_authority (String) = Auckland > capture_method (String) = Feature Extraction > capture_source_group (String) = NZ Aerial Imagery > capture_source_id (Integer) = 1014 > capture_source_name (String) = Auckland 0.075m Urban Aerial Photos > (2017) > capture_source_from (Date) = 2017/03/15 > capture_source_to (Date) = 2017/05/06 > last_modified (Date) = 2019/04/05 > id (Integer64) = 11471 > minx (Real) = 1771139.875 > maxx (Real) = 1771152 > miny (Real) = 5894865.5 > maxy (Real) = 5894883 > MULTIPOLYGON (((1771139.91348946 5894880.21926274,1771146.68318411 > 5894882.80032223,1771151.97657734 5894868.91662014,1771145.02123927 > 5894866.26478195,1771143.49962591 5894870.255725,1771142.39909482 > 5894869.83612922,1771140.01981817 5894876.07658206,1771141.30599197 > 5894876.56695731,1771139.91348946 5894880.21926274))) > > > Even > > Le 25/09/2025 à 01:38, Tim Harris via gdal-dev a écrit : > > Hi, trying to do some queries of fairly large GPKG files sitting in S3 > > in the hopes that spatial indexes and S3 range requests make it fast > > and efficient. But it seems that's not the case and I'm wondering if > > I'm doing something wrong. > > > > The examples below are just looking for a geometry at a point. The > > filename is fake but the geoms.gpkg contains a layer "geoms" with a > > bunch of polygons. It's about 1.7 GB in size. > > > > Something like this runs relatively fast, it makes about 17 HTTP > > requests and outputs the feature: > > > > --- > > ogrinfo -ro -json -features /vsis3/bucket/geoms.gpkg geoms -spat -105 > > 40 -105 40 > > --- > > > > But this, using -sql instead of -spat, sits forever without finishing. > > With CPL_CURL_VERBOSE it sort of looks like it's just scanning through > > the file a range request at a time: > > > > --- > > ogrinfo -ro -json -features /vsis3/bucket/geoms.gpkg geoms -sql > > "select * from geoms where ST_Intersects(geom, > > ST_GeomFromText('POINT(-105 40)', 4326)) = 1" > > --- > > > > I also tried replacing the ST_GeomFromText with other things > > like MakePoint(-105, 40, 4326) and ST_EnvIntersects(geom, -105, 40, > > -105, 40), just in case. It seems like -spat uses the spatial index > > efficiently but -sql doesn't. I also tried using them both, but having > > the -sql there seems to make it slow. > > > > I also tried to add "explain query plan" to the sql statement: > > > > --- > > ogrinfo -ro > > /vsis3/prod-tilerepo-va2/metadata/20250920T041443/geoms.gpkg geoms > > -sql "EXPLAIN QUERY PLAN select * from geoms where ST_Intersects(geom, > > ST_GeomFromText('POINT(-105 40)', 4326)) = 1" > > --- > > > > That only sort of confirms the scan, and reports "SCAN geoms" in the > > output. This page, https://sqlite.org/eqp.html, explains the "explain > > query plan" output and I was hoping to see a "SEARCH geoms USING INDEX > > ___" of some sort. > > > > Is there something about the use of -sql that precludes it from using > > the spatial index? Maybe we need to do an ogr2ogr with just -spat to > > narrow it down, then apply a -sql to that result if we need additional > > filtering? > > > > Thanks > > > > _______________________________________________ > > gdal-dev mailing list > > [email protected] > > https://lists.osgeo.org/mailman/listinfo/gdal-dev > > -- > http://www.spatialys.com/ > My software is free, but my time generally not. > > _______________________________________________ > gdal-dev mailing list > [email protected] > https://lists.osgeo.org/mailman/listinfo/gdal-dev > > >
_______________________________________________ gdal-dev mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/gdal-dev
