[gdal-dev] Gdalinfo slow with big Rasterlite tables
Hi, I converted 153600 x 249600 pixels of rasterdata into a Rasterlite table. After creating overviews the result behaves well in Quantum GIS once the table is opened. Unfortunatly opening the table with QGis installed on a mini PC takes ages. I do not know what QGis is really doing but I guessed that perhaps it wants to know some metadata and makes a gdalinfo request or something equivalent. Therefore I made a test with bit more powerful computer and running gdalinfo with that took about 3 minutes. That feels rather slow. Is gdalinfo perhaps walking through every single tile in the rasterlite table for gathering the image layer info? Could there be any other way to do it in more effective way on the GDAL side? I have a feeling that this may not be really a GDAL problem. Doing plain SELECT count(*) from test_rasters with Spatialite-gui is also very slow and took about the same 3 minutes in my rapid test. That makes me think that for getting more speed Rasterlite should store more info about tiles into raster_metadata tables for providing an instant access to data. Perhaps this is something to talk with Alessandro who is developing a new Rasterlite 2 which should be ready by the end of this year. When it comes to GDAL, could it make any sense to cache gdalinfo from Rasterlite layers? Three minutes is rather a long time and my 153600 x 249600 pixel sized layer with 780270 rows/tiles in 5 meter resolution in the table is not exceptionally big. If time is increasing with tile count it would mean 12 minutes for getting gdalinfo from 2.5 meter resolution and 48 minutes from 1.25 minutes layer... Test with gdal_translate makes me feel that it is also reading the layer info first and spending about 3 minutes for that before starting the real work. Timing is very uncertain though, I could only take the time before the progress bar appears on the screen. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Gdalinfo slow with big Rasterlite tables
Jukka, Is gdalinfo perhaps walking through every single tile in the rasterlite table for gathering the image layer info? Could there be any other way to do it in more effective way on the GDAL side? The Rasterlite driver needs to fetch the extent of the _metadata layers to establish the extent of the raster , which might take a long time when there are a lot of tiles. When it comes to GDAL, could it make any sense to cache gdalinfo from Rasterlite layers? Three minutes is rather a long time and my 153600 x 249600 pixel sized layer with 780270 rows/tiles in 5 meter resolution in the table is not exceptionally big. If time is increasing with tile count it would mean 12 minutes for getting gdalinfo from 2.5 meter resolution and 48 minutes from 1.25 minutes layer... Funny because independantly of the issue you raise here, I was working on improving the performance of GetFeatureCount() and GetExtent() on Spatialite DBs. In Spatialite 3.0, there is a SQL function triggered by SELECT UpdateLayerStatistics() that creates a layer_statistics table that cache those both row count and extent. I've just pushed an improvement (r24800) in which the SQLite driver can use those cached values, if they are up-to-date. The up-to-dateness is determined by checking that the timestamp of the last 'UpdateLayerStatistics' event recorded in the 'spatialite_history' table matches the timestamp of the file. When creating a new Spatialite DB or updating it with the OGR API, the SQLite driver makes sure that the statistics are kept up-to-date automatically. However, if a third-party tool edits the DB, it is then necessary to run : ogrinfo the.db -sql SELECT UpdateLayerStatistics(). (The driver plays on the safe side, and will not use old statistics to avoid getting false results.) I've just made marginal changes (r24801) in the Rasterlite driver so that the above caching mechanism works automatically in simple gdal_translate and gdaladdo scenario. I would expect that it might solve your performance problem, although I have not checked that. You can check that statistics are used by issuing : $ gdalinfo byte.sqlite --debug on SQLITE: SpatiaLite-style SQLite DB found ! SQLITE: File timestamp matches layer statistics timestamp. Loading statistics for byte_metadata SQLite: Layer byte_metadata feature count : 2 SQLite: Layer byte_metadata extent : 440720.0,3750120.0,441920.0,3751320.0 OGR: OGROpen(byte.sqlite/0x1ad6390) succeeded as SQLite. [...] Best regards, Even ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
[gdal-dev] [Proposed new feature] A 'SQLite SQL dialect for OGR
Hi folks, I've attached in http://trac.osgeo.org/gdal/attachment/ticket/4782/sqlite_dialect.patch a patch that adds a SQLite alternate SQL dialect that can be used with any OGR datasource (only available if GDAL/OGR is configured with SQLite support of course) To remind you the concept of SQL dialects, for non-RDBMS OGR drivers, OGR uses its own SQL engine, which is the called the OGRSQL dialect. Whereas for RDBMS OGR drivers, their own SQL engine will be used, unless otherwise specified. This patchs adds the capability to use a SQLite dialect (through the -dialect option of ogrinfo or ogr2ogr for example). When doing so, the SQLite SQL engine is used, and when Spatialite is available, all the Spatialite functions ( see http://www.gaia-gis.it/gaia-sins/spatialite-sql-3.0.0.html ) can also be used. Technically, this works thanks to a temporary in-memory SQLite DB and a module (ogr/ogrsf_frmts/sqlite/ogrsqlitevirtualogr.cpp) that exposes OGR layers to SQLite through its Virtual Table mechanism ( http://www.sqlite.org/vtab.html ). When the datasource you operate on is opened in update mode and that the corresponding OGR driver supports CreateFeature()/SetFeature()/DeleteFeature() operations, SQL INSERT/UPDATE/DELETE operations will work too. What do you think about it ? A few non-exhaustive examples of things you can do : # Initial CSV file : $ cat my.csv id,foo,bar,long,lat 1,foo,bar,2,49 1,foo,bar,3,50 2,foo2,bar2,-2,49 2,foo2,bar2,-3,51 # Creates a Geometry field from each (long,lat) tuple : $ ogrinfo my.csv -dialect sqlite -sql SELECT *, MakePoint(CAST(long AS FLOAT), CAST(lat AS FLOAT)) as geometry FROM my -q Layer name: SELECT OGRFeature(SELECT):0 id (String) = 1 foo (String) = foo bar (String) = bar long (String) = 2 lat (String) = 49 POINT (2 49) OGRFeature(SELECT):1 id (String) = 1 foo (String) = foo bar (String) = bar long (String) = 3 lat (String) = 50 POINT (3 50) OGRFeature(SELECT):2 id (String) = 2 foo (String) = foo2 bar (String) = bar2 long (String) = -2 lat (String) = 49 POINT (-2 49) OGRFeature(SELECT):3 id (String) = 2 foo (String) = foo2 bar (String) = bar2 long (String) = -3 lat (String) = 51 POINT (-3 51) # Merge all the points that have the same id into a line : $ ogrinfo my.csv -dialect sqlite -sql SELECT id, foo,bar, MakeLine(MakePoint(CAST(long AS FLOAT), CAST(lat AS FLOAT))) FROM my GROUP BY id -q Layer name: SELECT OGRFeature(SELECT):0 id (String) = 1 foo (String) = foo bar (String) = bar LINESTRING (2 49,3 50) OGRFeature(SELECT):1 id (String) = 2 foo (String) = foo2 bar (String) = bar2 LINESTRING (-2 49,-3 51) # Compute the geodesic length of each line : $ ogrinfo my.csv -dialect sqlite -sql SELECT id, GeodesicLength(SetSRID(MakeLine(MakePoint(CAST(long AS FLOAT), CAST(lat AS FLOAT))),4326)) AS total_length FROM my GROUP BY id -q Layer name: SELECT OGRFeature(SELECT):0 id (String) = 1 total_length (Real) = 132725.477910869 OGRFeature(SELECT):1 id (String) = 2 total_length (Real) = 233720.037020965 # Appends a new entry in the CSV : $ ogrinfo my.csv -dialect sqlite -sql insert into my (id,foo,bar,long,lat) values (3,'foo3','bar3',2.5,49.5) $ cat my.csv id,foo,bar,long,lat 1,foo,bar,2,49 1,foo,bar,3,50 2,foo2,bar2,-2,49 2,foo2,bar2,-3,51 3,foo3,bar3,2.5,49.5 # Reprojection from EPSG:32631 to EPSG:4326 : $ ogrinfo poly.shp -dialect sqlite -sql SELECT ST_Transform(SetSRID(GEOMETRY,32631),4326) AS GEOMETRY, * FROM poly WHERE EAS_ID = 170 INFO: Open of `poly.shp' using driver `ESRI Shapefile' successful. Layer name: SELECT Geometry: Unknown (any) Feature Count: 1 Extent: (2.750069, 43.03) - (2.751428, 43.035184) Layer SRS WKT: GEOGCS[WGS 84, DATUM[WGS_1984, SPHEROID[WGS 84,6378137,298.257223563, AUTHORITY[EPSG,7030]], AUTHORITY[EPSG,6326]], PRIMEM[Greenwich,0, AUTHORITY[EPSG,8901]], UNIT[degree,0.0174532925199433, AUTHORITY[EPSG,9122]], AUTHORITY[EPSG,4326]] Geometry Column = GEOMETRY AREA: Real (0.0) EAS_ID: Real (0.0) PRFEDEA: String (0.0) OGRFeature(SELECT):0 AREA (Real) = 5268.813 EAS_ID (Real) = 170 PRFEDEA (String) = 35043413 POLYGON ((2.751427625469495 43.034734578878634,2.750298298604006 43.034443959553869,2.75006933958772 43.03490271631064,2.75124435992688 43.035184432169061,2.751427625469495 43.034734578878634)) Note: a similar capability was already available in OGR 1.9 for Shapefiles, through the use of the VirtualShape module that is embedded in Spatialite. See the http://gdal.org/ogr/drv_sqlite.html page. Best regards, Even ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] [Proposed new feature] A 'SQLite SQL dialect for OGR
This is wicked cool. Mike -- Michael Smith US Army Corps Remote Sensing GIS/Center On 8/18/12 1:03 PM, Even Rouault even.roua...@mines-paris.org wrote: Hi folks, I've attached in http://trac.osgeo.org/gdal/attachment/ticket/4782/sqlite_dialect.patch a patch that adds a SQLite alternate SQL dialect that can be used with any OGR datasource (only available if GDAL/OGR is configured with SQLite support of course) To remind you the concept of SQL dialects, for non-RDBMS OGR drivers, OGR uses its own SQL engine, which is the called the OGRSQL dialect. Whereas for RDBMS OGR drivers, their own SQL engine will be used, unless otherwise specified. This patchs adds the capability to use a SQLite dialect (through the -dialect option of ogrinfo or ogr2ogr for example). When doing so, the SQLite SQL engine is used, and when Spatialite is available, all the Spatialite functions ( see http://www.gaia-gis.it/gaia-sins/spatialite-sql-3.0.0.html ) can also be used. Technically, this works thanks to a temporary in-memory SQLite DB and a module (ogr/ogrsf_frmts/sqlite/ogrsqlitevirtualogr.cpp) that exposes OGR layers to SQLite through its Virtual Table mechanism ( http://www.sqlite.org/vtab.html ). When the datasource you operate on is opened in update mode and that the corresponding OGR driver supports CreateFeature()/SetFeature()/DeleteFeature() operations, SQL INSERT/UPDATE/DELETE operations will work too. What do you think about it ? A few non-exhaustive examples of things you can do : # Initial CSV file : $ cat my.csv id,foo,bar,long,lat 1,foo,bar,2,49 1,foo,bar,3,50 2,foo2,bar2,-2,49 2,foo2,bar2,-3,51 # Creates a Geometry field from each (long,lat) tuple : $ ogrinfo my.csv -dialect sqlite -sql SELECT *, MakePoint(CAST(long AS FLOAT), CAST(lat AS FLOAT)) as geometry FROM my -q Layer name: SELECT OGRFeature(SELECT):0 id (String) = 1 foo (String) = foo bar (String) = bar long (String) = 2 lat (String) = 49 POINT (2 49) OGRFeature(SELECT):1 id (String) = 1 foo (String) = foo bar (String) = bar long (String) = 3 lat (String) = 50 POINT (3 50) OGRFeature(SELECT):2 id (String) = 2 foo (String) = foo2 bar (String) = bar2 long (String) = -2 lat (String) = 49 POINT (-2 49) OGRFeature(SELECT):3 id (String) = 2 foo (String) = foo2 bar (String) = bar2 long (String) = -3 lat (String) = 51 POINT (-3 51) # Merge all the points that have the same id into a line : $ ogrinfo my.csv -dialect sqlite -sql SELECT id, foo,bar, MakeLine(MakePoint(CAST(long AS FLOAT), CAST(lat AS FLOAT))) FROM my GROUP BY id -q Layer name: SELECT OGRFeature(SELECT):0 id (String) = 1 foo (String) = foo bar (String) = bar LINESTRING (2 49,3 50) OGRFeature(SELECT):1 id (String) = 2 foo (String) = foo2 bar (String) = bar2 LINESTRING (-2 49,-3 51) # Compute the geodesic length of each line : $ ogrinfo my.csv -dialect sqlite -sql SELECT id, GeodesicLength(SetSRID(MakeLine(MakePoint(CAST(long AS FLOAT), CAST(lat AS FLOAT))),4326)) AS total_length FROM my GROUP BY id -q Layer name: SELECT OGRFeature(SELECT):0 id (String) = 1 total_length (Real) = 132725.477910869 OGRFeature(SELECT):1 id (String) = 2 total_length (Real) = 233720.037020965 # Appends a new entry in the CSV : $ ogrinfo my.csv -dialect sqlite -sql insert into my (id,foo,bar,long,lat) values (3,'foo3','bar3',2.5,49.5) $ cat my.csv id,foo,bar,long,lat 1,foo,bar,2,49 1,foo,bar,3,50 2,foo2,bar2,-2,49 2,foo2,bar2,-3,51 3,foo3,bar3,2.5,49.5 # Reprojection from EPSG:32631 to EPSG:4326 : $ ogrinfo poly.shp -dialect sqlite -sql SELECT ST_Transform(SetSRID(GEOMETRY,32631),4326) AS GEOMETRY, * FROM poly WHERE EAS_ID = 170 INFO: Open of `poly.shp' using driver `ESRI Shapefile' successful. Layer name: SELECT Geometry: Unknown (any) Feature Count: 1 Extent: (2.750069, 43.03) - (2.751428, 43.035184) Layer SRS WKT: GEOGCS[WGS 84, DATUM[WGS_1984, SPHEROID[WGS 84,6378137,298.257223563, AUTHORITY[EPSG,7030]], AUTHORITY[EPSG,6326]], PRIMEM[Greenwich,0, AUTHORITY[EPSG,8901]], UNIT[degree,0.0174532925199433, AUTHORITY[EPSG,9122]], AUTHORITY[EPSG,4326]] Geometry Column = GEOMETRY AREA: Real (0.0) EAS_ID: Real (0.0) PRFEDEA: String (0.0) OGRFeature(SELECT):0 AREA (Real) = 5268.813 EAS_ID (Real) = 170 PRFEDEA (String) = 35043413 POLYGON ((2.751427625469495 43.034734578878634,2.750298298604006 43.034443959553869,2.75006933958772 43.03490271631064,2.75124435992688 43.035184432169061,2.751427625469495 43.034734578878634)) Note: a similar capability was already available in OGR 1.9 for Shapefiles, through the use of the VirtualShape module that is embedded in Spatialite. See the http://gdal.org/ogr/drv_sqlite.html page. Best regards, Even ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___