Re: [gdal-dev] Add a new performance hint for Spatialite

2014-03-28 Thread Even Rouault
Jukka,

I'm very surprised that you need to do that explicitely, since the driver should
already do that by default. This is something we discovered last year or the
year before when you were experimenting OSM - Spatialite conversions. Since OGR
1.10, the spatial index is created when the datasource is closed.

Look at this debug trace:

$ ogr2ogr -f sqlite /vsimem/out.sqlite ../autotest/ogr/data/poly.shp -dsco
spatialite=yes --debug on
OGR: OGROpen(../autotest/ogr/data/poly.shp/0x69a3d0) succeeded as ESRI
Shapefile.
SQLITE: SpatiaLite v4 DB found !
OGR_SQLITE: exec(CREATE TABLE 'poly' (   OGC_FID INTEGER PRIMARY KEY))
OGR_SQLITE: exec(DELETE FROM geometry_columns WHERE f_table_name = 'poly')
OGR_SQLITE: exec(SELECT AddGeometryColumn('poly', 'GEOMETRY', 325834, 'POLYGON',
2))
OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'area' FLOAT)
OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'eas_id' FLOAT)
OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'prfedea' VARCHAR(16))
OGR_SQLITE: BEGIN Transaction
OGR_SQLITE: prepare(INSERT INTO 'poly' (GEOMETRY,area,eas_id,prfedea)
VALUES (?,?,?,?))
OGR_SQLITE: COMMIT Transaction
OGR2OGR: 10 features written in layer 'poly'
OGR_SQLITE: exec(SELECT CreateSpatialIndex('poly', 'GEOMETRY'))
SQLITE: Error no such table: layer_statistics
OGR: Unloading VirtualOGR module
Shape: 10 features read on layer 'poly'.

Could try adding --debug on to your ogr2ogr command line and see when
CreateSpatialIndex() is created ?

Even

 Hi,

 I took timings about adding 115 GML files (548 MB together, 3.2 million
 linestrings) into a Spatialite table. With default settings the table gets
 initialized with spatial index which makes following inserts slower. Another
 alternative is to create the table without spatial index, append all the
 data first and as a last step create spatial index for the ready made table.

 With spatial index on:
 Append + index: 71 minutes

 With spatial index off:
 Append: 9 minutes
 Create spatial index: 6 minutes
 Total: 15 minutes

 I was rather happy with the initial conversion speed until I made this test
 which revealed that creating spatial index as a final step made the whole
 process more than four times faster! This way both data table and spatial
 index are probably in contiguous chunks in the SQLite datafile and there is
 no need for post process VACUUM. Vacuuming in SQLite is rather slow and for
 this 1.3 GB database it takes more than 4 minutes to run.

 Suggestion: Add a new performance hint on page
 http://www.gdal.org/ogr/drv_sqlite.html

 If many source files will be collected into the same Spatialite table it
 can be much faster to initialize the table without a spatial index by using
 -lco SPATIAL_INDEX=NO and to create spatial index with a separate command
 after all the data are appended. Spatial index can be created with ogrinfo
 command
 ogrinfo db.sqlite -sql SELECT
 CreateSpatialIndex('table_name','geometry_column_name')

 Perhaps it could also be mentioned as a performance hint that VACUUUM can
 also be done from orginfo as
 ogrinfo db.sqlite -sql VACUUM

 -Jukka Rahkonen-



 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev



___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Add a new performance hint for Spatialite

2014-03-28 Thread Rahkonen Jukka (Tike)
Hi,

Sorry, I must have not emphasized enough that hint is only valid for appending 
data with subsequent ogr2ogr commands

ogr2ogr -f sqlite -dsco spatialite=yes mtk_tos.sqlite -dim 2 -lco 
spatial_index=no /vsizip/e:\mtk_tos\etrs89\gml\K32.zip tieviiva  -gt 65536
ogr2ogr -f sqlite -append mtk_tos.sqlite -dim 2 
/vsizip/e:\mtk_tos\etrs89\gml\K34.zip tieviiva -gt 65536
... repeat the latter command for 113 more GML files...

I am not sure if this is a common use case everywhere. However, our National 
Land Survey delivers vector data divided by map sheets which means that each 
country wide dataset contains hundreds or thousands of GML files.
 
Even Rouault wrote:
 
 Jukka,
 
 I'm very surprised that you need to do that explicitely, since the driver 
 should
 already do that by default. This is something we discovered last year or the 
 year
 before when you were experimenting OSM - Spatialite conversions. Since OGR
 1.10, the spatial index is created when the datasource is closed.
 
 Look at this debug trace:
 
 $ ogr2ogr -f sqlite /vsimem/out.sqlite ../autotest/ogr/data/poly.shp -dsco
 spatialite=yes --debug on
 OGR: OGROpen(../autotest/ogr/data/poly.shp/0x69a3d0) succeeded as ESRI
 Shapefile.
 SQLITE: SpatiaLite v4 DB found !
 OGR_SQLITE: exec(CREATE TABLE 'poly' (   OGC_FID INTEGER PRIMARY KEY))
 OGR_SQLITE: exec(DELETE FROM geometry_columns WHERE f_table_name =
 'poly')
 OGR_SQLITE: exec(SELECT AddGeometryColumn('poly', 'GEOMETRY', 325834,
 'POLYGON',
 2))
 OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'area' FLOAT)
 OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'eas_id' FLOAT)
 OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'prfedea' VARCHAR(16))
 OGR_SQLITE: BEGIN Transaction
 OGR_SQLITE: prepare(INSERT INTO 'poly'
 (GEOMETRY,area,eas_id,prfedea)
 VALUES (?,?,?,?))
 OGR_SQLITE: COMMIT Transaction
 OGR2OGR: 10 features written in layer 'poly'
 OGR_SQLITE: exec(SELECT CreateSpatialIndex('poly', 'GEOMETRY'))
 SQLITE: Error no such table: layer_statistics
 OGR: Unloading VirtualOGR module
 Shape: 10 features read on layer 'poly'.
 
 Could try adding --debug on to your ogr2ogr command line and see when
 CreateSpatialIndex() is created ?
 
 Even
 
  Hi,
 
  I took timings about adding 115 GML files (548 MB together, 3.2
  million
  linestrings) into a Spatialite table. With default settings the table
  gets initialized with spatial index which makes following inserts
  slower. Another alternative is to create the table without spatial
  index, append all the data first and as a last step create spatial index 
  for the
 ready made table.
 
  With spatial index on:
  Append + index: 71 minutes
 
  With spatial index off:
  Append: 9 minutes
  Create spatial index: 6 minutes
  Total: 15 minutes
 
  I was rather happy with the initial conversion speed until I made this
  test which revealed that creating spatial index as a final step made
  the whole process more than four times faster! This way both data
  table and spatial index are probably in contiguous chunks in the
  SQLite datafile and there is no need for post process VACUUM.
  Vacuuming in SQLite is rather slow and for this 1.3 GB database it takes 
  more
 than 4 minutes to run.
 
  Suggestion: Add a new performance hint on page
  http://www.gdal.org/ogr/drv_sqlite.html
 
  If many source files will be collected into the same Spatialite table
  it can be much faster to initialize the table without a spatial index
  by using -lco SPATIAL_INDEX=NO and to create spatial index with a
  separate command after all the data are appended. Spatial index can be
  created with ogrinfo command ogrinfo db.sqlite -sql SELECT
  CreateSpatialIndex('table_name','geometry_column_name')
 
  Perhaps it could also be mentioned as a performance hint that VACUUUM
  can also be done from orginfo as ogrinfo db.sqlite -sql VACUUM
 
  -Jukka Rahkonen-
 
 
 
  ___
  gdal-dev mailing list
  gdal-dev@lists.osgeo.org
  http://lists.osgeo.org/mailman/listinfo/gdal-dev
 
 

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Add a new performance hint for Spatialite

2014-03-28 Thread Even Rouault
Ah, got it. I read your message too fast. I've added your suggestions with a
simplified version of your below example in
http://trac.osgeo.org/gdal/changeset/27104

Another solution for your use case would be to make a OGR VRT with a
OGRVRTUnionLayer that reference every GML file and convert the VRT in a single
step in spatialite. But we don't have yet a ogrbuildvrt utility to create that
VRT in a user friendly way.

 Hi,

 Sorry, I must have not emphasized enough that hint is only valid for
 appending data with subsequent ogr2ogr commands

 ogr2ogr -f sqlite -dsco spatialite=yes mtk_tos.sqlite -dim 2 -lco
 spatial_index=no /vsizip/e:\mtk_tos\etrs89\gml\K32.zip tieviiva  -gt 65536
 ogr2ogr -f sqlite -append mtk_tos.sqlite -dim 2
 /vsizip/e:\mtk_tos\etrs89\gml\K34.zip tieviiva -gt 65536
 ... repeat the latter command for 113 more GML files...

 I am not sure if this is a common use case everywhere. However, our National
 Land Survey delivers vector data divided by map sheets which means that each
 country wide dataset contains hundreds or thousands of GML files.

 Even Rouault wrote:

  Jukka,
 
  I'm very surprised that you need to do that explicitely, since the driver
 should
  already do that by default. This is something we discovered last year or
 the year
  before when you were experimenting OSM - Spatialite conversions. Since OGR
  1.10, the spatial index is created when the datasource is closed.
 
  Look at this debug trace:
 
  $ ogr2ogr -f sqlite /vsimem/out.sqlite ../autotest/ogr/data/poly.shp -dsco
  spatialite=yes --debug on
  OGR: OGROpen(../autotest/ogr/data/poly.shp/0x69a3d0) succeeded as ESRI
  Shapefile.
  SQLITE: SpatiaLite v4 DB found !
  OGR_SQLITE: exec(CREATE TABLE 'poly' (   OGC_FID INTEGER PRIMARY KEY))
  OGR_SQLITE: exec(DELETE FROM geometry_columns WHERE f_table_name =
  'poly')
  OGR_SQLITE: exec(SELECT AddGeometryColumn('poly', 'GEOMETRY', 325834,
  'POLYGON',
  2))
  OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'area' FLOAT)
  OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'eas_id' FLOAT)
  OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'prfedea' VARCHAR(16))
  OGR_SQLITE: BEGIN Transaction
  OGR_SQLITE: prepare(INSERT INTO 'poly'
  (GEOMETRY,area,eas_id,prfedea)
  VALUES (?,?,?,?))
  OGR_SQLITE: COMMIT Transaction
  OGR2OGR: 10 features written in layer 'poly'
  OGR_SQLITE: exec(SELECT CreateSpatialIndex('poly', 'GEOMETRY'))
  SQLITE: Error no such table: layer_statistics
  OGR: Unloading VirtualOGR module
  Shape: 10 features read on layer 'poly'.
 
  Could try adding --debug on to your ogr2ogr command line and see when
  CreateSpatialIndex() is created ?
 
  Even
 
   Hi,
  
   I took timings about adding 115 GML files (548 MB together, 3.2
   million
   linestrings) into a Spatialite table. With default settings the table
   gets initialized with spatial index which makes following inserts
   slower. Another alternative is to create the table without spatial
   index, append all the data first and as a last step create spatial index
 for the
  ready made table.
  
   With spatial index on:
   Append + index: 71 minutes
  
   With spatial index off:
   Append: 9 minutes
   Create spatial index: 6 minutes
   Total: 15 minutes
  
   I was rather happy with the initial conversion speed until I made this
   test which revealed that creating spatial index as a final step made
   the whole process more than four times faster! This way both data
   table and spatial index are probably in contiguous chunks in the
   SQLite datafile and there is no need for post process VACUUM.
   Vacuuming in SQLite is rather slow and for this 1.3 GB database it takes
 more
  than 4 minutes to run.
  
   Suggestion: Add a new performance hint on page
   http://www.gdal.org/ogr/drv_sqlite.html
  
   If many source files will be collected into the same Spatialite table
   it can be much faster to initialize the table without a spatial index
   by using -lco SPATIAL_INDEX=NO and to create spatial index with a
   separate command after all the data are appended. Spatial index can be
   created with ogrinfo command ogrinfo db.sqlite -sql SELECT
   CreateSpatialIndex('table_name','geometry_column_name')
  
   Perhaps it could also be mentioned as a performance hint that VACUUUM
   can also be done from orginfo as ogrinfo db.sqlite -sql VACUUM
  
   -Jukka Rahkonen-
  
  
  
   ___
   gdal-dev mailing list
   gdal-dev@lists.osgeo.org
   http://lists.osgeo.org/mailman/listinfo/gdal-dev
  
 

 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev



___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev