Re: [gdal-dev] Simple schema support for GeoJSON

2014-11-21 Thread Rahkonen Jukka (Tike)
Hi,

I have no use for this feature myself but by reading various mailing lists and 
forums I have learned that many people consider it is always a good idea to 
read data for example from WFS services as GeoJSON instead of GML. I can easily 
imagine that there will be troubles with guess-by-data method if they are 
making subsequent requests from the service. For example strings which are all 
numbers but which may contain leading zeroes are saved either to integers or 
strings  if leading zeroes are interpreted right at all. Or floats which do not 
always contain decimals, or list attributes which sometimes have only zero or 
one member.

Embedded schema feels optimal because then it would always travel together with 
the data and we all have probably lost .tfw or .prj files sometimes.

-Jukka-

Even Rouault wrote:

 Jukka,
 
 Data type guessing implemented in the OGR GeoJSON driver is quite natural
 hopefully.
 A whole scan of the GeoJSON file is made and the following rules are applied :
 - if an attribute has integer-only content -- Integer
 - if an attribute has an array of integer-only content  -- IntegerList
 - if an attribute has integer or floating point content -- Real
 - if an attribute has an array of integer or floating point content -- 
 RealList
 - if an attribute has an array of anything else content -- StringList
 - otherwise -- String
 
 With RFC 50 and other pending improvements in the driver:
 - if an attribute has boolean-only content -- Integer(Boolean)
 - if an attribute has an array of boolean-only content -- 
 IntegerList(Boolean)
 - if an attribute has date-only content -- Date
 - if an attribute has time-only content -- Time
 - if an attribute has datetime or date content -- DateTime
 
 I'm not sure we want to invent a .jsont format, but if you download
 http://svn.osgeo.org/gdal/trunk/gdal/swig/python/samples/ogr2vrt.py
 
 and run  :
 
 python ogr2vrt.py
 http://demo.opengeo.org/geoserver/wfs?service=wfsversion=1.0.0request
 =getfeaturetypename=topp:statesoutputformat=json test.vrt
 
 This will create you a VRT with the default schema, that you can easily edit.
 Note: as with OGR SQL CAST, this is post processing. So if the guess done by 
 the
 GeoJSON driver leads to a loss of information, you cannot recover it. 
 Hopefully
 the implemented rules will not lead to information loss.
 
 A better approach would be to have the schema embedded in a JSON way in the
 GeoJSON file itself.
 That could be an evolution of the format, but I'm not sure this would be 
 really
 popular, given JSON/GeoJSON is heavily used by NoSQL approaches...
 
 Hum, doing a quick search, I just found http://json-schema.org/ that appears 
 to
 be an IETF draft.
 It doesn't look that the schema is embedded in the data file itself.
 
 There's also GeoJSON-LD that might be a bit related :
 https://github.com/geojson/geojson-ld
 
 CC'ing Sean in case he has thoughts on this.
 
 Even
 
  Hi,
 
  I wonder if GDAL could have some simple and relatively user friendly
  way for defining a schema for GeoJSON data. The GeoJSON driver seems
  to guess the data types of attributes with some undocumented way but
  users could have better knowledge about the desired schema.
 
  I know I can control the data type by using OGR SQL and CAST as in
  ogrinfo -sql select cast(EMPLOYED as float) from OGRGeojson
  states.json -so
 
  However, perhaps GeoJSON is enough popular for deserving an easier way
  for writing a schema. First I thought that it would be enough to copy
  the csvt text file mechanism from the GDAL CSV driver
  http://www.gdal.org/drv_csv.html. However, the csvt file is a plain
  list of types which will be applied to the attributes in the same
  order than they appear in the text file
  Integer(5),Real(10.7),String(15)
 
  For GeoJSON it would feel more user friendly to include the attribute
  names in the list somehow like
  population;Integer(5),area;Real(10.7),name;String(15).
 
  This would make it easier for users to write a valid jsont file. A
  list with attribute names could perhaps also help GDAL as well because
  the features in GeoJSON file do not necessarily have same attributes.
 
  As an example this is the right schema for a WFS feature type which is
  captured from
  http://demo.opengeo.org/geoserver/wfs?service=wfsversion=1.0.0reques
  t=des
  cribefeaturetypetypename=topp:states
 
 
  name=the_geom type=gml:MultiPolygonPropertyType/
  name=STATE_NAME type=xsd:string/
  name=STATE_FIPS type=xsd:string/
  name=SUB_REGION type=xsd:string/
  name=STATE_ABBR type=xsd:string/
  name=LAND_KM type=xsd:double/
  name=WATER_KM type=xsd:double/
  name=PERSONS type=xsd:double/
  name=FAMILIES type=xsd:double/
  name=HOUSHOLD type=xsd:double/
  name=MALE type=xsd:double/
  name=FEMALE type=xsd:double/
  name=WORKERS type=xsd:double/
  name=DRVALONE type=xsd:double/
  name=CARPOOL type=xsd:double/
  name=PUBTRANS type=xsd:double/
  name=EMPLOYED type=xsd:double/
  name=UNEMPLOY type=xsd:double/
  

Re: [gdal-dev] Simple schema support for GeoJSON

2014-11-21 Thread Rahkonen Jukka (Tike)
Even Rouault 

 Le vendredi 21 novembre 2014 15:35:43, Rahkonen Jukka (Tike) a écrit :
  Hi,
 
  I have no use for this feature myself but by reading various mailing
  lists and forums I have learned that many people consider it is always
  a good idea to read data for example from WFS services as GeoJSON instead
 of GML.
 
 Because it consumes less bandwidth ?


I suppose rather that they generalize the good experiences about GeoJSON on 
browsers to mean that GML is poor for everything.  I found an interesting site 
http://jsperf.com/openlayers-format-reading-speed/2

I imagine that I can see a meaningful speed difference there indeed.

-Jukka-

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Simple schema support for GeoJSON

2014-11-21 Thread Rahkonen Jukka (Tike)
Hi,


As I wrote, I got a motivation for my first mail because I have seen that 
people are quite often using GeoJSON for delivering geospatial data as data, to 
be saved on disk and used like shapefiles, GML etc. As a result you get stuff 
like this:

http://demo.opengeo.org/geoserver/wfs?service=wfsversion=1.0.0request=getfeaturetypename=topp:statesoutputformat=application/json


You wrote and I agree with it that XML and JSON have very different strengths 
and use cases . However, people do what they want and I do feel that GeoJSON 
will be used for use cases where XML could be stronger like as the only 
supported format in some download services.


About the nonsensical 4-field schema, it is a little bit violent but just what 
about everybody who is using OpenStreetMap data is doing all the time. OSM 
features are pushed into traditional simple feature model and a set of tags are 
converted to attributes in a fixed schema. There are lots of null fields in the 
data and even that is in a way  nonsensical, it is also practical because it 
makes it possible to use osm2pgsql and PostGIS and Mapnik for rendering.


I am so fixated to consume data that I was not thinking at all about how to 
write GeoJSON with GDAL. I was just thinking that if some data are only 
available as GeoJSON, how users could convert it to PostGIS etc. so that the 
data types of the attributes will be the same as in the original data.


Because GeoJSON will not carry the data types as a payload I suppose that the 
current guess-the-datatype approach is the best starting point. Workaround by 
using VRT as Even suggested is good for fine tuning and cast with SQL works as 
well. The correct datatypes may still be somehow uncertain but perhaps those 
who maintain such services will announce the structure of their data on their 
web pages if they feel that it is important and they for example are awaiting 
data updates from users. When it comes to WFS, it seems to be an easy case 
because the XML schema can be reused as GeoJSON schema.


-Jukka Rahkonen-




Sean Gillies s...@mapbox.com

 Hi Even, Jukka,

 While the OGC service architecture is heavily dependent on schemas, OGR type 
 schemas are not *generally* useful for GeoJSON. Consider the following 
 abbreviated feature collection:

  features: [
{properties: {a: 0, b: lol}, ...},
{properties: {c: 2014-11-21, d: wut}, ...}
  ]

 It has two features and they are distinctly different types. A schema that 
 says these features have 4 fields would be nonsensical.

 There are a bunch of different JSON schema approaches and none of them seem 
 to have any traction. https://github.com/json-schema/json-schema for example 
 looks to be stalled. I think the lack of traction reflects some deeper 
 reality: that XML and JSON have very different strengths and use cases and 
 that attempts to XML-ize JSON by adding schemas will always eventually run 
 out of steam.

 For OGR to write schemas into GeoJSON would be a mistake. They could be 
 misleading and because there will never (as far as I can tell) be consensus 
 in the JSON community on the right form of schema, anything OGR implemented 
 would end up being a loser.


On Fri, Nov 21, 2014 at 6:28 AM, Even Rouault 
even.roua...@spatialys.commailto:even.roua...@spatialys.com wrote:
Jukka,

Data type guessing implemented in the OGR GeoJSON driver is quite natural 
hopefully.
A whole scan of the GeoJSON file is made and the following rules are applied :
- if an attribute has integer-only content -- Integer
- if an attribute has an array of integer-only content  -- IntegerList
- if an attribute has integer or floating point content -- Real
- if an attribute has an array of integer or floating point content -- RealList
- if an attribute has an array of anything else content -- StringList
- otherwise -- String

With RFC 50 and other pending improvements in the driver:
- if an attribute has boolean-only content -- Integer(Boolean)
- if an attribute has an array of boolean-only content -- IntegerList(Boolean)
- if an attribute has date-only content -- Date
- if an attribute has time-only content -- Time
- if an attribute has datetime or date content -- DateTime

I'm not sure we want to invent a .jsont format, but if you download
http://svn.osgeo.org/gdal/trunk/gdal/swig/python/samples/ogr2vrt.py

and run  :

python ogr2vrt.py 
http://demo.opengeo.org/geoserver/wfs?service=wfsversion=1.0.0request=getfeaturetypename=topp:statesoutputformat=json;
 test.vrt

This will create you a VRT with the default schema, that you can easily edit.
Note: as with OGR SQL CAST, this is post processing. So if the guess done by 
the GeoJSON driver
leads to a loss of information, you cannot recover it. Hopefully the 
implemented rules will not
lead to information loss.

A better approach would be to have the schema embedded in a JSON way in the 
GeoJSON file itself.
That could be an evolution of the format, but I'm not sure this would 

Re: [gdal-dev] LZW Compression on geotiffs

2014-10-01 Thread Rahkonen Jukka (Tike)
Even Rouault wrote:
 
 Le jeudi 02 octobre 2014 01:06:57, David Strip a écrit :
  On 10/1/2014 12:02 PM, Jukka Rahkonen wrote:
 
  For comparison:
  Tiff as zipped347 MB
  Tiff into png 263 MB
  If I have understood right both zip and png are using deflate
  algorithm so there might be some place for improving deflate compression in
 GDAL.
 
   I was curious how png could achieve such  better compression if it is
  using the same deflate algorithm. I wouldn't think different
  implementations would account for so much improvement. It turns out
  the png compression uses a filtering step ahead of compression. This
  is explained here. The filter is similar to a differential pulse code
  modulation, in which the pixel is represented as the difference from
  the pixels to the left, left upper diagonal, and above. This typically
  reduces the magnitude of the value to something close to zero, making
  the encoding more efficient.
 
 True, a way to improve things might be to specify -co PREDICTOR=2. Should
 apply to both LZW and DEFLATE.
 This is one of the filter that might be used by PNG, except that PNG has 
 different
 filters, so it will eventually beat TIFF deflate.

Not a bad suggestion.

Original 424 MB
DEFLATE without predictor  380 MB 
DEFLATE with -co predictor=2 280 MB

-Jukka Rahkonen-
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Binary Predicates in SQLite SQL dialect

2014-09-30 Thread Rahkonen Jukka (Tike)
Even Rouault wrote:

 Le mardi 30 septembre 2014 23:20:12, Jukka Rahkonen a écrit :
 Even Rouault even.rouault at spatialys.com writes:
  Le mardi 30 septembre 2014 20:20:14, Andre Vautour a écrit :
 ...

   I would build SpatiaLite with GEOS support, but unfortunately its LGPL
   licensing is too restrictive for our application. So, would it make
   sense to change the logic in OGRSQLiteRegisterSQLFunctions to account
   for the case of SpatiaLite being built without GEOS?
 
  Andre,
 
  That makes sense. I guess that can only be checked at runtime though,

 probably

  by issuing a ST_Intersects() and checking the error code.
 
  I'd note that the spatial predicates in OGR geometry are also based on
  GEOS. Except OGRGeometry::Intersects() that has a simplified
  implementation

 based on

  bounding box intersection when GEOS is not available.

 Hi,

 From the end user's point of view this example feels unpleasant. If
 Intersects gives different results with OGR and SQLite dialects because
 of different implementations, a somewhat clever user can handle it now by
 selecting the dialect.

 Actually the OGR SQL dialect doesn't support directly spatial predicates. That
 can only be done through the SetSpatialFilter() API that is typically
 triggered with the -spat option of ogrinfo/ogr2ogr.

 What was discussed here is SQLite user functions installed by OGR. Currently
 even if Spatialite isn't available, there's an implementation of the most
 common spatial predicates that goes through the OGRGeometry methods. Kind of
 poor-man spatialite.

 But if selecting SQLite dialect might still lead to
 use of  OGRGeometry::Intersects() depending on how Spatialite has been
 built I would say that an end user can't manage the situation.

 How about renaming it into OGR_Intersects in such case?

 I'm not sure that would help for consistency, because
 OGRGeometry::Intersects() can be equivalent to Spatialite ST_Intersects() when
 GDAL is compiled with GEOS support. So OGR_Intersects would have different
 behaviour.

 Perhaps that should rather be a OGR_Envelope_Intersects() that is guaranteed
 to always do bounding box intersection

 I don't know if we must worry too much about that, as most users should have
 GDAL and Spatialite binaries with GEOS support. I'd say that only people that
 know what they do will build them without GEOS support, and in which case they
 must be aware of the implications.

You are right and I feel rather relaxed. What I was worried about was that 
people who know what they do when they build GDAL can package the result into 
applications which they deliver to poor people who have no idea about all that. 
You read QGIS users mailing lists and for sure have noticed how users are 
asking why QGIS sometimes, on some computers is behaving in a different way 
than it used to behave because of differences in GDAL used for builds. You 
know, different SQLite and Spatialite versions as a typical example, or the 
variety of JPEG2000 drivers.

 -Jukka-
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Wrong statistics from gdalinfo

2014-08-27 Thread Rahkonen Jukka (Tike)
 Jukka,
 
 I suspect the file has been updated after first statistics have been 
 computed, or
 perhaps they have been computed in -approx_stats mode.
 
 gdalinfo -stats doesn't recompute statistics if they are already stored in 
 the file.

All right, the statistics are how they are because the big GeoTIFF is created 
from this VRT file
http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt

The wrong statistics seem to be stored into VRT. I do not know where the 
statistics came into VRT. The VRT has been created with gdalbuildvrt. 

-Jukka-
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] [OSGeo-Standards] Idea: GeoTIFF box in JPEG to addgeoreferencing

2014-05-21 Thread Rahkonen Jukka (Tike)
Even Rouault wrote:

 A fundamental question that has not been much discussed in this thread is : do
 people really see interest in capturing georeferencing in popular image 
 formats
 like PNG or JPG ? My thinking is that it could be potentially usefull in a few
 situations :
 * when a WMS server returns a PNG or JPG.
 * when generating tile caches for WMTS / TMS.

If an image is a map or aerial image it is worth more with georeferencing than 
without or of equal value for some users but not less for anybody. WMS is a 
good example (we have GeoTIFF outputformat as on option just because of that) 
but there are lots of static map services in the web which deliver just images 
of maps but which could with a minimal effort attach georeferencing into the 
maps so that they could be flipped into a GPS device to act as moving map 
background also for offline usage.

I think that use cases for PNG and JPG are mostly very simple. However, world 
file is not enough but also the projection code should be advertised. And 
support for georeferencing by using ground control points would suit very well 
for scanned maps and digital photos of maps. Inserting a minimal tiff file 
inside an image in another format may be a dirty trick but at least it works 
pretty well with JPEG2000.

-Jukka Rahkonen-

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Idea: GeoTIFF box in JPEG to add georeferencing

2014-05-12 Thread Rahkonen Jukka (Tike)
Hi,

This is just a small detail and I may be wrong, but I fear that axisLabels 
element does not universally remove the need to go to the database because 
there are systems which use X and Y as axis names, but the meaning of X and 
Y can be different. For example for EPSG:3857 X means Easting but for the old 
Finnish KKJ is means Northing. I can also be wrong in saying the X is an axis 
name because in the following EPSG reports the names seem to be Easting and 
Northing and X and Y are abbreviations. However, by looking at the 
EPSG:3857 example from the package 
https://portal.opengeospatial.org/files/?artifact_id=50118 just the ambiguous 
abbreviations are used as axisLabels.
gml:axisLabelsx y/gml:axisLabels

Am I right or am I wrong?

http://epsg-registry.org/report.htm?type=selectionentity=urn:ogc:def:crs:EPSG::2393reportDetail=shortstyle=urn:uuid:report-style:default-with-codestyle_name=OGP%20Default%20With%20Codetitle=Finnish%20KKJ

http://epsg-registry.org/report.htm?type=selectionentity=urn:ogc:def:crs:EPSG::3857reportDetail=shortstyle=urn:uuid:report-style:default-with-codestyle_name=OGP%20Default%20With%20Codetitle=EPSG:3857

Regards,

-Jukka Rahkonen-

Peter Baumann wrote:

 Hi all,
 
 just saw this, thought I'd chime in being the editor of GMLCOV :) Any 
 questions,
 I'll gladly try to answer, I'm just on the road currently, so expect a delay 
 of a day
 or so.
 
 Trying to respond to what has been raised below:
 
 On 05/12/2014 11:18 PM, Even Rouault wrote:
  Le lundi 12 mai 2014 23:05:21, Jukka Rahkonen a écrit :
  Even Rouault even.rouault at mines-paris.org writes:
 
  ...
 
  In light of this, it may be better to use an xml or textual
  representation and embed it inside an XMP block, which is supported
  for many formats[1]. Also it would allow for easier human-reading.
  Yes, that's one possibility. Which comes back to GMLJP2 unless there
  are
  other
 
  standards...
 
  I'd want to build on something that is a real standard or a
  de-facto standard, but not reinvent everything from scratch.
  What GMLJP2 gives as a bonus is the axis order trouble
 
 actually, no. GMLJP2 relies on GMLCOV which defines axis order. Each coverage
 has a Native CRS which is defined via an OGC URI (which, in the case of EPSG,
 refers to the OGP maintained database). In the CRS definition the axis is 
 given
 unambiguously.
 
 If you don't want to go to that database, use the axisLabels attribute in the
 Envelope, it lists axes in their proper sequence.
 
 
  and not totally
  clear interpretation if origin is in the centre (probably it is) or in the
  corner of a pixel,
 
 naively, GMLCOV follows the pixel-in center interpretation. However, encodings
 may override this, such as GeoTIFF (see the pertaining adopted spec [1]).
 
 [1] https://portal.opengeospatial.org/files/?artifact_id=50118
 
 
  and rectified grid does not support ground control
  points.
 
 yes, because it is rectified. If you want more degree of freedom do not use
 RectifiedGridCoverage but ReferenceableGridCoverage. In conjunction with
 GML 3.3
 ( a compatible enhancement of GML 3.2.1) this gives you irregular and warped
 grids. We're not completely done, though, and would be glad about your
 support
 in some advanced issues. Note that this work is all voluntary and relies on 
 some
 project financing to be done. Up to now, EC and ESA have been generous, and
 so
 we are going ahead step by step.
 
 
  Thus GMLJP2 is at least not better in everything, it has also
  drawbacks.
  Hi Jukka,
 
  Yes, for all your above reasons, I would prefer to avoid it.
 
 so which ones are remaining, following the clarification you mention below?
 
 cheers,
 Peter
 
  Although the axis order trouble (the one of EPSG) and interpretation of 
  origin
  (pixel center) should be mostly clarified with the revised version.
  As far as ground countrol points are concerned, I've not really looked at 
  the
  capabilities of GMLCov, so perhaps there's something in it for that.
  Otherwise, that would be indeed a drawback.
 
  -Jukka Rahkonen-
 
  ___
  gdal-dev mailing list
  gdal-dev@lists.osgeo.org
  http://lists.osgeo.org/mailman/listinfo/gdal-dev
 
 --
 Dr. Peter Baumann
   - Professor of Computer Science, Jacobs University Bremen
 www.faculty.jacobs-university.de/pbaumann
 mail: p.baum...@jacobs-university.de
 tel: +49-421-200-3178, fax: +49-421-200-493178
   - Executive Director, rasdaman GmbH Bremen (HRB 26793)
 www.rasdaman.com, mail: baum...@rasdaman.com
 tel: 0800-rasdaman, fax: 0800-rasdafax, mobile: +49-173-5837882
 Si forte in alienas manus oberraverit hec peregrina epistola incertis ventis
 dimissa, sed Deo commendata, precamur ut ei reddatur cui soli destinata, nec
 preripiat quisquam non sibi parata. (mail disclaimer, AD 1083)
 

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Add a new performance hint for Spatialite

2014-03-28 Thread Rahkonen Jukka (Tike)
Hi,

Sorry, I must have not emphasized enough that hint is only valid for appending 
data with subsequent ogr2ogr commands

ogr2ogr -f sqlite -dsco spatialite=yes mtk_tos.sqlite -dim 2 -lco 
spatial_index=no /vsizip/e:\mtk_tos\etrs89\gml\K32.zip tieviiva  -gt 65536
ogr2ogr -f sqlite -append mtk_tos.sqlite -dim 2 
/vsizip/e:\mtk_tos\etrs89\gml\K34.zip tieviiva -gt 65536
... repeat the latter command for 113 more GML files...

I am not sure if this is a common use case everywhere. However, our National 
Land Survey delivers vector data divided by map sheets which means that each 
country wide dataset contains hundreds or thousands of GML files.
 
Even Rouault wrote:
 
 Jukka,
 
 I'm very surprised that you need to do that explicitely, since the driver 
 should
 already do that by default. This is something we discovered last year or the 
 year
 before when you were experimenting OSM - Spatialite conversions. Since OGR
 1.10, the spatial index is created when the datasource is closed.
 
 Look at this debug trace:
 
 $ ogr2ogr -f sqlite /vsimem/out.sqlite ../autotest/ogr/data/poly.shp -dsco
 spatialite=yes --debug on
 OGR: OGROpen(../autotest/ogr/data/poly.shp/0x69a3d0) succeeded as ESRI
 Shapefile.
 SQLITE: SpatiaLite v4 DB found !
 OGR_SQLITE: exec(CREATE TABLE 'poly' (   OGC_FID INTEGER PRIMARY KEY))
 OGR_SQLITE: exec(DELETE FROM geometry_columns WHERE f_table_name =
 'poly')
 OGR_SQLITE: exec(SELECT AddGeometryColumn('poly', 'GEOMETRY', 325834,
 'POLYGON',
 2))
 OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'area' FLOAT)
 OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'eas_id' FLOAT)
 OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'prfedea' VARCHAR(16))
 OGR_SQLITE: BEGIN Transaction
 OGR_SQLITE: prepare(INSERT INTO 'poly'
 (GEOMETRY,area,eas_id,prfedea)
 VALUES (?,?,?,?))
 OGR_SQLITE: COMMIT Transaction
 OGR2OGR: 10 features written in layer 'poly'
 OGR_SQLITE: exec(SELECT CreateSpatialIndex('poly', 'GEOMETRY'))
 SQLITE: Error no such table: layer_statistics
 OGR: Unloading VirtualOGR module
 Shape: 10 features read on layer 'poly'.
 
 Could try adding --debug on to your ogr2ogr command line and see when
 CreateSpatialIndex() is created ?
 
 Even
 
  Hi,
 
  I took timings about adding 115 GML files (548 MB together, 3.2
  million
  linestrings) into a Spatialite table. With default settings the table
  gets initialized with spatial index which makes following inserts
  slower. Another alternative is to create the table without spatial
  index, append all the data first and as a last step create spatial index 
  for the
 ready made table.
 
  With spatial index on:
  Append + index: 71 minutes
 
  With spatial index off:
  Append: 9 minutes
  Create spatial index: 6 minutes
  Total: 15 minutes
 
  I was rather happy with the initial conversion speed until I made this
  test which revealed that creating spatial index as a final step made
  the whole process more than four times faster! This way both data
  table and spatial index are probably in contiguous chunks in the
  SQLite datafile and there is no need for post process VACUUM.
  Vacuuming in SQLite is rather slow and for this 1.3 GB database it takes 
  more
 than 4 minutes to run.
 
  Suggestion: Add a new performance hint on page
  http://www.gdal.org/ogr/drv_sqlite.html
 
  If many source files will be collected into the same Spatialite table
  it can be much faster to initialize the table without a spatial index
  by using -lco SPATIAL_INDEX=NO and to create spatial index with a
  separate command after all the data are appended. Spatial index can be
  created with ogrinfo command ogrinfo db.sqlite -sql SELECT
  CreateSpatialIndex('table_name','geometry_column_name')
 
  Perhaps it could also be mentioned as a performance hint that VACUUUM
  can also be done from orginfo as ogrinfo db.sqlite -sql VACUUM
 
  -Jukka Rahkonen-
 
 
 
  ___
  gdal-dev mailing list
  gdal-dev@lists.osgeo.org
  http://lists.osgeo.org/mailman/listinfo/gdal-dev
 
 

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Something wrong with writing a big raster into PDF

2014-03-07 Thread Rahkonen Jukka (Tike)
Hi Even,

Yes, that way I can perhaps create an all-white PDF map a bit faster but with 
no other improvement, unfortunately.

-Jukka-

Even Rouault wrote:
 
 Hi Jukka,
 
 did you try with -co TILED=YES ?
 
 Even
 
 
  Hi,
 
  I can't make usable pdf files from some original maps and I do not
  understand what goes wrong.
  I am testing on Windows 7 and GDAL-dev from gisinternals, both 32 and
  64 bit. Images are like this one:
 
 
 http://kartat.kapsi.fi/files/taustakarttasarja/taustakartta_80/5m/etrs89/png/U
 M5L.png
 
  Command can be simplified to
  gdal_translate -of pdf UM5L.png UM5L.pdf Everything seems to go
  alright and gdalinfo shows normal information from the pdf. However,
  if I open the UM5L.pdf file with Acrobat Reader it does not show a
  map. Everything is just pure white.
 
  I planned to check what happens if read the map back from pdf as
  gdal_translate -of gtiff um5l.pdf um5l.tif After fifteen minutes I
  estimate that the output will be ready after an hour or so which is
  all too slow for a Corei7 CPU and 8GB of RAM.
 
  I had another try with the same binaries with a small png file
  (2000x2000
  pixels) and conversion into pdf was fast and successful. Is there
  perhaps something that does not scale up properly when the raster size gets
 bigger?
 
  -Jukka Rahkonen-
 
 
 
  ___
  gdal-dev mailing list
  gdal-dev@lists.osgeo.org
  http://lists.osgeo.org/mailman/listinfo/gdal-dev
 
 

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Something wrong with writing a big raster into PDF

2014-03-07 Thread Rahkonen Jukka (Tike)

Even Rouault wrote:
 
 Le vendredi 07 mars 2014 21:11:40, Even Rouault a écrit :
  Le vendredi 07 mars 2014 14:20:13, Rahkonen Jukka (Tike) a écrit :
   Hi Even,
  
   Yes, that way I can perhaps create an all-white PDF map a bit faster
   but with no other improvement, unfortunately.
 
  I've tried gdal_translate UM5L.png UM5L.pdf -of pdf on my Linux
  workstation and the the following viewers :
 
 I forgot to mention the -co TILED=YES which was actually used and makes a
 dramatic improvement for xpdf  QGIS use cases, compared to not tiling mode.
 
  - Adobe Reader 9.4.2 Linux: blank image
  - Okular, KDE-based PDF viewer, based on poppler library: can display
  a
  fit-to- page overview in a few seconds, but not the full resolution
  image since it tries to allocate a 15360x15360 buffer
  - Evince, GNOME-based PDF viewer, based on poppler library, uses cairo
  for
  display: blank image with error message cairo context error: out of
  memory - xpdf, viewer whose code has been forked to build the poppler
  library: can display the PDF at full resolution, and in reasonable time.
  - qgis (based on GDAL PDF driver, and poppler backend) : the initial
  overview computation is really really slow. Roughly half an hour.
  Because there's currently no optimization in the GDAL PDF reader to
  get a lower resolution image. Could potentially be dramatically
  improved by asking a rendering at lower DPI for overview levels,
  instead of querying at best DPI and then downsampling. But once that
  32x32 overview is computed, if you go to 100% scale and pan, the
  refresh time is around 1 second after each pan. - command line
  rendering (based on GDAL PDF driver, and poppler backend) :
  gdal_translate UM5L_tiled.pdf out.tif -co TILED=YES : 1 minute 20 sec
  (less if you specify a higher value than the default for
  GDAL_CACHEMAX)
 
  I've also tried generating a JPEG2000 compressed PDF and Adore Reader
  is not happier.
 
  Conclusion: most PDF viewers assume that a PDF page can fit in memory
  and don't use a tiling strategy to display it.
 
  Even

I  made some tests too by increasing the output size step by step. For my 
laptop running Windows 7 54 bit with 8 GB RAM and 32-bit Acrobat Reader XI the 
first failing pdf was made as
gdal_translate -of png -outsize 80% 80% ul4l.png ul4l_80p.png -co tiled=yes
Input file size is 19200, 19200

Output size is then 15360, 15360.
Acrobat Reader shows an empty map but task manager is listing only a nominal 
memory consumption of 24 MB for the AcroRd32.exe process. From the Reader file 
properties menu the page size of the failing one is 5080x5080.
For the till 60% reduced version that shows the map the PDF page size is 
4064x4064.

Now I wonder if this is really about too big image or something else. 15360 by 
15360 pixels is not extremely much. Is there some secret 200 inch limit hiding 
in the background? See 
http://indesignsecrets.com/beware-200-limit-for-pdfs.php. 5080 mm is just above 
200 inches.
 
-Jukka Rahkonen-

-Jukka Rahkonen-
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] GeoPackage fails after touching it with Spatialite-gui

2014-02-18 Thread Rahkonen Jukka (Tike)
Hi,

Is this the same issue about how GPKG is handling ExecuteSQL()? If it is, then 
I will make a ticket because a database without indexes is missing a lot. Or is 
there some other way how I could create index into GPKG with GDAL? Alessandro 
Furieri told already that using spatialite-gui/tools for GeoPackages is 
currently not safe.

ogrinfo test.gpkg -dialect sqlite -sql create index knro_idx on test (knro)
INFO: Open of `test.gpkg'
  using driver `GPKG' successful.
ERROR 1: In ExecuteSQL(): sqlite3_prepare(create index knro_idx on test (knro)):
  no such table: main.test

-Jukka-

Even Rouault wrote:
 
 Selon Jukka Rahkonen jukka.rahko...@mmmtike.fi:
 
  Even Rouault even.rouault at mines-paris.org writes:
 
  
   Jukka,
  
   I highly suspect that the ALTER TABLE must imply a rewriting of the
   file by spatialite/spatialite-gui, and when doing so, it doesn't
   preserve the application id (4 bytes in the header of sqlite
   file), that geopackage specification mentions to be set to a
   particular value. Consequently the geopackage driver later fails when
 checking the signature.
 
  I have not yet found a way to reproduce crash in a reliable way but
  while testing I found something else. Does the following mean that the
  GPKG file is not totally valid SQLite file or just that GDAL gets
  puzzled because there are separate implementations for SQLite/Spatialite and
 GPKG?
 
 
  Step 1: create plain SQLite and GPKG databases
 
  C:\dataogr2ogr -f sqlite  test.sqlite temp.shp C:\dataogr2ogr -f
  gpkg  test.gpkg temp.shp
 
  Step 2: try to rename a table
  C:\dataogrinfo test.sqlite -dialect sqlite -sql alter table temp
  rename to temp2
  INFO: Open of `test.sqlite'
using driver `SQLite' successful.
 
  C:\dataogrinfo test.gpkg -dialect sqlite -sql alter table temp
  rename to temp2
  INFO: Open of `test.gpkg'
using driver `GPKG' successful.
  ERROR 1: In ExecuteSQL(): sqlite3_prepare(alter table temp rename to temp2):
no such table: temp
 
 Hum, well I can see that you are going to run into problems. The GPKG driver 
 has
 no ExecuteSQL() implementation that should directly handle your SQL requests
 (what the SQLite driver would do). So it fallbacks to the generic ExecuteSQL()
 implementation, which uses the sqlite VirtualOGR mechanism, which does not
 support table renaming.
 Ideally, the GPKG driver should be extended to implement ExecuteSQL() in a
 similar way than the SQLite driver.
 
 
  I noticed that OGR dialect does not support renaming tables, it
  suggests to rename a column instead.
 
  -Jukka-
 
  ___
  gdal-dev mailing list
  gdal-dev@lists.osgeo.org
  http://lists.osgeo.org/mailman/listinfo/gdal-dev
 
 

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] GeoPackage fails after touching it with Spatialite-gui

2014-02-18 Thread Rahkonen Jukka (Tike)
Hi,

Creating indexes with sqlite3 feels safe. Renaming table is unsafe and leads to 
ogrinfo crash.

sqlite alter table test rename to test2;

C:\ohjelmat\sqlite3ogrinfo kuti.gpkg
ERROR 1: (null)
INFO: Open of `kuti.gpkg'
  using driver `GPKG' successful.

Next: Crash.

What is good is that everything is good again after doing 
sqlite alter table test2 rename to test; 

Renaming a table is not something that is absolutely needed but it is not a 
totally odd idea either and it should not lead to program crash. Now it looks 
like the crash comes from metadata and real table names being unsynchronized 
and a proper way to support renaming the tables would mean creating also 
triggers into the database for updating the metadata fields correctly.
Because it is sure at the moment that renaming a table will lead to a not 
usable GeoPackage it might be good to mention it in the Limitations section of 
driver page http://www.gdal.org/ogr/drv_geopackage.html

-Jukka-


 -Alkuperäinen viesti-
 Lähettäjä: Even Rouault [mailto:even.roua...@mines-paris.org]
 Lähetetty: 18. helmikuuta 2014 11:07
 Vastaanottaja: Rahkonen Jukka (Tike)
 Kopio: 'gdal-dev@lists.osgeo.org'
 Aihe: Re: [gdal-dev] GeoPackage fails after touching it with Spatialite-gui
 
 Hi Jukka,
 
 yes, this is the same issue. -dialect sqlite is quite powerfull, but can only 
 handle
 SELECT / UPDATE / DELETE, no other fancy stuff. The GPKG driver should just
 have its ExecuteSQL() implementation.
 Currently I guess you could open a .gpkg with the sqlite3 binary. I wouldn't
 expect it to alter the application_id field (or perhaps in the same situation 
 as
 spatialite_gui does).
 
 Even
 
  Hi,
 
  Is this the same issue about how GPKG is handling ExecuteSQL()? If it
  is, then I will make a ticket because a database without indexes is
  missing a lot. Or is there some other way how I could create index into GPKG
 with GDAL?
  Alessandro Furieri told already that using spatialite-gui/tools for
  GeoPackages is currently not safe.
 
  ogrinfo test.gpkg -dialect sqlite -sql create index knro_idx on test 
  (knro)
  INFO: Open of `test.gpkg'
using driver `GPKG' successful.
  ERROR 1: In ExecuteSQL(): sqlite3_prepare(create index knro_idx on
  test
  (knro)):
no such table: main.test
 
  -Jukka-
 
  Even Rouault wrote:
  
   Selon Jukka Rahkonen jukka.rahko...@mmmtike.fi:
  
Even Rouault even.rouault at mines-paris.org writes:
   

 Jukka,

 I highly suspect that the ALTER TABLE must imply a rewriting of
 the file by spatialite/spatialite-gui, and when doing so, it
 doesn't preserve the application id (4 bytes in the header of
 sqlite file), that geopackage specification mentions to be set
 to a particular value. Consequently the geopackage driver later
 fails when
   checking the signature.
   
I have not yet found a way to reproduce crash in a reliable way
but while testing I found something else. Does the following mean
that the GPKG file is not totally valid SQLite file or just that
GDAL gets puzzled because there are separate implementations for
SQLite/Spatialite
  and
   GPKG?
   
   
Step 1: create plain SQLite and GPKG databases
   
C:\dataogr2ogr -f sqlite  test.sqlite temp.shp C:\dataogr2ogr -f
gpkg  test.gpkg temp.shp
   
Step 2: try to rename a table
C:\dataogrinfo test.sqlite -dialect sqlite -sql alter table temp
rename to temp2
INFO: Open of `test.sqlite'
  using driver `SQLite' successful.
   
C:\dataogrinfo test.gpkg -dialect sqlite -sql alter table temp
rename to temp2
INFO: Open of `test.gpkg'
  using driver `GPKG' successful.
ERROR 1: In ExecuteSQL(): sqlite3_prepare(alter table temp rename
to
  temp2):
  no such table: temp
  
   Hum, well I can see that you are going to run into problems. The
   GPKG
  driver has
   no ExecuteSQL() implementation that should directly handle your SQL
  requests
   (what the SQLite driver would do). So it fallbacks to the generic
  ExecuteSQL()
   implementation, which uses the sqlite VirtualOGR mechanism, which
   does not support table renaming.
   Ideally, the GPKG driver should be extended to implement
   ExecuteSQL() in a similar way than the SQLite driver.
  
   
I noticed that OGR dialect does not support renaming tables, it
suggests to rename a column instead.
   
-Jukka-
   
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev
   
  
 
  ___
  gdal-dev mailing list
  gdal-dev@lists.osgeo.org
  http://lists.osgeo.org/mailman/listinfo/gdal-dev
 
 

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Fast Pixel Access

2014-02-03 Thread Rahkonen Jukka (Tike)
Hi,

Perhaps, but in this game the rule was not to have any GIS servers. Myself I 
would rather consider WFS. It could send heights from single points but also a 
profile along a line or all values within a polygon.

-Jukka-

Brian Case [r...@winkey.org] wrote:

 -Jukka

 tileindex, mapserver, and the gdal wms driver



 On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote:
 Luke Roth roth.luke at gmail.com writes:

 
  Another thing that might speed up access is setting the config
 option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment
 variable or on the command line.  That should help with GDAL reading the
 directory each time it opens a dataset.  I have an application which reads
 one value from each of a large number of datasets and setting this option
 made it run about 3 times faster.


 Hi,

 You are right. This config option makes GDAL to skip the reading of the
 remote directory and saves a lot of bandwidth:

 VRT case:
 Bytes Received:  4 244 509 (of which the vrt file: 4 192 577)
 Sequence (clock) duration:00:00:09.9996000
 Was:
 Bytes Received:  6 459 443
 Sequence (clock) duration:00:00:37.813

 BigTIFF case:
 Bytes Received:  2 158 917
 Sequence (clock) duration:00:00:04.4368000
 Was:
 Bytes Received:  4 374 137
 Sequence (clock) duration:00:00:30.9192000


 Conclusion:
 Both options are unsuitable for serious use while amusing to play with.
 Reading the BigTIFF tile offset index (or whatever it is) seems to mean
 about 2 MB of compultory payload traffic. Reading the VRT file means in this
 example 4 MB of payload. If this sort of net access to a large directory of
 raster files should be important for someone there should be a way to find
 the right raster file and righ data range in that file with minimum amount
 of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to
 keep the vrt file on the client side.

 -Jukka Rahkonen-


 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] New option INDEX_COLUMNS for SQLite and GPKG

2014-01-25 Thread Rahkonen Jukka (Tike)
Hi Even,

Good points and vacuuming is also very important sometimes but but does not 
suit at all for a creation option. I will think about some text and examples 
that could be added into SQLite/Spatialite performance hints.

-Jukka-


Even Rouault wrote:

 Hi Jukka,

 I'm a bit ambivalent about providing a creation option for that (although that
 would not be an heresy). For a few reasons :
 - it is relatively easy to create an attribute index manually (once you know
 the syntax)
 - it is not necessarily to create it at layer creation time
 - it would potentially apply to all drivers that have a SQL engine behind

 But I agree that improving the documentation to advertize the interest of
 indexes and how to create them could be usefull. Perhaps you would want to
 propose a modified version of the HTML page ? (I think that the GPKG one could
 just point to the relevant section of the SQLite one, to avoid doc
 duplication)

 Even

 Hi,

 I guess that both SQLite/Spatialite and especially OGC Geopackage will be
 used by people who consider them more like file formats than as datebases.
 Such users may not know the power of indexes and they do not necessarily
 know anything about SQL and such.

 How about writing a few lines about the power and importance of indexes
 into the SQLite and GPKG driver pages? There could at least be on example
 about how to create a new index with ogrinfo, which must be a secret even
 for many advanced GDAL users. Better though, from user point of view,
 would be to implement a new layer creation option. I see a very similar
 case in SQLite/Spatialite
 COMPRESS_COLUMNS=column_name1[,column_name2, ...]: (Starting with GDAL
 1.10.0) A list of (String) columns that must be compressed
 The new -lco might be
 INDEX_COLUMNS=column_name1[,column_name2, ...]: (Starting with GDAL 2.0) A
 list of columns that will be indexed.

 I am not sure if -lco should somehow support also composite indexes.
 Perhaps users who know what those (and unique indexes) mean can also use
 SQL and create them with ogrinfo if they can see an example in the
 documentation. Thus -lco could be made to accept only one column per
 index. And index name in the db could be set automatically into something
 like
 layer_name_column_name_idx. The SQL that this -lco should fire is simply

 CREATE INDEX table_name_column_name_idx
 ON table_name (column_name);

 -Jukka Rahkonen-
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev