Re: [gdal-dev] Simple schema support for GeoJSON
Hi, I have no use for this feature myself but by reading various mailing lists and forums I have learned that many people consider it is always a good idea to read data for example from WFS services as GeoJSON instead of GML. I can easily imagine that there will be troubles with guess-by-data method if they are making subsequent requests from the service. For example strings which are all numbers but which may contain leading zeroes are saved either to integers or strings if leading zeroes are interpreted right at all. Or floats which do not always contain decimals, or list attributes which sometimes have only zero or one member. Embedded schema feels optimal because then it would always travel together with the data and we all have probably lost .tfw or .prj files sometimes. -Jukka- Even Rouault wrote: Jukka, Data type guessing implemented in the OGR GeoJSON driver is quite natural hopefully. A whole scan of the GeoJSON file is made and the following rules are applied : - if an attribute has integer-only content -- Integer - if an attribute has an array of integer-only content -- IntegerList - if an attribute has integer or floating point content -- Real - if an attribute has an array of integer or floating point content -- RealList - if an attribute has an array of anything else content -- StringList - otherwise -- String With RFC 50 and other pending improvements in the driver: - if an attribute has boolean-only content -- Integer(Boolean) - if an attribute has an array of boolean-only content -- IntegerList(Boolean) - if an attribute has date-only content -- Date - if an attribute has time-only content -- Time - if an attribute has datetime or date content -- DateTime I'm not sure we want to invent a .jsont format, but if you download http://svn.osgeo.org/gdal/trunk/gdal/swig/python/samples/ogr2vrt.py and run : python ogr2vrt.py http://demo.opengeo.org/geoserver/wfs?service=wfsversion=1.0.0request =getfeaturetypename=topp:statesoutputformat=json test.vrt This will create you a VRT with the default schema, that you can easily edit. Note: as with OGR SQL CAST, this is post processing. So if the guess done by the GeoJSON driver leads to a loss of information, you cannot recover it. Hopefully the implemented rules will not lead to information loss. A better approach would be to have the schema embedded in a JSON way in the GeoJSON file itself. That could be an evolution of the format, but I'm not sure this would be really popular, given JSON/GeoJSON is heavily used by NoSQL approaches... Hum, doing a quick search, I just found http://json-schema.org/ that appears to be an IETF draft. It doesn't look that the schema is embedded in the data file itself. There's also GeoJSON-LD that might be a bit related : https://github.com/geojson/geojson-ld CC'ing Sean in case he has thoughts on this. Even Hi, I wonder if GDAL could have some simple and relatively user friendly way for defining a schema for GeoJSON data. The GeoJSON driver seems to guess the data types of attributes with some undocumented way but users could have better knowledge about the desired schema. I know I can control the data type by using OGR SQL and CAST as in ogrinfo -sql select cast(EMPLOYED as float) from OGRGeojson states.json -so However, perhaps GeoJSON is enough popular for deserving an easier way for writing a schema. First I thought that it would be enough to copy the csvt text file mechanism from the GDAL CSV driver http://www.gdal.org/drv_csv.html. However, the csvt file is a plain list of types which will be applied to the attributes in the same order than they appear in the text file Integer(5),Real(10.7),String(15) For GeoJSON it would feel more user friendly to include the attribute names in the list somehow like population;Integer(5),area;Real(10.7),name;String(15). This would make it easier for users to write a valid jsont file. A list with attribute names could perhaps also help GDAL as well because the features in GeoJSON file do not necessarily have same attributes. As an example this is the right schema for a WFS feature type which is captured from http://demo.opengeo.org/geoserver/wfs?service=wfsversion=1.0.0reques t=des cribefeaturetypetypename=topp:states name=the_geom type=gml:MultiPolygonPropertyType/ name=STATE_NAME type=xsd:string/ name=STATE_FIPS type=xsd:string/ name=SUB_REGION type=xsd:string/ name=STATE_ABBR type=xsd:string/ name=LAND_KM type=xsd:double/ name=WATER_KM type=xsd:double/ name=PERSONS type=xsd:double/ name=FAMILIES type=xsd:double/ name=HOUSHOLD type=xsd:double/ name=MALE type=xsd:double/ name=FEMALE type=xsd:double/ name=WORKERS type=xsd:double/ name=DRVALONE type=xsd:double/ name=CARPOOL type=xsd:double/ name=PUBTRANS type=xsd:double/ name=EMPLOYED type=xsd:double/ name=UNEMPLOY type=xsd:double/
Re: [gdal-dev] Simple schema support for GeoJSON
Even Rouault Le vendredi 21 novembre 2014 15:35:43, Rahkonen Jukka (Tike) a écrit : Hi, I have no use for this feature myself but by reading various mailing lists and forums I have learned that many people consider it is always a good idea to read data for example from WFS services as GeoJSON instead of GML. Because it consumes less bandwidth ? I suppose rather that they generalize the good experiences about GeoJSON on browsers to mean that GML is poor for everything. I found an interesting site http://jsperf.com/openlayers-format-reading-speed/2 I imagine that I can see a meaningful speed difference there indeed. -Jukka- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Simple schema support for GeoJSON
Hi, As I wrote, I got a motivation for my first mail because I have seen that people are quite often using GeoJSON for delivering geospatial data as data, to be saved on disk and used like shapefiles, GML etc. As a result you get stuff like this: http://demo.opengeo.org/geoserver/wfs?service=wfsversion=1.0.0request=getfeaturetypename=topp:statesoutputformat=application/json You wrote and I agree with it that XML and JSON have very different strengths and use cases . However, people do what they want and I do feel that GeoJSON will be used for use cases where XML could be stronger like as the only supported format in some download services. About the nonsensical 4-field schema, it is a little bit violent but just what about everybody who is using OpenStreetMap data is doing all the time. OSM features are pushed into traditional simple feature model and a set of tags are converted to attributes in a fixed schema. There are lots of null fields in the data and even that is in a way nonsensical, it is also practical because it makes it possible to use osm2pgsql and PostGIS and Mapnik for rendering. I am so fixated to consume data that I was not thinking at all about how to write GeoJSON with GDAL. I was just thinking that if some data are only available as GeoJSON, how users could convert it to PostGIS etc. so that the data types of the attributes will be the same as in the original data. Because GeoJSON will not carry the data types as a payload I suppose that the current guess-the-datatype approach is the best starting point. Workaround by using VRT as Even suggested is good for fine tuning and cast with SQL works as well. The correct datatypes may still be somehow uncertain but perhaps those who maintain such services will announce the structure of their data on their web pages if they feel that it is important and they for example are awaiting data updates from users. When it comes to WFS, it seems to be an easy case because the XML schema can be reused as GeoJSON schema. -Jukka Rahkonen- Sean Gillies s...@mapbox.com Hi Even, Jukka, While the OGC service architecture is heavily dependent on schemas, OGR type schemas are not *generally* useful for GeoJSON. Consider the following abbreviated feature collection: features: [ {properties: {a: 0, b: lol}, ...}, {properties: {c: 2014-11-21, d: wut}, ...} ] It has two features and they are distinctly different types. A schema that says these features have 4 fields would be nonsensical. There are a bunch of different JSON schema approaches and none of them seem to have any traction. https://github.com/json-schema/json-schema for example looks to be stalled. I think the lack of traction reflects some deeper reality: that XML and JSON have very different strengths and use cases and that attempts to XML-ize JSON by adding schemas will always eventually run out of steam. For OGR to write schemas into GeoJSON would be a mistake. They could be misleading and because there will never (as far as I can tell) be consensus in the JSON community on the right form of schema, anything OGR implemented would end up being a loser. On Fri, Nov 21, 2014 at 6:28 AM, Even Rouault even.roua...@spatialys.commailto:even.roua...@spatialys.com wrote: Jukka, Data type guessing implemented in the OGR GeoJSON driver is quite natural hopefully. A whole scan of the GeoJSON file is made and the following rules are applied : - if an attribute has integer-only content -- Integer - if an attribute has an array of integer-only content -- IntegerList - if an attribute has integer or floating point content -- Real - if an attribute has an array of integer or floating point content -- RealList - if an attribute has an array of anything else content -- StringList - otherwise -- String With RFC 50 and other pending improvements in the driver: - if an attribute has boolean-only content -- Integer(Boolean) - if an attribute has an array of boolean-only content -- IntegerList(Boolean) - if an attribute has date-only content -- Date - if an attribute has time-only content -- Time - if an attribute has datetime or date content -- DateTime I'm not sure we want to invent a .jsont format, but if you download http://svn.osgeo.org/gdal/trunk/gdal/swig/python/samples/ogr2vrt.py and run : python ogr2vrt.py http://demo.opengeo.org/geoserver/wfs?service=wfsversion=1.0.0request=getfeaturetypename=topp:statesoutputformat=json; test.vrt This will create you a VRT with the default schema, that you can easily edit. Note: as with OGR SQL CAST, this is post processing. So if the guess done by the GeoJSON driver leads to a loss of information, you cannot recover it. Hopefully the implemented rules will not lead to information loss. A better approach would be to have the schema embedded in a JSON way in the GeoJSON file itself. That could be an evolution of the format, but I'm not sure this would
Re: [gdal-dev] LZW Compression on geotiffs
Even Rouault wrote: Le jeudi 02 octobre 2014 01:06:57, David Strip a écrit : On 10/1/2014 12:02 PM, Jukka Rahkonen wrote: For comparison: Tiff as zipped347 MB Tiff into png 263 MB If I have understood right both zip and png are using deflate algorithm so there might be some place for improving deflate compression in GDAL. I was curious how png could achieve such better compression if it is using the same deflate algorithm. I wouldn't think different implementations would account for so much improvement. It turns out the png compression uses a filtering step ahead of compression. This is explained here. The filter is similar to a differential pulse code modulation, in which the pixel is represented as the difference from the pixels to the left, left upper diagonal, and above. This typically reduces the magnitude of the value to something close to zero, making the encoding more efficient. True, a way to improve things might be to specify -co PREDICTOR=2. Should apply to both LZW and DEFLATE. This is one of the filter that might be used by PNG, except that PNG has different filters, so it will eventually beat TIFF deflate. Not a bad suggestion. Original 424 MB DEFLATE without predictor 380 MB DEFLATE with -co predictor=2 280 MB -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Binary Predicates in SQLite SQL dialect
Even Rouault wrote: Le mardi 30 septembre 2014 23:20:12, Jukka Rahkonen a écrit : Even Rouault even.rouault at spatialys.com writes: Le mardi 30 septembre 2014 20:20:14, Andre Vautour a écrit : ... I would build SpatiaLite with GEOS support, but unfortunately its LGPL licensing is too restrictive for our application. So, would it make sense to change the logic in OGRSQLiteRegisterSQLFunctions to account for the case of SpatiaLite being built without GEOS? Andre, That makes sense. I guess that can only be checked at runtime though, probably by issuing a ST_Intersects() and checking the error code. I'd note that the spatial predicates in OGR geometry are also based on GEOS. Except OGRGeometry::Intersects() that has a simplified implementation based on bounding box intersection when GEOS is not available. Hi, From the end user's point of view this example feels unpleasant. If Intersects gives different results with OGR and SQLite dialects because of different implementations, a somewhat clever user can handle it now by selecting the dialect. Actually the OGR SQL dialect doesn't support directly spatial predicates. That can only be done through the SetSpatialFilter() API that is typically triggered with the -spat option of ogrinfo/ogr2ogr. What was discussed here is SQLite user functions installed by OGR. Currently even if Spatialite isn't available, there's an implementation of the most common spatial predicates that goes through the OGRGeometry methods. Kind of poor-man spatialite. But if selecting SQLite dialect might still lead to use of OGRGeometry::Intersects() depending on how Spatialite has been built I would say that an end user can't manage the situation. How about renaming it into OGR_Intersects in such case? I'm not sure that would help for consistency, because OGRGeometry::Intersects() can be equivalent to Spatialite ST_Intersects() when GDAL is compiled with GEOS support. So OGR_Intersects would have different behaviour. Perhaps that should rather be a OGR_Envelope_Intersects() that is guaranteed to always do bounding box intersection I don't know if we must worry too much about that, as most users should have GDAL and Spatialite binaries with GEOS support. I'd say that only people that know what they do will build them without GEOS support, and in which case they must be aware of the implications. You are right and I feel rather relaxed. What I was worried about was that people who know what they do when they build GDAL can package the result into applications which they deliver to poor people who have no idea about all that. You read QGIS users mailing lists and for sure have noticed how users are asking why QGIS sometimes, on some computers is behaving in a different way than it used to behave because of differences in GDAL used for builds. You know, different SQLite and Spatialite versions as a typical example, or the variety of JPEG2000 drivers. -Jukka- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Wrong statistics from gdalinfo
Jukka, I suspect the file has been updated after first statistics have been computed, or perhaps they have been computed in -approx_stats mode. gdalinfo -stats doesn't recompute statistics if they are already stored in the file. All right, the statistics are how they are because the big GeoTIFF is created from this VRT file http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt The wrong statistics seem to be stored into VRT. I do not know where the statistics came into VRT. The VRT has been created with gdalbuildvrt. -Jukka- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] [OSGeo-Standards] Idea: GeoTIFF box in JPEG to addgeoreferencing
Even Rouault wrote: A fundamental question that has not been much discussed in this thread is : do people really see interest in capturing georeferencing in popular image formats like PNG or JPG ? My thinking is that it could be potentially usefull in a few situations : * when a WMS server returns a PNG or JPG. * when generating tile caches for WMTS / TMS. If an image is a map or aerial image it is worth more with georeferencing than without or of equal value for some users but not less for anybody. WMS is a good example (we have GeoTIFF outputformat as on option just because of that) but there are lots of static map services in the web which deliver just images of maps but which could with a minimal effort attach georeferencing into the maps so that they could be flipped into a GPS device to act as moving map background also for offline usage. I think that use cases for PNG and JPG are mostly very simple. However, world file is not enough but also the projection code should be advertised. And support for georeferencing by using ground control points would suit very well for scanned maps and digital photos of maps. Inserting a minimal tiff file inside an image in another format may be a dirty trick but at least it works pretty well with JPEG2000. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Idea: GeoTIFF box in JPEG to add georeferencing
Hi, This is just a small detail and I may be wrong, but I fear that axisLabels element does not universally remove the need to go to the database because there are systems which use X and Y as axis names, but the meaning of X and Y can be different. For example for EPSG:3857 X means Easting but for the old Finnish KKJ is means Northing. I can also be wrong in saying the X is an axis name because in the following EPSG reports the names seem to be Easting and Northing and X and Y are abbreviations. However, by looking at the EPSG:3857 example from the package https://portal.opengeospatial.org/files/?artifact_id=50118 just the ambiguous abbreviations are used as axisLabels. gml:axisLabelsx y/gml:axisLabels Am I right or am I wrong? http://epsg-registry.org/report.htm?type=selectionentity=urn:ogc:def:crs:EPSG::2393reportDetail=shortstyle=urn:uuid:report-style:default-with-codestyle_name=OGP%20Default%20With%20Codetitle=Finnish%20KKJ http://epsg-registry.org/report.htm?type=selectionentity=urn:ogc:def:crs:EPSG::3857reportDetail=shortstyle=urn:uuid:report-style:default-with-codestyle_name=OGP%20Default%20With%20Codetitle=EPSG:3857 Regards, -Jukka Rahkonen- Peter Baumann wrote: Hi all, just saw this, thought I'd chime in being the editor of GMLCOV :) Any questions, I'll gladly try to answer, I'm just on the road currently, so expect a delay of a day or so. Trying to respond to what has been raised below: On 05/12/2014 11:18 PM, Even Rouault wrote: Le lundi 12 mai 2014 23:05:21, Jukka Rahkonen a écrit : Even Rouault even.rouault at mines-paris.org writes: ... In light of this, it may be better to use an xml or textual representation and embed it inside an XMP block, which is supported for many formats[1]. Also it would allow for easier human-reading. Yes, that's one possibility. Which comes back to GMLJP2 unless there are other standards... I'd want to build on something that is a real standard or a de-facto standard, but not reinvent everything from scratch. What GMLJP2 gives as a bonus is the axis order trouble actually, no. GMLJP2 relies on GMLCOV which defines axis order. Each coverage has a Native CRS which is defined via an OGC URI (which, in the case of EPSG, refers to the OGP maintained database). In the CRS definition the axis is given unambiguously. If you don't want to go to that database, use the axisLabels attribute in the Envelope, it lists axes in their proper sequence. and not totally clear interpretation if origin is in the centre (probably it is) or in the corner of a pixel, naively, GMLCOV follows the pixel-in center interpretation. However, encodings may override this, such as GeoTIFF (see the pertaining adopted spec [1]). [1] https://portal.opengeospatial.org/files/?artifact_id=50118 and rectified grid does not support ground control points. yes, because it is rectified. If you want more degree of freedom do not use RectifiedGridCoverage but ReferenceableGridCoverage. In conjunction with GML 3.3 ( a compatible enhancement of GML 3.2.1) this gives you irregular and warped grids. We're not completely done, though, and would be glad about your support in some advanced issues. Note that this work is all voluntary and relies on some project financing to be done. Up to now, EC and ESA have been generous, and so we are going ahead step by step. Thus GMLJP2 is at least not better in everything, it has also drawbacks. Hi Jukka, Yes, for all your above reasons, I would prefer to avoid it. so which ones are remaining, following the clarification you mention below? cheers, Peter Although the axis order trouble (the one of EPSG) and interpretation of origin (pixel center) should be mostly clarified with the revised version. As far as ground countrol points are concerned, I've not really looked at the capabilities of GMLCov, so perhaps there's something in it for that. Otherwise, that would be indeed a drawback. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev -- Dr. Peter Baumann - Professor of Computer Science, Jacobs University Bremen www.faculty.jacobs-university.de/pbaumann mail: p.baum...@jacobs-university.de tel: +49-421-200-3178, fax: +49-421-200-493178 - Executive Director, rasdaman GmbH Bremen (HRB 26793) www.rasdaman.com, mail: baum...@rasdaman.com tel: 0800-rasdaman, fax: 0800-rasdafax, mobile: +49-173-5837882 Si forte in alienas manus oberraverit hec peregrina epistola incertis ventis dimissa, sed Deo commendata, precamur ut ei reddatur cui soli destinata, nec preripiat quisquam non sibi parata. (mail disclaimer, AD 1083) ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Add a new performance hint for Spatialite
Hi, Sorry, I must have not emphasized enough that hint is only valid for appending data with subsequent ogr2ogr commands ogr2ogr -f sqlite -dsco spatialite=yes mtk_tos.sqlite -dim 2 -lco spatial_index=no /vsizip/e:\mtk_tos\etrs89\gml\K32.zip tieviiva -gt 65536 ogr2ogr -f sqlite -append mtk_tos.sqlite -dim 2 /vsizip/e:\mtk_tos\etrs89\gml\K34.zip tieviiva -gt 65536 ... repeat the latter command for 113 more GML files... I am not sure if this is a common use case everywhere. However, our National Land Survey delivers vector data divided by map sheets which means that each country wide dataset contains hundreds or thousands of GML files. Even Rouault wrote: Jukka, I'm very surprised that you need to do that explicitely, since the driver should already do that by default. This is something we discovered last year or the year before when you were experimenting OSM - Spatialite conversions. Since OGR 1.10, the spatial index is created when the datasource is closed. Look at this debug trace: $ ogr2ogr -f sqlite /vsimem/out.sqlite ../autotest/ogr/data/poly.shp -dsco spatialite=yes --debug on OGR: OGROpen(../autotest/ogr/data/poly.shp/0x69a3d0) succeeded as ESRI Shapefile. SQLITE: SpatiaLite v4 DB found ! OGR_SQLITE: exec(CREATE TABLE 'poly' ( OGC_FID INTEGER PRIMARY KEY)) OGR_SQLITE: exec(DELETE FROM geometry_columns WHERE f_table_name = 'poly') OGR_SQLITE: exec(SELECT AddGeometryColumn('poly', 'GEOMETRY', 325834, 'POLYGON', 2)) OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'area' FLOAT) OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'eas_id' FLOAT) OGR_SQLITE: exec(ALTER TABLE 'poly' ADD COLUMN 'prfedea' VARCHAR(16)) OGR_SQLITE: BEGIN Transaction OGR_SQLITE: prepare(INSERT INTO 'poly' (GEOMETRY,area,eas_id,prfedea) VALUES (?,?,?,?)) OGR_SQLITE: COMMIT Transaction OGR2OGR: 10 features written in layer 'poly' OGR_SQLITE: exec(SELECT CreateSpatialIndex('poly', 'GEOMETRY')) SQLITE: Error no such table: layer_statistics OGR: Unloading VirtualOGR module Shape: 10 features read on layer 'poly'. Could try adding --debug on to your ogr2ogr command line and see when CreateSpatialIndex() is created ? Even Hi, I took timings about adding 115 GML files (548 MB together, 3.2 million linestrings) into a Spatialite table. With default settings the table gets initialized with spatial index which makes following inserts slower. Another alternative is to create the table without spatial index, append all the data first and as a last step create spatial index for the ready made table. With spatial index on: Append + index: 71 minutes With spatial index off: Append: 9 minutes Create spatial index: 6 minutes Total: 15 minutes I was rather happy with the initial conversion speed until I made this test which revealed that creating spatial index as a final step made the whole process more than four times faster! This way both data table and spatial index are probably in contiguous chunks in the SQLite datafile and there is no need for post process VACUUM. Vacuuming in SQLite is rather slow and for this 1.3 GB database it takes more than 4 minutes to run. Suggestion: Add a new performance hint on page http://www.gdal.org/ogr/drv_sqlite.html If many source files will be collected into the same Spatialite table it can be much faster to initialize the table without a spatial index by using -lco SPATIAL_INDEX=NO and to create spatial index with a separate command after all the data are appended. Spatial index can be created with ogrinfo command ogrinfo db.sqlite -sql SELECT CreateSpatialIndex('table_name','geometry_column_name') Perhaps it could also be mentioned as a performance hint that VACUUUM can also be done from orginfo as ogrinfo db.sqlite -sql VACUUM -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Something wrong with writing a big raster into PDF
Hi Even, Yes, that way I can perhaps create an all-white PDF map a bit faster but with no other improvement, unfortunately. -Jukka- Even Rouault wrote: Hi Jukka, did you try with -co TILED=YES ? Even Hi, I can't make usable pdf files from some original maps and I do not understand what goes wrong. I am testing on Windows 7 and GDAL-dev from gisinternals, both 32 and 64 bit. Images are like this one: http://kartat.kapsi.fi/files/taustakarttasarja/taustakartta_80/5m/etrs89/png/U M5L.png Command can be simplified to gdal_translate -of pdf UM5L.png UM5L.pdf Everything seems to go alright and gdalinfo shows normal information from the pdf. However, if I open the UM5L.pdf file with Acrobat Reader it does not show a map. Everything is just pure white. I planned to check what happens if read the map back from pdf as gdal_translate -of gtiff um5l.pdf um5l.tif After fifteen minutes I estimate that the output will be ready after an hour or so which is all too slow for a Corei7 CPU and 8GB of RAM. I had another try with the same binaries with a small png file (2000x2000 pixels) and conversion into pdf was fast and successful. Is there perhaps something that does not scale up properly when the raster size gets bigger? -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Something wrong with writing a big raster into PDF
Even Rouault wrote: Le vendredi 07 mars 2014 21:11:40, Even Rouault a écrit : Le vendredi 07 mars 2014 14:20:13, Rahkonen Jukka (Tike) a écrit : Hi Even, Yes, that way I can perhaps create an all-white PDF map a bit faster but with no other improvement, unfortunately. I've tried gdal_translate UM5L.png UM5L.pdf -of pdf on my Linux workstation and the the following viewers : I forgot to mention the -co TILED=YES which was actually used and makes a dramatic improvement for xpdf QGIS use cases, compared to not tiling mode. - Adobe Reader 9.4.2 Linux: blank image - Okular, KDE-based PDF viewer, based on poppler library: can display a fit-to- page overview in a few seconds, but not the full resolution image since it tries to allocate a 15360x15360 buffer - Evince, GNOME-based PDF viewer, based on poppler library, uses cairo for display: blank image with error message cairo context error: out of memory - xpdf, viewer whose code has been forked to build the poppler library: can display the PDF at full resolution, and in reasonable time. - qgis (based on GDAL PDF driver, and poppler backend) : the initial overview computation is really really slow. Roughly half an hour. Because there's currently no optimization in the GDAL PDF reader to get a lower resolution image. Could potentially be dramatically improved by asking a rendering at lower DPI for overview levels, instead of querying at best DPI and then downsampling. But once that 32x32 overview is computed, if you go to 100% scale and pan, the refresh time is around 1 second after each pan. - command line rendering (based on GDAL PDF driver, and poppler backend) : gdal_translate UM5L_tiled.pdf out.tif -co TILED=YES : 1 minute 20 sec (less if you specify a higher value than the default for GDAL_CACHEMAX) I've also tried generating a JPEG2000 compressed PDF and Adore Reader is not happier. Conclusion: most PDF viewers assume that a PDF page can fit in memory and don't use a tiling strategy to display it. Even I made some tests too by increasing the output size step by step. For my laptop running Windows 7 54 bit with 8 GB RAM and 32-bit Acrobat Reader XI the first failing pdf was made as gdal_translate -of png -outsize 80% 80% ul4l.png ul4l_80p.png -co tiled=yes Input file size is 19200, 19200 Output size is then 15360, 15360. Acrobat Reader shows an empty map but task manager is listing only a nominal memory consumption of 24 MB for the AcroRd32.exe process. From the Reader file properties menu the page size of the failing one is 5080x5080. For the till 60% reduced version that shows the map the PDF page size is 4064x4064. Now I wonder if this is really about too big image or something else. 15360 by 15360 pixels is not extremely much. Is there some secret 200 inch limit hiding in the background? See http://indesignsecrets.com/beware-200-limit-for-pdfs.php. 5080 mm is just above 200 inches. -Jukka Rahkonen- -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] GeoPackage fails after touching it with Spatialite-gui
Hi, Is this the same issue about how GPKG is handling ExecuteSQL()? If it is, then I will make a ticket because a database without indexes is missing a lot. Or is there some other way how I could create index into GPKG with GDAL? Alessandro Furieri told already that using spatialite-gui/tools for GeoPackages is currently not safe. ogrinfo test.gpkg -dialect sqlite -sql create index knro_idx on test (knro) INFO: Open of `test.gpkg' using driver `GPKG' successful. ERROR 1: In ExecuteSQL(): sqlite3_prepare(create index knro_idx on test (knro)): no such table: main.test -Jukka- Even Rouault wrote: Selon Jukka Rahkonen jukka.rahko...@mmmtike.fi: Even Rouault even.rouault at mines-paris.org writes: Jukka, I highly suspect that the ALTER TABLE must imply a rewriting of the file by spatialite/spatialite-gui, and when doing so, it doesn't preserve the application id (4 bytes in the header of sqlite file), that geopackage specification mentions to be set to a particular value. Consequently the geopackage driver later fails when checking the signature. I have not yet found a way to reproduce crash in a reliable way but while testing I found something else. Does the following mean that the GPKG file is not totally valid SQLite file or just that GDAL gets puzzled because there are separate implementations for SQLite/Spatialite and GPKG? Step 1: create plain SQLite and GPKG databases C:\dataogr2ogr -f sqlite test.sqlite temp.shp C:\dataogr2ogr -f gpkg test.gpkg temp.shp Step 2: try to rename a table C:\dataogrinfo test.sqlite -dialect sqlite -sql alter table temp rename to temp2 INFO: Open of `test.sqlite' using driver `SQLite' successful. C:\dataogrinfo test.gpkg -dialect sqlite -sql alter table temp rename to temp2 INFO: Open of `test.gpkg' using driver `GPKG' successful. ERROR 1: In ExecuteSQL(): sqlite3_prepare(alter table temp rename to temp2): no such table: temp Hum, well I can see that you are going to run into problems. The GPKG driver has no ExecuteSQL() implementation that should directly handle your SQL requests (what the SQLite driver would do). So it fallbacks to the generic ExecuteSQL() implementation, which uses the sqlite VirtualOGR mechanism, which does not support table renaming. Ideally, the GPKG driver should be extended to implement ExecuteSQL() in a similar way than the SQLite driver. I noticed that OGR dialect does not support renaming tables, it suggests to rename a column instead. -Jukka- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] GeoPackage fails after touching it with Spatialite-gui
Hi, Creating indexes with sqlite3 feels safe. Renaming table is unsafe and leads to ogrinfo crash. sqlite alter table test rename to test2; C:\ohjelmat\sqlite3ogrinfo kuti.gpkg ERROR 1: (null) INFO: Open of `kuti.gpkg' using driver `GPKG' successful. Next: Crash. What is good is that everything is good again after doing sqlite alter table test2 rename to test; Renaming a table is not something that is absolutely needed but it is not a totally odd idea either and it should not lead to program crash. Now it looks like the crash comes from metadata and real table names being unsynchronized and a proper way to support renaming the tables would mean creating also triggers into the database for updating the metadata fields correctly. Because it is sure at the moment that renaming a table will lead to a not usable GeoPackage it might be good to mention it in the Limitations section of driver page http://www.gdal.org/ogr/drv_geopackage.html -Jukka- -Alkuperäinen viesti- Lähettäjä: Even Rouault [mailto:even.roua...@mines-paris.org] Lähetetty: 18. helmikuuta 2014 11:07 Vastaanottaja: Rahkonen Jukka (Tike) Kopio: 'gdal-dev@lists.osgeo.org' Aihe: Re: [gdal-dev] GeoPackage fails after touching it with Spatialite-gui Hi Jukka, yes, this is the same issue. -dialect sqlite is quite powerfull, but can only handle SELECT / UPDATE / DELETE, no other fancy stuff. The GPKG driver should just have its ExecuteSQL() implementation. Currently I guess you could open a .gpkg with the sqlite3 binary. I wouldn't expect it to alter the application_id field (or perhaps in the same situation as spatialite_gui does). Even Hi, Is this the same issue about how GPKG is handling ExecuteSQL()? If it is, then I will make a ticket because a database without indexes is missing a lot. Or is there some other way how I could create index into GPKG with GDAL? Alessandro Furieri told already that using spatialite-gui/tools for GeoPackages is currently not safe. ogrinfo test.gpkg -dialect sqlite -sql create index knro_idx on test (knro) INFO: Open of `test.gpkg' using driver `GPKG' successful. ERROR 1: In ExecuteSQL(): sqlite3_prepare(create index knro_idx on test (knro)): no such table: main.test -Jukka- Even Rouault wrote: Selon Jukka Rahkonen jukka.rahko...@mmmtike.fi: Even Rouault even.rouault at mines-paris.org writes: Jukka, I highly suspect that the ALTER TABLE must imply a rewriting of the file by spatialite/spatialite-gui, and when doing so, it doesn't preserve the application id (4 bytes in the header of sqlite file), that geopackage specification mentions to be set to a particular value. Consequently the geopackage driver later fails when checking the signature. I have not yet found a way to reproduce crash in a reliable way but while testing I found something else. Does the following mean that the GPKG file is not totally valid SQLite file or just that GDAL gets puzzled because there are separate implementations for SQLite/Spatialite and GPKG? Step 1: create plain SQLite and GPKG databases C:\dataogr2ogr -f sqlite test.sqlite temp.shp C:\dataogr2ogr -f gpkg test.gpkg temp.shp Step 2: try to rename a table C:\dataogrinfo test.sqlite -dialect sqlite -sql alter table temp rename to temp2 INFO: Open of `test.sqlite' using driver `SQLite' successful. C:\dataogrinfo test.gpkg -dialect sqlite -sql alter table temp rename to temp2 INFO: Open of `test.gpkg' using driver `GPKG' successful. ERROR 1: In ExecuteSQL(): sqlite3_prepare(alter table temp rename to temp2): no such table: temp Hum, well I can see that you are going to run into problems. The GPKG driver has no ExecuteSQL() implementation that should directly handle your SQL requests (what the SQLite driver would do). So it fallbacks to the generic ExecuteSQL() implementation, which uses the sqlite VirtualOGR mechanism, which does not support table renaming. Ideally, the GPKG driver should be extended to implement ExecuteSQL() in a similar way than the SQLite driver. I noticed that OGR dialect does not support renaming tables, it suggests to rename a column instead. -Jukka- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Hi, Perhaps, but in this game the rule was not to have any GIS servers. Myself I would rather consider WFS. It could send heights from single points but also a profile along a line or all values within a polygon. -Jukka- Brian Case [r...@winkey.org] wrote: -Jukka tileindex, mapserver, and the gdal wms driver On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote: Luke Roth roth.luke at gmail.com writes: Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Hi, You are right. This config option makes GDAL to skip the reading of the remote directory and saves a lot of bandwidth: VRT case: Bytes Received: 4 244 509 (of which the vrt file: 4 192 577) Sequence (clock) duration:00:00:09.9996000 Was: Bytes Received: 6 459 443 Sequence (clock) duration:00:00:37.813 BigTIFF case: Bytes Received: 2 158 917 Sequence (clock) duration:00:00:04.4368000 Was: Bytes Received: 4 374 137 Sequence (clock) duration:00:00:30.9192000 Conclusion: Both options are unsuitable for serious use while amusing to play with. Reading the BigTIFF tile offset index (or whatever it is) seems to mean about 2 MB of compultory payload traffic. Reading the VRT file means in this example 4 MB of payload. If this sort of net access to a large directory of raster files should be important for someone there should be a way to find the right raster file and righ data range in that file with minimum amount of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to keep the vrt file on the client side. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] New option INDEX_COLUMNS for SQLite and GPKG
Hi Even, Good points and vacuuming is also very important sometimes but but does not suit at all for a creation option. I will think about some text and examples that could be added into SQLite/Spatialite performance hints. -Jukka- Even Rouault wrote: Hi Jukka, I'm a bit ambivalent about providing a creation option for that (although that would not be an heresy). For a few reasons : - it is relatively easy to create an attribute index manually (once you know the syntax) - it is not necessarily to create it at layer creation time - it would potentially apply to all drivers that have a SQL engine behind But I agree that improving the documentation to advertize the interest of indexes and how to create them could be usefull. Perhaps you would want to propose a modified version of the HTML page ? (I think that the GPKG one could just point to the relevant section of the SQLite one, to avoid doc duplication) Even Hi, I guess that both SQLite/Spatialite and especially OGC Geopackage will be used by people who consider them more like file formats than as datebases. Such users may not know the power of indexes and they do not necessarily know anything about SQL and such. How about writing a few lines about the power and importance of indexes into the SQLite and GPKG driver pages? There could at least be on example about how to create a new index with ogrinfo, which must be a secret even for many advanced GDAL users. Better though, from user point of view, would be to implement a new layer creation option. I see a very similar case in SQLite/Spatialite COMPRESS_COLUMNS=column_name1[,column_name2, ...]: (Starting with GDAL 1.10.0) A list of (String) columns that must be compressed The new -lco might be INDEX_COLUMNS=column_name1[,column_name2, ...]: (Starting with GDAL 2.0) A list of columns that will be indexed. I am not sure if -lco should somehow support also composite indexes. Perhaps users who know what those (and unique indexes) mean can also use SQL and create them with ogrinfo if they can see an example in the documentation. Thus -lco could be made to accept only one column per index. And index name in the db could be set automatically into something like layer_name_column_name_idx. The SQL that this -lco should fire is simply CREATE INDEX table_name_column_name_idx ON table_name (column_name); -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev