Re: [gdal-dev] Fast Pixel Access
Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Luke On Sun, Feb 2, 2014 at 2:12 PM, Jukka Rahkonen jukka.rahko...@mmmtike.fiwrote: Hi, I made a few tests and here comes my conclusions. Hypothesis is that someone wants to make a DEM query service which is using gdallocationinfo for queries and DEM data is to be accessed as files from a standard web site. I compared three alternatives: 1) There are thousands of DEM files on the server and they are combined together with a VRT file. 2) There is only one DEM file as BigTIFF. 3) DEM is split into tiles into x/y/z tile directory structure like in Google maps or OpenStreetMap tiles. My test data covers Finland with 10 m grid size and as deflate compressed tiffs they make about 10 GB together. Before going on, keep in mind that the speed needs indexes. The better index, the less unnecessary data to read. In case 1) the first level index is the VRT file. The second level index, if it exists, is in the headers of the real DEM files. It may be possible to jump to a correct offset from the beginning of the DEM data and read only a part of the file. In case 2) the index is in the internal TIFF directory. If the BigTIFF is tiled the access to tiles should be rather effectice. And finally in case 3) the index is built into directory structure and tiling schema that is used for saving the tiles. The schema is no well known that tile map service clients can directly ask for a certain file name if they know the coordinates and scale. Conclusions: 1) - The whole VRT file must be readed. Caching the vrt file would make next requests faster. - For some reason gdallocationinfo wants to get the directory list of the directory where the vrt file is. This is slow and generates lots of traffic if the thousands of DEM files are in the same directory. Probably it would be faster to have them in another dierectory. 2) - BigTIFF route is more straight forward but gdallocationinfo needs still to do many big range reads. - Also in this case gdallocationinfo reads the target file directory. It would be good to keep this directory small. Don't do like I did with having in the directory the BigTIFF DEM file that was the only file needed, but also the vrt and thousands of original DEMs from the previuos test - but at least this is a know this issue now and know how to avoid it. In my case reading the directory made 2.2 MB of web traffic and all or most for wain. 3) - I used OpenStreetMap tile service as the test data for the third test. In this case gdallocationinfo knows exactly which tile to request and it is making only one request. It also seems to cache some tiles on the client side which means that queries for close locations may hit the cached tile and be very fast. Summary statistics: 1) Gdallocationinfo makes 6 requests and reads 6.4 MB of data 2) Gdallocationinfo makes 8 requests and reads 4.3 MB of data 3) Gdallocationinfo makes 1 requests and reads 10 kB of data Requests I used are these: 1) gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/ dem10m/dem_10m.vrt -geoloc 389559 6677412 2) gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/ dem10m/dem_10m.tif -geoloc 389559 6677412 3) gdallocationinfo frmt_wms_openstreetmap_tms.xml -geoloc 389559 6677412 I know that the queried place in 3) is not the same because SRIDs of data differ nor does OSM return 16-bit DEM heights but 3-band RGB values instead but it does not matter here, the idea is what is important. My conclusion is that you should cut your DEM into tiles with for example gdal2tiles or MapTiler and the resuld could actually be quit speedy and perhaps using 126x126 tiles could make it still a bit faster. Hope that they can create tiles as 16-bit tiffs. I am sure that these results are not scientifically sound but I am also sure that the difference between 6.4 MB/4.3 MB/10 kB is something to think about especially if you dream about a mobile service. I placed the requests which gdallocationinfo made during these tests into http://latuviitta.org/documents/gdallocationinfo_requests.txt -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Dan, I had not heard of the KEA format and it looks promising, accept the need to compile. I am hoping to this with out-of-the-box GDAL. I also did not see a license statement. GDAL does support HDF5 (another format I am not familiar with), but it looks like the limit is 2GB for the built in driver. My dataset is also covers the US with 55,501 x 7.5 minutes tiles at 0.15012 arc second resolution (~5m), a total size 2,289,001 x 756,001 pixels, 1.8T. Creating a single dataset from the tiles can be done, but in our environment, is not cheap. David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Daniel Clewley Sent: Saturday, February 01, 2014 1:45 PM To: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Hi David, Following on from the VRT / Bigtiff comparison Jukka posted. Have you considered storing the data as a single KEA format file, which is based on HDF5? I have the National Elevation Dataset for the US, which comprises 3,605 1 x 1 degree tiles at 1 arc sec resolution. I first created a VRT then used gdal_translate to convert to a KEA file. Total size is 421,212 x 252,012 pixels, 77 GB. I also built overviews for a fast display (this took a long time and I don't think is needed for your case). I've just tried using gdallocationinfo on the file to get pixel information and it takes 0.5s to get the pixel value back. The KEA library and GDAL driver source are available from: https://bitbucket.org/chchrsc/kealib/ and the format is described in: Peter Bunting, Sam Gillingham, The KEA image file format, Computers Geosciences, Volume 57, August 2013, Pages 54-58, ISSN 0098-3004, http://dx.doi.org/10.1016/j.cageo.2013.03.025. If you don't mind having one massive file (in addition to the individual tiles, which could be archived), this might work for your use case. Thanks, Dan Message: 10 Date: Fri, 31 Jan 2014 16:15:53 + From: David Baker (Geoscience) david.m.ba...@chk.com To: 'gdal-dev@lists.osgeo.org' gdal-dev@lists.osgeo.org Subject: [gdal-dev] Fast Pixel Access Message-ID: 2a18a4344312134b937df938d992264a0508f...@okcexhprd122.chkenergy.net Content-Type: text/plain; charset=us-ascii Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. Thanks, David David M. Baker Senior Advisor - Geoscience Technology Chesapeake Energy Corporation david.m.ba...@chk.commailto:david.m.ba...@chk.com ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
[gdal-dev] Problems linking 1.10 with CURL support
Hi All I'm have trouble linking GDAL 1.10 with CURL support enabled. I can and have been building GDAL 1.9 with CURL so I'm sure I have all the settings within the nmake.opt file correct, unless I have to do something different for the latest version. I'm on Windows 7 and have tried using both VS2005 and VS2010. The linking error I'm getting is: error LNK2019: unresolved external symbol void __cdecl CPLHTTPSetOptions(void *,char * *) (?CPLHTTPSetOptions@@YAXPAXPAPAD@Z) referenced in function void __cdecl CPLHTTPInitializeRequest(struct CPLHTTPRequest *,char const *,char const * const *) (? CPLHTTPInitializeRequest@@YAXPAUCPLHTTPRequest@@PBDPBQBD@Z) gdalhttp.obj My nmake.opt section looks like the following # Uncomment to use libcurl (DLL by default) # The cURL library is used for WCS, WMS, GeoJSON, SRS call importFromUrl(), WFS, GFT, CouchDB, /vsicurl/ etc. CURL_DIR=W:\3rdpartyLibs\curl\7_35\src\ CURL_INC=-I$(CURL_DIR)/include # Uncoment following line to use libcurl as dynamic library CURL_LIB = w:/lib/libcurl.lib wsock32.lib wldap32.lib winmm.lib # Uncoment following two lines to use libcurl as static library #CURL_LIB = $(CURL_DIR)/libcurl.lib wsock32.lib wldap32.lib winmm.lib #CURL_CFLAGS = -DCURL_STATICLIB The file cpl_http.cpp does have an implimentation but VS2005 would appear to indicate that the #define HAVE)CURL is not defined??? Any help would be appreciated. Cheers Andy ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Luke Roth roth.luke at gmail.com writes: Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Hi, You are right. This config option makes GDAL to skip the reading of the remote directory and saves a lot of bandwidth: VRT case: Bytes Received: 4 244 509 (of which the vrt file: 4 192 577) Sequence (clock) duration: 00:00:09.9996000 Was: Bytes Received: 6 459 443 Sequence (clock) duration: 00:00:37.813 BigTIFF case: Bytes Received: 2 158 917 Sequence (clock) duration: 00:00:04.4368000 Was: Bytes Received: 4 374 137 Sequence (clock) duration: 00:00:30.9192000 Conclusion: Both options are unsuitable for serious use while amusing to play with. Reading the BigTIFF tile offset index (or whatever it is) seems to mean about 2 MB of compultory payload traffic. Reading the VRT file means in this example 4 MB of payload. If this sort of net access to a large directory of raster files should be important for someone there should be a way to find the right raster file and righ data range in that file with minimum amount of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to keep the vrt file on the client side. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
-Jukka tileindex, mapserver, and the gdal wms driver On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote: Luke Roth roth.luke at gmail.com writes: Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Hi, You are right. This config option makes GDAL to skip the reading of the remote directory and saves a lot of bandwidth: VRT case: Bytes Received: 4 244 509 (of which the vrt file: 4 192 577) Sequence (clock) duration:00:00:09.9996000 Was: Bytes Received: 6 459 443 Sequence (clock) duration:00:00:37.813 BigTIFF case: Bytes Received: 2 158 917 Sequence (clock) duration:00:00:04.4368000 Was: Bytes Received: 4 374 137 Sequence (clock) duration:00:00:30.9192000 Conclusion: Both options are unsuitable for serious use while amusing to play with. Reading the BigTIFF tile offset index (or whatever it is) seems to mean about 2 MB of compultory payload traffic. Reading the VRT file means in this example 4 MB of payload. If this sort of net access to a large directory of raster files should be important for someone there should be a way to find the right raster file and righ data range in that file with minimum amount of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to keep the vrt file on the client side. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Problems linking 1.10 with CURL support
Andy Cheetham andy.cheetham at geomaticSolutions.com writes: Hi All I'm have trouble linking GDAL 1.10 with CURL support enabled. I can and have been building GDAL 1.9 with CURL so I'm sure I have all the settings within the nmake.opt file correct, unless I have to do something different for the latest version. I'm on Windows 7 and have tried using both VS2005 and VS2010. If the ready made Windows builds in http://gisinternals.com/sdk/ do not suit you perhaps the buildlogs can still help you. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Hi, Perhaps, but in this game the rule was not to have any GIS servers. Myself I would rather consider WFS. It could send heights from single points but also a profile along a line or all values within a polygon. -Jukka- Brian Case [r...@winkey.org] wrote: -Jukka tileindex, mapserver, and the gdal wms driver On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote: Luke Roth roth.luke at gmail.com writes: Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Hi, You are right. This config option makes GDAL to skip the reading of the remote directory and saves a lot of bandwidth: VRT case: Bytes Received: 4 244 509 (of which the vrt file: 4 192 577) Sequence (clock) duration:00:00:09.9996000 Was: Bytes Received: 6 459 443 Sequence (clock) duration:00:00:37.813 BigTIFF case: Bytes Received: 2 158 917 Sequence (clock) duration:00:00:04.4368000 Was: Bytes Received: 4 374 137 Sequence (clock) duration:00:00:30.9192000 Conclusion: Both options are unsuitable for serious use while amusing to play with. Reading the BigTIFF tile offset index (or whatever it is) seems to mean about 2 MB of compultory payload traffic. Reading the VRT file means in this example 4 MB of payload. If this sort of net access to a large directory of raster files should be important for someone there should be a way to find the right raster file and righ data range in that file with minimum amount of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to keep the vrt file on the client side. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Jukka, No matter the endpoint the user uses to access the data, behind the scenes, there must be fast pixel access, correct? Or are you saying that at WFS would do it quickly out of the box? David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Rahkonen Jukka (Tike) Sent: Monday, February 03, 2014 11:37 AM To: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Hi, Perhaps, but in this game the rule was not to have any GIS servers. Myself I would rather consider WFS. It could send heights from single points but also a profile along a line or all values within a polygon. -Jukka- Brian Case [r...@winkey.org] wrote: -Jukka tileindex, mapserver, and the gdal wms driver On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote: Luke Roth roth.luke at gmail.com writes: Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Hi, You are right. This config option makes GDAL to skip the reading of the remote directory and saves a lot of bandwidth: VRT case: Bytes Received: 4 244 509 (of which the vrt file: 4 192 577) Sequence (clock) duration:00:00:09.9996000 Was: Bytes Received: 6 459 443 Sequence (clock) duration:00:00:37.813 BigTIFF case: Bytes Received: 2 158 917 Sequence (clock) duration:00:00:04.4368000 Was: Bytes Received: 4 374 137 Sequence (clock) duration:00:00:30.9192000 Conclusion: Both options are unsuitable for serious use while amusing to play with. Reading the BigTIFF tile offset index (or whatever it is) seems to mean about 2 MB of compultory payload traffic. Reading the VRT file means in this example 4 MB of payload. If this sort of net access to a large directory of raster files should be important for someone there should be a way to find the right raster file and righ data range in that file with minimum amount of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to keep the vrt file on the client side. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Hi Jukka, David and others, We were rendering in the past the elevation data into Mercator tiles with http://www.maptiler.com/, the follower of my GDAL2Tiles.py script - for extremely fast pixel access to the elevation values at given geographic location - without a need for server software for hosting such data. We made the tiles for CGIAR 90m DEM for whole world. In fact the raw elevation was mapped into RGB space for this purpose. Decoding on the client side is then very easy and you have precise elevation values for the whole near area preloaded. You may need to do something similar if you want to implement very fast client side hill-shading in web browser canvas (similar to http://dev.klokantech.com/klokan/hillshading/), or dynamic elevation profile drawn while moving the mouse over a map, and many other tasks with data loaded in the client directly. The visualisation of the tiles is possible also in 3D, see: http://vimeo.com/29605292 For WebGL there are now more efficient ways (direct binary data structures instead of images). See: http://dev.klokantech.com/srtm/srtm_decode.html http://www.klokantech.com/labs/dem-color-encoding/ http://dev.klokantech.com/srtm/googlemaps.html There is a source code of the GDAL utility for encoding elevation data into RGB available online: https://github.com/webglearth/gdaldem_web but you may not need it, if direct reading via JavaScript in the web browser is not required for your application. Regards, Klokan Petr Pridal On Mon, Feb 3, 2014 at 7:48 PM, David Baker (Geoscience) david.m.ba...@chk.com wrote: Jukka, No matter the endpoint the user uses to access the data, behind the scenes, there must be fast pixel access, correct? Or are you saying that at WFS would do it quickly out of the box? David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto: gdal-dev-boun...@lists.osgeo.org] On Behalf Of Rahkonen Jukka (Tike) Sent: Monday, February 03, 2014 11:37 AM To: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Hi, Perhaps, but in this game the rule was not to have any GIS servers. Myself I would rather consider WFS. It could send heights from single points but also a profile along a line or all values within a polygon. -Jukka- Brian Case [r...@winkey.org] wrote: -Jukka tileindex, mapserver, and the gdal wms driver On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote: Luke Roth roth.luke at gmail.com writes: Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Hi, You are right. This config option makes GDAL to skip the reading of the remote directory and saves a lot of bandwidth: VRT case: Bytes Received: 4 244 509 (of which the vrt file: 4 192 577) Sequence (clock) duration:00:00:09.9996000 Was: Bytes Received: 6 459 443 Sequence (clock) duration:00:00:37.813 BigTIFF case: Bytes Received: 2 158 917 Sequence (clock) duration:00:00:04.4368000 Was: Bytes Received: 4 374 137 Sequence (clock) duration:00:00:30.9192000 Conclusion: Both options are unsuitable for serious use while amusing to play with. Reading the BigTIFF tile offset index (or whatever it is) seems to mean about 2 MB of compultory payload traffic. Reading the VRT file means in this example 4 MB of payload. If this sort of net access to a large directory of raster files should be important for someone there should be a way to find the right raster file and righ data range in that file with minimum amount of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to keep the vrt file on the client side. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and