Re: [gdal-dev] Fast Pixel Access
Even, No not an i386... A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2 cores with 12.0GB. Thought the data is on the network, not local, with 1Gbps access. The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly increase the speed. Does the BIL driver read the whole file into memory first? Might a direct read be faster? And Even, please excuse my ignorance, but what is gdb? I really would like to do the profiling. David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Sunday, February 09, 2014 6:36 AM To: David Baker (Geoscience) Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit : Evan, I am not sure how to profile as I do not have access to the code to profile. I did do a timing test... vrt file = 22,970 KB bil file = 35,180 KB * 55,501 I piped five locations from the loc.txt file: -96.0 36.0 -98.0 37.0 -100.0 38.0 -99.0 39.0 -101.0 35.0 gdallocationinfo -valonly -geoloc intermap.vrt loc.txt 189.84185791015625.5 sec 384.85745239257822.6 sec 762.01593017578122.9 sec 550.71911621093823.6 sec 883.63702392578122.9 sec Note: I used a lap timer on my iPhone to capture the split times as the results appeared in the console window. Does this give any insight? Woo I agree that's utterly slow ! When you mentionned slow I thought it was more in the order of 0.1 second ! We can already exclude the parsing time of the VRT since you do that in the same gdallocationinfo session and that there will be just one parsing. And I can't believe that the intersection test for 55 000 rectangles takes ~ 20 seconds, unless you have an old i386 at 5 MHz ;-) My usual way of profiling stuff that is slow in the order of more than one second is to run under gdb, break with Ctrl+C, display the stack trace, continue the run, break again, display the stack trace, etc.. If you end up breaking in the same function, then you've found the bottleneck. I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was suggested and seems to improve things significantly. Perhaps we should try to cache the result of the initial readdir so it can benefits to later attempts, but I haven't checked how easily that could be miplemented. Or perhaps we should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it causes problem from time to time. But generally filesystems don't behave very well when there are a lot of files in the same directory. You'd better organizing your tiles in subdirectories. But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could try the above suggestion to identify where the time is spent. Even David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent: Saturday, February 01, 2014 1:28 AM To: Brian Case Cc: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 00:23:13, Brian Case a écrit : evenr what about the use of a tileindex? You really mean a tileindex as produced by gdaltindex ? Well, that's not exactly the same beast as a VRT, but yes if it was recognized as a GDAL dataset then you could potentially save the cost of XML parsing. One could imagine that the VRT driver would accept a tileindex as an altenate connection string. Anyway it would be interesting to first profile where the time is spent in David use case. If it's in the XML parsing, then I can't see what could be easily improved in that area. If it's the intersection, then there's potential for improvement. seems an intersection with a set of polys first would be quick brian On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote: Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit : Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? No, it isn't currently, although I think it could be improved to have a in- memory index with moderate effort. But are you sure the slowness is due to the lack of index ? 55,000 is a big number, but not that big. Maybe the slowness just comes from the opening time (XML parsing) of such a big VRT. That would need to be profiled to be sure where the bottleneck is. Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C
Re: [gdal-dev] Fast Pixel Access
Selon David Baker (Geoscience) david.m.ba...@chk.com: Even, No not an i386... A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2 cores with 12.0GB. Thought the data is on the network, not local, with 1Gbps access. The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly increase the speed. Does the BIL driver read the whole file into memory first? Might a direct read be faster? No the BIL driver will just read the line where the pixel is. And Even, please excuse my ignorance, but what is gdb? I really would like to do the profiling. gdb is the GNU debugger ( https://www.gnu.org/software/gdb/ ) . Assuming you use Linux (likely available on MacOSX too). You should be able to install it with the usual package management system of the distribution : apt-get install gdb, yum install gdb, ... . Otherwise on Windows, I'm far less familiar with the debugging tools. gdb --args gdallocationinfo -valonly -geoloc intermap.vrt Then type run Type a coordinate -96.0 36.0 ctrl+c to suspend execution bt to display the stack trace c to resume execution David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Sunday, February 09, 2014 6:36 AM To: David Baker (Geoscience) Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit : Evan, I am not sure how to profile as I do not have access to the code to profile. I did do a timing test... vrt file = 22,970 KB bil file = 35,180 KB * 55,501 I piped five locations from the loc.txt file: -96.0 36.0 -98.0 37.0 -100.0 38.0 -99.0 39.0 -101.0 35.0 gdallocationinfo -valonly -geoloc intermap.vrt loc.txt 189.84185791015625.5 sec 384.85745239257822.6 sec 762.01593017578122.9 sec 550.71911621093823.6 sec 883.63702392578122.9 sec Note: I used a lap timer on my iPhone to capture the split times as the results appeared in the console window. Does this give any insight? Woo I agree that's utterly slow ! When you mentionned slow I thought it was more in the order of 0.1 second ! We can already exclude the parsing time of the VRT since you do that in the same gdallocationinfo session and that there will be just one parsing. And I can't believe that the intersection test for 55 000 rectangles takes ~ 20 seconds, unless you have an old i386 at 5 MHz ;-) My usual way of profiling stuff that is slow in the order of more than one second is to run under gdb, break with Ctrl+C, display the stack trace, continue the run, break again, display the stack trace, etc.. If you end up breaking in the same function, then you've found the bottleneck. I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was suggested and seems to improve things significantly. Perhaps we should try to cache the result of the initial readdir so it can benefits to later attempts, but I haven't checked how easily that could be miplemented. Or perhaps we should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it causes problem from time to time. But generally filesystems don't behave very well when there are a lot of files in the same directory. You'd better organizing your tiles in subdirectories. But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could try the above suggestion to identify where the time is spent. Even David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent: Saturday, February 01, 2014 1:28 AM To: Brian Case Cc: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 00:23:13, Brian Case a écrit : evenr what about the use of a tileindex? You really mean a tileindex as produced by gdaltindex ? Well, that's not exactly the same beast as a VRT, but yes if it was recognized as a GDAL dataset then you could potentially save the cost of XML parsing. One could imagine that the VRT driver would accept a tileindex as an altenate connection string. Anyway it would be interesting to first profile where the time is spent in David use case. If it's in the XML parsing, then I can't see what could be easily improved in that area. If it's the intersection, then there's potential for improvement. seems an intersection with a set of polys first would be quick brian On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote: Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit : Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files
Re: [gdal-dev] Fast Pixel Access
Evan. I am on Windows and only have the binaries installed. David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Monday, February 10, 2014 8:54 AM To: David Baker (Geoscience) Cc: 'Even Rouault'; 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: RE: [gdal-dev] Fast Pixel Access Selon David Baker (Geoscience) david.m.ba...@chk.com: Even, No not an i386... A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2 cores with 12.0GB. Thought the data is on the network, not local, with 1Gbps access. The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly increase the speed. Does the BIL driver read the whole file into memory first? Might a direct read be faster? No the BIL driver will just read the line where the pixel is. And Even, please excuse my ignorance, but what is gdb? I really would like to do the profiling. gdb is the GNU debugger ( https://www.gnu.org/software/gdb/ ) . Assuming you use Linux (likely available on MacOSX too). You should be able to install it with the usual package management system of the distribution : apt-get install gdb, yum install gdb, ... . Otherwise on Windows, I'm far less familiar with the debugging tools. gdb --args gdallocationinfo -valonly -geoloc intermap.vrt Then type run Type a coordinate -96.0 36.0 ctrl+c to suspend execution bt to display the stack trace c to resume execution David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Sunday, February 09, 2014 6:36 AM To: David Baker (Geoscience) Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit : Evan, I am not sure how to profile as I do not have access to the code to profile. I did do a timing test... vrt file = 22,970 KB bil file = 35,180 KB * 55,501 I piped five locations from the loc.txt file: -96.0 36.0 -98.0 37.0 -100.0 38.0 -99.0 39.0 -101.0 35.0 gdallocationinfo -valonly -geoloc intermap.vrt loc.txt 189.84185791015625.5 sec 384.85745239257822.6 sec 762.01593017578122.9 sec 550.71911621093823.6 sec 883.63702392578122.9 sec Note: I used a lap timer on my iPhone to capture the split times as the results appeared in the console window. Does this give any insight? Woo I agree that's utterly slow ! When you mentionned slow I thought it was more in the order of 0.1 second ! We can already exclude the parsing time of the VRT since you do that in the same gdallocationinfo session and that there will be just one parsing. And I can't believe that the intersection test for 55 000 rectangles takes ~ 20 seconds, unless you have an old i386 at 5 MHz ;-) My usual way of profiling stuff that is slow in the order of more than one second is to run under gdb, break with Ctrl+C, display the stack trace, continue the run, break again, display the stack trace, etc.. If you end up breaking in the same function, then you've found the bottleneck. I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was suggested and seems to improve things significantly. Perhaps we should try to cache the result of the initial readdir so it can benefits to later attempts, but I haven't checked how easily that could be miplemented. Or perhaps we should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it causes problem from time to time. But generally filesystems don't behave very well when there are a lot of files in the same directory. You'd better organizing your tiles in subdirectories. But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could try the above suggestion to identify where the time is spent. Even David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent: Saturday, February 01, 2014 1:28 AM To: Brian Case Cc: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 00:23:13, Brian Case a écrit : evenr what about the use of a tileindex? You really mean a tileindex as produced by gdaltindex ? Well, that's not exactly the same beast as a VRT, but yes if it was recognized as a GDAL dataset then you could potentially save the cost of XML parsing. One could imagine that the VRT driver would accept a tileindex as an altenate connection string. Anyway it would be interesting to first profile where the time is spent in David use case. If it's in the XML parsing, then I can't see what could be easily improved in that area. If it's the intersection, then there's potential for improvement. seems an intersection with a set of polys first would be quick brian On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote: Le vendredi 31 janvier 2014 17:15:53, David Baker
Re: [gdal-dev] Fast Pixel Access
Le mardi 11 février 2014 00:10:20, David Baker (Geoscience) a écrit : Evan. I am on Windows and only have the binaries installed. Well, I let Windows developers lurking here answer if they have some good advice. I imagine you would need binaries with debugging symbols to be able to get a usefull stack trace. David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Monday, February 10, 2014 8:54 AM To: David Baker (Geoscience) Cc: 'Even Rouault'; 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: RE: [gdal-dev] Fast Pixel Access Selon David Baker (Geoscience) david.m.ba...@chk.com: Even, No not an i386... A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2 cores with 12.0GB. Thought the data is on the network, not local, with 1Gbps access. The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly increase the speed. Does the BIL driver read the whole file into memory first? Might a direct read be faster? No the BIL driver will just read the line where the pixel is. And Even, please excuse my ignorance, but what is gdb? I really would like to do the profiling. gdb is the GNU debugger ( https://www.gnu.org/software/gdb/ ) . Assuming you use Linux (likely available on MacOSX too). You should be able to install it with the usual package management system of the distribution : apt-get install gdb, yum install gdb, ... . Otherwise on Windows, I'm far less familiar with the debugging tools. gdb --args gdallocationinfo -valonly -geoloc intermap.vrt Then type run Type a coordinate -96.0 36.0 ctrl+c to suspend execution bt to display the stack trace c to resume execution David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Sunday, February 09, 2014 6:36 AM To: David Baker (Geoscience) Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit : Evan, I am not sure how to profile as I do not have access to the code to profile. I did do a timing test... vrt file = 22,970 KB bil file = 35,180 KB * 55,501 I piped five locations from the loc.txt file: -96.0 36.0 -98.0 37.0 -100.0 38.0 -99.0 39.0 -101.0 35.0 gdallocationinfo -valonly -geoloc intermap.vrt loc.txt 189.84185791015625.5 sec 384.85745239257822.6 sec 762.01593017578122.9 sec 550.71911621093823.6 sec 883.63702392578122.9 sec Note: I used a lap timer on my iPhone to capture the split times as the results appeared in the console window. Does this give any insight? Woo I agree that's utterly slow ! When you mentionned slow I thought it was more in the order of 0.1 second ! We can already exclude the parsing time of the VRT since you do that in the same gdallocationinfo session and that there will be just one parsing. And I can't believe that the intersection test for 55 000 rectangles takes ~ 20 seconds, unless you have an old i386 at 5 MHz ;-) My usual way of profiling stuff that is slow in the order of more than one second is to run under gdb, break with Ctrl+C, display the stack trace, continue the run, break again, display the stack trace, etc.. If you end up breaking in the same function, then you've found the bottleneck. I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was suggested and seems to improve things significantly. Perhaps we should try to cache the result of the initial readdir so it can benefits to later attempts, but I haven't checked how easily that could be miplemented. Or perhaps we should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it causes problem from time to time. But generally filesystems don't behave very well when there are a lot of files in the same directory. You'd better organizing your tiles in subdirectories. But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could try the above suggestion to identify where the time is spent. Even David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent: Saturday, February 01, 2014 1:28 AM To: Brian Case Cc: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 00:23:13, Brian Case a écrit : evenr what about the use of a tileindex? You really mean a tileindex as produced by gdaltindex ? Well, that's not exactly the same beast as a VRT, but yes if it was recognized as a GDAL dataset then you could potentially save the cost of XML parsing. One could imagine that the VRT driver would accept a tileindex as an altenate connection string. Anyway it would be interesting to first profile where the time
Re: [gdal-dev] Fast Pixel Access
Luke, Thank you for this suggestion. This too the access times from 15-20 seconds down to 1 to 3 seconds. The majority of the time seems to be spent on the initial read of the vrt as subsequent piped locations after the first are returned sub-second. For my current application, this should be okay. David From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Luke Roth Sent: Monday, February 03, 2014 8:11 AM To: Jukka Rahkonen Cc: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Luke On Sun, Feb 2, 2014 at 2:12 PM, Jukka Rahkonen jukka.rahko...@mmmtike.fimailto:jukka.rahko...@mmmtike.fi wrote: Hi, I made a few tests and here comes my conclusions. Hypothesis is that someone wants to make a DEM query service which is using gdallocationinfo for queries and DEM data is to be accessed as files from a standard web site. I compared three alternatives: 1) There are thousands of DEM files on the server and they are combined together with a VRT file. 2) There is only one DEM file as BigTIFF. 3) DEM is split into tiles into x/y/z tile directory structure like in Google maps or OpenStreetMap tiles. My test data covers Finland with 10 m grid size and as deflate compressed tiffs they make about 10 GB together. Before going on, keep in mind that the speed needs indexes. The better index, the less unnecessary data to read. In case 1) the first level index is the VRT file. The second level index, if it exists, is in the headers of the real DEM files. It may be possible to jump to a correct offset from the beginning of the DEM data and read only a part of the file. In case 2) the index is in the internal TIFF directory. If the BigTIFF is tiled the access to tiles should be rather effectice. And finally in case 3) the index is built into directory structure and tiling schema that is used for saving the tiles. The schema is no well known that tile map service clients can directly ask for a certain file name if they know the coordinates and scale. Conclusions: 1) - The whole VRT file must be readed. Caching the vrt file would make next requests faster. - For some reason gdallocationinfo wants to get the directory list of the directory where the vrt file is. This is slow and generates lots of traffic if the thousands of DEM files are in the same directory. Probably it would be faster to have them in another dierectory. 2) - BigTIFF route is more straight forward but gdallocationinfo needs still to do many big range reads. - Also in this case gdallocationinfo reads the target file directory. It would be good to keep this directory small. Don't do like I did with having in the directory the BigTIFF DEM file that was the only file needed, but also the vrt and thousands of original DEMs from the previuos test - but at least this is a know this issue now and know how to avoid it. In my case reading the directory made 2.2 MB of web traffic and all or most for wain. 3) - I used OpenStreetMap tile service as the test data for the third test. In this case gdallocationinfo knows exactly which tile to request and it is making only one request. It also seems to cache some tiles on the client side which means that queries for close locations may hit the cached tile and be very fast. Summary statistics: 1) Gdallocationinfo makes 6 requests and reads 6.4 MB of data 2) Gdallocationinfo makes 8 requests and reads 4.3 MB of data 3) Gdallocationinfo makes 1 requests and reads 10 kB of data Requests I used are these: 1) gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/ dem10m/dem_10m.vrt -geoloc 389559 6677412 2) gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/ dem10m/dem_10m.tif -geoloc 389559 6677412 3) gdallocationinfo frmt_wms_openstreetmap_tms.xml -geoloc 389559 6677412 I know that the queried place in 3) is not the same because SRIDs of data differ nor does OSM return 16-bit DEM heights but 3-band RGB values instead but it does not matter here, the idea is what is important. My conclusion is that you should cut your DEM into tiles with for example gdal2tiles or MapTiler and the resuld could actually be quit speedy and perhaps using 126x126 tiles could make it still a bit faster. Hope that they can create tiles as 16-bit tiffs. I am sure that these results are not scientifically sound but I am also sure that the difference between 6.4 MB/4.3 MB/10 kB is something to think about especially if you dream about a mobile service. I placed the requests which gdallocationinfo made during these tests into http://latuviitta.org/documents
Re: [gdal-dev] Fast Pixel Access
David Baker (Geoscience david.m.baker at chk.com writes: Jukka, No matter the endpoint the user uses to access the data, behind the scenes, there must be fast pixel access, correct? Or are you saying that at WFS would do it quickly out of the box? Hi, Fast access to data, yes, but using WFS is rather different scenario. WFS is made for vectors and before using it rastem DEM should be converted into small polygons. For the whole country it would mean whole lot of small polygons. Perhaps that is not a realistic approach. But now I think I must shut my mouth because I have never done anything DEM related in my job myself. Let's hope some specialist appears and gives a good solution for you. Oh well, Klokan appeared already. -Jukka- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Luke On Sun, Feb 2, 2014 at 2:12 PM, Jukka Rahkonen jukka.rahko...@mmmtike.fiwrote: Hi, I made a few tests and here comes my conclusions. Hypothesis is that someone wants to make a DEM query service which is using gdallocationinfo for queries and DEM data is to be accessed as files from a standard web site. I compared three alternatives: 1) There are thousands of DEM files on the server and they are combined together with a VRT file. 2) There is only one DEM file as BigTIFF. 3) DEM is split into tiles into x/y/z tile directory structure like in Google maps or OpenStreetMap tiles. My test data covers Finland with 10 m grid size and as deflate compressed tiffs they make about 10 GB together. Before going on, keep in mind that the speed needs indexes. The better index, the less unnecessary data to read. In case 1) the first level index is the VRT file. The second level index, if it exists, is in the headers of the real DEM files. It may be possible to jump to a correct offset from the beginning of the DEM data and read only a part of the file. In case 2) the index is in the internal TIFF directory. If the BigTIFF is tiled the access to tiles should be rather effectice. And finally in case 3) the index is built into directory structure and tiling schema that is used for saving the tiles. The schema is no well known that tile map service clients can directly ask for a certain file name if they know the coordinates and scale. Conclusions: 1) - The whole VRT file must be readed. Caching the vrt file would make next requests faster. - For some reason gdallocationinfo wants to get the directory list of the directory where the vrt file is. This is slow and generates lots of traffic if the thousands of DEM files are in the same directory. Probably it would be faster to have them in another dierectory. 2) - BigTIFF route is more straight forward but gdallocationinfo needs still to do many big range reads. - Also in this case gdallocationinfo reads the target file directory. It would be good to keep this directory small. Don't do like I did with having in the directory the BigTIFF DEM file that was the only file needed, but also the vrt and thousands of original DEMs from the previuos test - but at least this is a know this issue now and know how to avoid it. In my case reading the directory made 2.2 MB of web traffic and all or most for wain. 3) - I used OpenStreetMap tile service as the test data for the third test. In this case gdallocationinfo knows exactly which tile to request and it is making only one request. It also seems to cache some tiles on the client side which means that queries for close locations may hit the cached tile and be very fast. Summary statistics: 1) Gdallocationinfo makes 6 requests and reads 6.4 MB of data 2) Gdallocationinfo makes 8 requests and reads 4.3 MB of data 3) Gdallocationinfo makes 1 requests and reads 10 kB of data Requests I used are these: 1) gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/ dem10m/dem_10m.vrt -geoloc 389559 6677412 2) gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/ dem10m/dem_10m.tif -geoloc 389559 6677412 3) gdallocationinfo frmt_wms_openstreetmap_tms.xml -geoloc 389559 6677412 I know that the queried place in 3) is not the same because SRIDs of data differ nor does OSM return 16-bit DEM heights but 3-band RGB values instead but it does not matter here, the idea is what is important. My conclusion is that you should cut your DEM into tiles with for example gdal2tiles or MapTiler and the resuld could actually be quit speedy and perhaps using 126x126 tiles could make it still a bit faster. Hope that they can create tiles as 16-bit tiffs. I am sure that these results are not scientifically sound but I am also sure that the difference between 6.4 MB/4.3 MB/10 kB is something to think about especially if you dream about a mobile service. I placed the requests which gdallocationinfo made during these tests into http://latuviitta.org/documents/gdallocationinfo_requests.txt -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Dan, I had not heard of the KEA format and it looks promising, accept the need to compile. I am hoping to this with out-of-the-box GDAL. I also did not see a license statement. GDAL does support HDF5 (another format I am not familiar with), but it looks like the limit is 2GB for the built in driver. My dataset is also covers the US with 55,501 x 7.5 minutes tiles at 0.15012 arc second resolution (~5m), a total size 2,289,001 x 756,001 pixels, 1.8T. Creating a single dataset from the tiles can be done, but in our environment, is not cheap. David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Daniel Clewley Sent: Saturday, February 01, 2014 1:45 PM To: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Hi David, Following on from the VRT / Bigtiff comparison Jukka posted. Have you considered storing the data as a single KEA format file, which is based on HDF5? I have the National Elevation Dataset for the US, which comprises 3,605 1 x 1 degree tiles at 1 arc sec resolution. I first created a VRT then used gdal_translate to convert to a KEA file. Total size is 421,212 x 252,012 pixels, 77 GB. I also built overviews for a fast display (this took a long time and I don't think is needed for your case). I've just tried using gdallocationinfo on the file to get pixel information and it takes 0.5s to get the pixel value back. The KEA library and GDAL driver source are available from: https://bitbucket.org/chchrsc/kealib/ and the format is described in: Peter Bunting, Sam Gillingham, The KEA image file format, Computers Geosciences, Volume 57, August 2013, Pages 54-58, ISSN 0098-3004, http://dx.doi.org/10.1016/j.cageo.2013.03.025. If you don't mind having one massive file (in addition to the individual tiles, which could be archived), this might work for your use case. Thanks, Dan Message: 10 Date: Fri, 31 Jan 2014 16:15:53 + From: David Baker (Geoscience) david.m.ba...@chk.com To: 'gdal-dev@lists.osgeo.org' gdal-dev@lists.osgeo.org Subject: [gdal-dev] Fast Pixel Access Message-ID: 2a18a4344312134b937df938d992264a0508f...@okcexhprd122.chkenergy.net Content-Type: text/plain; charset=us-ascii Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. Thanks, David David M. Baker Senior Advisor - Geoscience Technology Chesapeake Energy Corporation david.m.ba...@chk.commailto:david.m.ba...@chk.com ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Luke Roth roth.luke at gmail.com writes: Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Hi, You are right. This config option makes GDAL to skip the reading of the remote directory and saves a lot of bandwidth: VRT case: Bytes Received: 4 244 509 (of which the vrt file: 4 192 577) Sequence (clock) duration: 00:00:09.9996000 Was: Bytes Received: 6 459 443 Sequence (clock) duration: 00:00:37.813 BigTIFF case: Bytes Received: 2 158 917 Sequence (clock) duration: 00:00:04.4368000 Was: Bytes Received: 4 374 137 Sequence (clock) duration: 00:00:30.9192000 Conclusion: Both options are unsuitable for serious use while amusing to play with. Reading the BigTIFF tile offset index (or whatever it is) seems to mean about 2 MB of compultory payload traffic. Reading the VRT file means in this example 4 MB of payload. If this sort of net access to a large directory of raster files should be important for someone there should be a way to find the right raster file and righ data range in that file with minimum amount of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to keep the vrt file on the client side. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
-Jukka tileindex, mapserver, and the gdal wms driver On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote: Luke Roth roth.luke at gmail.com writes: Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Hi, You are right. This config option makes GDAL to skip the reading of the remote directory and saves a lot of bandwidth: VRT case: Bytes Received: 4 244 509 (of which the vrt file: 4 192 577) Sequence (clock) duration:00:00:09.9996000 Was: Bytes Received: 6 459 443 Sequence (clock) duration:00:00:37.813 BigTIFF case: Bytes Received: 2 158 917 Sequence (clock) duration:00:00:04.4368000 Was: Bytes Received: 4 374 137 Sequence (clock) duration:00:00:30.9192000 Conclusion: Both options are unsuitable for serious use while amusing to play with. Reading the BigTIFF tile offset index (or whatever it is) seems to mean about 2 MB of compultory payload traffic. Reading the VRT file means in this example 4 MB of payload. If this sort of net access to a large directory of raster files should be important for someone there should be a way to find the right raster file and righ data range in that file with minimum amount of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to keep the vrt file on the client side. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Hi, Perhaps, but in this game the rule was not to have any GIS servers. Myself I would rather consider WFS. It could send heights from single points but also a profile along a line or all values within a polygon. -Jukka- Brian Case [r...@winkey.org] wrote: -Jukka tileindex, mapserver, and the gdal wms driver On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote: Luke Roth roth.luke at gmail.com writes: Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Hi, You are right. This config option makes GDAL to skip the reading of the remote directory and saves a lot of bandwidth: VRT case: Bytes Received: 4 244 509 (of which the vrt file: 4 192 577) Sequence (clock) duration:00:00:09.9996000 Was: Bytes Received: 6 459 443 Sequence (clock) duration:00:00:37.813 BigTIFF case: Bytes Received: 2 158 917 Sequence (clock) duration:00:00:04.4368000 Was: Bytes Received: 4 374 137 Sequence (clock) duration:00:00:30.9192000 Conclusion: Both options are unsuitable for serious use while amusing to play with. Reading the BigTIFF tile offset index (or whatever it is) seems to mean about 2 MB of compultory payload traffic. Reading the VRT file means in this example 4 MB of payload. If this sort of net access to a large directory of raster files should be important for someone there should be a way to find the right raster file and righ data range in that file with minimum amount of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to keep the vrt file on the client side. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Jukka, No matter the endpoint the user uses to access the data, behind the scenes, there must be fast pixel access, correct? Or are you saying that at WFS would do it quickly out of the box? David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Rahkonen Jukka (Tike) Sent: Monday, February 03, 2014 11:37 AM To: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Hi, Perhaps, but in this game the rule was not to have any GIS servers. Myself I would rather consider WFS. It could send heights from single points but also a profile along a line or all values within a polygon. -Jukka- Brian Case [r...@winkey.org] wrote: -Jukka tileindex, mapserver, and the gdal wms driver On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote: Luke Roth roth.luke at gmail.com writes: Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Hi, You are right. This config option makes GDAL to skip the reading of the remote directory and saves a lot of bandwidth: VRT case: Bytes Received: 4 244 509 (of which the vrt file: 4 192 577) Sequence (clock) duration:00:00:09.9996000 Was: Bytes Received: 6 459 443 Sequence (clock) duration:00:00:37.813 BigTIFF case: Bytes Received: 2 158 917 Sequence (clock) duration:00:00:04.4368000 Was: Bytes Received: 4 374 137 Sequence (clock) duration:00:00:30.9192000 Conclusion: Both options are unsuitable for serious use while amusing to play with. Reading the BigTIFF tile offset index (or whatever it is) seems to mean about 2 MB of compultory payload traffic. Reading the VRT file means in this example 4 MB of payload. If this sort of net access to a large directory of raster files should be important for someone there should be a way to find the right raster file and righ data range in that file with minimum amount of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to keep the vrt file on the client side. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Hi Jukka, David and others, We were rendering in the past the elevation data into Mercator tiles with http://www.maptiler.com/, the follower of my GDAL2Tiles.py script - for extremely fast pixel access to the elevation values at given geographic location - without a need for server software for hosting such data. We made the tiles for CGIAR 90m DEM for whole world. In fact the raw elevation was mapped into RGB space for this purpose. Decoding on the client side is then very easy and you have precise elevation values for the whole near area preloaded. You may need to do something similar if you want to implement very fast client side hill-shading in web browser canvas (similar to http://dev.klokantech.com/klokan/hillshading/), or dynamic elevation profile drawn while moving the mouse over a map, and many other tasks with data loaded in the client directly. The visualisation of the tiles is possible also in 3D, see: http://vimeo.com/29605292 For WebGL there are now more efficient ways (direct binary data structures instead of images). See: http://dev.klokantech.com/srtm/srtm_decode.html http://www.klokantech.com/labs/dem-color-encoding/ http://dev.klokantech.com/srtm/googlemaps.html There is a source code of the GDAL utility for encoding elevation data into RGB available online: https://github.com/webglearth/gdaldem_web but you may not need it, if direct reading via JavaScript in the web browser is not required for your application. Regards, Klokan Petr Pridal On Mon, Feb 3, 2014 at 7:48 PM, David Baker (Geoscience) david.m.ba...@chk.com wrote: Jukka, No matter the endpoint the user uses to access the data, behind the scenes, there must be fast pixel access, correct? Or are you saying that at WFS would do it quickly out of the box? David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto: gdal-dev-boun...@lists.osgeo.org] On Behalf Of Rahkonen Jukka (Tike) Sent: Monday, February 03, 2014 11:37 AM To: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Hi, Perhaps, but in this game the rule was not to have any GIS servers. Myself I would rather consider WFS. It could send heights from single points but also a profile along a line or all values within a polygon. -Jukka- Brian Case [r...@winkey.org] wrote: -Jukka tileindex, mapserver, and the gdal wms driver On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote: Luke Roth roth.luke at gmail.com writes: Another thing that might speed up access is setting the config option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on the command line. That should help with GDAL reading the directory each time it opens a dataset. I have an application which reads one value from each of a large number of datasets and setting this option made it run about 3 times faster. Hi, You are right. This config option makes GDAL to skip the reading of the remote directory and saves a lot of bandwidth: VRT case: Bytes Received: 4 244 509 (of which the vrt file: 4 192 577) Sequence (clock) duration:00:00:09.9996000 Was: Bytes Received: 6 459 443 Sequence (clock) duration:00:00:37.813 BigTIFF case: Bytes Received: 2 158 917 Sequence (clock) duration:00:00:04.4368000 Was: Bytes Received: 4 374 137 Sequence (clock) duration:00:00:30.9192000 Conclusion: Both options are unsuitable for serious use while amusing to play with. Reading the BigTIFF tile offset index (or whatever it is) seems to mean about 2 MB of compultory payload traffic. Reading the VRT file means in this example 4 MB of payload. If this sort of net access to a large directory of raster files should be important for someone there should be a way to find the right raster file and righ data range in that file with minimum amount of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to keep the vrt file on the client side. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email
Re: [gdal-dev] Fast Pixel Access
Hi, I made a few tests and here comes my conclusions. Hypothesis is that someone wants to make a DEM query service which is using gdallocationinfo for queries and DEM data is to be accessed as files from a standard web site. I compared three alternatives: 1) There are thousands of DEM files on the server and they are combined together with a VRT file. 2) There is only one DEM file as BigTIFF. 3) DEM is split into tiles into x/y/z tile directory structure like in Google maps or OpenStreetMap tiles. My test data covers Finland with 10 m grid size and as deflate compressed tiffs they make about 10 GB together. Before going on, keep in mind that the speed needs indexes. The better index, the less unnecessary data to read. In case 1) the first level index is the VRT file. The second level index, if it exists, is in the headers of the real DEM files. It may be possible to jump to a correct offset from the beginning of the DEM data and read only a part of the file. In case 2) the index is in the internal TIFF directory. If the BigTIFF is tiled the access to tiles should be rather effectice. And finally in case 3) the index is built into directory structure and tiling schema that is used for saving the tiles. The schema is no well known that tile map service clients can directly ask for a certain file name if they know the coordinates and scale. Conclusions: 1) - The whole VRT file must be readed. Caching the vrt file would make next requests faster. - For some reason gdallocationinfo wants to get the directory list of the directory where the vrt file is. This is slow and generates lots of traffic if the thousands of DEM files are in the same directory. Probably it would be faster to have them in another dierectory. 2) - BigTIFF route is more straight forward but gdallocationinfo needs still to do many big range reads. - Also in this case gdallocationinfo reads the target file directory. It would be good to keep this directory small. Don't do like I did with having in the directory the BigTIFF DEM file that was the only file needed, but also the vrt and thousands of original DEMs from the previuos test - but at least this is a know this issue now and know how to avoid it. In my case reading the directory made 2.2 MB of web traffic and all or most for wain. 3) - I used OpenStreetMap tile service as the test data for the third test. In this case gdallocationinfo knows exactly which tile to request and it is making only one request. It also seems to cache some tiles on the client side which means that queries for close locations may hit the cached tile and be very fast. Summary statistics: 1) Gdallocationinfo makes 6 requests and reads 6.4 MB of data 2) Gdallocationinfo makes 8 requests and reads 4.3 MB of data 3) Gdallocationinfo makes 1 requests and reads 10 kB of data Requests I used are these: 1) gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/ dem10m/dem_10m.vrt -geoloc 389559 6677412 2) gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/ dem10m/dem_10m.tif -geoloc 389559 6677412 3) gdallocationinfo frmt_wms_openstreetmap_tms.xml -geoloc 389559 6677412 I know that the queried place in 3) is not the same because SRIDs of data differ nor does OSM return 16-bit DEM heights but 3-band RGB values instead but it does not matter here, the idea is what is important. My conclusion is that you should cut your DEM into tiles with for example gdal2tiles or MapTiler and the resuld could actually be quit speedy and perhaps using 126x126 tiles could make it still a bit faster. Hope that they can create tiles as 16-bit tiffs. I am sure that these results are not scientifically sound but I am also sure that the difference between 6.4 MB/4.3 MB/10 kB is something to think about especially if you dream about a mobile service. I placed the requests which gdallocationinfo made during these tests into http://latuviitta.org/documents/gdallocationinfo_requests.txt -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
David Baker (Geoscience david.m.baker at chk.com writes: Dev’s, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon’s from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. I was experimenting with something like a GIS service without a GIS server) and I have some examples online but because of http connection the speed comparison does not make sense. Vrt combining biomass data from 13 single band tif files gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/metla/puuston_tilavuus.vrt -geoloc 389559 6677412 DEM of Finland with 10x10 m grid through vrt gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt -geoloc 389559 6677412 The same from a single BigTIFF gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.tif -geoloc 389559 6677412 Feel free to download the originals if you want, they are all made from open data. Just mention the National Land Survey of Finland, 2013 for the DEM and Finnish Forest Reserch Institute, 2013 for the biomass data if you publish data somewhere. The DEM datasets are about 10 GB each (Bigtiff + the original small ones). My tiffs have tiles but for this usage where only the value of a single pixel is interesting striped tiffs could be as fast to read than tiled tiffs. A trial would tell everything -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Evan, I am not sure how to profile as I do not have access to the code to profile. I did do a timing test... vrt file = 22,970 KB bil file = 35,180 KB * 55,501 I piped five locations from the loc.txt file: -96.0 36.0 -98.0 37.0 -100.0 38.0 -99.0 39.0 -101.0 35.0 gdallocationinfo -valonly -geoloc intermap.vrt loc.txt 189.84185791015625.5 sec 384.85745239257822.6 sec 762.01593017578122.9 sec 550.71911621093823.6 sec 883.63702392578122.9 sec Note: I used a lap timer on my iPhone to capture the split times as the results appeared in the console window. Does this give any insight? David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent: Saturday, February 01, 2014 1:28 AM To: Brian Case Cc: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 00:23:13, Brian Case a écrit : evenr what about the use of a tileindex? You really mean a tileindex as produced by gdaltindex ? Well, that's not exactly the same beast as a VRT, but yes if it was recognized as a GDAL dataset then you could potentially save the cost of XML parsing. One could imagine that the VRT driver would accept a tileindex as an altenate connection string. Anyway it would be interesting to first profile where the time is spent in David use case. If it's in the XML parsing, then I can't see what could be easily improved in that area. If it's the intersection, then there's potential for improvement. seems an intersection with a set of polys first would be quick brian On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote: Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit : Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? No, it isn't currently, although I think it could be improved to have a in- memory index with moderate effort. But are you sure the slowness is due to the lack of index ? 55,000 is a big number, but not that big. Maybe the slowness just comes from the opening time (XML parsing) of such a big VRT. That would need to be profiled to be sure where the bottleneck is. Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. Thanks, David David M. Baker Senior Advisor - Geoscience Technology Chesapeake Energy Corporation david.m.ba...@chk.commailto:david.m.ba...@chk.com This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). -- Geospatial professional services http://even.rouault.free.fr/services.html ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Jukka, Jukka wrote: I was experimenting with something like a GIS service without a GIS server) and I have some examples online but... I am looking to do as you have, a RESTful service to query the elevation at a given location. This will be used to in a DQM process as well as a geologic application that needs the elevation of a proposed wellsite for data mining. In both cases, 1000's if not tens of 1000's of calls will be made so performance is an issue. David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Jukka Rahkonen Sent: Saturday, February 01, 2014 7:09 AM To: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access David Baker (Geoscience david.m.baker at chk.com writes: Dev’s, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon’s from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. I was experimenting with something like a GIS service without a GIS server) and I have some examples online but because of http connection the speed comparison does not make sense. Vrt combining biomass data from 13 single band tif files gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/metla/puuston_tilavuus.vrt -geoloc 389559 6677412 DEM of Finland with 10x10 m grid through vrt gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt -geoloc 389559 6677412 The same from a single BigTIFF gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.tif -geoloc 389559 6677412 Feel free to download the originals if you want, they are all made from open data. Just mention the National Land Survey of Finland, 2013 for the DEM and Finnish Forest Reserch Institute, 2013 for the biomass data if you publish data somewhere. The DEM datasets are about 10 GB each (Bigtiff + the original small ones). My tiffs have tiles but for this usage where only the value of a single pixel is interesting striped tiffs could be as fast to read than tiled tiffs. A trial would tell everything -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
This is an application that is just screaming for a spatial index For starters you could build a spatialite db of the individual file extents that returned the filename to pass to gdallocationinfo On Feb 1, 2014, at 9:14 AM, David Baker (Geoscience) david.m.ba...@chk.com wrote: Jukka, Jukka wrote: I was experimenting with something like a GIS service without a GIS server) and I have some examples online but... I am looking to do as you have, a RESTful service to query the elevation at a given location. This will be used to in a DQM process as well as a geologic application that needs the elevation of a proposed wellsite for data mining. In both cases, 1000's if not tens of 1000's of calls will be made so performance is an issue. David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Jukka Rahkonen Sent: Saturday, February 01, 2014 7:09 AM To: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access David Baker (Geoscience david.m.baker at chk.com writes: Dev’s, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon’s from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. I was experimenting with something like a GIS service without a GIS server) and I have some examples online but because of http connection the speed comparison does not make sense. Vrt combining biomass data from 13 single band tif files gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/metla/puuston_tilavuus.vrt -geoloc 389559 6677412 DEM of Finland with 10x10 m grid through vrt gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt -geoloc 389559 6677412 The same from a single BigTIFF gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.tif -geoloc 389559 6677412 Feel free to download the originals if you want, they are all made from open data. Just mention the National Land Survey of Finland, 2013 for the DEM and Finnish Forest Reserch Institute, 2013 for the biomass data if you publish data somewhere. The DEM datasets are about 10 GB each (Bigtiff + the original small ones). My tiffs have tiles but for this usage where only the value of a single pixel is interesting striped tiffs could be as fast to read than tiled tiffs. A trial would tell everything -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Norman, Yes it does... At first I am looking to see if I can do this off the shelf with just the tools in the GDAL/ORG toolset. Are you thinking of using OGR to do the spatial query with the spatialite db? Would a .qix indexed tile index shapefile work? David From: Norman Vine [mailto:n...@cape.com] Sent: Saturday, February 01, 2014 8:23 AM To: David Baker (Geoscience) Cc: gdal-dev Subject: Re: [gdal-dev] Fast Pixel Access This is an application that is just screaming for a spatial index For starters you could build a spatialite db of the individual file extents that returned the filename to pass to gdallocationinfo On Feb 1, 2014, at 9:14 AM, David Baker (Geoscience) david.m.ba...@chk.commailto:david.m.ba...@chk.com wrote: Jukka, Jukka wrote: I was experimenting with something like a GIS service without a GIS server) and I have some examples online but... I am looking to do as you have, a RESTful service to query the elevation at a given location. This will be used to in a DQM process as well as a geologic application that needs the elevation of a proposed wellsite for data mining. In both cases, 1000's if not tens of 1000's of calls will be made so performance is an issue. David -Original Message- From: gdal-dev-boun...@lists.osgeo.orgmailto:gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.orgmailto:dev-boun...@lists.osgeo.org] On Behalf Of Jukka Rahkonen Sent: Saturday, February 01, 2014 7:09 AM To: gdal-dev@lists.osgeo.orgmailto:gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access David Baker (Geoscience david.m.baker at chk.comhttp://chk.com writes: Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. I was experimenting with something like a GIS service without a GIS server) and I have some examples online but because of http connection the speed comparison does not make sense. Vrt combining biomass data from 13 single band tif files gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/metla/puuston_tilavuus.vrt -geoloc 389559 6677412 DEM of Finland with 10x10 m grid through vrt gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt -geoloc 389559 6677412 The same from a single BigTIFF gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.tif -geoloc 389559 6677412 Feel free to download the originals if you want, they are all made from open data. Just mention the National Land Survey of Finland, 2013 for the DEM and Finnish Forest Reserch Institute, 2013 for the biomass data if you publish data somewhere. The DEM datasets are about 10 GB each (Bigtiff + the original small ones). My tiffs have tiles but for this usage where only the value of a single pixel is interesting striped tiffs could be as fast to read than tiled tiffs. A trial would tell everything -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.orgmailto:gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). ___ gdal-dev mailing list gdal-dev@lists.osgeo.orgmailto:gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any
Re: [gdal-dev] Fast Pixel Access
David Baker (Geoscience david.m.baker at chk.com writes: Jukka, Jukka wrote: I was experimenting with something like a GIS service without a GIS server) and I have some examples online but... I am looking to do as you have, a RESTful service to query the elevation at a given location. This will be used to in a DQM process as well as a geologic application that needs the elevation of a proposed wellsite for data mining. In both cases, 1000's if not tens of 1000's of calls will be made so performance is an issue. I don't believe that gdallocationinfo is the right thing for you. There are only 60 seconds in a minute and our experiments show that each request takes seconds or even tens of seconds. Your users will hang you. Now what to do instead? A WMS/WCS service can send a piece of DEM with thousands of pixels as GeoTIFF in a second and a heavy client like QGIS could continue the analysis. Or you can use the existing vrt file and read the region of interest with gdal_translate. This request is not very fast either but it brings you one million height values faster than possible users could find the rope gdal_translate -of GTiff -srcwin 1 1 1000 1000 /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt test_1000by1000.tif Or then WFS service could be used to send only thos DEM cells which intersect the requested point/line/area. That could be nice for further processing but in that case your DEM should be converted into points or polygons. I wonder if PostGIS raster has something to give for you. Or perhaps there is some other clever way to do the job with light and powerful tools. Ideas are welcome. -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Now what to do instead? A WMS/WCS service can send a piece of DEM with thousands of pixels as GeoTIFF in a second and a heavy client like QGIS could continue the analysis. Or you can use the existing vrt file and read the region of interest with gdal_translate. This request is not very fast either but it brings you one million height values faster than possible users could find the rope gdal_translate -of GTiff -srcwin 1 1 1000 1000 /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt test_1000by1000.tif Or then WFS service could be used to send only thos DEM cells which intersect the requested point/line/area. That could be nice for further processing but in that case your DEM should be converted into points or polygons. I wonder if PostGIS raster has something to give for you. Or perhaps there is some other clever way to do the job with light and powerful tools. Ideas are welcome. Funny, I had tried this idea with GMT, but took an awful long time ~34 s while the gdallocationinfo approach took ~14 secs echo 389559 6677412 | grdtrack -G/vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt -R389000/39/6677400/6678000 I have idea why through GMT it takes longer since it's still GDAL who gets the data from the remote location. And furthermore I require only a small grid chunk (the -R... limits) Joaquim ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Hi David, Following on from the VRT / Bigtiff comparison Jukka posted. Have you considered storing the data as a single KEA format file, which is based on HDF5? I have the National Elevation Dataset for the US, which comprises 3,605 1 x 1 degree tiles at 1 arc sec resolution. I first created a VRT then used gdal_translate to convert to a KEA file. Total size is 421,212 x 252,012 pixels, 77 GB. I also built overviews for a fast display (this took a long time and I don't think is needed for your case). I've just tried using gdallocationinfo on the file to get pixel information and it takes 0.5s to get the pixel value back. The KEA library and GDAL driver source are available from: https://bitbucket.org/chchrsc/kealib/ and the format is described in: Peter Bunting, Sam Gillingham, The KEA image file format, Computers Geosciences, Volume 57, August 2013, Pages 54-58, ISSN 0098-3004, http://dx.doi.org/10.1016/j.cageo.2013.03.025. If you don't mind having one massive file (in addition to the individual tiles, which could be archived), this might work for your use case. Thanks, Dan Message: 10 Date: Fri, 31 Jan 2014 16:15:53 + From: David Baker (Geoscience) david.m.ba...@chk.com To: 'gdal-dev@lists.osgeo.org' gdal-dev@lists.osgeo.org Subject: [gdal-dev] Fast Pixel Access Message-ID: 2a18a4344312134b937df938d992264a0508f...@okcexhprd122.chkenergy.net Content-Type: text/plain; charset=us-ascii Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. Thanks, David David M. Baker Senior Advisor - Geoscience Technology Chesapeake Energy Corporation david.m.ba...@chk.commailto:david.m.ba...@chk.com ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
[gdal-dev] Fast Pixel Access
Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. Thanks, David David M. Baker Senior Advisor - Geoscience Technology Chesapeake Energy Corporation david.m.ba...@chk.commailto:david.m.ba...@chk.com This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit : Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? No, it isn't currently, although I think it could be improved to have a in- memory index with moderate effort. But are you sure the slowness is due to the lack of index ? 55,000 is a big number, but not that big. Maybe the slowness just comes from the opening time (XML parsing) of such a big VRT. That would need to be profiled to be sure where the bottleneck is. Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. Thanks, David David M. Baker Senior Advisor - Geoscience Technology Chesapeake Energy Corporation david.m.ba...@chk.commailto:david.m.ba...@chk.com This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). -- Geospatial professional services http://even.rouault.free.fr/services.html ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
evenr what about the use of a tileindex? seems an intersection with a set of polys first would be quick brian On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote: Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit : Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? No, it isn't currently, although I think it could be improved to have a in- memory index with moderate effort. But are you sure the slowness is due to the lack of index ? 55,000 is a big number, but not that big. Maybe the slowness just comes from the opening time (XML parsing) of such a big VRT. That would need to be profiled to be sure where the bottleneck is. Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. Thanks, David David M. Baker Senior Advisor - Geoscience Technology Chesapeake Energy Corporation david.m.ba...@chk.commailto:david.m.ba...@chk.com This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). -- Geospatial professional services http://even.rouault.free.fr/services.html ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Le samedi 01 février 2014 00:23:13, Brian Case a écrit : evenr what about the use of a tileindex? You really mean a tileindex as produced by gdaltindex ? Well, that's not exactly the same beast as a VRT, but yes if it was recognized as a GDAL dataset then you could potentially save the cost of XML parsing. One could imagine that the VRT driver would accept a tileindex as an altenate connection string. Anyway it would be interesting to first profile where the time is spent in David use case. If it's in the XML parsing, then I can't see what could be easily improved in that area. If it's the intersection, then there's potential for improvement. seems an intersection with a set of polys first would be quick brian On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote: Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit : Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? No, it isn't currently, although I think it could be improved to have a in- memory index with moderate effort. But are you sure the slowness is due to the lack of index ? 55,000 is a big number, but not that big. Maybe the slowness just comes from the opening time (XML parsing) of such a big VRT. That would need to be profiled to be sure where the bottleneck is. Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The files are regularly named base on their location within a 1 arc-second grid. Thanks, David David M. Baker Senior Advisor - Geoscience Technology Chesapeake Energy Corporation david.m.ba...@chk.commailto:david.m.ba...@chk.com This email (and attachments if any) is intended only for the use of the individual or entity to which it is addressed, and may contain information that is confidential or privileged and exempt from disclosure under applicable law. If the reader of this email is not the intended recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return email and destroy all copies of the email (and attachments if any). -- Geospatial professional services http://even.rouault.free.fr/services.html ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev