Re: [gdal-dev] Fast Pixel Access

2014-02-10 Thread David Baker (Geoscience)
Even,

No not an i386...  A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2 cores 
with 12.0GB.  Thought the data is on the network, not local, with 1Gbps access. 
 The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly increase the speed.  
Does the BIL driver read the whole file into memory first?  Might a direct read 
be faster?

And Even, please excuse my ignorance, but what is gdb?  I really would like 
to do the profiling.

David

-Original Message-
From: Even Rouault [mailto:even.roua...@mines-paris.org]
Sent: Sunday, February 09, 2014 6:36 AM
To: David Baker (Geoscience)
Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org'
Subject: Re: [gdal-dev] Fast Pixel Access

Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit :
 Evan,

 I am not sure how to profile as I do not have access to the code to
 profile.  I did do a timing test...

 vrt file = 22,970 KB
 bil file = 35,180 KB * 55,501

 I piped five locations from the loc.txt file:
 -96.0 36.0
 -98.0 37.0
 -100.0 38.0
 -99.0 39.0
 -101.0 35.0

 gdallocationinfo -valonly -geoloc intermap.vrt  loc.txt
 189.84185791015625.5 sec
 384.85745239257822.6 sec
 762.01593017578122.9 sec
 550.71911621093823.6 sec
 883.63702392578122.9 sec

 Note: I used a lap timer on my iPhone to capture the split times as the
 results appeared in the console window.  Does this give any insight?

Woo I agree that's utterly slow ! When you mentionned slow I thought it was
more in the order of 0.1 second ! We can already exclude the parsing time of
the VRT since you do that in the same gdallocationinfo session and that there
will be just one parsing.
And I can't believe that the intersection test for 55 000 rectangles takes ~
20 seconds, unless you have an old i386 at 5 MHz ;-)
My usual way of profiling stuff that is slow in the order of more than one
second is to run under gdb, break with Ctrl+C, display the stack trace,
continue the run, break again, display the stack trace, etc.. If you end up
breaking in the same function, then you've found the bottleneck.

I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was
suggested and seems to improve things significantly. Perhaps we should try to
cache the result of the initial readdir so it can benefits to later attempts,
but I haven't checked how easily that could be miplemented. Or perhaps we
should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it
causes problem from time to time.
But generally filesystems don't behave very well when there are a lot of files
in the same directory. You'd better organizing your tiles in subdirectories.
But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could
try the above suggestion to identify where the time is spent.

Even


 David

 -Original Message-
 From: gdal-dev-boun...@lists.osgeo.org
 [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent:
 Saturday, February 01, 2014 1:28 AM
 To: Brian Case
 Cc: gdal-dev@lists.osgeo.org
 Subject: Re: [gdal-dev] Fast Pixel Access

 Le samedi 01 février 2014 00:23:13, Brian Case a écrit :
  evenr
 
 
  what about the use of a tileindex?

 You really mean a tileindex as produced by gdaltindex ? Well, that's not
 exactly the same beast as a VRT, but yes if it was recognized as a GDAL
 dataset then you could potentially save the cost of XML parsing. One could
 imagine that the VRT driver would accept a tileindex as an altenate
 connection string.

 Anyway it would be interesting to first profile where the time is spent in
 David use case. If it's in the XML parsing, then I can't see what could be
 easily improved in that area. If it's the intersection, then there's
 potential for improvement.

  seems an intersection with a set of
  polys first would be quick
 
 
 
  brian
 
  On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote:
   Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit :
Dev's,
   
I have a set of 55,501 bil files in a single directory.  They are
DEMS data that cover the US in 7.5 minute tiles.  I would like to
randomly access elevations at a given lat/lon's from the whole
dataset.  I created a vrt file from the directory of bil files, and
have been able to access the elevation at a given lat/lon using
gdallocationinfo, but because of the size of the dataset, this
operation is somewhat slow. Can the vrt be indexed?
  
   No, it isn't currently, although I think it could be improved to have a
   in- memory index with moderate effort.
  
   But are you sure the slowness is due to the lack of index ? 55,000 is a
   big number, but not that big. Maybe the slowness just comes from the
   opening time (XML parsing) of such a big VRT. That would need to be
   profiled to be sure where the bottleneck is.
  
Or, is there a faster, better way to access the pixels?  I would
first like to do this with the utilities before diving into code
(C

Re: [gdal-dev] Fast Pixel Access

2014-02-10 Thread Even Rouault
Selon David Baker (Geoscience) david.m.ba...@chk.com:

 Even,

 No not an i386...  A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2 cores
 with 12.0GB.  Thought the data is on the network, not local, with 1Gbps
 access.  The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly increase
 the speed.  Does the BIL driver read the whole file into memory first?  Might
 a direct read be faster?

No the BIL driver will just read the line where the pixel is.


 And Even, please excuse my ignorance, but what is gdb?  I really would like
 to do the profiling.

gdb is the GNU debugger ( https://www.gnu.org/software/gdb/ ) . Assuming you use
Linux (likely available on MacOSX too). You should be able to install it with
the usual package management system of the distribution : apt-get install gdb,
yum install gdb, ... . Otherwise on Windows, I'm far less familiar with the
debugging tools.

gdb --args gdallocationinfo -valonly -geoloc intermap.vrt

Then type run
Type a coordinate -96.0 36.0
ctrl+c to suspend execution
bt to display the stack trace
c to resume execution


 David

 -Original Message-
 From: Even Rouault [mailto:even.roua...@mines-paris.org]
 Sent: Sunday, February 09, 2014 6:36 AM
 To: David Baker (Geoscience)
 Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org'
 Subject: Re: [gdal-dev] Fast Pixel Access

 Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit :
  Evan,
 
  I am not sure how to profile as I do not have access to the code to
  profile.  I did do a timing test...
 
  vrt file = 22,970 KB
  bil file = 35,180 KB * 55,501
 
  I piped five locations from the loc.txt file:
  -96.0 36.0
  -98.0 37.0
  -100.0 38.0
  -99.0 39.0
  -101.0 35.0
 
  gdallocationinfo -valonly -geoloc intermap.vrt  loc.txt
  189.84185791015625.5 sec
  384.85745239257822.6 sec
  762.01593017578122.9 sec
  550.71911621093823.6 sec
  883.63702392578122.9 sec
 
  Note: I used a lap timer on my iPhone to capture the split times as the
  results appeared in the console window.  Does this give any insight?

 Woo I agree that's utterly slow ! When you mentionned slow I thought it was
 more in the order of 0.1 second ! We can already exclude the parsing time of
 the VRT since you do that in the same gdallocationinfo session and that there
 will be just one parsing.
 And I can't believe that the intersection test for 55 000 rectangles takes ~
 20 seconds, unless you have an old i386 at 5 MHz ;-)
 My usual way of profiling stuff that is slow in the order of more than one
 second is to run under gdb, break with Ctrl+C, display the stack trace,
 continue the run, break again, display the stack trace, etc.. If you end up
 breaking in the same function, then you've found the bottleneck.

 I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was
 suggested and seems to improve things significantly. Perhaps we should try to
 cache the result of the initial readdir so it can benefits to later attempts,
 but I haven't checked how easily that could be miplemented. Or perhaps we
 should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it
 causes problem from time to time.
 But generally filesystems don't behave very well when there are a lot of
 files
 in the same directory. You'd better organizing your tiles in subdirectories.
 But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could
 try the above suggestion to identify where the time is spent.

 Even

 
  David
 
  -Original Message-
  From: gdal-dev-boun...@lists.osgeo.org
  [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent:
  Saturday, February 01, 2014 1:28 AM
  To: Brian Case
  Cc: gdal-dev@lists.osgeo.org
  Subject: Re: [gdal-dev] Fast Pixel Access
 
  Le samedi 01 février 2014 00:23:13, Brian Case a écrit :
   evenr
  
  
   what about the use of a tileindex?
 
  You really mean a tileindex as produced by gdaltindex ? Well, that's not
  exactly the same beast as a VRT, but yes if it was recognized as a GDAL
  dataset then you could potentially save the cost of XML parsing. One could
  imagine that the VRT driver would accept a tileindex as an altenate
  connection string.
 
  Anyway it would be interesting to first profile where the time is spent in
  David use case. If it's in the XML parsing, then I can't see what could be
  easily improved in that area. If it's the intersection, then there's
  potential for improvement.
 
   seems an intersection with a set of
   polys first would be quick
  
  
  
   brian
  
   On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote:
Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit
 :
 Dev's,

 I have a set of 55,501 bil files in a single directory.  They are
 DEMS data that cover the US in 7.5 minute tiles.  I would like to
 randomly access elevations at a given lat/lon's from the whole
 dataset.  I created a vrt file from the directory of bil files

Re: [gdal-dev] Fast Pixel Access

2014-02-10 Thread David Baker (Geoscience)
Evan.  I am on Windows and only have the binaries installed.

David

-Original Message-
From: Even Rouault [mailto:even.roua...@mines-paris.org]
Sent: Monday, February 10, 2014 8:54 AM
To: David Baker (Geoscience)
Cc: 'Even Rouault'; 'Brian Case'; 'gdal-dev@lists.osgeo.org'
Subject: RE: [gdal-dev] Fast Pixel Access

Selon David Baker (Geoscience) david.m.ba...@chk.com:

 Even,

 No not an i386...  A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2 cores
 with 12.0GB.  Thought the data is on the network, not local, with 1Gbps
 access.  The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly increase
 the speed.  Does the BIL driver read the whole file into memory first?  Might
 a direct read be faster?

No the BIL driver will just read the line where the pixel is.


 And Even, please excuse my ignorance, but what is gdb?  I really would like
 to do the profiling.

gdb is the GNU debugger ( https://www.gnu.org/software/gdb/ ) . Assuming you use
Linux (likely available on MacOSX too). You should be able to install it with
the usual package management system of the distribution : apt-get install gdb,
yum install gdb, ... . Otherwise on Windows, I'm far less familiar with the
debugging tools.

gdb --args gdallocationinfo -valonly -geoloc intermap.vrt

Then type run
Type a coordinate -96.0 36.0
ctrl+c to suspend execution
bt to display the stack trace
c to resume execution


 David

 -Original Message-
 From: Even Rouault [mailto:even.roua...@mines-paris.org]
 Sent: Sunday, February 09, 2014 6:36 AM
 To: David Baker (Geoscience)
 Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org'
 Subject: Re: [gdal-dev] Fast Pixel Access

 Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit :
  Evan,
 
  I am not sure how to profile as I do not have access to the code to
  profile.  I did do a timing test...
 
  vrt file = 22,970 KB
  bil file = 35,180 KB * 55,501
 
  I piped five locations from the loc.txt file:
  -96.0 36.0
  -98.0 37.0
  -100.0 38.0
  -99.0 39.0
  -101.0 35.0
 
  gdallocationinfo -valonly -geoloc intermap.vrt  loc.txt
  189.84185791015625.5 sec
  384.85745239257822.6 sec
  762.01593017578122.9 sec
  550.71911621093823.6 sec
  883.63702392578122.9 sec
 
  Note: I used a lap timer on my iPhone to capture the split times as the
  results appeared in the console window.  Does this give any insight?

 Woo I agree that's utterly slow ! When you mentionned slow I thought it was
 more in the order of 0.1 second ! We can already exclude the parsing time of
 the VRT since you do that in the same gdallocationinfo session and that there
 will be just one parsing.
 And I can't believe that the intersection test for 55 000 rectangles takes ~
 20 seconds, unless you have an old i386 at 5 MHz ;-)
 My usual way of profiling stuff that is slow in the order of more than one
 second is to run under gdb, break with Ctrl+C, display the stack trace,
 continue the run, break again, display the stack trace, etc.. If you end up
 breaking in the same function, then you've found the bottleneck.

 I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was
 suggested and seems to improve things significantly. Perhaps we should try to
 cache the result of the initial readdir so it can benefits to later attempts,
 but I haven't checked how easily that could be miplemented. Or perhaps we
 should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it
 causes problem from time to time.
 But generally filesystems don't behave very well when there are a lot of
 files
 in the same directory. You'd better organizing your tiles in subdirectories.
 But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could
 try the above suggestion to identify where the time is spent.

 Even

 
  David
 
  -Original Message-
  From: gdal-dev-boun...@lists.osgeo.org
  [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent:
  Saturday, February 01, 2014 1:28 AM
  To: Brian Case
  Cc: gdal-dev@lists.osgeo.org
  Subject: Re: [gdal-dev] Fast Pixel Access
 
  Le samedi 01 février 2014 00:23:13, Brian Case a écrit :
   evenr
  
  
   what about the use of a tileindex?
 
  You really mean a tileindex as produced by gdaltindex ? Well, that's not
  exactly the same beast as a VRT, but yes if it was recognized as a GDAL
  dataset then you could potentially save the cost of XML parsing. One could
  imagine that the VRT driver would accept a tileindex as an altenate
  connection string.
 
  Anyway it would be interesting to first profile where the time is spent in
  David use case. If it's in the XML parsing, then I can't see what could be
  easily improved in that area. If it's the intersection, then there's
  potential for improvement.
 
   seems an intersection with a set of
   polys first would be quick
  
  
  
   brian
  
   On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote:
Le vendredi 31 janvier 2014 17:15:53, David Baker

Re: [gdal-dev] Fast Pixel Access

2014-02-10 Thread Even Rouault
Le mardi 11 février 2014 00:10:20, David Baker (Geoscience) a écrit :
 Evan.  I am on Windows and only have the binaries installed.

Well, I let Windows developers lurking here answer if they have some good 
advice. I imagine you would need binaries with debugging symbols to be able to 
get a usefull stack trace.

 
 David
 
 -Original Message-
 From: Even Rouault [mailto:even.roua...@mines-paris.org]
 Sent: Monday, February 10, 2014 8:54 AM
 To: David Baker (Geoscience)
 Cc: 'Even Rouault'; 'Brian Case'; 'gdal-dev@lists.osgeo.org'
 Subject: RE: [gdal-dev] Fast Pixel Access
 
 Selon David Baker (Geoscience) david.m.ba...@chk.com:
  Even,
  
  No not an i386...  A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2
  cores with 12.0GB.  Thought the data is on the network, not local, with
  1Gbps access.  The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly
  increase the speed.  Does the BIL driver read the whole file into memory
  first?  Might a direct read be faster?
 
 No the BIL driver will just read the line where the pixel is.
 
  And Even, please excuse my ignorance, but what is gdb?  I really would
  like to do the profiling.
 
 gdb is the GNU debugger ( https://www.gnu.org/software/gdb/ ) . Assuming
 you use Linux (likely available on MacOSX too). You should be able to
 install it with the usual package management system of the distribution :
 apt-get install gdb, yum install gdb, ... . Otherwise on Windows, I'm far
 less familiar with the debugging tools.
 
 gdb --args gdallocationinfo -valonly -geoloc intermap.vrt
 
 Then type run
 Type a coordinate -96.0 36.0
 ctrl+c to suspend execution
 bt to display the stack trace
 c to resume execution
 
  David
  
  -Original Message-
  From: Even Rouault [mailto:even.roua...@mines-paris.org]
  Sent: Sunday, February 09, 2014 6:36 AM
  To: David Baker (Geoscience)
  Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org'
  Subject: Re: [gdal-dev] Fast Pixel Access
  
  Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit :
   Evan,
   
   I am not sure how to profile as I do not have access to the code to
   profile.  I did do a timing test...
   
   vrt file = 22,970 KB
   bil file = 35,180 KB * 55,501
   
   I piped five locations from the loc.txt file:
   -96.0 36.0
   -98.0 37.0
   -100.0 38.0
   -99.0 39.0
   -101.0 35.0
   
   gdallocationinfo -valonly -geoloc intermap.vrt  loc.txt
   189.84185791015625.5 sec
   384.85745239257822.6 sec
   762.01593017578122.9 sec
   550.71911621093823.6 sec
   883.63702392578122.9 sec
   
   Note: I used a lap timer on my iPhone to capture the split times as the
   results appeared in the console window.  Does this give any insight?
  
  Woo I agree that's utterly slow ! When you mentionned slow I thought it
  was more in the order of 0.1 second ! We can already exclude the parsing
  time of the VRT since you do that in the same gdallocationinfo session
  and that there will be just one parsing.
  And I can't believe that the intersection test for 55 000 rectangles
  takes ~ 20 seconds, unless you have an old i386 at 5 MHz ;-)
  My usual way of profiling stuff that is slow in the order of more than
  one second is to run under gdb, break with Ctrl+C, display the stack
  trace, continue the run, break again, display the stack trace, etc.. If
  you end up breaking in the same function, then you've found the
  bottleneck.
  
  I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was
  suggested and seems to improve things significantly. Perhaps we should
  try to cache the result of the initial readdir so it can benefits to
  later attempts, but I haven't checked how easily that could be
  miplemented. Or perhaps we should just change the default value of
  GDAL_DISABLE_READDIR_ON_OPEN since it causes problem from time to time.
  But generally filesystems don't behave very well when there are a lot of
  files
  in the same directory. You'd better organizing your tiles in
  subdirectories. But still 1 to 3 seconds sounds a bit slow to me. Would
  be cool if you could try the above suggestion to identify where the time
  is spent.
  
  Even
  
   David
   
   -Original Message-
   From: gdal-dev-boun...@lists.osgeo.org
   [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault
   Sent: Saturday, February 01, 2014 1:28 AM
   To: Brian Case
   Cc: gdal-dev@lists.osgeo.org
   Subject: Re: [gdal-dev] Fast Pixel Access
   
   Le samedi 01 février 2014 00:23:13, Brian Case a écrit :
evenr


what about the use of a tileindex?
   
   You really mean a tileindex as produced by gdaltindex ? Well, that's
   not exactly the same beast as a VRT, but yes if it was recognized as a
   GDAL dataset then you could potentially save the cost of XML parsing.
   One could imagine that the VRT driver would accept a tileindex as an
   altenate connection string.
   
   Anyway it would be interesting to first profile where the time

Re: [gdal-dev] Fast Pixel Access

2014-02-05 Thread David Baker (Geoscience)
Luke,

Thank you for this suggestion.  This too the access times from 15-20 seconds 
down to 1 to 3 seconds.  The majority of the time seems to be spent on the 
initial read of the vrt as subsequent piped locations after the first are 
returned sub-second.  For my current application, this should be okay.

David


From: gdal-dev-boun...@lists.osgeo.org 
[mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Luke Roth
Sent: Monday, February 03, 2014 8:11 AM
To: Jukka Rahkonen
Cc: gdal-dev@lists.osgeo.org
Subject: Re: [gdal-dev] Fast Pixel Access

Another thing that might speed up access is setting the config option 
GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment variable or on 
the command line.  That should help with GDAL reading the directory each time 
it opens a dataset.  I have an application which reads one value from each of a 
large number of datasets and setting this option made it run about 3 times 
faster.
Luke
On Sun, Feb 2, 2014 at 2:12 PM, Jukka Rahkonen 
jukka.rahko...@mmmtike.fimailto:jukka.rahko...@mmmtike.fi wrote:
Hi,

I made a few tests and here comes my conclusions. Hypothesis is that someone
wants to make a DEM query service which is using gdallocationinfo for
queries and DEM data is to be accessed as files from a standard web site. I
compared three alternatives:
1) There are thousands of DEM files on the server and they are combined
together with a VRT file.
2) There is only one DEM file as BigTIFF.
3) DEM is split into tiles into x/y/z tile directory structure like in
Google maps or OpenStreetMap tiles.

My test data covers Finland with 10 m grid size and as deflate compressed
tiffs they make about 10 GB together.

Before going on, keep in mind that the speed needs indexes. The better
index, the less unnecessary data to read. In case 1) the first level index
is the VRT file. The second level index, if it exists, is in the headers of
the real DEM files. It may be possible to jump to a correct offset from the
beginning of the DEM data and read only a part of the file.  In case 2) the
index is in the internal TIFF directory. If the BigTIFF is tiled the access
to tiles should be rather effectice. And finally in case 3) the index is
built into directory structure and tiling schema that is used for saving the
tiles. The schema is no well known that tile map service clients can
directly ask for a certain file name if they know the coordinates and scale.

Conclusions:

1)
- The whole VRT file must be readed. Caching the vrt file would make next
requests faster.
- For some reason gdallocationinfo wants to get the directory list of the
directory where the vrt file is. This is slow and generates lots of traffic
if the thousands of DEM files are in the same directory. Probably it would
be faster to have them in another dierectory.

2)
- BigTIFF route is more straight forward but gdallocationinfo needs still to
do many big range reads.
- Also in this case gdallocationinfo reads the target file directory. It
would be good to keep this directory small. Don't do like I did with having
in the directory the BigTIFF DEM file that was the only file needed, but
also the vrt and thousands of original DEMs from the previuos test - but at
least this is a know this issue now and know how to avoid it. In my case
reading the directory made 2.2 MB of web traffic and all or most for wain.

3)
- I used OpenStreetMap tile service as the test data for the third test. In
this case gdallocationinfo knows exactly which tile to request and it is
making only one request. It also seems to cache some tiles on the client
side which means that queries for close locations may hit the cached tile
and be very fast.

Summary statistics:

1) Gdallocationinfo makes 6 requests and reads 6.4 MB of data
2) Gdallocationinfo makes 8 requests and reads 4.3 MB of data
3) Gdallocationinfo makes 1 requests and reads 10 kB of data

Requests I used are these:

1)
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/
dem10m/dem_10m.vrt -geoloc  389559 6677412
2)
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/
dem10m/dem_10m.tif -geoloc  389559 6677412
3)
gdallocationinfo  frmt_wms_openstreetmap_tms.xml -geoloc  389559 6677412

I know that the queried place in 3) is not the same because SRIDs of data
differ nor does OSM return 16-bit DEM heights but 3-band RGB values instead
but it does not matter here, the idea is what is important.

My conclusion is that you should cut your DEM into tiles with for example
gdal2tiles or MapTiler and the resuld could actually be quit speedy and
perhaps using 126x126 tiles could make it still a bit faster. Hope that they
can create tiles as 16-bit tiffs.

 I am sure that these results are not scientifically sound but I am also
sure that the difference between 6.4 MB/4.3 MB/10 kB is something to think
about especially if you dream about a mobile service.

I placed the requests which gdallocationinfo made during these tests into
http://latuviitta.org/documents

Re: [gdal-dev] Fast Pixel Access

2014-02-04 Thread Jukka Rahkonen
David Baker (Geoscience david.m.baker at chk.com writes:

 
 Jukka,
 
 No matter the endpoint the user uses to access the data, behind the
scenes, there must be fast pixel access,
 correct?  Or are you saying that at WFS would do it quickly out of the box?

Hi,

Fast access to data, yes, but using WFS is rather different scenario. WFS is
made for vectors and before using it rastem DEM should be converted into
small polygons. For the whole country it would mean whole lot of small
polygons. Perhaps that is not a realistic approach. But now I think I must
shut my mouth because I have never done anything DEM related in my job
myself. Let's hope some specialist appears and gives a good solution for
you. Oh well, Klokan appeared already.

-Jukka-

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Fast Pixel Access

2014-02-03 Thread Luke Roth
Another thing that might speed up access is setting the config option
GDAL_DISABLE_READDIR_ON_OPEN
= TRUE, either as an environment variable or on the command line.  That
should help with GDAL reading the directory each time it opens a dataset.
 I have an application which reads one value from each of a large number of
datasets and setting this option made it run about 3 times faster.
Luke

On Sun, Feb 2, 2014 at 2:12 PM, Jukka Rahkonen jukka.rahko...@mmmtike.fiwrote:

 Hi,

 I made a few tests and here comes my conclusions. Hypothesis is that
 someone
 wants to make a DEM query service which is using gdallocationinfo for
 queries and DEM data is to be accessed as files from a standard web site. I
 compared three alternatives:
 1) There are thousands of DEM files on the server and they are combined
 together with a VRT file.
 2) There is only one DEM file as BigTIFF.
 3) DEM is split into tiles into x/y/z tile directory structure like in
 Google maps or OpenStreetMap tiles.

 My test data covers Finland with 10 m grid size and as deflate compressed
 tiffs they make about 10 GB together.

 Before going on, keep in mind that the speed needs indexes. The better
 index, the less unnecessary data to read. In case 1) the first level index
 is the VRT file. The second level index, if it exists, is in the headers of
 the real DEM files. It may be possible to jump to a correct offset from the
 beginning of the DEM data and read only a part of the file.  In case 2) the
 index is in the internal TIFF directory. If the BigTIFF is tiled the access
 to tiles should be rather effectice. And finally in case 3) the index is
 built into directory structure and tiling schema that is used for saving
 the
 tiles. The schema is no well known that tile map service clients can
 directly ask for a certain file name if they know the coordinates and
 scale.

 Conclusions:

 1)
 - The whole VRT file must be readed. Caching the vrt file would make next
 requests faster.
 - For some reason gdallocationinfo wants to get the directory list of the
 directory where the vrt file is. This is slow and generates lots of traffic
 if the thousands of DEM files are in the same directory. Probably it would
 be faster to have them in another dierectory.

 2)
 - BigTIFF route is more straight forward but gdallocationinfo needs still
 to
 do many big range reads.
 - Also in this case gdallocationinfo reads the target file directory. It
 would be good to keep this directory small. Don't do like I did with having
 in the directory the BigTIFF DEM file that was the only file needed, but
 also the vrt and thousands of original DEMs from the previuos test - but
 at
 least this is a know this issue now and know how to avoid it. In my case
 reading the directory made 2.2 MB of web traffic and all or most for wain.

 3)
 - I used OpenStreetMap tile service as the test data for the third test. In
 this case gdallocationinfo knows exactly which tile to request and it is
 making only one request. It also seems to cache some tiles on the client
 side which means that queries for close locations may hit the cached tile
 and be very fast.

 Summary statistics:

 1) Gdallocationinfo makes 6 requests and reads 6.4 MB of data
 2) Gdallocationinfo makes 8 requests and reads 4.3 MB of data
 3) Gdallocationinfo makes 1 requests and reads 10 kB of data

 Requests I used are these:

 1)
 gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/
 dem10m/dem_10m.vrt -geoloc  389559 6677412
 2)
 gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/
 dem10m/dem_10m.tif -geoloc  389559 6677412
 3)
 gdallocationinfo  frmt_wms_openstreetmap_tms.xml -geoloc  389559 6677412

 I know that the queried place in 3) is not the same because SRIDs of data
 differ nor does OSM return 16-bit DEM heights but 3-band RGB values instead
 but it does not matter here, the idea is what is important.

 My conclusion is that you should cut your DEM into tiles with for example
 gdal2tiles or MapTiler and the resuld could actually be quit speedy and
 perhaps using 126x126 tiles could make it still a bit faster. Hope that
 they
 can create tiles as 16-bit tiffs.

  I am sure that these results are not scientifically sound but I am also
 sure that the difference between 6.4 MB/4.3 MB/10 kB is something to think
 about especially if you dream about a mobile service.

 I placed the requests which gdallocationinfo made during these tests into
 http://latuviitta.org/documents/gdallocationinfo_requests.txt

 -Jukka Rahkonen-


 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Fast Pixel Access

2014-02-03 Thread David Baker (Geoscience)
Dan,

I had not heard of the KEA format and it looks promising, accept the need to 
compile.  I am hoping to this with out-of-the-box GDAL.  I also did not see a 
license statement.  GDAL does support HDF5 (another format I am not familiar 
with), but it looks like the limit is 2GB for the built in driver.  My dataset 
is also covers the US with 55,501 x 7.5 minutes tiles at 0.15012 arc second 
resolution (~5m), a total size 2,289,001 x 756,001 pixels, 1.8T.  Creating a 
single dataset from the tiles can be done, but in our environment, is not cheap.

David

-Original Message-
From: gdal-dev-boun...@lists.osgeo.org 
[mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Daniel Clewley
Sent: Saturday, February 01, 2014 1:45 PM
To: gdal-dev@lists.osgeo.org
Subject: Re: [gdal-dev] Fast Pixel Access

Hi David,

Following on from the VRT / Bigtiff comparison Jukka posted. Have you 
considered storing the data as a single KEA format file, which is based on HDF5?

I have the National Elevation Dataset for the US, which comprises 3,605 1 x 1 
degree tiles at 1 arc sec resolution. I first created a VRT then used 
gdal_translate to convert to a KEA file. Total size is 421,212 x 252,012 
pixels, 77 GB. I also built overviews for a fast display (this took a long time 
and I don't think is needed for your case).

I've just tried using gdallocationinfo on the file to get pixel information and 
it takes  0.5s to get the pixel value back.

The KEA library and GDAL driver source are available from:

https://bitbucket.org/chchrsc/kealib/

and the format is described in:

Peter Bunting, Sam Gillingham, The KEA image file format, Computers  
Geosciences, Volume 57, August 2013, Pages 54-58, ISSN 0098-3004, 
http://dx.doi.org/10.1016/j.cageo.2013.03.025.

If you don't mind having one massive file (in addition to the individual tiles, 
which could be archived), this might work for your use case.

Thanks,

Dan


 Message: 10
 Date: Fri, 31 Jan 2014 16:15:53 +
 From: David Baker (Geoscience) david.m.ba...@chk.com
 To: 'gdal-dev@lists.osgeo.org' gdal-dev@lists.osgeo.org
 Subject: [gdal-dev] Fast Pixel Access
 Message-ID:
   2a18a4344312134b937df938d992264a0508f...@okcexhprd122.chkenergy.net
 Content-Type: text/plain; charset=us-ascii

 Dev's,

 I have a set of 55,501 bil files in a single directory.  They are DEMS data 
 that cover the US in 7.5 minute tiles.  I would like to randomly access 
 elevations at a given lat/lon's from the whole dataset.  I created a vrt file 
 from the directory of bil files, and have been able to access the elevation 
 at a given lat/lon using gdallocationinfo, but because of the size of the 
 dataset, this operation is somewhat slow.  Can the vrt be indexed? Or, is 
 there a faster, better way to access the pixels?  I would first like to do 
 this with the utilities before diving into code (C#).  The files are 
 regularly named base on their location within a 1 arc-second grid.

 Thanks,
 David

 David M. Baker
 Senior Advisor - Geoscience Technology
 Chesapeake Energy Corporation
 david.m.ba...@chk.commailto:david.m.ba...@chk.com




___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev



This email (and attachments if any) is intended only for the use of the 
individual or entity to which it is addressed, and may contain information that 
is confidential or privileged and exempt from disclosure under applicable law. 
If the reader of this email is not the intended recipient, or the employee or 
agent responsible for delivering this message to the intended recipient, you 
are hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify the sender immediately by return email and destroy all 
copies of the email (and attachments if any).
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Fast Pixel Access

2014-02-03 Thread Jukka Rahkonen
Luke Roth roth.luke at gmail.com writes:

 
 Another thing that might speed up access is setting the config
option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment
variable or on the command line.  That should help with GDAL reading the
directory each time it opens a dataset.  I have an application which reads
one value from each of a large number of datasets and setting this option
made it run about 3 times faster.


Hi,

You are right. This config option makes GDAL to skip the reading of the
remote directory and saves a lot of bandwidth:

VRT case: 
Bytes Received:  4 244 509 (of which the vrt file: 4 192 577)
Sequence (clock) duration:  00:00:09.9996000
Was:
Bytes Received:  6 459 443
Sequence (clock) duration:  00:00:37.813

BigTIFF case: 
Bytes Received:  2 158 917
Sequence (clock) duration:  00:00:04.4368000
Was:
Bytes Received:  4 374 137
Sequence (clock) duration:  00:00:30.9192000


Conclusion:
Both options are unsuitable for serious use while amusing to play with.
Reading the BigTIFF tile offset index (or whatever it is) seems to mean
about 2 MB of compultory payload traffic. Reading the VRT file means in this
example 4 MB of payload. If this sort of net access to a large directory of
raster files should be important for someone there should be a way to find
the right raster file and righ data range in that file with minimum amount
of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to
keep the vrt file on the client side.

-Jukka Rahkonen- 


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Fast Pixel Access

2014-02-03 Thread Brian Case
-Jukka

tileindex, mapserver, and the gdal wms driver



On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote:
 Luke Roth roth.luke at gmail.com writes:
 
  
  Another thing that might speed up access is setting the config
 option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment
 variable or on the command line.  That should help with GDAL reading the
 directory each time it opens a dataset.  I have an application which reads
 one value from each of a large number of datasets and setting this option
 made it run about 3 times faster.
 
 
 Hi,
 
 You are right. This config option makes GDAL to skip the reading of the
 remote directory and saves a lot of bandwidth:
 
 VRT case: 
 Bytes Received:  4 244 509 (of which the vrt file: 4 192 577)
 Sequence (clock) duration:00:00:09.9996000
 Was:
 Bytes Received:  6 459 443
 Sequence (clock) duration:00:00:37.813
 
 BigTIFF case: 
 Bytes Received:  2 158 917
 Sequence (clock) duration:00:00:04.4368000
 Was:
 Bytes Received:  4 374 137
 Sequence (clock) duration:00:00:30.9192000
 
 
 Conclusion:
 Both options are unsuitable for serious use while amusing to play with.
 Reading the BigTIFF tile offset index (or whatever it is) seems to mean
 about 2 MB of compultory payload traffic. Reading the VRT file means in this
 example 4 MB of payload. If this sort of net access to a large directory of
 raster files should be important for someone there should be a way to find
 the right raster file and righ data range in that file with minimum amount
 of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to
 keep the vrt file on the client side.
 
 -Jukka Rahkonen- 
 
 
 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Fast Pixel Access

2014-02-03 Thread Rahkonen Jukka (Tike)
Hi,

Perhaps, but in this game the rule was not to have any GIS servers. Myself I 
would rather consider WFS. It could send heights from single points but also a 
profile along a line or all values within a polygon.

-Jukka-

Brian Case [r...@winkey.org] wrote:

 -Jukka

 tileindex, mapserver, and the gdal wms driver



 On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote:
 Luke Roth roth.luke at gmail.com writes:

 
  Another thing that might speed up access is setting the config
 option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment
 variable or on the command line.  That should help with GDAL reading the
 directory each time it opens a dataset.  I have an application which reads
 one value from each of a large number of datasets and setting this option
 made it run about 3 times faster.


 Hi,

 You are right. This config option makes GDAL to skip the reading of the
 remote directory and saves a lot of bandwidth:

 VRT case:
 Bytes Received:  4 244 509 (of which the vrt file: 4 192 577)
 Sequence (clock) duration:00:00:09.9996000
 Was:
 Bytes Received:  6 459 443
 Sequence (clock) duration:00:00:37.813

 BigTIFF case:
 Bytes Received:  2 158 917
 Sequence (clock) duration:00:00:04.4368000
 Was:
 Bytes Received:  4 374 137
 Sequence (clock) duration:00:00:30.9192000


 Conclusion:
 Both options are unsuitable for serious use while amusing to play with.
 Reading the BigTIFF tile offset index (or whatever it is) seems to mean
 about 2 MB of compultory payload traffic. Reading the VRT file means in this
 example 4 MB of payload. If this sort of net access to a large directory of
 raster files should be important for someone there should be a way to find
 the right raster file and righ data range in that file with minimum amount
 of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to
 keep the vrt file on the client side.

 -Jukka Rahkonen-


 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Fast Pixel Access

2014-02-03 Thread David Baker (Geoscience)
Jukka,

No matter the endpoint the user uses to access the data, behind the scenes, 
there must be fast pixel access, correct?  Or are you saying that at WFS would 
do it quickly out of the box?

David

-Original Message-
From: gdal-dev-boun...@lists.osgeo.org 
[mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Rahkonen Jukka (Tike)
Sent: Monday, February 03, 2014 11:37 AM
To: gdal-dev@lists.osgeo.org
Subject: Re: [gdal-dev] Fast Pixel Access

Hi,

Perhaps, but in this game the rule was not to have any GIS servers. Myself I 
would rather consider WFS. It could send heights from single points but also a 
profile along a line or all values within a polygon.

-Jukka-

Brian Case [r...@winkey.org] wrote:

 -Jukka

 tileindex, mapserver, and the gdal wms driver



 On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote:
 Luke Roth roth.luke at gmail.com writes:

 
  Another thing that might speed up access is setting the config
 option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment
 variable or on the command line.  That should help with GDAL reading the
 directory each time it opens a dataset.  I have an application which reads
 one value from each of a large number of datasets and setting this option
 made it run about 3 times faster.


 Hi,

 You are right. This config option makes GDAL to skip the reading of the
 remote directory and saves a lot of bandwidth:

 VRT case:
 Bytes Received:  4 244 509 (of which the vrt file: 4 192 577)
 Sequence (clock) duration:00:00:09.9996000
 Was:
 Bytes Received:  6 459 443
 Sequence (clock) duration:00:00:37.813

 BigTIFF case:
 Bytes Received:  2 158 917
 Sequence (clock) duration:00:00:04.4368000
 Was:
 Bytes Received:  4 374 137
 Sequence (clock) duration:00:00:30.9192000


 Conclusion:
 Both options are unsuitable for serious use while amusing to play with.
 Reading the BigTIFF tile offset index (or whatever it is) seems to mean
 about 2 MB of compultory payload traffic. Reading the VRT file means in this
 example 4 MB of payload. If this sort of net access to a large directory of
 raster files should be important for someone there should be a way to find
 the right raster file and righ data range in that file with minimum amount
 of bytes. Perhaps some kind of rtree indexed vrt file? First aid might be to
 keep the vrt file on the client side.

 -Jukka Rahkonen-


 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev



This email (and attachments if any) is intended only for the use of the 
individual or entity to which it is addressed, and may contain information that 
is confidential or privileged and exempt from disclosure under applicable law. 
If the reader of this email is not the intended recipient, or the employee or 
agent responsible for delivering this message to the intended recipient, you 
are hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify the sender immediately by return email and destroy all 
copies of the email (and attachments if any).
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Fast Pixel Access

2014-02-03 Thread Klokan Petr Přidal
Hi Jukka, David and others,

We were rendering in the past the elevation data into Mercator tiles with
http://www.maptiler.com/, the follower of my GDAL2Tiles.py script - for
extremely fast pixel access to the elevation values at given geographic
location - without a need for server software for hosting such data. We
made the tiles for CGIAR 90m DEM for whole world.

In fact the raw elevation was mapped into RGB space for this purpose.
Decoding on the client side is then very easy and you have precise
elevation values for the whole near area preloaded.

You may need to do something similar if you want to implement very fast
client side hill-shading in web browser canvas (similar to
http://dev.klokantech.com/klokan/hillshading/), or dynamic elevation
profile drawn while moving the mouse over a map, and many other tasks with
data loaded in the client directly.

The visualisation of the tiles is possible also in 3D, see:
http://vimeo.com/29605292
For WebGL there are now more efficient ways (direct binary data structures
instead of images).

See:
http://dev.klokantech.com/srtm/srtm_decode.html
http://www.klokantech.com/labs/dem-color-encoding/
http://dev.klokantech.com/srtm/googlemaps.html

There is a source code of the GDAL utility for encoding elevation data into
RGB available online:
https://github.com/webglearth/gdaldem_web
but you may not need it, if direct reading via JavaScript in the web
browser is not required for your application.

Regards,

Klokan Petr Pridal


On Mon, Feb 3, 2014 at 7:48 PM, David Baker (Geoscience) 
david.m.ba...@chk.com wrote:

 Jukka,

 No matter the endpoint the user uses to access the data, behind the
 scenes, there must be fast pixel access, correct?  Or are you saying that
 at WFS would do it quickly out of the box?

 David

 -Original Message-
 From: gdal-dev-boun...@lists.osgeo.org [mailto:
 gdal-dev-boun...@lists.osgeo.org] On Behalf Of Rahkonen Jukka (Tike)
 Sent: Monday, February 03, 2014 11:37 AM
 To: gdal-dev@lists.osgeo.org
 Subject: Re: [gdal-dev] Fast Pixel Access

 Hi,

 Perhaps, but in this game the rule was not to have any GIS servers. Myself
 I would rather consider WFS. It could send heights from single points but
 also a profile along a line or all values within a polygon.

 -Jukka-

 Brian Case [r...@winkey.org] wrote:

  -Jukka

  tileindex, mapserver, and the gdal wms driver



  On Mon, 2014-02-03 at 17:20 +, Jukka Rahkonen wrote:
  Luke Roth roth.luke at gmail.com writes:
 
  
   Another thing that might speed up access is setting the config
  option GDAL_DISABLE_READDIR_ON_OPEN = TRUE, either as an environment
  variable or on the command line.  That should help with GDAL reading the
  directory each time it opens a dataset.  I have an application which
 reads
  one value from each of a large number of datasets and setting this
 option
  made it run about 3 times faster.
 
 
  Hi,
 
  You are right. This config option makes GDAL to skip the reading of the
  remote directory and saves a lot of bandwidth:
 
  VRT case:
  Bytes Received:  4 244 509 (of which the vrt file: 4 192 577)
  Sequence (clock) duration:00:00:09.9996000
  Was:
  Bytes Received:  6 459 443
  Sequence (clock) duration:00:00:37.813
 
  BigTIFF case:
  Bytes Received:  2 158 917
  Sequence (clock) duration:00:00:04.4368000
  Was:
  Bytes Received:  4 374 137
  Sequence (clock) duration:00:00:30.9192000
 
 
  Conclusion:
  Both options are unsuitable for serious use while amusing to play with.
  Reading the BigTIFF tile offset index (or whatever it is) seems to mean
  about 2 MB of compultory payload traffic. Reading the VRT file means in
 this
  example 4 MB of payload. If this sort of net access to a large directory
 of
  raster files should be important for someone there should be a way to
 find
  the right raster file and righ data range in that file with minimum
 amount
  of bytes. Perhaps some kind of rtree indexed vrt file? First aid might
 be to
  keep the vrt file on the client side.
 
  -Jukka Rahkonen-
 
 
  ___
  gdal-dev mailing list
  gdal-dev@lists.osgeo.org
  http://lists.osgeo.org/mailman/listinfo/gdal-dev


 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev

 

 This email (and attachments if any) is intended only for the use of the
 individual or entity to which it is addressed, and may contain information
 that is confidential or privileged and exempt from disclosure under
 applicable law. If the reader of this email is not the intended recipient,
 or the employee or agent responsible for delivering this message to the
 intended recipient, you are hereby notified that any dissemination,
 distribution or copying of this communication is strictly prohibited. If
 you have received this communication in error, please notify the sender
 immediately by return email

Re: [gdal-dev] Fast Pixel Access

2014-02-02 Thread Jukka Rahkonen
Hi,

I made a few tests and here comes my conclusions. Hypothesis is that someone
wants to make a DEM query service which is using gdallocationinfo for
queries and DEM data is to be accessed as files from a standard web site. I
compared three alternatives:
1) There are thousands of DEM files on the server and they are combined
together with a VRT file.
2) There is only one DEM file as BigTIFF.
3) DEM is split into tiles into x/y/z tile directory structure like in
Google maps or OpenStreetMap tiles.

My test data covers Finland with 10 m grid size and as deflate compressed
tiffs they make about 10 GB together.

Before going on, keep in mind that the speed needs indexes. The better
index, the less unnecessary data to read. In case 1) the first level index
is the VRT file. The second level index, if it exists, is in the headers of
the real DEM files. It may be possible to jump to a correct offset from the
beginning of the DEM data and read only a part of the file.  In case 2) the
index is in the internal TIFF directory. If the BigTIFF is tiled the access
to tiles should be rather effectice. And finally in case 3) the index is
built into directory structure and tiling schema that is used for saving the
tiles. The schema is no well known that tile map service clients can
directly ask for a certain file name if they know the coordinates and scale.

Conclusions:

1)
- The whole VRT file must be readed. Caching the vrt file would make next
requests faster.
- For some reason gdallocationinfo wants to get the directory list of the
directory where the vrt file is. This is slow and generates lots of traffic
if the thousands of DEM files are in the same directory. Probably it would
be faster to have them in another dierectory.

2) 
- BigTIFF route is more straight forward but gdallocationinfo needs still to
do many big range reads. 
- Also in this case gdallocationinfo reads the target file directory. It
would be good to keep this directory small. Don't do like I did with having
in the directory the BigTIFF DEM file that was the only file needed, but
also the vrt and thousands of original DEMs from the previuos test - but at
least this is a know this issue now and know how to avoid it. In my case
reading the directory made 2.2 MB of web traffic and all or most for wain.

3)
- I used OpenStreetMap tile service as the test data for the third test. In
this case gdallocationinfo knows exactly which tile to request and it is
making only one request. It also seems to cache some tiles on the client
side which means that queries for close locations may hit the cached tile
and be very fast.

Summary statistics:

1) Gdallocationinfo makes 6 requests and reads 6.4 MB of data
2) Gdallocationinfo makes 8 requests and reads 4.3 MB of data
3) Gdallocationinfo makes 1 requests and reads 10 kB of data

Requests I used are these:

1)
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/
dem10m/dem_10m.vrt -geoloc  389559 6677412
2)
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/
dem10m/dem_10m.tif -geoloc  389559 6677412
3)
gdallocationinfo  frmt_wms_openstreetmap_tms.xml -geoloc  389559 6677412

I know that the queried place in 3) is not the same because SRIDs of data
differ nor does OSM return 16-bit DEM heights but 3-band RGB values instead
but it does not matter here, the idea is what is important.

My conclusion is that you should cut your DEM into tiles with for example
gdal2tiles or MapTiler and the resuld could actually be quit speedy and
perhaps using 126x126 tiles could make it still a bit faster. Hope that they
can create tiles as 16-bit tiffs.

 I am sure that these results are not scientifically sound but I am also
sure that the difference between 6.4 MB/4.3 MB/10 kB is something to think
about especially if you dream about a mobile service. 

I placed the requests which gdallocationinfo made during these tests into
http://latuviitta.org/documents/gdallocationinfo_requests.txt

-Jukka Rahkonen-


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Fast Pixel Access

2014-02-01 Thread Jukka Rahkonen
David Baker (Geoscience david.m.baker at chk.com writes:

 
 
 
 Dev’s,
  
 I have a set of 55,501 bil files in a single directory.  They are DEMS
data that cover the US in 7.5 minute tiles.  I would like to randomly access
elevations at a given lat/lon’s from the whole dataset.  I created a vrt
file from the directory
  of bil files, and have been able to access the elevation at a given
lat/lon using gdallocationinfo, but because of the size of the dataset, this
operation is somewhat slow.  Can the vrt be indexed? Or, is there a faster,
better way to access the pixels?  I
  would first like to do this with the utilities before diving into code
(C#).  The files are regularly named base on their location within a 1
arc-second grid.

I was experimenting with something like a GIS service without a GIS server)
and I have some examples online but because of http connection the speed
comparison does not make sense.

Vrt combining biomass data from 13 single band tif files
gdallocationinfo
/vsicurl/http://latuviitta.kapsi.fi/data/metla/puuston_tilavuus.vrt -geoloc
 389559 6677412

DEM of Finland with 10x10 m grid through vrt
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt
-geoloc  389559 6677412

The same from a single BigTIFF
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.tif
-geoloc  389559 6677412

Feel free to download the originals if you want, they are all made from open
data. Just mention the National Land Survey of Finland, 2013 for the DEM and
Finnish Forest Reserch Institute, 2013 for the biomass data if you publish
data somewhere. The DEM datasets are about 10 GB each (Bigtiff + the
original small ones).

My tiffs have tiles but for this usage where only the value of a single
pixel is interesting striped tiffs could be as fast to read than tiled
tiffs. A trial would tell everything

-Jukka Rahkonen-


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Fast Pixel Access

2014-02-01 Thread David Baker (Geoscience)
Evan,

I am not sure how to profile as I do not have access to the code to profile.  I 
did do a timing test...

vrt file = 22,970 KB
bil file = 35,180 KB * 55,501

I piped five locations from the loc.txt file:
-96.0 36.0
-98.0 37.0
-100.0 38.0
-99.0 39.0
-101.0 35.0

gdallocationinfo -valonly -geoloc intermap.vrt  loc.txt
189.84185791015625.5 sec
384.85745239257822.6 sec
762.01593017578122.9 sec
550.71911621093823.6 sec
883.63702392578122.9 sec

Note: I used a lap timer on my iPhone to capture the split times as the results 
appeared in the console window.  Does this give any insight?

David

-Original Message-
From: gdal-dev-boun...@lists.osgeo.org 
[mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault
Sent: Saturday, February 01, 2014 1:28 AM
To: Brian Case
Cc: gdal-dev@lists.osgeo.org
Subject: Re: [gdal-dev] Fast Pixel Access

Le samedi 01 février 2014 00:23:13, Brian Case a écrit :
 evenr


 what about the use of a tileindex?

You really mean a tileindex as produced by gdaltindex ? Well, that's not
exactly the same beast as a VRT, but yes if it was recognized as a GDAL
dataset then you could potentially save the cost of XML parsing. One could
imagine that the VRT driver would accept a tileindex as an altenate connection
string.

Anyway it would be interesting to first profile where the time is spent in David
use case. If it's in the XML parsing, then I can't see what could be easily
improved in that area. If it's the intersection, then there's potential for
improvement.

 seems an intersection with a set of
 polys first would be quick



 brian

 On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote:
  Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit :
   Dev's,
  
   I have a set of 55,501 bil files in a single directory.  They are DEMS
   data that cover the US in 7.5 minute tiles.  I would like to randomly
   access elevations at a given lat/lon's from the whole dataset.  I
   created a vrt file from the directory of bil files, and have been able
   to access the elevation at a given lat/lon using gdallocationinfo, but
   because of the size of the dataset, this operation is somewhat slow.
   Can the vrt be indexed?
 
  No, it isn't currently, although I think it could be improved to have a
  in- memory index with moderate effort.
 
  But are you sure the slowness is due to the lack of index ? 55,000 is a
  big number, but not that big. Maybe the slowness just comes from the
  opening time (XML parsing) of such a big VRT. That would need to be
  profiled to be sure where the bottleneck is.
 
   Or, is there a faster, better way to access the pixels?  I would
   first like to do this with the utilities before diving into code (C#).
   The files are regularly named base on their location within a 1
   arc-second grid.
  
   Thanks,
   David
  
   David M. Baker
   Senior Advisor - Geoscience Technology
   Chesapeake Energy Corporation
   david.m.ba...@chk.commailto:david.m.ba...@chk.com
  
  
   
  
   This email (and attachments if any) is intended only for the use of the
   individual or entity to which it is addressed, and may contain
   information that is confidential or privileged and exempt from
   disclosure under applicable law. If the reader of this email is not
   the intended recipient, or the employee or agent responsible for
   delivering this message to the intended recipient, you are hereby
   notified that any dissemination, distribution or copying of this
   communication is strictly prohibited. If you have received this
   communication in error, please notify the sender immediately by return
   email and destroy all copies of the email (and attachments if any).

--
Geospatial professional services
http://even.rouault.free.fr/services.html
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev



This email (and attachments if any) is intended only for the use of the 
individual or entity to which it is addressed, and may contain information that 
is confidential or privileged and exempt from disclosure under applicable law. 
If the reader of this email is not the intended recipient, or the employee or 
agent responsible for delivering this message to the intended recipient, you 
are hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify the sender immediately by return email and destroy all 
copies of the email (and attachments if any).
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Fast Pixel Access

2014-02-01 Thread David Baker (Geoscience)
Jukka,

Jukka wrote:
I was experimenting with something like a GIS service without a GIS server)
and I have some examples online but...

I am looking to do as you have, a RESTful service to query the elevation at a 
given location.  This will be used to in a DQM process as well as a geologic 
application that needs the elevation of a proposed wellsite for data mining.  
In both cases, 1000's if not tens of 1000's of calls will be made so 
performance is an issue.

David


-Original Message-
From: gdal-dev-boun...@lists.osgeo.org 
[mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Jukka Rahkonen
Sent: Saturday, February 01, 2014 7:09 AM
To: gdal-dev@lists.osgeo.org
Subject: Re: [gdal-dev] Fast Pixel Access

David Baker (Geoscience david.m.baker at chk.com writes:




 Dev’s,

 I have a set of 55,501 bil files in a single directory.  They are DEMS
data that cover the US in 7.5 minute tiles.  I would like to randomly access
elevations at a given lat/lon’s from the whole dataset.  I created a vrt
file from the directory
  of bil files, and have been able to access the elevation at a given
lat/lon using gdallocationinfo, but because of the size of the dataset, this
operation is somewhat slow.  Can the vrt be indexed? Or, is there a faster,
better way to access the pixels?  I
  would first like to do this with the utilities before diving into code
(C#).  The files are regularly named base on their location within a 1
arc-second grid.

I was experimenting with something like a GIS service without a GIS server)
and I have some examples online but because of http connection the speed
comparison does not make sense.

Vrt combining biomass data from 13 single band tif files
gdallocationinfo
/vsicurl/http://latuviitta.kapsi.fi/data/metla/puuston_tilavuus.vrt -geoloc
 389559 6677412

DEM of Finland with 10x10 m grid through vrt
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt
-geoloc  389559 6677412

The same from a single BigTIFF
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.tif
-geoloc  389559 6677412

Feel free to download the originals if you want, they are all made from open
data. Just mention the National Land Survey of Finland, 2013 for the DEM and
Finnish Forest Reserch Institute, 2013 for the biomass data if you publish
data somewhere. The DEM datasets are about 10 GB each (Bigtiff + the
original small ones).

My tiffs have tiles but for this usage where only the value of a single
pixel is interesting striped tiffs could be as fast to read than tiled
tiffs. A trial would tell everything

-Jukka Rahkonen-


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev



This email (and attachments if any) is intended only for the use of the 
individual or entity to which it is addressed, and may contain information that 
is confidential or privileged and exempt from disclosure under applicable law. 
If the reader of this email is not the intended recipient, or the employee or 
agent responsible for delivering this message to the intended recipient, you 
are hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify the sender immediately by return email and destroy all 
copies of the email (and attachments if any).
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Fast Pixel Access

2014-02-01 Thread Norman Vine
This is an application that is just screaming for a spatial index

For starters you could build a spatialite db of the individual file extents
that returned the filename to pass to gdallocationinfo

On Feb 1, 2014, at 9:14 AM, David Baker (Geoscience) david.m.ba...@chk.com 
wrote:

 Jukka,
 
 Jukka wrote:
 I was experimenting with something like a GIS service without a GIS server)
 and I have some examples online but...
 
 I am looking to do as you have, a RESTful service to query the elevation at a 
 given location.  This will be used to in a DQM process as well as a geologic 
 application that needs the elevation of a proposed wellsite for data mining.  
 In both cases, 1000's if not tens of 1000's of calls will be made so 
 performance is an issue.
 
 David
 
 
 -Original Message-
 From: gdal-dev-boun...@lists.osgeo.org 
 [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Jukka Rahkonen
 Sent: Saturday, February 01, 2014 7:09 AM
 To: gdal-dev@lists.osgeo.org
 Subject: Re: [gdal-dev] Fast Pixel Access
 
 David Baker (Geoscience david.m.baker at chk.com writes:
 
 
 
 
 Dev’s,
 
 I have a set of 55,501 bil files in a single directory.  They are DEMS
 data that cover the US in 7.5 minute tiles.  I would like to randomly access
 elevations at a given lat/lon’s from the whole dataset.  I created a vrt
 file from the directory
 of bil files, and have been able to access the elevation at a given
 lat/lon using gdallocationinfo, but because of the size of the dataset, this
 operation is somewhat slow.  Can the vrt be indexed? Or, is there a faster,
 better way to access the pixels?  I
 would first like to do this with the utilities before diving into code
 (C#).  The files are regularly named base on their location within a 1
 arc-second grid.
 
 I was experimenting with something like a GIS service without a GIS server)
 and I have some examples online but because of http connection the speed
 comparison does not make sense.
 
 Vrt combining biomass data from 13 single band tif files
 gdallocationinfo
 /vsicurl/http://latuviitta.kapsi.fi/data/metla/puuston_tilavuus.vrt -geoloc
 389559 6677412
 
 DEM of Finland with 10x10 m grid through vrt
 gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt
 -geoloc  389559 6677412
 
 The same from a single BigTIFF
 gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.tif
 -geoloc  389559 6677412
 
 Feel free to download the originals if you want, they are all made from open
 data. Just mention the National Land Survey of Finland, 2013 for the DEM and
 Finnish Forest Reserch Institute, 2013 for the biomass data if you publish
 data somewhere. The DEM datasets are about 10 GB each (Bigtiff + the
 original small ones).
 
 My tiffs have tiles but for this usage where only the value of a single
 pixel is interesting striped tiffs could be as fast to read than tiled
 tiffs. A trial would tell everything
 
 -Jukka Rahkonen-
 
 
 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev
 
 
 
 This email (and attachments if any) is intended only for the use of the 
 individual or entity to which it is addressed, and may contain information 
 that is confidential or privileged and exempt from disclosure under 
 applicable law. If the reader of this email is not the intended recipient, or 
 the employee or agent responsible for delivering this message to the intended 
 recipient, you are hereby notified that any dissemination, distribution or 
 copying of this communication is strictly prohibited. If you have received 
 this communication in error, please notify the sender immediately by return 
 email and destroy all copies of the email (and attachments if any).
 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Fast Pixel Access

2014-02-01 Thread David Baker (Geoscience)
Norman,

Yes it does...  At first I am looking to see if I can do this off the shelf 
with just the tools in the GDAL/ORG toolset.  Are you thinking of using OGR to 
do the spatial query with the spatialite db?  Would a .qix indexed tile index 
shapefile work?

David


From: Norman Vine [mailto:n...@cape.com]
Sent: Saturday, February 01, 2014 8:23 AM
To: David Baker (Geoscience)
Cc: gdal-dev
Subject: Re: [gdal-dev] Fast Pixel Access

This is an application that is just screaming for a spatial index

For starters you could build a spatialite db of the individual file extents
that returned the filename to pass to gdallocationinfo

On Feb 1, 2014, at 9:14 AM, David Baker (Geoscience) 
david.m.ba...@chk.commailto:david.m.ba...@chk.com wrote:


Jukka,

Jukka wrote:

I was experimenting with something like a GIS service without a GIS server)
and I have some examples online but...

I am looking to do as you have, a RESTful service to query the elevation at a 
given location.  This will be used to in a DQM process as well as a geologic 
application that needs the elevation of a proposed wellsite for data mining.  
In both cases, 1000's if not tens of 1000's of calls will be made so 
performance is an issue.

David


-Original Message-
From: gdal-dev-boun...@lists.osgeo.orgmailto:gdal-dev-boun...@lists.osgeo.org 
[mailto:gdal-dev-boun...@lists.osgeo.orgmailto:dev-boun...@lists.osgeo.org] 
On Behalf Of Jukka Rahkonen
Sent: Saturday, February 01, 2014 7:09 AM
To: gdal-dev@lists.osgeo.orgmailto:gdal-dev@lists.osgeo.org
Subject: Re: [gdal-dev] Fast Pixel Access

David Baker (Geoscience david.m.baker at chk.comhttp://chk.com writes:





Dev's,

I have a set of 55,501 bil files in a single directory.  They are DEMS
data that cover the US in 7.5 minute tiles.  I would like to randomly access
elevations at a given lat/lon's from the whole dataset.  I created a vrt
file from the directory

of bil files, and have been able to access the elevation at a given
lat/lon using gdallocationinfo, but because of the size of the dataset, this
operation is somewhat slow.  Can the vrt be indexed? Or, is there a faster,
better way to access the pixels?  I

would first like to do this with the utilities before diving into code
(C#).  The files are regularly named base on their location within a 1
arc-second grid.

I was experimenting with something like a GIS service without a GIS server)
and I have some examples online but because of http connection the speed
comparison does not make sense.

Vrt combining biomass data from 13 single band tif files
gdallocationinfo
/vsicurl/http://latuviitta.kapsi.fi/data/metla/puuston_tilavuus.vrt -geoloc
389559 6677412

DEM of Finland with 10x10 m grid through vrt
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt
-geoloc  389559 6677412

The same from a single BigTIFF
gdallocationinfo /vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.tif
-geoloc  389559 6677412

Feel free to download the originals if you want, they are all made from open
data. Just mention the National Land Survey of Finland, 2013 for the DEM and
Finnish Forest Reserch Institute, 2013 for the biomass data if you publish
data somewhere. The DEM datasets are about 10 GB each (Bigtiff + the
original small ones).

My tiffs have tiles but for this usage where only the value of a single
pixel is interesting striped tiffs could be as fast to read than tiled
tiffs. A trial would tell everything

-Jukka Rahkonen-


___
gdal-dev mailing list
gdal-dev@lists.osgeo.orgmailto:gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev



This email (and attachments if any) is intended only for the use of the 
individual or entity to which it is addressed, and may contain information that 
is confidential or privileged and exempt from disclosure under applicable law. 
If the reader of this email is not the intended recipient, or the employee or 
agent responsible for delivering this message to the intended recipient, you 
are hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify the sender immediately by return email and destroy all 
copies of the email (and attachments if any).
___
gdal-dev mailing list
gdal-dev@lists.osgeo.orgmailto:gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev




This email (and attachments if any) is intended only for the use of the 
individual or entity to which it is addressed, and may contain information that 
is confidential or privileged and exempt from disclosure under applicable law. 
If the reader of this email is not the intended recipient, or the employee or 
agent responsible for delivering this message to the intended recipient, you 
are hereby notified that any

Re: [gdal-dev] Fast Pixel Access

2014-02-01 Thread Jukka Rahkonen
David Baker (Geoscience david.m.baker at chk.com writes:

 
 Jukka,
 
 Jukka wrote:
 I was experimenting with something like a GIS service without a GIS server)
 and I have some examples online but...
 
 I am looking to do as you have, a RESTful service to query the elevation
at a given location.  This will be used
 to in a DQM process as well as a geologic application that needs the
elevation of a proposed wellsite for
 data mining.  In both cases, 1000's if not tens of 1000's of calls will be
made so performance is an issue.

I don't believe that gdallocationinfo is the right thing for you. There are
only 60 seconds in a minute and our experiments show that each request takes
seconds or even tens of seconds. Your users will hang you.

Now what to do instead? A WMS/WCS service can send a piece of DEM with
thousands of pixels as GeoTIFF in a second and a heavy client like QGIS
could continue the analysis. Or you can use the existing vrt file and read
the region of interest with gdal_translate. This request is not very fast
either but it brings you one million height values faster than possible
users could find the rope
gdal_translate -of GTiff -srcwin 1 1 1000 1000
/vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt test_1000by1000.tif

Or then WFS service could be used to send only thos DEM cells which
intersect the requested point/line/area. That could be nice for further
processing  but in that case your DEM should be converted into points or
polygons. I wonder if PostGIS raster has something to give for you. Or
perhaps there is some other clever way to do the job with light and powerful
tools. Ideas are welcome.

-Jukka Rahkonen-

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Fast Pixel Access

2014-02-01 Thread Joaquim Luis



Now what to do instead? A WMS/WCS service can send a piece of DEM with
thousands of pixels as GeoTIFF in a second and a heavy client like QGIS
could continue the analysis. Or you can use the existing vrt file and read
the region of interest with gdal_translate. This request is not very fast
either but it brings you one million height values faster than possible
users could find the rope
gdal_translate -of GTiff -srcwin 1 1 1000 1000
/vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt test_1000by1000.tif

Or then WFS service could be used to send only thos DEM cells which
intersect the requested point/line/area. That could be nice for further
processing  but in that case your DEM should be converted into points or
polygons. I wonder if PostGIS raster has something to give for you. Or
perhaps there is some other clever way to do the job with light and powerful
tools. Ideas are welcome.


Funny, I had tried this idea with GMT, but took an awful long time ~34 s 
while the gdallocationinfo approach took ~14 secs


echo 389559 6677412 | grdtrack 
-G/vsicurl/http://latuviitta.kapsi.fi/data/dem10m/dem_10m.vrt 
-R389000/39/6677400/6678000


I have idea why through GMT it takes longer since it's still GDAL who 
gets the data from the remote location. And furthermore I require only a 
small grid chunk (the -R... limits)


Joaquim

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Fast Pixel Access

2014-02-01 Thread Daniel Clewley
Hi David,

Following on from the VRT / Bigtiff comparison Jukka posted. Have you 
considered storing the data as a single KEA format file, which is based on HDF5?

I have the National Elevation Dataset for the US, which comprises 3,605 1 x 1 
degree tiles at 1 arc sec resolution. I first created a VRT then used 
gdal_translate to convert to a KEA file. Total size is 421,212 x 252,012 
pixels, 77 GB. I also built overviews for a fast display (this took a long time 
and I don't think is needed for your case).

I've just tried using gdallocationinfo on the file to get pixel information and 
it takes  0.5s to get the pixel value back. 

The KEA library and GDAL driver source are available from:

https://bitbucket.org/chchrsc/kealib/

and the format is described in:

Peter Bunting, Sam Gillingham, The KEA image file format, Computers  
Geosciences, Volume 57, August 2013, Pages 54-58, ISSN 0098-3004, 
http://dx.doi.org/10.1016/j.cageo.2013.03.025.

If you don't mind having one massive file (in addition to the individual tiles, 
which could be archived), this might work for your use case. 

Thanks,

Dan

 
 Message: 10
 Date: Fri, 31 Jan 2014 16:15:53 +
 From: David Baker (Geoscience) david.m.ba...@chk.com
 To: 'gdal-dev@lists.osgeo.org' gdal-dev@lists.osgeo.org
 Subject: [gdal-dev] Fast Pixel Access
 Message-ID:
   2a18a4344312134b937df938d992264a0508f...@okcexhprd122.chkenergy.net
 Content-Type: text/plain; charset=us-ascii
 
 Dev's,
 
 I have a set of 55,501 bil files in a single directory.  They are DEMS data 
 that cover the US in 7.5 minute tiles.  I would like to randomly access 
 elevations at a given lat/lon's from the whole dataset.  I created a vrt file 
 from the directory of bil files, and have been able to access the elevation 
 at a given lat/lon using gdallocationinfo, but because of the size of the 
 dataset, this operation is somewhat slow.  Can the vrt be indexed? Or, is 
 there a faster, better way to access the pixels?  I would first like to do 
 this with the utilities before diving into code (C#).  The files are 
 regularly named base on their location within a 1 arc-second grid.
 
 Thanks,
 David
 
 David M. Baker
 Senior Advisor - Geoscience Technology
 Chesapeake Energy Corporation
 david.m.ba...@chk.commailto:david.m.ba...@chk.com
 
 
 

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


[gdal-dev] Fast Pixel Access

2014-01-31 Thread David Baker (Geoscience)
Dev's,

I have a set of 55,501 bil files in a single directory.  They are DEMS data 
that cover the US in 7.5 minute tiles.  I would like to randomly access 
elevations at a given lat/lon's from the whole dataset.  I created a vrt file 
from the directory of bil files, and have been able to access the elevation at 
a given lat/lon using gdallocationinfo, but because of the size of the dataset, 
this operation is somewhat slow.  Can the vrt be indexed? Or, is there a 
faster, better way to access the pixels?  I would first like to do this with 
the utilities before diving into code (C#).  The files are regularly named base 
on their location within a 1 arc-second grid.

Thanks,
David

David M. Baker
Senior Advisor - Geoscience Technology
Chesapeake Energy Corporation
david.m.ba...@chk.commailto:david.m.ba...@chk.com




This email (and attachments if any) is intended only for the use of the 
individual or entity to which it is addressed, and may contain information that 
is confidential or privileged and exempt from disclosure under applicable law. 
If the reader of this email is not the intended recipient, or the employee or 
agent responsible for delivering this message to the intended recipient, you 
are hereby notified that any dissemination, distribution or copying of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify the sender immediately by return email and destroy all 
copies of the email (and attachments if any).
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Fast Pixel Access

2014-01-31 Thread Even Rouault
Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit :
 Dev's,
 
 I have a set of 55,501 bil files in a single directory.  They are DEMS data
 that cover the US in 7.5 minute tiles.  I would like to randomly access
 elevations at a given lat/lon's from the whole dataset.  I created a vrt
 file from the directory of bil files, and have been able to access the
 elevation at a given lat/lon using gdallocationinfo, but because of the
 size of the dataset, this operation is somewhat slow.  Can the vrt be
 indexed?

No, it isn't currently, although I think it could be improved to have a in-
memory index with moderate effort.

But are you sure the slowness is due to the lack of index ? 55,000 is a big 
number, but not that big. Maybe the slowness just comes from the opening time 
(XML parsing) of such a big VRT. That would need to be profiled to be sure 
where the bottleneck is.

 Or, is there a faster, better way to access the pixels?  I would
 first like to do this with the utilities before diving into code (C#). 
 The files are regularly named base on their location within a 1 arc-second
 grid.
 
 Thanks,
 David
 
 David M. Baker
 Senior Advisor - Geoscience Technology
 Chesapeake Energy Corporation
 david.m.ba...@chk.commailto:david.m.ba...@chk.com
 
 
 
 
 This email (and attachments if any) is intended only for the use of the
 individual or entity to which it is addressed, and may contain information
 that is confidential or privileged and exempt from disclosure under
 applicable law. If the reader of this email is not the intended recipient,
 or the employee or agent responsible for delivering this message to the
 intended recipient, you are hereby notified that any dissemination,
 distribution or copying of this communication is strictly prohibited. If
 you have received this communication in error, please notify the sender
 immediately by return email and destroy all copies of the email (and
 attachments if any).

-- 
Geospatial professional services
http://even.rouault.free.fr/services.html
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Fast Pixel Access

2014-01-31 Thread Brian Case
evenr


what about the use of a tileindex? seems an intersection with a set of
polys first would be quick

brian





On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote:
 Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit :
  Dev's,
  
  I have a set of 55,501 bil files in a single directory.  They are DEMS data
  that cover the US in 7.5 minute tiles.  I would like to randomly access
  elevations at a given lat/lon's from the whole dataset.  I created a vrt
  file from the directory of bil files, and have been able to access the
  elevation at a given lat/lon using gdallocationinfo, but because of the
  size of the dataset, this operation is somewhat slow.  Can the vrt be
  indexed?
 
 No, it isn't currently, although I think it could be improved to have a in-
 memory index with moderate effort.
 
 But are you sure the slowness is due to the lack of index ? 55,000 is a big 
 number, but not that big. Maybe the slowness just comes from the opening time 
 (XML parsing) of such a big VRT. That would need to be profiled to be sure 
 where the bottleneck is.
 
  Or, is there a faster, better way to access the pixels?  I would
  first like to do this with the utilities before diving into code (C#). 
  The files are regularly named base on their location within a 1 arc-second
  grid.
  
  Thanks,
  David
  
  David M. Baker
  Senior Advisor - Geoscience Technology
  Chesapeake Energy Corporation
  david.m.ba...@chk.commailto:david.m.ba...@chk.com
  
  
  
  
  This email (and attachments if any) is intended only for the use of the
  individual or entity to which it is addressed, and may contain information
  that is confidential or privileged and exempt from disclosure under
  applicable law. If the reader of this email is not the intended recipient,
  or the employee or agent responsible for delivering this message to the
  intended recipient, you are hereby notified that any dissemination,
  distribution or copying of this communication is strictly prohibited. If
  you have received this communication in error, please notify the sender
  immediately by return email and destroy all copies of the email (and
  attachments if any).
 
 -- 
 Geospatial professional services
 http://even.rouault.free.fr/services.html
 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Fast Pixel Access

2014-01-31 Thread Even Rouault
Le samedi 01 février 2014 00:23:13, Brian Case a écrit :
 evenr
 
 
 what about the use of a tileindex?

You really mean a tileindex as produced by gdaltindex ? Well, that's not 
exactly the same beast as a VRT, but yes if it was recognized as a GDAL 
dataset then you could potentially save the cost of XML parsing. One could 
imagine that the VRT driver would accept a tileindex as an altenate connection 
string.

Anyway it would be interesting to first profile where the time is spent in 
David 
use case. If it's in the XML parsing, then I can't see what could be easily 
improved in that area. If it's the intersection, then there's potential for 
improvement.

 seems an intersection with a set of
 polys first would be quick


 
 brian
 
 On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote:
  Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit :
   Dev's,
   
   I have a set of 55,501 bil files in a single directory.  They are DEMS
   data that cover the US in 7.5 minute tiles.  I would like to randomly
   access elevations at a given lat/lon's from the whole dataset.  I
   created a vrt file from the directory of bil files, and have been able
   to access the elevation at a given lat/lon using gdallocationinfo, but
   because of the size of the dataset, this operation is somewhat slow. 
   Can the vrt be indexed?
  
  No, it isn't currently, although I think it could be improved to have a
  in- memory index with moderate effort.
  
  But are you sure the slowness is due to the lack of index ? 55,000 is a
  big number, but not that big. Maybe the slowness just comes from the
  opening time (XML parsing) of such a big VRT. That would need to be
  profiled to be sure where the bottleneck is.
  
   Or, is there a faster, better way to access the pixels?  I would
   first like to do this with the utilities before diving into code (C#).
   The files are regularly named base on their location within a 1
   arc-second grid.
   
   Thanks,
   David
   
   David M. Baker
   Senior Advisor - Geoscience Technology
   Chesapeake Energy Corporation
   david.m.ba...@chk.commailto:david.m.ba...@chk.com
   
   
   
   
   This email (and attachments if any) is intended only for the use of the
   individual or entity to which it is addressed, and may contain
   information that is confidential or privileged and exempt from
   disclosure under applicable law. If the reader of this email is not
   the intended recipient, or the employee or agent responsible for
   delivering this message to the intended recipient, you are hereby
   notified that any dissemination, distribution or copying of this
   communication is strictly prohibited. If you have received this
   communication in error, please notify the sender immediately by return
   email and destroy all copies of the email (and attachments if any).

-- 
Geospatial professional services
http://even.rouault.free.fr/services.html
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev