Re: [gdal-dev] TOC sub-dataset retrieving when frame with same coverage share the same boundary rectangle.
Hi A RPF structure (CADRG, CIB, ECRG etc) may include any combination of scalelevels and/or geographic zones as referenced in the A.TOC file. So there may be both scale-overlaps and zone-overlaps. There is also a frame-based update concept but this is seldom used. An example from a CADRG dataset containing 14 subdatasets: SUBDATASET_6_NAME=NITF_TOC_ENTRY:CADRG_TFC_250K_6_5:A.TOC SUBDATASET_6_DESC=CADRG:TFC:Transit Flying Chart (UK):250K:6:5 SUBDATASET_7_NAME=NITF_TOC_ENTRY:CADRG_LFC-FR_(Day)_500K_1_6:A.TOC SUBDATASET_7_DESC=CADRG:LFC-FR (Day):Low Flying Chart (Day) - Host Nation:500K:1:6 This means that you must analyze the individual subdatasets to extract/combine info for each scalelevel individually. To my knowledge, there is no consistent naming principle in the standard - so the scalelevels are just logical names in the A.TOC file. There should be overlaps between the zones but this data content shall be identical. Best Regards Andreas Oxenstierna Selon Nicolas G nicolas.g-...@outlook.fr: hi yes you have well described the behaviour of the driver. The subdatasets match the boundary rectangles And it is very odd to have two frames with same col row coordinates for the same boundary rectangle. in your case you have to directly open the nitf tile instead of the a.toc Hi, I?m using GDAL to get the CADRG database sub-dataset list. As an example (TOC file content): - Only one boundary rectangle defined in ?boundary rectangle section?, - In ?frame file index section?, I have two frames covering the same area (same resolution, ?, but different functional content), - These two frames are stored in different path but all referring to the same boundary rectangle. Then in that case, the message Frame entry(%d,%d) for frame file index %d was already found. is raised by GDAL when loading the TOC file, and finally only one sub-dataset is given back to me(the first one of the TOC file) - method ?RPFTOCReadFromBuffer()? of Rpftocfile.cpp. It seems that GDAL uses boundary rectangles and frame files to build the sub-dataset list and if, for a same boundary rectangle, two or more frames with the same coverage belong to different path then only one GDAL sub-dataset is created. Does someone can help me and explain me how to access the data of these two frames files (same coverage, same boundary rectangle but different path) through GDAL API? Or is there any development on this subject? Thanks, Nicolas ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] gdalwarp EPSG:32662 problem
2014-02-09 8:40 GMT+02:00 Andre Joost andre+jo...@nurfuerspam.de: Am 09.02.2014 00:42, schrieb Even Rouault: But Antti guess seems right. Instead of +ellps=WGS84 (or +datum=WGS84), if you play with the a (semi-major axis) and b (semi-minor axis) parameters, you can see that only +a has an influence, so latest proj version seems to use a spherical version of eqc. If you look at http://trac.osgeo.org/proj/browser/trunk/proj/src/PJ_eqc.c and the chapter 12 of Snyders manual, you will only find formulas for the sphere. So I guess there is no other way to calculate eqc. Maybe older versions calculated another radius for the sphere when an ellipsoid was given. Stephen's shift was about 20km south, which correlates quite well if you use semi-minor axis of WGS84 as radius of sphere while calculating forward, and semi-major axis as radius of sphere while calculating the inverse. At latitude of 55 degrees the difference is ca. 20 530 meters (55 degrees - 54.8156 degrees). There are several different radii of the Earth, and some of them could arguably be used in this context in place of semi-major axis. All radii ar not created equal: http://en.wikipedia.org/wiki/Earth_radius You have to bear in mind that this projection is intended for small scale mapping, for example mapping the whole world. In that scale 20 km is nothing. If you need better representation of the Earth you have to use a projection which takes ellipsoidal properties into account. Of course the beef in this thread is not about choosing a projection, but the change/difference in formulae used, which can create problems as Stephen pointed out. Cheers, Antti ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] WMS Issues Two Requests
Unfortunately. Even, 'that's life' in my customers world was too slow. Now I understand what the WMS driver is doing in terms of BlockSize and DataWindow, I can work round the issue for the time being by requesting a data window sized block in a single request rather than lots of smaller blocks in several requests. This has effectively speeded up the user interaction from about 5 secs to 1-2 secs, an acceptable delay when navigating around a map. Thanks Martin -- View this message in context: http://osgeo-org.1560.x6.nabble.com/Re-WMS-Issues-Two-Requests-tp5102423p5102848.html Sent from the GDAL - Dev mailing list archive at Nabble.com. ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] TOC sub-dataset retrieving when frame with same coverage share the same boundary rectangle.
Hi Andreas, The fact is that I can't use the GDAL API to get the different sub-datasets in this particular case, so can't make any combination of frames without going directly to all NITF frames as suggested by Even. Regarding your answer, maybe the meaning of Zone and Scale is not clear for me: Zone = Arc Zone?Scale = Logical name (as in your example LFC, TFC...)? but not really the resolution. For me, the subdataset could be a directory in RPF directory tree, as each directory is generated according to different parameters. You normally have one directory (path) by [Arc Zone/Scale Logical Name (LFC, TFC...)] couple (I don't take into account the fact that a directory can be splitted if frames files are not contiguous). The problem appears when two [Arc Zone/Scale Logical Name (LFC, TFC...)] contain data at the same resolution, so with the same tiling, so potentially with frames at the same col/row. As you have written There should be overlaps between the zones but this data content shall be identical. So the following example is not really possible? Example: I have one boundary rectangle which belongs to two differents paths (but for the same coverage (same Arc zone)) by example a path with LFC-FR_(Night) content and a path with LFC-FR_(Day) content (all at the same resolution). For each frame coverage I have one frame in LFC-FR_(Night) path and one frame in LFC-FR_(Day) path (so different paths and frame names and extensions), then GDAL reports to me only one Sub-dataset, and then I can't access through GDAL Subdataset API the data in the two paths. I need to analyse the TOC file in my App to get the paths, boundaries... and then access the frame files. Nico Date: Mon, 10 Feb 2014 09:27:03 +0100 From: a...@t-kartor.se To: even.roua...@mines-paris.org; nicolas.g-...@outlook.fr CC: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] TOC sub-dataset retrieving when frame with same coverage share the same boundary rectangle. Hi A RPF structure (CADRG, CIB, ECRG etc) may include any combination of scalelevels and/or geographic zones as referenced in the A.TOC file. So there may be both scale-overlaps and zone-overlaps. There is also a frame-based update concept but this is seldom used. An example from a CADRG dataset containing 14 subdatasets: Corps du message SUBDATASET_6_NAME=NITF_TOC_ENTRY:CADRG_TFC_250K_6_5:A.TOC SUBDATASET_6_DESC=CADRG:TFC:Transit Flying Chart (UK):250K:6:5 SUBDATASET_7_NAME=NITF_TOC_ENTRY:CADRG_LFC-FR_(Day)_500K_1_6:A.TOC SUBDATASET_7_DESC=CADRG:LFC-FR (Day):Low Flying Chart (Day) - Host Nation:500K:1:6 This means that you must analyze the individual subdatasets to extract/combine info for each scalelevel individually. To my knowledge, there is no consistent naming principle in the standard - so the scalelevels are just logical names in the A.TOC file. There should be overlaps between the zones but this data content shall be identical. Best Regards Andreas Oxenstierna Selon Nicolas G nicolas.g-...@outlook.fr: hi yes you have well described the behaviour of the driver. The subdatasets match the boundary rectangles And it is very odd to have two frames with same col row coordinates for the same boundary rectangle. in your case you have to directly open the nitf tile instead of the a.toc Hi, I�m using GDAL to get the CADRG database sub-dataset list. As an example (TOC file content): - Only one boundary rectangle defined in �boundary rectangle section�, - In �frame file index section�, I have two frames covering the same area (same resolution, �, but different functional content), - These two frames are stored in different path but all referring to the same boundary rectangle. Then in that case, the message Frame entry(%d,%d) for frame file index %d was already found. is raised by GDAL when loading the TOC file, and finally only one sub-dataset is given back to me(the first one of the TOC file) - method �RPFTOCReadFromBuffer()� of Rpftocfile.cpp. It seems that GDAL uses boundary rectangles and frame files to build the sub-dataset list and if, for a same boundary rectangle, two or more frames with the same coverage belong to different path then only one GDAL sub-dataset is created. Does someone can help me and explain me how to access the data of these two frames files (same coverage, same boundary rectangle but different path) through GDAL API? Or is there any development on this subject? Thanks, Nicolas ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] WMS Issues Two Requests
Selon Martin Lewis mle...@ngms.eu.com: Unfortunately. Even, 'that's life' in my customers world was too slow. I didn't imply that you had to bear poor performance. For higher performance, other protocols, tile-oriented and cache friendly, are more appropriate such as WMTS, TMS. Now I understand what the WMS driver is doing in terms of BlockSize and DataWindow, I can work round the issue for the time being by requesting a data window sized block in a single request rather than lots of smaller blocks in several requests. This has effectively speeded up the user interaction from about 5 secs to 1-2 secs, an acceptable delay when navigating around a map. Thanks Martin -- View this message in context: http://osgeo-org.1560.x6.nabble.com/Re-WMS-Issues-Two-Requests-tp5102423p5102848.html Sent from the GDAL - Dev mailing list archive at Nabble.com. ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Even, No not an i386... A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2 cores with 12.0GB. Thought the data is on the network, not local, with 1Gbps access. The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly increase the speed. Does the BIL driver read the whole file into memory first? Might a direct read be faster? And Even, please excuse my ignorance, but what is gdb? I really would like to do the profiling. David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Sunday, February 09, 2014 6:36 AM To: David Baker (Geoscience) Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit : Evan, I am not sure how to profile as I do not have access to the code to profile. I did do a timing test... vrt file = 22,970 KB bil file = 35,180 KB * 55,501 I piped five locations from the loc.txt file: -96.0 36.0 -98.0 37.0 -100.0 38.0 -99.0 39.0 -101.0 35.0 gdallocationinfo -valonly -geoloc intermap.vrt loc.txt 189.84185791015625.5 sec 384.85745239257822.6 sec 762.01593017578122.9 sec 550.71911621093823.6 sec 883.63702392578122.9 sec Note: I used a lap timer on my iPhone to capture the split times as the results appeared in the console window. Does this give any insight? Woo I agree that's utterly slow ! When you mentionned slow I thought it was more in the order of 0.1 second ! We can already exclude the parsing time of the VRT since you do that in the same gdallocationinfo session and that there will be just one parsing. And I can't believe that the intersection test for 55 000 rectangles takes ~ 20 seconds, unless you have an old i386 at 5 MHz ;-) My usual way of profiling stuff that is slow in the order of more than one second is to run under gdb, break with Ctrl+C, display the stack trace, continue the run, break again, display the stack trace, etc.. If you end up breaking in the same function, then you've found the bottleneck. I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was suggested and seems to improve things significantly. Perhaps we should try to cache the result of the initial readdir so it can benefits to later attempts, but I haven't checked how easily that could be miplemented. Or perhaps we should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it causes problem from time to time. But generally filesystems don't behave very well when there are a lot of files in the same directory. You'd better organizing your tiles in subdirectories. But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could try the above suggestion to identify where the time is spent. Even David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent: Saturday, February 01, 2014 1:28 AM To: Brian Case Cc: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 00:23:13, Brian Case a écrit : evenr what about the use of a tileindex? You really mean a tileindex as produced by gdaltindex ? Well, that's not exactly the same beast as a VRT, but yes if it was recognized as a GDAL dataset then you could potentially save the cost of XML parsing. One could imagine that the VRT driver would accept a tileindex as an altenate connection string. Anyway it would be interesting to first profile where the time is spent in David use case. If it's in the XML parsing, then I can't see what could be easily improved in that area. If it's the intersection, then there's potential for improvement. seems an intersection with a set of polys first would be quick brian On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote: Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit : Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and have been able to access the elevation at a given lat/lon using gdallocationinfo, but because of the size of the dataset, this operation is somewhat slow. Can the vrt be indexed? No, it isn't currently, although I think it could be improved to have a in- memory index with moderate effort. But are you sure the slowness is due to the lack of index ? 55,000 is a big number, but not that big. Maybe the slowness just comes from the opening time (XML parsing) of such a big VRT. That would need to be profiled to be sure where the bottleneck is. Or, is there a faster, better way to access the pixels? I would first like to do this with the utilities before diving into code (C#). The
Re: [gdal-dev] Fast Pixel Access
Selon David Baker (Geoscience) david.m.ba...@chk.com: Even, No not an i386... A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2 cores with 12.0GB. Thought the data is on the network, not local, with 1Gbps access. The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly increase the speed. Does the BIL driver read the whole file into memory first? Might a direct read be faster? No the BIL driver will just read the line where the pixel is. And Even, please excuse my ignorance, but what is gdb? I really would like to do the profiling. gdb is the GNU debugger ( https://www.gnu.org/software/gdb/ ) . Assuming you use Linux (likely available on MacOSX too). You should be able to install it with the usual package management system of the distribution : apt-get install gdb, yum install gdb, ... . Otherwise on Windows, I'm far less familiar with the debugging tools. gdb --args gdallocationinfo -valonly -geoloc intermap.vrt Then type run Type a coordinate -96.0 36.0 ctrl+c to suspend execution bt to display the stack trace c to resume execution David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Sunday, February 09, 2014 6:36 AM To: David Baker (Geoscience) Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit : Evan, I am not sure how to profile as I do not have access to the code to profile. I did do a timing test... vrt file = 22,970 KB bil file = 35,180 KB * 55,501 I piped five locations from the loc.txt file: -96.0 36.0 -98.0 37.0 -100.0 38.0 -99.0 39.0 -101.0 35.0 gdallocationinfo -valonly -geoloc intermap.vrt loc.txt 189.84185791015625.5 sec 384.85745239257822.6 sec 762.01593017578122.9 sec 550.71911621093823.6 sec 883.63702392578122.9 sec Note: I used a lap timer on my iPhone to capture the split times as the results appeared in the console window. Does this give any insight? Woo I agree that's utterly slow ! When you mentionned slow I thought it was more in the order of 0.1 second ! We can already exclude the parsing time of the VRT since you do that in the same gdallocationinfo session and that there will be just one parsing. And I can't believe that the intersection test for 55 000 rectangles takes ~ 20 seconds, unless you have an old i386 at 5 MHz ;-) My usual way of profiling stuff that is slow in the order of more than one second is to run under gdb, break with Ctrl+C, display the stack trace, continue the run, break again, display the stack trace, etc.. If you end up breaking in the same function, then you've found the bottleneck. I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was suggested and seems to improve things significantly. Perhaps we should try to cache the result of the initial readdir so it can benefits to later attempts, but I haven't checked how easily that could be miplemented. Or perhaps we should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it causes problem from time to time. But generally filesystems don't behave very well when there are a lot of files in the same directory. You'd better organizing your tiles in subdirectories. But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could try the above suggestion to identify where the time is spent. Even David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent: Saturday, February 01, 2014 1:28 AM To: Brian Case Cc: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 00:23:13, Brian Case a écrit : evenr what about the use of a tileindex? You really mean a tileindex as produced by gdaltindex ? Well, that's not exactly the same beast as a VRT, but yes if it was recognized as a GDAL dataset then you could potentially save the cost of XML parsing. One could imagine that the VRT driver would accept a tileindex as an altenate connection string. Anyway it would be interesting to first profile where the time is spent in David use case. If it's in the XML parsing, then I can't see what could be easily improved in that area. If it's the intersection, then there's potential for improvement. seems an intersection with a set of polys first would be quick brian On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote: Le vendredi 31 janvier 2014 17:15:53, David Baker (Geoscience) a écrit : Dev's, I have a set of 55,501 bil files in a single directory. They are DEMS data that cover the US in 7.5 minute tiles. I would like to randomly access elevations at a given lat/lon's from the whole dataset. I created a vrt file from the directory of bil files, and
[gdal-dev] Problems compressing tiff files
Hello all I'm currently trying to do the following, using gdal: 1) Combining some tifs into one using gdal_merge, COMPRESS=LZW 2) Using gdalwarp with cutline to cut out a certain piece of the result 3) Compressing again using gdalwarp, COMPRESS=LZW 4) Generating pyramids with gdaladdo However, my resulting Tiff is still very big, which hinders performance greatly in later steps of my workflow. I suspect that some compressions don't function properly, but I'm not being notified of such happenings if occuring. For instance, applying these steps on five tiffs each no bigger than 1.5 MB results in a tiff of 12 GB size. Any help or suggestions would be greatly appreciated. Kind regards Rafael Krucker ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] TOC sub-dataset retrieving when frame with same coverage share the same boundary rectangle.
Hi Even, Thank you, in fact it seems clear according to your RPF Spec analysis that the A.TOC file is not corrcetly formatted. The TOC file has been provided by Data producer... As you say the solution is to define another boundary rectangle with the same coordinate, and it will be ok. Thanks, Nico Date: Mon, 10 Feb 2014 13:15:24 +0100 From: even.roua...@mines-paris.org To: nicolas.g-...@outlook.fr CC: a...@t-kartor.se; even.roua...@mines-paris.org; gdal-dev@lists.osgeo.org Subject: RE: [gdal-dev] TOC sub-dataset retrieving when frame with same coverage share the same boundary rectangle. Selon Nicolas G nicolas.g-...@outlook.fr: Hi Andreas, The fact is that I can't use the GDAL API to get the different sub-datasets in this particular case, so can't make any combination of frames without going directly to all NITF frames as suggested by Even. Regarding your answer, maybe the meaning of Zone and Scale is not clear for me: Zone = Arc Zone?Scale = Logical name (as in your example LFC, TFC...)? but not really the resolution. For me, the subdataset could be a directory in RPF directory tree, as each directory is generated according to different parameters. You normally have one directory (path) by [Arc Zone/Scale Logical Name (LFC, TFC...)] couple (I don't take into account the fact that a directory can be splitted if frames files are not contiguous). The problem appears when two [Arc Zone/Scale Logical Name (LFC, TFC...)] contain data at the same resolution, so with the same tiling, so potentially with frames at the same col/row. I suspect your A.TOC is not correctly formatted, or in a uncommon way. For frames attached to one given (Arc Zone, Scale Logical Name), called a rectangle boundary in RPF spec, you should not have 2 frames with same col/row. According to http://earth-info.nga.mil/publications/specs/printed/2411/2411_RPF.pdf page 20 : b. Each boundary rectangle defined in the [boundary rectangle section] of a [table of contents file] will act as a logical container for a rectangular #8220;virtualmatrix#8221; of frames (see 5.2.1 below). Individual frames may be referenced by their (row, column) position within such a matrix. And then page 30 : (3) The [boundary rectangle section) will contain the boundaries of one or more boundary rectangles, each defining the periphery of a geographic area containing all of the [frame file]s in the given data interchange volume that have a given data typet compression ratio, producer, latitudinal zone, and scale (or resolution). A given record will also specify the dimensions of a rectangular #8220;virtual matrix#8221; of fixed-size frames of the given scale or resolution that fills the given boundary rectangle Page 31: (4) The [frame file index section] will contain scales and data types for all [frame file]s in the given volume. Each entry will identify the boundary rectangle (named in the [boundary rectangle section]) where the frame is located, and it will specify the row and column in a #8220;virtual matrix#8221; of frames within the boundary rectangle where the specific frame is located == So this sounds pretty clear to me that you shouldn't have 2 frames within the same boundary rectangle that have identical row,col The solution would be to have 2 boundary rectangles (with same extent) and attach one single frame to one of those boundary rectangles Where does your A.TOC comes from : is it provided by a data producer, or your own creation ? As you have written There should be overlaps between the zones but this data content shall be identical. So the following example is not really possible? Example: I have one boundary rectangle which belongs to two differents paths (but for the same coverage (same Arc zone)) by example a path with LFC-FR_(Night) content and a path with LFC-FR_(Day) content (all at the same resolution). For each frame coverage I have one frame in LFC-FR_(Night) path and one frame in LFC-FR_(Day) path (so different paths and frame names and extensions), then GDAL reports to me only one Sub-dataset, and then I can't access through GDAL Subdataset API the data in the two paths. I need to analyse the TOC file in my App to get the paths, boundaries... and then access the frame files. Nico Date: Mon, 10 Feb 2014 09:27:03 +0100 From: a...@t-kartor.se To: even.roua...@mines-paris.org; nicolas.g-...@outlook.fr CC: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] TOC sub-dataset retrieving when frame with same coverage share the same boundary rectangle. Hi A RPF structure (CADRG, CIB, ECRG etc) may include any combination of scalelevels and/or geographic zones as referenced in the A.TOC file. So there may be both scale-overlaps and zone-overlaps. There is also a frame-based update concept but this is seldom
Re: [gdal-dev] Problems compressing tiff files
Selon Rafael Krucker rkruc...@hsr.ch: Hello all I'm currently trying to do the following, using gdal: 1) Combining some tifs into one using gdal_merge, COMPRESS=LZW 2) Using gdalwarp with cutline to cut out a certain piece of the result 3) Compressing again using gdalwarp, COMPRESS=LZW The way gdalwarp works by default will not produce optimal compressed sizes. See https://trac.osgeo.org/gdal/wiki/UserDocs/GdalWarp#GeoTIFFoutput-coCOMPRESSisbroken for explanations and solutions But if you just need to compress as it seems in your step 3, you should rather use gdal_translate than gdalwarp. Will be faster and give optimally compressed files out of the box. 4) Generating pyramids with gdaladdo However, my resulting Tiff is still very big, which hinders performance greatly in later steps of my workflow. I suspect that some compressions don't function properly, but I'm not being notified of such happenings if occuring. For instance, applying these steps on five tiffs each no bigger than 1.5 MB results in a tiff of 12 GB size. Any help or suggestions would be greatly appreciated. Kind regards Rafael Krucker ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] how to create just the msk file from a rgba vrt file
Hi Even, and many thanks! Your pointers have solved my question. As for the problem building overviews it seems that it occurs when you enter a OVERVIEW tag pointing to a regular image, that doesn't have the INTERNAL_MASK_FLAGS_# set. On the other hand these flags allowed me to create a mask file even faster than using gdal_translate. The process is: 1) get a shapefile that represents your mask (maybe use gdaltindex and dissolve the result) 2) use gdalinfo to get extent and cell size of your vrt mosaic 3) use gdal_rasterize to get your mask raster, burning the shapefile to a new image initiated with 0 and burnt with 255 4) set the metadata tags to make this raster a mask (I used a small python script for this) 5) rename your raster mask to match your vrt mosaic (eg mosaic.vrt.msk) That's it. Instant (almost) mask file: instead of 4h41min it took 10min to create my mask. (The vrt mosaic references 487 rgb tiff files, totaling 240001x25 pixels in size.) The command to build the mask was the following: gdal_rasterize --config GDAL_CACHEMAX 512 -of GTiff -co compress=deflate -co tiled=yes -l quads_diss_etrs -te -79997.180 -22.390 40002.909 -104999.818 -tr 0.44258518990 0.44258518990 -ot Byte -burn 255 -init 0 -co INTERLEAVE=BAND -co NBITS=1 quads_diss_etrs.shp maskfile.tif I'm not sure about the NBITS command effect though... Then to insert the mask flags I used this python script: # import gdal from gdalconst import * import sys gdal.UseExceptions() try: filename = sys.argv[1] dataset = gdal.Open(filename, GA_Update) dataset.SetMetadataItem( INTERNAL_MASK_FLAGS_1, 2, None ) dataset.SetMetadataItem( INTERNAL_MASK_FLAGS_2, 2, None ) dataset.SetMetadataItem( INTERNAL_MASK_FLAGS_3, 2, None ) except Exception, e: print Error: + str(e) dataset = None ## Thanks again. Duarte -Mensagem original- De: Even Rouault [mailto:even.roua...@mines-paris.org] Enviada: domingo, 9 de Fevereiro de 2014 09:56 Para: gdal-dev@lists.osgeo.org Cc: Duarte Carreira; Brian Case Assunto: Re: [gdal-dev] how to create just the msk file from a rgba vrt file Le vendredi 07 février 2014 10:55:06, Duarte Carreira a écrit : Thanks Brian. But this way you rewrite the whole image to disk. It uses lots of disk space and takes forever. I want to avoid that and just get the mask out to a msk file as fast as possible. I don't want to convert my rgba vrt mosaic. The final objective is to get a msk file I can use with the simple rgb vrt mosaic. Then I'll be able to build overviews with jpeg/ycbcr compression which I can't do with rgba vrt because of the 4 bands. For now I have tried 3 ways: 1) use gdal_rasterize to create a mask directly from the mask polygon shapefile. Then just edit the rgb vrt mosaic and add a maskband to it. The problem here is gdaladdo does not honor the maskband. This is the fastest way I know, pitty it doesn't work in the end. I'd be curious that you provide ways of reproducing this. The overview computation code has explicit code to deal with mask bands. 2) use gdal_translate like you suggested but use -scale to write all 0s in all 3 bands, and compress with deflate. You get a valid mask and a very, very small useless mosaic. This works but takes a while still. 3) use gdal_translate like you suggested but exaggerate the jpeg compression so it errors out (jpeg_quality=15). You get an invalid 1kb mosaic and an apparently good msk. But it's corrupted in some way. So doesn't work. I think #1 has potential. If there was a way to somehow turn the tif created by gdal_rasterize into a true mask file and have it honored by gdaladdo we would have a winner. Maybe there's a way to directly export the alpha band from the rgba vrt mosaic to a mks file without writing anything else? That I guess would be the fastest way of all. You can generate a valid .msk file with the following gdal_translate command : gdal_translate -b 4 rgba.tif out.tif.msk \ -mo INTERNAL_MASK_FLAGS_1=2 -mo INTERNAL_MASK_FLAGS_2=2 \ -mo INTERNAL_MASK_FLAGS_3=2 -co COMPRESS=DEFLATE -co INTERLEAVE=BAND \ -co NBITS=1 You need to provide as many -mo INTERNAL_MASK_FLAGS_X=2 option as there are bands (so 3 for a RGB dataset as in the above example). This is the important option that will make a .msk file being recognized as a mask band. Even -- Geospatial professional services http://even.rouault.free.fr/services.html ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Fast Pixel Access
Evan. I am on Windows and only have the binaries installed. David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Monday, February 10, 2014 8:54 AM To: David Baker (Geoscience) Cc: 'Even Rouault'; 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: RE: [gdal-dev] Fast Pixel Access Selon David Baker (Geoscience) david.m.ba...@chk.com: Even, No not an i386... A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2 cores with 12.0GB. Thought the data is on the network, not local, with 1Gbps access. The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly increase the speed. Does the BIL driver read the whole file into memory first? Might a direct read be faster? No the BIL driver will just read the line where the pixel is. And Even, please excuse my ignorance, but what is gdb? I really would like to do the profiling. gdb is the GNU debugger ( https://www.gnu.org/software/gdb/ ) . Assuming you use Linux (likely available on MacOSX too). You should be able to install it with the usual package management system of the distribution : apt-get install gdb, yum install gdb, ... . Otherwise on Windows, I'm far less familiar with the debugging tools. gdb --args gdallocationinfo -valonly -geoloc intermap.vrt Then type run Type a coordinate -96.0 36.0 ctrl+c to suspend execution bt to display the stack trace c to resume execution David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Sunday, February 09, 2014 6:36 AM To: David Baker (Geoscience) Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit : Evan, I am not sure how to profile as I do not have access to the code to profile. I did do a timing test... vrt file = 22,970 KB bil file = 35,180 KB * 55,501 I piped five locations from the loc.txt file: -96.0 36.0 -98.0 37.0 -100.0 38.0 -99.0 39.0 -101.0 35.0 gdallocationinfo -valonly -geoloc intermap.vrt loc.txt 189.84185791015625.5 sec 384.85745239257822.6 sec 762.01593017578122.9 sec 550.71911621093823.6 sec 883.63702392578122.9 sec Note: I used a lap timer on my iPhone to capture the split times as the results appeared in the console window. Does this give any insight? Woo I agree that's utterly slow ! When you mentionned slow I thought it was more in the order of 0.1 second ! We can already exclude the parsing time of the VRT since you do that in the same gdallocationinfo session and that there will be just one parsing. And I can't believe that the intersection test for 55 000 rectangles takes ~ 20 seconds, unless you have an old i386 at 5 MHz ;-) My usual way of profiling stuff that is slow in the order of more than one second is to run under gdb, break with Ctrl+C, display the stack trace, continue the run, break again, display the stack trace, etc.. If you end up breaking in the same function, then you've found the bottleneck. I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was suggested and seems to improve things significantly. Perhaps we should try to cache the result of the initial readdir so it can benefits to later attempts, but I haven't checked how easily that could be miplemented. Or perhaps we should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it causes problem from time to time. But generally filesystems don't behave very well when there are a lot of files in the same directory. You'd better organizing your tiles in subdirectories. But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could try the above suggestion to identify where the time is spent. Even David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent: Saturday, February 01, 2014 1:28 AM To: Brian Case Cc: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 00:23:13, Brian Case a écrit : evenr what about the use of a tileindex? You really mean a tileindex as produced by gdaltindex ? Well, that's not exactly the same beast as a VRT, but yes if it was recognized as a GDAL dataset then you could potentially save the cost of XML parsing. One could imagine that the VRT driver would accept a tileindex as an altenate connection string. Anyway it would be interesting to first profile where the time is spent in David use case. If it's in the XML parsing, then I can't see what could be easily improved in that area. If it's the intersection, then there's potential for improvement. seems an intersection with a set of polys first would be quick brian On Fri, 2014-01-31 at 19:30 +0100, Even Rouault wrote: Le vendredi 31 janvier 2014 17:15:53, David Baker
Re: [gdal-dev] Fast Pixel Access
Le mardi 11 février 2014 00:10:20, David Baker (Geoscience) a écrit : Evan. I am on Windows and only have the binaries installed. Well, I let Windows developers lurking here answer if they have some good advice. I imagine you would need binaries with debugging symbols to be able to get a usefull stack trace. David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Monday, February 10, 2014 8:54 AM To: David Baker (Geoscience) Cc: 'Even Rouault'; 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: RE: [gdal-dev] Fast Pixel Access Selon David Baker (Geoscience) david.m.ba...@chk.com: Even, No not an i386... A Dell Precision T3500 w/Intel W3680 @ 3.33GHhz 6x2 cores with 12.0GB. Thought the data is on the network, not local, with 1Gbps access. The GDAL_DISABLE_READDIR_ON_OPEN = TRUE did significantly increase the speed. Does the BIL driver read the whole file into memory first? Might a direct read be faster? No the BIL driver will just read the line where the pixel is. And Even, please excuse my ignorance, but what is gdb? I really would like to do the profiling. gdb is the GNU debugger ( https://www.gnu.org/software/gdb/ ) . Assuming you use Linux (likely available on MacOSX too). You should be able to install it with the usual package management system of the distribution : apt-get install gdb, yum install gdb, ... . Otherwise on Windows, I'm far less familiar with the debugging tools. gdb --args gdallocationinfo -valonly -geoloc intermap.vrt Then type run Type a coordinate -96.0 36.0 ctrl+c to suspend execution bt to display the stack trace c to resume execution David -Original Message- From: Even Rouault [mailto:even.roua...@mines-paris.org] Sent: Sunday, February 09, 2014 6:36 AM To: David Baker (Geoscience) Cc: 'Brian Case'; 'gdal-dev@lists.osgeo.org' Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 15:04:46, David Baker (Geoscience) a écrit : Evan, I am not sure how to profile as I do not have access to the code to profile. I did do a timing test... vrt file = 22,970 KB bil file = 35,180 KB * 55,501 I piped five locations from the loc.txt file: -96.0 36.0 -98.0 37.0 -100.0 38.0 -99.0 39.0 -101.0 35.0 gdallocationinfo -valonly -geoloc intermap.vrt loc.txt 189.84185791015625.5 sec 384.85745239257822.6 sec 762.01593017578122.9 sec 550.71911621093823.6 sec 883.63702392578122.9 sec Note: I used a lap timer on my iPhone to capture the split times as the results appeared in the console window. Does this give any insight? Woo I agree that's utterly slow ! When you mentionned slow I thought it was more in the order of 0.1 second ! We can already exclude the parsing time of the VRT since you do that in the same gdallocationinfo session and that there will be just one parsing. And I can't believe that the intersection test for 55 000 rectangles takes ~ 20 seconds, unless you have an old i386 at 5 MHz ;-) My usual way of profiling stuff that is slow in the order of more than one second is to run under gdb, break with Ctrl+C, display the stack trace, continue the run, break again, display the stack trace, etc.. If you end up breaking in the same function, then you've found the bottleneck. I see now that in that thread GDAL_DISABLE_READDIR_ON_OPEN = TRUE was suggested and seems to improve things significantly. Perhaps we should try to cache the result of the initial readdir so it can benefits to later attempts, but I haven't checked how easily that could be miplemented. Or perhaps we should just change the default value of GDAL_DISABLE_READDIR_ON_OPEN since it causes problem from time to time. But generally filesystems don't behave very well when there are a lot of files in the same directory. You'd better organizing your tiles in subdirectories. But still 1 to 3 seconds sounds a bit slow to me. Would be cool if you could try the above suggestion to identify where the time is spent. Even David -Original Message- From: gdal-dev-boun...@lists.osgeo.org [mailto:gdal-dev-boun...@lists.osgeo.org] On Behalf Of Even Rouault Sent: Saturday, February 01, 2014 1:28 AM To: Brian Case Cc: gdal-dev@lists.osgeo.org Subject: Re: [gdal-dev] Fast Pixel Access Le samedi 01 février 2014 00:23:13, Brian Case a écrit : evenr what about the use of a tileindex? You really mean a tileindex as produced by gdaltindex ? Well, that's not exactly the same beast as a VRT, but yes if it was recognized as a GDAL dataset then you could potentially save the cost of XML parsing. One could imagine that the VRT driver would accept a tileindex as an altenate connection string. Anyway it would be interesting to first profile where the time