Re: [gdal-dev] Filesize too large when writing compressed float's to a Geotiff from Python

2015-06-04 Thread Even Rouault

 It makes sense that the order in which the data is written/stored affects
 the performance of the compression, but i don't get why it would be
 different for integers as compared to floats?

Floats are larger than Int8 and Int16, so for the same amount of 
GDAL_CACHEMAX, you can cache less blocks, causing more temporary flushes of 
partial tiles/strips to disk (partial = that have data only for one or two 
bands, but not the 3), than need to be refetched when data for the remaining 
bands is available and then recompressed and rewritten.

 
 
 Regards,
 Rutger
 
 
 
 
 Even Rouault-2 wrote
 
  Le mercredi 03 juin 2015 15:21:07, Rutger a écrit :
  
  Rutger,
  
  the issue is that you write data band after band, whereas by default the
  GTiff
  driver create pixel-interleaved datasets. So some blocks in the GTiff
  might be
  reread and rewritten several times as the data coming from the various
  bands
  come.
  
  Several fixes/workarounds :
  - if you've sufficient RAM to hold another copy of the uncompressed
  dataset,
  increase GDAL_CACHEMAX
  - or add options = [ 'INTERLEAVE=BAND' ] in the Create() call to create a
  band
  interleaved dataset
  - more involved fix: since there's no dataset WriteArray() in GDAL Python
  for
  now, you would have to iterate block by block and for each block write
  the corresponding region of each band.
  - you could also use Dataset.WriteRaster() if you can get a buffer from
  the
  numpy array
  
  Even
 
 --
 View this message in context:
 http://osgeo-org.1560.x6.nabble.com/Filesize-too-large-when-writing-compre
 ssed-float-s-to-a-Geotiff-from-Python-tp5208916p5209075.html Sent from the
 GDAL - Dev mailing list archive at Nabble.com.
 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Filesize too large when writing compressed float's to a Geotiff from Python

2015-06-04 Thread Rutger
Even, 

Thanks for the suggestions, the first two work well. I'll have a look at the
ds.WriteRaster, that seems an interesting way, since it also prevents
unnecessary looping over the bands. 

Writing per block is what i usually do, maybe that's why i never noticed it
before. I now ran into it while fetching and writing a dataset from OpenDAP,
whereas i usually read blocks from GTiffs.

It makes sense that the order in which the data is written/stored affects
the performance of the compression, but i don't get why it would be
different for integers as compared to floats? 


Regards,
Rutger




Even Rouault-2 wrote
 Le mercredi 03 juin 2015 15:21:07, Rutger a écrit :
 
 Rutger,
 
 the issue is that you write data band after band, whereas by default the
 GTiff 
 driver create pixel-interleaved datasets. So some blocks in the GTiff
 might be 
 reread and rewritten several times as the data coming from the various
 bands 
 come.
 
 Several fixes/workarounds :
 - if you've sufficient RAM to hold another copy of the uncompressed
 dataset, 
 increase GDAL_CACHEMAX
 - or add options = [ 'INTERLEAVE=BAND' ] in the Create() call to create a
 band 
 interleaved dataset
 - more involved fix: since there's no dataset WriteArray() in GDAL Python
 for 
 now, you would have to iterate block by block and for each block write the 
 corresponding region of each band.
 - you could also use Dataset.WriteRaster() if you can get a buffer from
 the 
 numpy array
 
 Even





--
View this message in context: 
http://osgeo-org.1560.x6.nabble.com/Filesize-too-large-when-writing-compressed-float-s-to-a-Geotiff-from-Python-tp5208916p5209075.html
Sent from the GDAL - Dev mailing list archive at Nabble.com.
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Filesize too large when writing compressed float's to a Geotiff from Python

2015-06-03 Thread Even Rouault
Le mercredi 03 juin 2015 15:21:07, Rutger a écrit :
 Dear list,
 
 When i try to write a floating point Geotiff from Python (both 32 or 64
 bit), the resulting file size is significantly larger compared to the
 output of gdal_translate. I wrote a small test script which tests this it
 for various creation options like compression and block-sizes. It seems to
 be the case for any compression method (packbits, lzw, deflate), and other
 creation options don't seem to matter. Uncompressed data, or compressed
 integers work fine, the file size matches gdal_translate very well.
 
 Here is the notebook i used:
 http://nbviewer.ipython.org/gist/RutgerK/27c4af235035621fb609
 
 I would first all be interested to know if anyone can replicate this
 behavior? And if there is something i can do to prevent this? I would
 rather avoid having to run gdal_translate after each file written from
 Python.

Rutger,

the issue is that you write data band after band, whereas by default the GTiff 
driver create pixel-interleaved datasets. So some blocks in the GTiff might be 
reread and rewritten several times as the data coming from the various bands 
come.

Several fixes/workarounds :
- if you've sufficient RAM to hold another copy of the uncompressed dataset, 
increase GDAL_CACHEMAX
- or add options = [ 'INTERLEAVE=BAND' ] in the Create() call to create a band 
interleaved dataset
- more involved fix: since there's no dataset WriteArray() in GDAL Python for 
now, you would have to iterate block by block and for each block write the 
corresponding region of each band.
- you could also use Dataset.WriteRaster() if you can get a buffer from the 
numpy array

Even


 
 I'm running it on Windows 7 64bit. My GDAL version comes from the default
 Conda repository which contains both bindings and utilities (at least for
 version 1.11.1).
 
 All i could find was this thread from 2010, it seems somewhat similar
 except that there its about Uint16, for which it works for me:
 http://osgeo-org.1560.x6.nabble.com/gdal-dev-RE-Compression-using-the-creat
 e-method-in-python-and-aggregation-methods-td3747703.html
 
 
 Regards,
 Rutger
 
 
 
 --
 View this message in context:
 http://osgeo-org.1560.x6.nabble.com/Filesize-too-large-when-writing-compre
 ssed-float-s-to-a-Geotiff-from-Python-tp5208916.html Sent from the GDAL -
 Dev mailing list archive at Nabble.com.
 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


[gdal-dev] Filesize too large when writing compressed float's to a Geotiff from Python

2015-06-03 Thread Rutger
Dear list,

When i try to write a floating point Geotiff from Python (both 32 or 64
bit), the resulting file size is significantly larger compared to the output
of gdal_translate. I wrote a small test script which tests this it for
various creation options like compression and block-sizes. It seems to be
the case for any compression method (packbits, lzw, deflate), and other
creation options don't seem to matter. Uncompressed data, or compressed
integers work fine, the file size matches gdal_translate very well.

Here is the notebook i used:
http://nbviewer.ipython.org/gist/RutgerK/27c4af235035621fb609

I would first all be interested to know if anyone can replicate this
behavior? And if there is something i can do to prevent this? I would rather
avoid having to run gdal_translate after each file written from Python. 

I'm running it on Windows 7 64bit. My GDAL version comes from the default
Conda repository which contains both bindings and utilities (at least for
version 1.11.1).

All i could find was this thread from 2010, it seems somewhat similar except
that there its about Uint16, for which it works for me:
http://osgeo-org.1560.x6.nabble.com/gdal-dev-RE-Compression-using-the-create-method-in-python-and-aggregation-methods-td3747703.html


Regards,
Rutger



--
View this message in context: 
http://osgeo-org.1560.x6.nabble.com/Filesize-too-large-when-writing-compressed-float-s-to-a-Geotiff-from-Python-tp5208916.html
Sent from the GDAL - Dev mailing list archive at Nabble.com.
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev