You could put Zarr into a ZIP. But there's little point in using SOZip for that use case (SOZIP has been merged into master 6 months ago by the way, in GDAL 3.7.0), since SOZip is for compressing large files. In a Zarr archive, you would have a lot of small/medium sized files for each chunk/tile. And when you need to read one, you read it in its whole (where SOZip aim is to be able to read efficiently a subset of a compressed file). SOZip main use case is more for vector datasets (geopackage, flatgeobuf, potentially Esri file geodatabase...)

For Zarr in ZIP, you should either use uncompressed Zarr and use Zip deflate compression, or compressed Zarr (blosc, whatever) and use uncompressed Zip ("store method").  If you have a Zarr dataset with lots of tiles, it might actually be relevant to use the zipindex (https://github.com/minio/zipindex) extension to locate more quickly each Zarr chunk, but GDAL won't make use of it.

Le 08/12/2023 à 21:23, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev a écrit :

The underlying network file system is opaque to us and can change on occasion.  But recently our team were asked to cull unused files due to inode counts.

We’re excited to explore SOZip on our vector data where random seek is important to us, but we’re waiting for that branch to be merged into master.   I don’t trust standard zip libraries to be performant for this usage case but I’m willing to be shown otherwise.

Jesse

*From: *gdal-dev <gdal-dev-boun...@lists.osgeo.org> on behalf of Laurențiu Nicola via gdal-dev <gdal-dev@lists.osgeo.org>
*Reply-To: *Laurențiu Nicola <lnic...@dend.ro>
*Date: *Friday, December 8, 2023 at 3:01 PM
*To: *gdallists <gdal-dev@lists.osgeo.org>
*Subject: *[EXTERNAL] [BULK] Re: [gdal-dev] GTiff bit shuffle compression feature request

*CAUTION:*This email originated from outside of NASA.  Please take care when clicking links or opening attachments. Use the "Report Message" button to report suspicious messages to the NASA SOC.



On Fri, Dec 8, 2023, at 21:32, Even Rouault wrote:

    yes, poor wording of mine. I meant that if using PREDICTOR=3, one
    should compare with FILTER=DELTA. But looking more closely, they
    are not strictly equivalent. PREDICTOR=3 applies the delta as
    b[0]-a[0], b[1]-a[1], b[2]-a[2], b[3]-a[3] where a[0...3] and
    b[0...3] are seen as the 4 byte representation of the float32,
    whereas FILTER=DELTA does the difference b_float - a_float as
    floating point. This isn't the same...

https://www.blosc.org/posts/bytedelta-enhance-compression-toolset/ seems to be the equivalent.

> inode allocation

XFS or ZIP?

> extra step to decompress Zarr out of ZIP

Most libraries should be able to read Zarr directly from a ZIP archive.


_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

--
http://www.spatialys.com
My software is free, but my time generally not.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
  • ... Even Rouault via gdal-dev
    • ... Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev
      • ... Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev
        • ... Rahkonen Jukka via gdal-dev
    • ... Laurențiu Nicola via gdal-dev
      • ... Even Rouault via gdal-dev
        • ... Laurențiu Nicola via gdal-dev
          • ... Even Rouault via gdal-dev
            • ... Laurențiu Nicola via gdal-dev
              • ... Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev
              • ... Even Rouault via gdal-dev
      • ... Howard Butler via gdal-dev

Reply via email to