Re: [xz-devel] Question about using Java API for geospatial data

Gary Lucas Sun, 10 Jul 2022 05:31:59 -0700

Hi Brett,

I'm not sure that this is authoritative. The Java API documentation
says that it "aims" to provide "Full support for the .xz file format
specification version 1.0.4"


I am using the latest release of the Java library, version 1.19

Gary

On Sat, Jul 9, 2022 at 7:58 PM Brett Okken <brett.okken...@gmail.com> wrote:
>
> What version of xz are you using?
>
> The differences between xz and lzma are a bit more involved. One such example 
> is that xz is a framed format which includes checksums on each “frame”. I 
> would not expect checksum verification to account for all of that difference, 
> but it can be disabled to confirm.
>
> On Sat, Jul 9, 2022 at 6:31 AM Gary Lucas <gwluca...@gmail.com> wrote:
>>
>> Hi,
>>
>> Would anyone be able to confirm that I am using the Java library
>> xz-java-1.9.zip correctly? If not, could you suggest a better way to
>> use it? Code snippets are included below.
>>
>> I am using the library to compress a public-domain data product called
>> ETOPO1. ETOPO1 provides a global-scale grid of 233 million elevation
>> and ocean depth samples as integer meters. My implementation
>> compresses the data in separate blocks of about 20 thousand values
>> each. Previously, I used Huffman coding and Deflate to reduce the size
>> of the data to about 4.39 bits per value. With your library, LZMA
>> reduces that to 4.14 bits per value and XZ to 4.16. So both techniques
>> represent a substantial improvement in compression compared to the
>> Huffman/Deflate methods. That improvement comes with a reasonable
>> cost. Decompression using LZMA and XZ is slower than Huffman/Deflate.
>> The original implementation requires an average of 4.8 seconds to
>> decompress the full set of 233 million points.  The LZMA version
>> requires 15.2 seconds, and the XZ version requires 18.9 seconds.
>>
>> My understanding is that XZ should perform better than LZMA. Since
>> that is not the case, could there be something suboptimal with the way
>> my code uses the API?
>>
>> If you would like more detail about the implementation, please visit
>>
>>         Compression Algorithms for Raster Data:
>> https://gwlucastrig.github.io/GridfourDocs/notes/GridfourDataCompressionAlgorithms.html
>>         Compression using Lagrange Multipliers for Optimal Predictors:
>> https://gwlucastrig.github.io/GridfourDocs/notes/CompressionUsingOptimalPredictors.html
>>         GVRS Frequently asked Questions (FAQ):
>> https://github.com/gwlucastrig/gridfour/wiki/A-GVRS-FAQ
>>
>> Thank you for your great data compression library.
>>
>> Gary
>>
>> And here are the Code Snippets:
>>
>> The Gridfour Virtual Raster Store (GVRS) is a wrapper format that
>> stores separate blocks of compressed data to provide random-access by
>> application code
>>
>> LZMA ------------------------------------------
>>         // byte [] input is input data
>>         ByteArrayOutputStream baos = new  ByteArrayOutputStream();
>>         lzmaOut = new LZMAOutputStream(baos, new LZMA2Options(), 
>> input.length);
>>         lzmaOut.write(input, 0, input.length);
>>         lzmaOut.finish();
>>         lzmaOut.close();
>>         return baos.toByteArray();   // return byte[] which is stored to file
>>
>>
>>         // reading the compressed data:
>>         ByteArrayInputStream bais = new
>> ByteArrayInputStream(compressedInput, 0, compressedInput.length);
>>         LZMAInputStream lzmaIn = new LZMAInputStream(bais);
>>         byte[] output = new byte[expectedOutputLength];
>>         lzmaIn.read(output, 0, output.length);
>>
>>
>> XZ ----------------------------------------------------
>>         // byte [] input is input data
>>         ByteArrayOutputStream baos = new  ByteArrayOutputStream();
>>         xzOut = new XzOutputStream(baos, new LZMA2Options(), input.length);
>>         xzOut.write(input, 0, input.length);
>>         xzOut.finish();
>>         xzOut.close();
>>         return baos.toByteArray();   // return byte[] which is stored to file
>>
>>        // reading the compressed data:
>>        ByteArrayInputStream bais = new
>> ByteArrayInputStream(compressedInput, 0, compressedInput.length);
>>         XzInputStream xzIn = new XzInputStream(bais);
>>         byte[] output = new byte[expectedOutputLength];
>>         xzIn.read(output, 0, output.length);
>>

Re: [xz-devel] Question about using Java API for geospatial data

Reply via email to