Hi Brett, I'm not sure that this is authoritative. The Java API documentation says that it "aims" to provide "Full support for the .xz file format specification version 1.0.4"
I am using the latest release of the Java library, version 1.19 Gary On Sat, Jul 9, 2022 at 7:58 PM Brett Okken <brett.okken...@gmail.com> wrote: > > What version of xz are you using? > > The differences between xz and lzma are a bit more involved. One such example > is that xz is a framed format which includes checksums on each “frame”. I > would not expect checksum verification to account for all of that difference, > but it can be disabled to confirm. > > On Sat, Jul 9, 2022 at 6:31 AM Gary Lucas <gwluca...@gmail.com> wrote: >> >> Hi, >> >> Would anyone be able to confirm that I am using the Java library >> xz-java-1.9.zip correctly? If not, could you suggest a better way to >> use it? Code snippets are included below. >> >> I am using the library to compress a public-domain data product called >> ETOPO1. ETOPO1 provides a global-scale grid of 233 million elevation >> and ocean depth samples as integer meters. My implementation >> compresses the data in separate blocks of about 20 thousand values >> each. Previously, I used Huffman coding and Deflate to reduce the size >> of the data to about 4.39 bits per value. With your library, LZMA >> reduces that to 4.14 bits per value and XZ to 4.16. So both techniques >> represent a substantial improvement in compression compared to the >> Huffman/Deflate methods. That improvement comes with a reasonable >> cost. Decompression using LZMA and XZ is slower than Huffman/Deflate. >> The original implementation requires an average of 4.8 seconds to >> decompress the full set of 233 million points. The LZMA version >> requires 15.2 seconds, and the XZ version requires 18.9 seconds. >> >> My understanding is that XZ should perform better than LZMA. Since >> that is not the case, could there be something suboptimal with the way >> my code uses the API? >> >> If you would like more detail about the implementation, please visit >> >> Compression Algorithms for Raster Data: >> https://gwlucastrig.github.io/GridfourDocs/notes/GridfourDataCompressionAlgorithms.html >> Compression using Lagrange Multipliers for Optimal Predictors: >> https://gwlucastrig.github.io/GridfourDocs/notes/CompressionUsingOptimalPredictors.html >> GVRS Frequently asked Questions (FAQ): >> https://github.com/gwlucastrig/gridfour/wiki/A-GVRS-FAQ >> >> Thank you for your great data compression library. >> >> Gary >> >> And here are the Code Snippets: >> >> The Gridfour Virtual Raster Store (GVRS) is a wrapper format that >> stores separate blocks of compressed data to provide random-access by >> application code >> >> LZMA ------------------------------------------ >> // byte [] input is input data >> ByteArrayOutputStream baos = new ByteArrayOutputStream(); >> lzmaOut = new LZMAOutputStream(baos, new LZMA2Options(), >> input.length); >> lzmaOut.write(input, 0, input.length); >> lzmaOut.finish(); >> lzmaOut.close(); >> return baos.toByteArray(); // return byte[] which is stored to file >> >> >> // reading the compressed data: >> ByteArrayInputStream bais = new >> ByteArrayInputStream(compressedInput, 0, compressedInput.length); >> LZMAInputStream lzmaIn = new LZMAInputStream(bais); >> byte[] output = new byte[expectedOutputLength]; >> lzmaIn.read(output, 0, output.length); >> >> >> XZ ---------------------------------------------------- >> // byte [] input is input data >> ByteArrayOutputStream baos = new ByteArrayOutputStream(); >> xzOut = new XzOutputStream(baos, new LZMA2Options(), input.length); >> xzOut.write(input, 0, input.length); >> xzOut.finish(); >> xzOut.close(); >> return baos.toByteArray(); // return byte[] which is stored to file >> >> // reading the compressed data: >> ByteArrayInputStream bais = new >> ByteArrayInputStream(compressedInput, 0, compressedInput.length); >> XzInputStream xzIn = new XzInputStream(bais); >> byte[] output = new byte[expectedOutputLength]; >> xzIn.read(output, 0, output.length); >>