On 2024-03-05 Dennis Ens wrote: > > The XZ for Java development is becoming active again but it may > > still take a while until the next stable release is out. A few > > other things are waiting in the queue from the past three years. > > Ah, I see. Thank you for the answer. Do you have a timeline of when > the changes are expected?
I hope 1.10 could be done in a month or two but I don't want to make any promises or serious predictions. Historically those haven't been accurate at all. > First, xz-java seems much slower. I tested compressing and > decompressing a ~1.2 gigabyte file, and xz-java took 17m32.345s > compared to xz's 7m7.615s to compress. Decompressing was 0m21.760s to > 0m6.223s. Is there anything that can be done to improve the speed of > the Java version, or is c just a much more efficient programming > language? Brett Okken's patches (originally from early 2021) should improve compression speed. They are currently under review. Those are one of the things to get into the next stable release. However, Java in general is slower. Some compressors have a Java API but the performance-critical code is native code. For example, java.util.zip calls into native code from zlib. XZ for Java doesn't use any native code (for now at least). XZ for Java lacks threading still. Implementing it is among the most important tasks in XZ for Java. It helps with big files like your test file but makes compressed file a little bigger. From your numbers I'm not certain if you used xz in threaded mode or not. The time difference looks unusually high for single-threaded mode for both compression and decompression. The difference for a big input file in threaded mode looks small though (unless it had lots of trivially-compressible sections). In single-threaded mode, I would expect compressing with xz to take around 30-40 % less time than XZ for Java but your numbers show 60 % time reduction. XZ Utils 5.6.0 added x86-64 assembly (GCC & Clang only) which reduces per-thread decompression time by 20-40 % depending on the file and the computer. So that increases the difference between XZ Utils and XZ for Java too: decompression time can be roughly 50 % less with XZ Utils 5.6.0 in single-threaded mode on x86-64 compared to XZ for Java. XZ Utils 5.6.0 also enables threaded mode by default. > Also, I noticed that the results of compressing the files were > different sizes. They both worked, so I don't know if it's an issue, > but it does seem strange. The xz-java one was slightly smaller than > the xz one. The encoder implementations have some minor differences which affects both output and speed. Different releases can in theory have different output. XZ Utils output might change in future versions too. -- Lasse Collin