On 2024-03-05 Dennis Ens wrote:
> > The XZ for Java development is becoming active again but it may
> > still take a while until the next stable release is out. A few
> > other things are waiting in the queue from the past three years.  
> 
> Ah, I see. Thank you for the answer. Do you have a timeline of when
> the changes are expected?

I hope 1.10 could be done in a month or two but I don't want to make any
promises or serious predictions. Historically those haven't been
accurate at all.

> First, xz-java seems much slower. I tested compressing and
> decompressing a ~1.2 gigabyte file, and xz-java took 17m32.345s
> compared to xz's 7m7.615s to compress. Decompressing was 0m21.760s to
> 0m6.223s. Is there anything that can be done to improve the speed of
> the Java version, or is c just a much more efficient programming
> language?

Brett Okken's patches (originally from early 2021) should improve
compression speed. They are currently under review. Those are one of
the things to get into the next stable release.

However, Java in general is slower. Some compressors have a Java API but
the performance-critical code is native code. For example, java.util.zip
calls into native code from zlib. XZ for Java doesn't use any native
code (for now at least).

XZ for Java lacks threading still. Implementing it is among the most
important tasks in XZ for Java. It helps with big files like your test
file but makes compressed file a little bigger. From your numbers I'm
not certain if you used xz in threaded mode or not. The time difference
looks unusually high for single-threaded mode for both compression and
decompression. The difference for a big input file in threaded mode
looks small though (unless it had lots of trivially-compressible
sections).

In single-threaded mode, I would expect compressing with xz to take
around 30-40 % less time than XZ for Java but your numbers show 60 %
time reduction.

XZ Utils 5.6.0 added x86-64 assembly (GCC & Clang only) which reduces
per-thread decompression time by 20-40 % depending on the file and the
computer. So that increases the difference between XZ Utils and XZ for
Java too: decompression time can be roughly 50 % less with XZ Utils
5.6.0 in single-threaded mode on x86-64 compared to XZ for Java.

XZ Utils 5.6.0 also enables threaded mode by default.

> Also, I noticed that the results of compressing the files were
> different sizes. They both worked, so I don't know if it's an issue,
> but it does seem strange. The xz-java one was slightly smaller than
> the xz one.

The encoder implementations have some minor differences which affects
both output and speed. Different releases can in theory have different
output. XZ Utils output might change in future versions too.

-- 
Lasse Collin

Reply via email to