I found and resolved the difference:
https://github.com/tukaani-project/xz-java/pull/13/commits/1e4550e06d8cbec4079b2b2fba4a2245307cc4e6

It was indeed in BT4 and had to do with searching for the
niceLenLimit. I will update the benchmarks over the weekend, as they
take some time to run.

Brett

On Thu, Feb 29, 2024 at 8:47 PM Brett Okken <brett.okken...@gmail.com> wrote:
>
> > Thanks! Ideally there would be one commit to add the minimal portable
> > version, then separate commits for each optimized variant.
>
> Would you like me to remove the Unsafe based impl from
> https://github.com/tukaani-project/xz-java/pull/13?
>
> > So far I have given it only a quick try. array_comp_incremental seems
> > faster than xz-java.git master. Compression time was reduced by about
> > 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my
> > computer is old so I don't doubt your higher numbers.
>
> How are you testing? I am using jmh, so it has a warm up period before
> actually measuring, giving the jvm plenty of opportunity to perform
> optimizations. If you are doing single shot executions to compress a
> file, that could provide pretty different results.
>
> > With array_comparison_performance the improvement seems to be less,
> > maybe 5 %. I didn't test much yet but it still seems clear that
> > array_comp_incremental is faster on my computer.
>
> Going back to the previous question, this could be due to fact that I
> collapsed some class hierarchy in the _incremental pr. This could take
> the optimizer a bit longer to figure out.
>
> > However, your code produces different output compared to xz-java.git
> > master so the speed comparison isn't entirely fair. I assume there was
> > no intent to affect the encoder output with these changes so I wonder
> > what is going on. Both of your branches produce the same output so it's
> > something common between them that makes the difference.
>
> This was definitely not the intent, and I had not noticed this previously.
>
> With the 3 files I test with, none have any difference with preset of
> 3. The smallest file (ihe_ovly_pr.cm) also has no difference at preset
> 6.
>
> With the ~25MB image1.dcm (mostly a greyscale bmp), the PR versions
> produce more compressed content at preset 6.
> 1.9 = 4,041,476
> PR = 4,004,156
>
> There is a smaller difference with the ~50MB xml file, but strangely,
> the PR version is slightly bigger.
> 1.9 = 1,589,512
> PR = 1,589,564
>
> Given that I am only seeing differences with preset of 6, I am
> guessing the difference must be in BT4.
> The result still seems to be valid (at least the java XZInputStream
> reads it back correctly).
> There is clearly a subtle "defect" somewhere, but I cannot tell if it
> is in the current trunk or the PR. My best guess is that there is an
> off by 1 error in one or the other.
>
> Brett
>
> On Thu, Feb 29, 2024 at 11:35 AM Lasse Collin <lasse.col...@tukaani.org> 
> wrote:
> >
> > On 2024-02-25 Brett Okken wrote:
> > > I created https://github.com/tukaani-project/xz-java/pull/13 with the
> > > bare bones changes to utilize a utility for array comparisons and an
> > > Unsafe implementation.
> > > When/if that is reviewed and approved, we can move on through the
> > > other implementation options.
> >
> > Thanks! Ideally there would be one commit to add the minimal portable
> > version, then separate commits for each optimized variant.
> >
> > So far I have given it only a quick try. array_comp_incremental seems
> > faster than xz-java.git master. Compression time was reduced by about
> > 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my
> > computer is old so I don't doubt your higher numbers.
> >
> > With array_comparison_performance the improvement seems to be less,
> > maybe 5 %. I didn't test much yet but it still seems clear that
> > array_comp_incremental is faster on my computer.
> >
> > However, your code produces different output compared to xz-java.git
> > master so the speed comparison isn't entirely fair. I assume there was
> > no intent to affect the encoder output with these changes so I wonder
> > what is going on. Both of your branches produce the same output so it's
> > something common between them that makes the difference.
> >
> > I plan to get back to this next week.
> >
> > > > One thing I wonder is if JNI could help.
> > >
> > > It would most likely make things faster, but also more complicated. I
> > > like the java version for the simplicity. I am not necessarily looking
> > > to compete with native performance, but would like to get improvements
> > > where they are reasonably available. Here there is some complexity in
> > > supporting multiple implementations for different versions and/or
> > > architectures, but that complexity does not intrude into the core of
> > > the xz code.
> >
> > I think your thoughts are similar to mine here. Java version is clearly
> > slower but it's nicer code to read too. A separate class for buffer
> > comparisons indeed doesn't hurt the readability of the core code.
> >
> > On the other hand, if Java version happened to be used a lot then JNI
> > could save both time (up to 50 %) and even electricity. java.util.zip
> > uses native zlib for the performance-critical code.
> >
> > In the long run both faster Java code and JNI might be worth doing.
> > There's more than enough pure Java stuff to do for now so any JNI
> > thoughts have to wait.
> >
> > --
> > Lasse Collin
> >

Reply via email to