I have added a comment to the PR with updated benchmark results:
https://github.com/tukaani-project/xz-java/pull/13#issuecomment-1977705691

On Fri, Mar 1, 2024 at 6:23 AM Brett Okken <brett.okken...@gmail.com> wrote:
>
> I found and resolved the difference:
> https://github.com/tukaani-project/xz-java/pull/13/commits/1e4550e06d8cbec4079b2b2fba4a2245307cc4e6
>
> It was indeed in BT4 and had to do with searching for the
> niceLenLimit. I will update the benchmarks over the weekend, as they
> take some time to run.
>
> Brett
>
> On Thu, Feb 29, 2024 at 8:47 PM Brett Okken <brett.okken...@gmail.com> wrote:
> >
> > > Thanks! Ideally there would be one commit to add the minimal portable
> > > version, then separate commits for each optimized variant.
> >
> > Would you like me to remove the Unsafe based impl from
> > https://github.com/tukaani-project/xz-java/pull/13?
> >
> > > So far I have given it only a quick try. array_comp_incremental seems
> > > faster than xz-java.git master. Compression time was reduced by about
> > > 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my
> > > computer is old so I don't doubt your higher numbers.
> >
> > How are you testing? I am using jmh, so it has a warm up period before
> > actually measuring, giving the jvm plenty of opportunity to perform
> > optimizations. If you are doing single shot executions to compress a
> > file, that could provide pretty different results.
> >
> > > With array_comparison_performance the improvement seems to be less,
> > > maybe 5 %. I didn't test much yet but it still seems clear that
> > > array_comp_incremental is faster on my computer.
> >
> > Going back to the previous question, this could be due to fact that I
> > collapsed some class hierarchy in the _incremental pr. This could take
> > the optimizer a bit longer to figure out.
> >
> > > However, your code produces different output compared to xz-java.git
> > > master so the speed comparison isn't entirely fair. I assume there was
> > > no intent to affect the encoder output with these changes so I wonder
> > > what is going on. Both of your branches produce the same output so it's
> > > something common between them that makes the difference.
> >
> > This was definitely not the intent, and I had not noticed this previously.
> >
> > With the 3 files I test with, none have any difference with preset of
> > 3. The smallest file (ihe_ovly_pr.cm) also has no difference at preset
> > 6.
> >
> > With the ~25MB image1.dcm (mostly a greyscale bmp), the PR versions
> > produce more compressed content at preset 6.
> > 1.9 = 4,041,476
> > PR = 4,004,156
> >
> > There is a smaller difference with the ~50MB xml file, but strangely,
> > the PR version is slightly bigger.
> > 1.9 = 1,589,512
> > PR = 1,589,564
> >
> > Given that I am only seeing differences with preset of 6, I am
> > guessing the difference must be in BT4.
> > The result still seems to be valid (at least the java XZInputStream
> > reads it back correctly).
> > There is clearly a subtle "defect" somewhere, but I cannot tell if it
> > is in the current trunk or the PR. My best guess is that there is an
> > off by 1 error in one or the other.
> >
> > Brett
> >
> > On Thu, Feb 29, 2024 at 11:35 AM Lasse Collin <lasse.col...@tukaani.org> 
> > wrote:
> > >
> > > On 2024-02-25 Brett Okken wrote:
> > > > I created https://github.com/tukaani-project/xz-java/pull/13 with the
> > > > bare bones changes to utilize a utility for array comparisons and an
> > > > Unsafe implementation.
> > > > When/if that is reviewed and approved, we can move on through the
> > > > other implementation options.
> > >
> > > Thanks! Ideally there would be one commit to add the minimal portable
> > > version, then separate commits for each optimized variant.
> > >
> > > So far I have given it only a quick try. array_comp_incremental seems
> > > faster than xz-java.git master. Compression time was reduced by about
> > > 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my
> > > computer is old so I don't doubt your higher numbers.
> > >
> > > With array_comparison_performance the improvement seems to be less,
> > > maybe 5 %. I didn't test much yet but it still seems clear that
> > > array_comp_incremental is faster on my computer.
> > >
> > > However, your code produces different output compared to xz-java.git
> > > master so the speed comparison isn't entirely fair. I assume there was
> > > no intent to affect the encoder output with these changes so I wonder
> > > what is going on. Both of your branches produce the same output so it's
> > > something common between them that makes the difference.
> > >
> > > I plan to get back to this next week.
> > >
> > > > > One thing I wonder is if JNI could help.
> > > >
> > > > It would most likely make things faster, but also more complicated. I
> > > > like the java version for the simplicity. I am not necessarily looking
> > > > to compete with native performance, but would like to get improvements
> > > > where they are reasonably available. Here there is some complexity in
> > > > supporting multiple implementations for different versions and/or
> > > > architectures, but that complexity does not intrude into the core of
> > > > the xz code.
> > >
> > > I think your thoughts are similar to mine here. Java version is clearly
> > > slower but it's nicer code to read too. A separate class for buffer
> > > comparisons indeed doesn't hurt the readability of the core code.
> > >
> > > On the other hand, if Java version happened to be used a lot then JNI
> > > could save both time (up to 50 %) and even electricity. java.util.zip
> > > uses native zlib for the performance-critical code.
> > >
> > > In the long run both faster Java code and JNI might be worth doing.
> > > There's more than enough pure Java stuff to do for now so any JNI
> > > thoughts have to wait.
> > >
> > > --
> > > Lasse Collin
> > >

Reply via email to