Looks great to me.
I apologize for not getting to the arm stuff myself.
Thanks,
Brett
On Thu, Jul 11, 2024 at 11:47 AM Lasse Collin
wrote:
> On 2024-03-26 Lasse Collin wrote:
> > I suppose the ARM64 speed is still to be determined by you or someone
> > else.
>
> I was given results from one ARM
On 2024-03-26 Lasse Collin wrote:
> I suppose the ARM64 speed is still to be determined by you or someone
> else.
I was given results from one ARM64 system with Java 23:
- Java 8 code: 5.6 s
- Basic: 3.8 s
- UnalignedLongLE: 2.4 s
The test command:
time head -c1 /
On 2024-03-20 Brett Okken wrote:
> The jdk8 changes show nice improvements over head. My assumption is
> that with less math going on in the offsets of the while loop allowed
> the jvm to better optimize.
Sounds good, thanks! :-)
> I am surprised with the binary math behind your handling of long
> I still need to check a few of your edits if some of them should be
> included. :-)
I think the changes to LZMAEncoderNormal as part of this PR to avoid the
negative length comparison would be good to carry forward. It basically
compares to the MATCH_LEN_MIN instead of 0. This can avoid some (sh
On 2024-03-12 Brett Okken wrote:
> I am still working on digesting your branch.
I still need to check a few of your edits if some of them should be
included. :-)
> The difference in method signature is subtle, but I think a key part
> of the improvements you are getting. Could you add javadoc to
I am still working on digesting your branch. The difference in method
signature is subtle, but I think a key part of the improvements you are
getting. Could you add javadoc to more clearly describe how the args are to
be interpreted and what the return value means?
I am playing with manually unrol
On 2024-03-09 Brett Okken wrote:
> When I tested graviton2 (arm64) previously, Arrays.mismatch was
> better than comparing longs using a VarHandle.
Sounds promising. :-) However, your array_comparison_performance
handles the last 1-7 bytes byte-by-byte. My array_compare branch
reserves extra 7 byt
When I tested graviton2 (arm64) previously, Arrays.mismatch was better than
comparing longs using a VarHandle.
The benefits are definitely with content that compresses more - because
there are more long matches.
I do like Unsafe as an option for jdk 8 users on x86 or arm64.
On Sat, Mar 9, 2024 a
I created a branch array_compare. It has a simple version for Java <= 8
which seems very slightly faster than the current code in master, at
least when tested with OpenJDK 21. For Java >= 9 there is
Arrays.mismatch for portability and VarHandle for x86-64 and ARM64.
These are clearly faster than th
On 2024-02-29 Brett Okken wrote:
> > Thanks! Ideally there would be one commit to add the minimal
> > portable version, then separate commits for each optimized variant.
>
> Would you like me to remove the Unsafe based impl from
> https://github.com/tukaani-project/xz-java/pull/13?
There are new
I have added a comment to the PR with updated benchmark results:
https://github.com/tukaani-project/xz-java/pull/13#issuecomment-1977705691
On Fri, Mar 1, 2024 at 6:23 AM Brett Okken wrote:
>
> I found and resolved the difference:
> https://github.com/tukaani-project/xz-java/pull/13/commits/1e455
I found and resolved the difference:
https://github.com/tukaani-project/xz-java/pull/13/commits/1e4550e06d8cbec4079b2b2fba4a2245307cc4e6
It was indeed in BT4 and had to do with searching for the
niceLenLimit. I will update the benchmarks over the weekend, as they
take some time to run.
Brett
On
> Thanks! Ideally there would be one commit to add the minimal portable
> version, then separate commits for each optimized variant.
Would you like me to remove the Unsafe based impl from
https://github.com/tukaani-project/xz-java/pull/13?
> So far I have given it only a quick try. array_comp_inc
On 2024-02-25 Brett Okken wrote:
> I created https://github.com/tukaani-project/xz-java/pull/13 with the
> bare bones changes to utilize a utility for array comparisons and an
> Unsafe implementation.
> When/if that is reviewed and approved, we can move on through the
> other implementation options
> Thanks! I could be good to split into smaller commits to make reviewing
> easier.
I created https://github.com/tukaani-project/xz-java/pull/13 with the
bare bones changes to utilize a utility for array comparisons and an
Unsafe implementation.
When/if that is reviewed and approved, we can move o
On 2024-02-19 Brett Okken wrote:
> I have created a pr to the GitHub project.
>
> https://github.com/tukaani-project/xz-java/pull/12
Thanks! I could be good to split into smaller commits to make reviewing
easier.
> It is not clear to me if that is actually seeing active dev on the
> Java project
I have created a pr to the GitHub project.
https://github.com/tukaani-project/xz-java/pull/12
It is not clear to me if that is actually seeing active dev on the Java
project yet.
Thanks,
Brett
On Sat, Feb 12, 2022 at 11:45 AM Brett Okken
wrote:
> Can this be taken up again?
>
> On Wed, Mar 24
> Can this be taken up again?
+1
Any updates on this?
--
Dennis Ens
Can this be taken up again?
On Wed, Mar 24, 2021 at 6:20 AM Brett Okken
wrote:
> I grabbed an older version in the last mail. This is the updated
> version for aarch64.
>
I grabbed an older version in the last mail. This is the updated
version for aarch64.
ArrayUtil.java
Description: Binary data
I was able to test on AWS graviton2 instances (aarch64), but only with
jdk 15. The results show that the vectorized approach appears the best
option, though long comparisons are also an improvement over baseline.
Based on this, I made a small change to ArrayUtil to, by default, use
unsafe long co
I have attached updated patches and ArrayUtil.java.
HC4 needed changes/optimizations in both locations.
I also found a better way to handle BT4 occasionally sending -1 as the length.
diff --git a/src/org/tukaani/xz/lz/BT4.java b/src/org/tukaani/xz/lz/BT4.java
index 6c46feb..7d78aef 100644
--- a/src
On Tue, Feb 16, 2021 at 12:48 PM Lasse Collin wrote:
>
> I quickly tried these with "XZEncDemo 2". I used the preset 2 because
> that uses LZMAEncoderFast instead of LZMAEncoderNormal where the
> negative lengths result in a crash.
I updated the mismatch method to check for negative lengths upfro
I quickly tried these with "XZEncDemo 2". I used the preset 2 because
that uses LZMAEncoderFast instead of LZMAEncoderNormal where the
negative lengths result in a crash. The performance was about the same
or worse than the original code. I don't know why. I didn't spend much
time on this and it's
Replacing while loops with switch statements for the "extra bytes"
also yields a small improvement. Pulling that common logic out into a
utility method negates most of the benefit.
Here is the updated ArrayUtil class.
package org.tukaani.xz.common;
import static java.lang.invoke.MethodType.meth
Based on some playing around with unrolling loops as part of the crc64
implementation, I tried unrolling the "legacy" implementation and
found it provided some nice improvements. The improvements were most
pronounced on 32 bit jdk 11:
32 jdk 11 - LEGACY
Benchmark (f
package org.tukaani.xz.common;
import static java.lang.invoke.MethodType.methodType;
import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.lang.reflect.Constructor;
import java.lang.reflect.Method;
import java.nio.ByteOrder;
i
diff --git a/src/org/tukaani/xz/lz/BT4.java b/src/org/tukaani/xz/lz/BT4.java
index 6c46feb..c96c766 100644
--- a/src/org/tukaani/xz/lz/BT4.java
+++ b/src/org/tukaani/xz/lz/BT4.java
@@ -11,6 +11,7 @@
package org.tukaani.xz.lz;
import org.tukaani.xz.ArrayCache;
+import org.tukaani.xz.common.ArrayU
> Have you tested with 32-bit Java too? It's quite possible that it's
> better to use ints than longs on 32-bit system. If so, that should be
> detected at runtime too, I guess.
I have now run benchmarks using the 32bit jre on 64bit windows system.
That actually introduces additional interesting i
On 2021-01-11 Brett Okken wrote:
> I threw together a quick jmh test, and there is no value in the
> changes to Hash234.
OK, let's forget that then.
On 2021-01-16 Brett Okken wrote:
> I have found a way to use VarHandle byte array access at runtime in
> code which is compile time compatible with
Lasse,
I have found a way to use VarHandle byte array access at runtime in
code which is compile time compatible with jdk 7. So here is an
updated ArrayUtil class which will use a VarHandle to read long values
in jdk 9+. If that is not available, it will attempt to use
sun.misc.Unsafe. If that can
I have continued to refine the changes around the array comparisons
and think I am pretty well done there.
I did a small benchmark measuring the time to compress 3 different
files using new XZOutputStream(cos, new LZMA2Options()). Where cos was
an OutputStream which simply calculated the crc32 of
It turns out that reading the longs in native byte order provides
noticeable improvement.
I did find that there was cost overhead of ~1 ns/op by using an
interface/implementation to flex behavior if Unsafe could not be
loaded. That cost goes away by using java.lang.invoke.MethodHandle.
So here is a
I threw together a quick jmh test, and there is no value in the
changes to Hash234.
For the array mismatch, the results are kind of interesting. My
observation, stepping through some compression uses, is that the
comparison length is typically 100-200 bytes in length, but the actual
match length i
On 2021-01-09 Brett Okken wrote:
> This would seem to be a potential candidate for a multi-release
> jar[1], if you can figure out a reasonable way to get a build system
> to generate one.
I suppose it can be done. The build system uses Apache Ant. From some
sources I've understood that there are
Here is a class which is compatible with jdk 7. It will use a
MethodHandle to invoke Arrays.mismatch if that is found at runtime. If
that is not found, it will see if it can find Unsafe to read 4 bytes
at a time and compare as ints. If that cannot be found/loaded/invoked,
it falls back to iterating
This would seem to be a potential candidate for a multi-release
jar[1], if you can figure out a reasonable way to get a build system
to generate one.
The 4 uses I found of comparing byte[] could be refactored to call a
new utility class to do the comparison. The "regular" implementation
could be j
On 2021-01-08 Brett Okken wrote:
> Are there any plans to update xz-java to take advantage of newer
> features in jdk 9+?
There aren't much plans at all. Adding module-info.java is likely to
happen in the next release, whenever that will be.
Apache Commons Compress 1.20 requires Java 7. It depend
38 matches
Mail list logo