I created a branch array_compare. It has a simple version for Java <= 8 which seems very slightly faster than the current code in master, at least when tested with OpenJDK 21. For Java >= 9 there is Arrays.mismatch for portability and VarHandle for x86-64 and ARM64. These are clearly faster than the basic version.
sun.misc.Unsafe would be a little faster than VarHandle but I feel it's not enough to be worth the downsides (non-standard and not memory safe). 32-bit archs I didn't include, for now at least, since if people want speed I hope they don't run 32-bit Java. Speed differences are very minor when testing with files that don't compress extremely well. That was the problem I had with my earlier test results. With files that have compression ratio like 0.05 the speed differences are clear. I cannot test on ARM64 so it would be great if someone can, comparing the three versions. The most extreme difference is when compressing just zeros: time head -c100000000 /dev/zero \ | java -jar build/jar/XZEncDemo.jar > /dev/null Internal docs should be added to the branch and perhaps there are other related optimizations to do still. So it's not fully finished yet but now it's ready for testing and feedback. For example, some tweaks from your array_comp_incremental could be considered after testing. -- Lasse Collin