On 2021-02-03 Brett Okken wrote:
> On Wed, Feb 3, 2021 at 2:56 PM Lasse Collin
> <lasse.col...@tukaani.org> wrote:
> > It seems to regress horribly if dist is zero. A file with a very
> > long sequence of the same byte is good for testing.
> 
> Would this be a valid test of what you are describing?
[...]
> The source is effectively 160MB of the same byte value.

Yes, it's fine.

> I found a strange bit of behavior with this case in the compression.
> In LZMAEncoderNormal.calcLongRepPrices, I am seeing a case where
> 
>             int len2Limit = Math.min(niceLen, avail - len - 1);
> 
> results in -1, (avail and len are both 8). This results in calling
> LZEncoder.getMatchLen with a lenLimit of -1. Is that expected?

I didn't check in detail now, but I think it's expected. There are two
such places. A speed optimization was forgotten in liblzma from these
two places because of this detail. I finally remembered to add the
optimization in 5.2.5.

On 2021-02-03 Brett Okken wrote:
> I still need to do more testing across jdk 8 and 15, but initial
> returns on this are pretty positive. The repeating byte file is
> meaningfully faster than baseline. One of my test files (image1.dcm)
> does not improve much from baseline, but the other 2 files do.

The repeating byte is indeed much faster than the baseline. With normal
files the speed seems to be about the same as the version I posted, so
a minor improvement over the baseline.

With a file with two-byte repeat ("ababababababab"...) it's 50 % slower
than the baseline. Calling arraycopy in a loop, copying two bytes at a
time, is not efficient. I didn't try look how big the copy needs to be
to make the overhead of arraycopy smaller than the benefit but clearly
it needs to be bigger than two bytes.

The use of Arrays.fill to optimize the case of one repeating byte looks
useful especially if it won't hurt performance in other situations.
Still, I'm not sure yet if the LZDecoder optimizations should go in 1.9.

-- 
Lasse Collin  |  IRC: Larhzu @ IRCnet & Freenode

Reply via email to