On Tue, 19 May 2026 19:50:56 GMT, Volodymyr Paprotski <[email protected]> 
wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp line 293:
>> 
>>> 291:     __ evpxorq( A7, k1,  A7, Address(buf,  7 * 8), false, 
>>> Assembler::AVX_128bit);
>>> 292:     __ evpxorq( A8, k1,  A8, Address(buf,  8 * 8), false, 
>>> Assembler::AVX_128bit);
>>> 293:     __ cmpl(block_size, 72);
>> 
>> Are we reading 8 extra bytes from buf (0 to 79 instead of 0 to 71)?
>> K1 has 0x1F, we have q(8 byte) elements, so only last two bits of K1 would 
>> be used which are set to 1 so it is not limiting the memory read. I must be 
>> missing something.
>
> ooh.. this one. this really tripped me up, because the way this intrinsic 
> loops over user data is very non-intuitive (though its probably more 
> intrinsic-friendly). Best I can describe: "limit is the start of the last 
> block". From DigestBase.java
> 
> 
>         // compress complete blocks
>         if (len >= blockSize) {
>             int limit = ofs + len;
>             ofs = implCompressMultiBlock(b, ofs, limit - blockSize); // <<<<< 
> HERE!!!
>             len = limit - ofs;
>         }
>         // copy remainder to buffer
>         if (len > 0) {
>             System.arraycopy(b, ofs, buffer, 0, len);
>             bufOfs = len;
>         }
>     }
>     // compress complete blocks
>     private int implCompressMultiBlock(byte[] b, int ofs, int limit) {
>         implCompressMultiBlockCheck(b, ofs, limit);
>         return implCompressMultiBlock0(b, ofs, limit);
>     }
> 
>     @IntrinsicCandidate
>     private int implCompressMultiBlock0(byte[] b, int ofs, int limit) {
>         for (; ofs <= limit; ofs += blockSize) {
>             implCompress(b, ofs);
>         }
>         return ofs;
>     }
> 
>     private void implCompressMultiBlockCheck(byte[] b, int ofs, int limit) {
>         if (limit < 0) {
>             return;  // not an error because implCompressMultiBlockImpl won't 
> execute if limit < 0
>                      // and an exception is thrown if ofs < 0.
>         }
> 
>         Objects.requireNonNull(b);
>         Preconditions.checkIndex(ofs, b.length, 
> Preconditions.AIOOBE_FORMATTER);
> 
>         int endIndex = (limit / blockSize) * blockSize  + blockSize - 1; // 
> <<<<< HERE
>         if (endIndex >= b.length) {
>             throw new ArrayIndexOutOfBoundsException(endIndex);
>         }
>     }

But `SHA3::implCompressCheck` only checks up to `ofs + blockSize - 1` which is 
why I believe the old code had an array of masks to round out the various 
blocks sizes in order to prevent accessing memory outside of the array in Java.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/31125#discussion_r3271934914

Reply via email to