On Fri, 28 Oct 2022 20:58:33 GMT, Volodymyr Paprotski <[email protected]> wrote:
>> No, going the WhiteBox route was not something I was thinking of. I sought
>> feedback from a couple hotspot-knowledgable people about the use of WhiteBox
>> APIs and both felt that it was not the right way to go. One said that
>> WhiteBox is really for VM testing and not for these kinds of java classes.
>
> One idea I was trying to measure was to make the intrinsic (i.e. the while
> loop remains exactly the same, just moved to different =non-static= function):
>
> private void processMultipleBlocks(byte[] input, int offset, int length) {
> //, MutableIntegerModuloP A, IntegerModuloP R) {
> while (length >= BLOCK_LENGTH) {
> n.setValue(input, offset, BLOCK_LENGTH, (byte)0x01);
> a.setSum(n); // A += (temp | 0x01)
> a.setProduct(r); // A = (A * R) % p
> offset += BLOCK_LENGTH;
> length -= BLOCK_LENGTH;
> }
> }
>
>
> In principle, the java version would not get any slower (i.e. there is only
> one extra function jump). At the expense of the C++ glue getting more
> complex. In C++ I need to dig out using IR
> `(sun.security.util.math.intpoly.IntegerPolynomial.MutableElement)(this.a).limbs`
> then convert 5*26bit limbs into 3*44-bit limbs. The IR is very new to me so
> will take some time. (I think I found some AES code that does something
> similar).
>
> That said.. I thought this idea would had been perhaps a separate PR, if
> needed at all.. Digging limbs out is one thing, but also need to add asserts
> and safety. Mostly would be happy to just measure if its worth it.
thread resumed below
-------------
PR: https://git.openjdk.org/jdk/pull/10582