On Tue, 15 Nov 2022 17:42:08 GMT, Volodymyr Paprotski <d...@openjdk.org> wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp line 384:
>> 
>>> 382: void StubGenerator::poly1305_limbs(const Register limbs, const 
>>> Register a0, const Register a1, const Register a2, bool only128)
>>> 383: {
>>> 384:   const Register t1 = r13;
>> 
>> Please, make the temps explicit and lift them into arguments. Otherwise, 
>> it's hard to see what registers are clobbered when helper methods are called.
>
> Thanks for pointing this out.. I spent quite a bit of time and went back and 
> forth on 'register allocation'... it does make sense to pass all the temps 
> needed, when the number of temps is small. This is the case for the three 
> `*_limbs_*` functions. Maybe I should indeed do that...
> 
> On other hand, there are functions like `poly1305_multiply8_avx512` and 
> `poly1305_process_blocks_avx512` that use a _lot_ of temp registers. I think 
> it makes sense to keep those as 'function-header declarations'. 
> 
> Then there are functions like `poly1305_multiply_scalar` that could go either 
> way, has some temps and 'implicitly clobbered' registers, but probably should 
> stay 'as is'..
> 
> I ended up being 'pedantic' and making _all_ temps into 'header variables'. I 
> also tried to comment, but those probably mean more to me then anyone else in 
> hindsight?
> 
> 
> // Register Map:
> // GPRs:
> //   input        = rdi
> //   length       = rbx
> //   accumulator  = rcx
> //   R   = r8
> //   a0  = rsi
> //   a1  = r9
> //   a2  = r10
> //   r0  = r11
> //   r1  = r12
> //   c1  = r8;
> //   t1  = r13
> //   t2  = r14
> //   t3  = r15
> //   t0  = r14
> //   rscratch = r13
> //   stack(rsp, rbp)
> //   imul(rax, rdx)
> // ZMMs:
> //   T: xmm0-6
> //   C: xmm7-9
> //   A: xmm13-18
> //   B: xmm19-24
> //   R: xmm25-29
> ...
>   // Register Map:
>   // reserved: rsp, rbp, rcx
>   // PARAMs: rdi, rbx, rsi, r8-r12
>   // poly1305_multiply_scalar clobbers: r13-r15, rax, rdx
>   const Register t0 = r14;
>   const Register t1 = r13;
>   const Register rscratch = r13;
> 
>   // poly1305_limbs_avx512 clobbers: xmm0, xmm1
>   // poly1305_multiply8_avx512 clobbers: xmm0-xmm6
>   const XMMRegister T0 = xmm2;
> ...
> 
> 
> I think I am ok changing the `*limbs*` functions (even started, before I 
> remembered my train of thought from months back..) but let me know if you 
> agree with the rest of the reasoning?

Changed just the three `*limbs*` functions.

-------------

PR: https://git.openjdk.org/jdk/pull/10582

Reply via email to