On Tue, 15 Nov 2022 17:42:08 GMT, Volodymyr Paprotski <d...@openjdk.org> wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp line 384: >> >>> 382: void StubGenerator::poly1305_limbs(const Register limbs, const >>> Register a0, const Register a1, const Register a2, bool only128) >>> 383: { >>> 384: const Register t1 = r13; >> >> Please, make the temps explicit and lift them into arguments. Otherwise, >> it's hard to see what registers are clobbered when helper methods are called. > > Thanks for pointing this out.. I spent quite a bit of time and went back and > forth on 'register allocation'... it does make sense to pass all the temps > needed, when the number of temps is small. This is the case for the three > `*_limbs_*` functions. Maybe I should indeed do that... > > On other hand, there are functions like `poly1305_multiply8_avx512` and > `poly1305_process_blocks_avx512` that use a _lot_ of temp registers. I think > it makes sense to keep those as 'function-header declarations'. > > Then there are functions like `poly1305_multiply_scalar` that could go either > way, has some temps and 'implicitly clobbered' registers, but probably should > stay 'as is'.. > > I ended up being 'pedantic' and making _all_ temps into 'header variables'. I > also tried to comment, but those probably mean more to me then anyone else in > hindsight? > > > // Register Map: > // GPRs: > // input = rdi > // length = rbx > // accumulator = rcx > // R = r8 > // a0 = rsi > // a1 = r9 > // a2 = r10 > // r0 = r11 > // r1 = r12 > // c1 = r8; > // t1 = r13 > // t2 = r14 > // t3 = r15 > // t0 = r14 > // rscratch = r13 > // stack(rsp, rbp) > // imul(rax, rdx) > // ZMMs: > // T: xmm0-6 > // C: xmm7-9 > // A: xmm13-18 > // B: xmm19-24 > // R: xmm25-29 > ... > // Register Map: > // reserved: rsp, rbp, rcx > // PARAMs: rdi, rbx, rsi, r8-r12 > // poly1305_multiply_scalar clobbers: r13-r15, rax, rdx > const Register t0 = r14; > const Register t1 = r13; > const Register rscratch = r13; > > // poly1305_limbs_avx512 clobbers: xmm0, xmm1 > // poly1305_multiply8_avx512 clobbers: xmm0-xmm6 > const XMMRegister T0 = xmm2; > ... > > > I think I am ok changing the `*limbs*` functions (even started, before I > remembered my train of thought from months back..) but let me know if you > agree with the rest of the reasoning? Changed just the three `*limbs*` functions. ------------- PR: https://git.openjdk.org/jdk/pull/10582