Re: RFR: 8247645: ChaCha20 intrinsics

Andrew Haley Sun, 06 Nov 2022 23:36:46 -0800

On Fri, 2 Sep 2022 16:52:02 GMT, Jamil Nimeh <[email protected]> wrote:


>> src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 2521:
>> 
>>> 2519: #undef INSN3
>>> 2520: #undef INSN4
>>> 2521: 
>> 
>> This code to handle the AdvSIMD load/store single structure and AdvSIMD 
>> load/store single structure (post-indexed) is excessive.
>> 
>> Every one of these instructions has the the format, 
>> 
>> `0|Q|0011010|L|R|00000|opcode|S|size|Rn|Rt`
>> 
>> or
>> 
>> `0|Q|0011011|L|R|   Rm|opcode|S|size|Rn|Rt`
>> 
>> Perhaps consider using a `RegSet regs` for the registers. Then the 
>> instruction encoding to use (1,2,3,or 4 consecutive registers) can be picked 
>> up from `regs.size()`. There only needs to be a single routine for all of 
>> the `ld_st` variants.
>
> Thanks for the suggestion.  I will look into this.  I can see how 
> `regs.size()` could simplify these macros.

Another thing that may be better than a `RegSet`. If you use a C++11 template 
parameter pack, you can do something like this:


template<typename R, typename... Rx>
void foo(R first_register, Rx... more_registers) {
  const R regs[] = { first_register, more_registers... };  // An array that 
contains the more regs
  const int count = sizeof...(more_registers);             // The count of more 
regs
  ...
}

And then you can use the same logic, regardless of the number of registers.

> What I don't know is if one approach is better than the other for other 
> reasons like performance or memory consumption. Do you have any feelings one 
> way or the other?

`ADR` is smaller and faster at runtime, `lea(reg, ExternalAddress((address) 
foo)` with `const uint64_t[] foo = { ... }` will be slightly faster at start-up 
time. It makes no sense to emit the table with `emit_data64()` then take the 
address of the table you've just emitted with `lea`. That's worse for startup 
time _and_ for runtime. So I don't much mind emitting the table at runtime, but 
if you do, get its address with `ADR`.

-------------

PR: https://git.openjdk.org/jdk/pull/7702

Re: RFR: 8247645: ChaCha20 intrinsics

Reply via email to