On Fri, 24 May 2024 18:37:13 GMT, Vladimir Kozlov wrote:
>> Changed to `lea` with `InternalAddress()`. Generates the exact same code,
>> but makes more sense. I looked at `movdqu` and see no code that generates
>> RIP-relative loads. It merely checks `reachable()` and adds an intermediate
>
On Fri, 24 May 2024 15:33:46 GMT, Scott Gibbons wrote:
>> Thanks for checking. Well I know that the
>> `MacroAssembler::movdqu(XMMRegister dst, AddressLiteral src, Register
>> rscratch)` method actually generates rip-relative addresses. Maybe we could
>> copy some of that code.
>
> Changed to
On Fri, 24 May 2024 14:49:05 GMT, Daniel Jeliński wrote:
>> Just did the experiment and it turns out that `mov64(r15,
>> (int64_t)small_jump_table)` and `lea(r15,
>> ExternalAddress(small_jump_table))` produce exactly the same code:
>>
>> `0x7fffe463d68b: 49 bf a0 d5 63 e4 ff 7f 00 00 m
On Fri, 24 May 2024 14:49:05 GMT, Daniel Jeliński wrote:
>> Just did the experiment and it turns out that `mov64(r15,
>> (int64_t)small_jump_table)` and `lea(r15,
>> ExternalAddress(small_jump_table))` produce exactly the same code:
>>
>> `0x7fffe463d68b: 49 bf a0 d5 63 e4 ff 7f 00 00 m
On Fri, 24 May 2024 14:19:13 GMT, Scott Gibbons wrote:
>> the RIP-relative lea should have a shorter encoding. I think something like
>> `lea(r15, ExternalAddress(small_jump_table))` should produce it (untested)
>
> Just did the experiment and it turns out that `mov64(r15,
> (int64_t)small_jump
On Fri, 24 May 2024 06:31:40 GMT, Daniel Jeliński wrote:
>> It may, but I believe the movq is shorter (although maybe not to r15). I'll
>> do some experimentation.
>
> the RIP-relative lea should have a shorter encoding. I think something like
> `lea(r15, ExternalAddress(small_jump_table))` sh
On Fri, 24 May 2024 06:31:36 GMT, Daniel Jeliński wrote:
>> Thanks for finding this. It was ignorance on my part as I thought the xorq
>> would have logic to not emit the REX prefix if not necessary, but it
>> doesn't. Fixed.
>
> Right, it seems to surprise people. There's a lot of preexistin
On Thu, 23 May 2024 19:26:10 GMT, Scott Gibbons wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 268:
>>
>>> 266: __ cmpq(needle_len_p, 0);
>>> 267: __ jg_b(L_nextCheck);
>>> 268: __ xorq(rax, rax);
>>
>> out of curiosity, is there any advantage to using `xorq` instead o
On Thu, 23 May 2024 19:02:05 GMT, Daniel Jeliński wrote:
>> Scott Gibbons has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Fix for IndexOf.java on mac
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 268:
>
>> 266: __ cmpq(
On Thu, 23 May 2024 17:25:34 GMT, Scott Gibbons wrote:
>> Re-write the IndexOf code without the use of the pcmpestri instruction, only
>> using AVX2 instructions. This change accelerates String.IndexOf on average
>> 1.3x for AVX2. The benchmark numbers:
>>
>>
>> Benchmark
> Re-write the IndexOf code without the use of the pcmpestri instruction, only
> using AVX2 instructions. This change accelerates String.IndexOf on average
> 1.3x for AVX2. The benchmark numbers:
>
>
> BenchmarkScore
> Latest
11 matches
Mail list logo