On Wed, 15 Nov 2023 00:39:29 GMT, Sandhya Viswanathan <sviswanat...@openjdk.org> wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1186: >> >>> 1184: __ evmovntdquq(Address(dst, index, scale, offset + 0x40), xmm2, >>> Assembler::AVX_512bit); >>> 1185: __ evmovntdquq(Address(dst, index, scale, offset + 0x80), xmm3, >>> Assembler::AVX_512bit); >>> 1186: __ evmovntdquq(Address(dst, index, scale, offset + 0xC0), xmm4, >>> Assembler::AVX_512bit); >> >> These are non-temporal memory moves, to force eviction from write combining >> buffers we may need to emit additional fences, else a subsequent read from >> destination memory may see incorrect values. > > @jatin-bhateja There is a sfence at line 781. Thanks, there is an store fence upon completion of the main loop for the large size code: ![image](https://github.com/openjdk/jdk/assets/3858882/3bcea3c6-3bda-458c-aa7c-29ed6010cde2) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16575#discussion_r1393511087