On Fri, 6 Mar 2026 00:36:48 GMT, Sandhya Viswanathan <[email protected]> wrote:
>> xinyangwu has updated the pull request with a new target base due to a merge >> or a rebase. The incremental webrev excludes the unrelated changes brought >> in by the merge/rebase. The pull request contains 11 additional commits >> since the last revision: >> >> - Remove trailing backslashes >> - Merge branch 'openjdk:master' into aes >> - refactor >> - Merge branch 'openjdk:master' into aes >> - 8376164: Optimize AES/ECB/PKCS5Padding implementation using full-message >> intrinsic stub and parallel RoundKey addition >> - Merge branch 'openjdk:master' into aes >> - 8376164: Optimize AES/ECB/PKCS5Padding implementation using full-message >> intrinsic stub and parallel RoundKey addition >> - Merge branch 'openjdk:master' into aes >> - Merge branch 'openjdk:master' into aes >> - Merge branch 'openjdk:master' into aes >> - ... and 1 more: https://git.openjdk.org/jdk/compare/c718cea6...ef2effbc > > src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp line 1494: > >> 1492: DoFour(pxor, xmm_key_tmp); >> 1493: for (int i = 1; i < rounds[k]; i++) { >> 1494: load_key(xmm_key_tmp, key, i * 0x10, xmm_key_shuf_mask); > > We have 16 Xmm registers and we have used only 6 of them. We could use the > remaining 10 Xmm registers to load 10 of the keys prior to the L_loop4 and > hold them. That way we do not need to reload 10 of the keys again and again > in the loop. I’ve updated the implementation as you suggested. The benchmark show a further performance improvement compared to the previous version. Thanks. > src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp line 1560: > >> 1558: // rax - input length >> 1559: // >> 1560: address >> StubGenerator::generate_electronicCodeBook_decryptAESCrypt_Parallel() { > > The encrypt and decrypt methods are very similar, wonder if we could > parametrize and use one generate method to generate both? Changed as suggested. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/29385#discussion_r2895555423 PR Review Comment: https://git.openjdk.org/jdk/pull/29385#discussion_r2895557105
