On Mon, 25 Jan 2021 09:52:00 GMT, Andrew Haley <a...@openjdk.org> wrote:
>> Hello >> Why is it not nice ? >> linux_aarch64 uses some linux specific tls function >> _ZN10JavaThread25aarch64_get_thread_helperEv from >> hotspot/os_cpu/linux_aarch64/threadLS_linux_aarch64.s >> which clobbers only r0 and r1 >> macos_aarch64 has no such tls code and uses generic C-call to >> Thread::current(); >> Hence we are saving possibly clobbered regs here. > > Surely if you did as Linux does you wouldn't need to clobber all those > registers. I see how this can be done, from looking at C example with `__thread`, which involves poorly documented relocations and private libc interface invocation. But now I also wonder how much benefit would we have from this optimization. `get_thread` is called from few places and all of them are guarded by `#ifdef ASSERT`. The release build is still fine after I removed MacroAssembler::get_thread definition (as a verification). Callers of get_thread: * StubAssembler::call_RT * Runtime1::generate_patching * StubGenerator::generate_call_stub * StubGenerator::generate_catch_exception All of them are heavy-weight functions, nonoptimal get_thread is unlikely to slow them down even in fastdebug. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200