In the last couple of weeks I have been working on various issues related to aarch64 port. I have managed to make good progress and I will be sending new patches soon.
Two issues had to do with making Java run on aarch64 - https://github.com/cloudius-systems/osv/issues/1145 and https://github.com/cloudius-systems/osv/issues/1157. After exchanging some emails on the openjdk emailing list and researching this problem, I finally discovered that the problem only happens when JIT is enabled and is caused by the fact that the JIT compiler generates machine code to access arbitrary address in memory in a way that assumes all addresses are 48 bits, meaning first 16 bits are 0. And here are the details: "Once I got hold of the JDK debuginfo files and identified the patching code - MacroAssembler::pd_patch_instruction(), I was able to put a breakpoint in it and see something very revealing: #0 MacroAssembler::pd_patch_instruction_size (branch=0x20000879465c "\351\377\237\322\351\377\277\362\351\377\337\362\n\243\352\227\037", target=0xffffa00042c862e0 "\020zB") at src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp:75 #1 0x0000100000bc13cc in MacroAssembler::pd_patch_instruction (file=0x0, line=0, target=0xffffa00042c862e0 "\020zB", branch=<optimized out>) at src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp:626 #2 NativeMovConstReg::set_data (this=0x20000879465c, x=-105551995837728) at src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp:262 #3 0x0000100000850bd0 in CompiledIC::set_ic_destination_and_value (value=0xffffa00042c862e0, entry_point=0x20000823d290 "(\b@\271\b\001]\322*\005@\371\037\001\n\353,\001", <incomplete sequence \371\200>, this=<optimized out>) at src/hotspot/share/code/compiledIC.hpp:193 #4 ICStub::finalize (this=<optimized out>) at src/hotspot/share/code/icBuffer.cpp:91 #5 ICStubInterface::finalize (this=<optimized out>, self=<optimized out>) at src/hotspot/share/code/icBuffer.cpp:43 #6 0x0000100000e30958 in StubQueue::stub_finalize (this=0xffffa00041555300, s=<optimized out>) at src/hotspot/share/code/stubs.hpp:168 #7 StubQueue::remove_first (this=0xffffa00041555300) at src/hotspot/share/code/stubs.cpp:175 .... The corresponding crash value of X9 was this: 0x0000*a00042c862e0* vs the target argument of pd_patch_instruction() (see above in the backtrace): 0xffff*a00042c862e0* Now given this comment: // Move a constant pointer into r. In AArch64 mode the virtual // address space is 48 bits in size, so we only need three // instructions to create a patchable instruction sequence that can // reach anywhere. and this fragment of pd_patch_instruction() - https://github.com/openjdk/jdk17u/blob/6f0f42630eac1febf562062afc523fdf3d2a920a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L152-L161 it seems that the code to load x8 register with an address gets patched with 0x0000a00042c862e0 instead of 0xffffa00042c862e0. It is interesting that this assert - https://github.com/openjdk/jdk17u/blob/6f0f42630eac1febf562062afc523fdf3d2a920a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L77 - does not get hit. The bottom line is that the valid address 0xffffa00042c862e0 gets truncated to 0x0000a00042c862e0 I guess based on the assumption that in Linux all userspace addresses are 48-bits long (see https://www.kernel.org/doc/html/latest/arm64/memory.html). In OSv unikernel, there is no separation between user space and kernel space, and it happens that addresses returned by malloc fall into this range: 0xffffa00000000000 - 0xffffafffffffffff So I guess the only solution to fix it on the OSv side would be to tweak its virtual memory mapping for mallocs and make sure it never uses virtual addresses > 48-bits." Currently OSv maps this part of virtual memory like so: ------ 0x ffff 8000 0000 0000 phys_mem --\ | | |- Main Area - 16T ------ 0x ffff 9000 0000 0000 --X | | |- Page Area - 16T ------ 0x ffff a000 0000 0000 --X | | |- Mempool Area - 16T ------ 0x ffff b000 0000 0000 --X | | |- Debug Area - 80T ------ 0x ffff ffff ffff ffff --/ I wonder if this was arbitrary choice made early in OSv design and there was some good reason for it. Could this be changed to this: ------ 0x 0000 8000 0000 0000 phys_mem --\ | | |- Main Area - 16T ------ 0x 0000 9000 0000 0000 --X | | |- Page Area - 16T ------ 0x 0000 a000 0000 0000 --X | | |- Mempool Area - 16T ------ 0x 0000 b000 0000 0000 --X | | |- Debug Area - 80T ------ 0x 0000 ffff ffff ffff --/ I did manage to hack the code for aarch64 and it seems to be working. I also found a similar case for x86_64. The library rapidjson used by dotnet uses last 16 bits of addresses to compress some info into it - see https://github.com/Tencent/rapidjson/pull/546#issue-133623698. Now going forward I think Linux will extend the userspace addresses eventually from 48 bits to 56 bits (see https://en.wikipedia.org/wiki/Intel_5-level_paging) or higher. And dotnet actually made a fix to disable this high 16-bits hack. But given there are Linux apps that may assume that addresses are 48-bit and take advantage of it, it might be wise to change the OSv virtual memory layout to use the lower part only (<= 0x 0000 ffff ffff ffff). What do you think? Waldek -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/79d6032f-02c9-4dcf-955e-eb2f4b9f308bn%40googlegroups.com.