[osv-dev] [PATCH] aarch64: improve unexpected exception handling

2022-03-13 Thread Waldemar Kozaczuk
Debugging scenarios when OSv crashes due to an unexpected exception
can be quite tedious given they are handled by entry_invalid which
simply makes kernel "hang" waiting for an interrupt. One needs to connect
with gdb and introspect registers to make sense of what happenned.

This patch improves the unexpected exception handling by defining
proper handlers for each exception level and exception type. When exception
is triggered corresponding handler prints exception type, exception level and
all registers and aborts potentially printing a backtrace.

Signed-off-by: Waldemar Kozaczuk 
---
 arch/aarch64/entry.S   | 125 +++--
 arch/aarch64/exceptions.cc |  30 +++--
 2 files changed, 103 insertions(+), 52 deletions(-)

diff --git a/arch/aarch64/entry.S b/arch/aarch64/entry.S
index 0c2a6e82..03266f9c 100644
--- a/arch/aarch64/entry.S
+++ b/arch/aarch64/entry.S
@@ -22,10 +22,10 @@
 Lower Exception level, from AArch32  0x600 0x680 
0x700 0x780
  */
 
-.macro vector_entry label idx
+.macro vector_entry level, type
 /* every entry is at 2^7 bits distance */
 .align 7
-b   \label
+b   entry_\level\()_\type
 .endm
 
 .global exception_vectors
@@ -34,28 +34,28 @@
 .align 12
 exception_vectors:
 /* Current Exception level with SP_EL0 : unused */
-vector_entry entry_invalid  0   // Synchronous
-vector_entry entry_invalid  1   // IRQ or vIRQ
-vector_entry entry_invalid  2   // FIQ or vFIQ
-vector_entry entry_invalid  3   // SError or vSError
+vector_entry curr_el_sp0  sync   // Synchronous
+vector_entry curr_el_sp0  irq// IRQ or vIRQ
+vector_entry curr_el_sp0  fiq// FIQ or vFIQ
+vector_entry curr_el_sp0  serror // SError or vSError
 
 /* Current Exception level with SP_ELx : only actually used */
-vector_entry entry_sync 4
-vector_entry entry_irq  5
-vector_entry entry_fiq  6
-vector_entry entry_serror   7
+vector_entry curr_el_spx  sync
+vector_entry curr_el_spx  irq
+vector_entry curr_el_spx  fiq
+vector_entry curr_el_spx  serror
 
 /* Lower Exception level in AArch64 : unused since we don't go to EL0 
*/
-vector_entry entry_invalid  8
-vector_entry entry_invalid  9
-vector_entry entry_invalid 10
-vector_entry entry_invalid 11
+vector_entry lower_el_aarch64 sync
+vector_entry lower_el_aarch64 irq
+vector_entry lower_el_aarch64 fiq
+vector_entry lower_el_aarch64 serror
 
 /* Lower Exception level in AArch32 : no El0, no AArch32 */
-vector_entry entry_invalid 12
-vector_entry entry_invalid 13
-vector_entry entry_invalid 14
-vector_entry entry_invalid 15
+vector_entry lower_el_aarch32 sync
+vector_entry lower_el_aarch32 irq
+vector_entry lower_el_aarch32 fiq
+vector_entry lower_el_aarch32 serror
 
 /* keep in sync with the struct in exceptions.hh */
 .macro push_state_to_exception_frame
@@ -131,24 +131,61 @@ thread_main:
 .equ ESR_FLT_BEG,2 // we strip LL
 .equ ESR_FLT_END,5
 
-.global entry_invalid
-.hidden entry_invalid
-.type entry_invalid, @function
-entry_invalid:
-mrs x20, elr_el1   // Exception Link Register -> X20
-mrs x21, spsr_el1  // Saved PSTATE -> X21
-mrs x22, esr_el1   // Exception Syndrome Register -> X22
+.macro entry_unexpected_exception level, type, level_id, type_id
+.global entry_\level\()_\type
+.hidden entry_\level\()_\type
+.type entry_\level\()_\type, @function
+entry_\level\()_\type:
+.cfi_startproc simple
+.cfi_signal_frame
+.cfi_def_cfa sp, 0
+.cfi_offset x30, -32 // Point to the elr register located at the -32 
offset
+ // of the exception frame to help gdb link to the
+ // address when interrupt was raised
+push_state_to_exception_frame
+mrs x1, esr_el1
+str w1, [sp, #272] // Store Exception Syndrom Register in the frame
+mov x0, sp // Save exception_frame to x0
+mov x1, \level_id
+mov x2, \type_id
+bl  handle_unexpected_exception
+pop_state_from_exception_frame
+bl  abort
+.cfi_endproc
+.endm
+
+.equ CURR_EL_SP0, 0x0
+.equ CURR_EL_SPX, 0x1
+.equ LOWER_EL_AARCH64, 0x2
+.equ LOWER_EL_AARCH32, 0x3
 
-ubfmx23, x22, #ESR_EC_BEG, #ESR_EC_END   // Exception Class -> X23
-ubfmx24, x22, #ESR_ISS_BEG, #ESR_ISS_END // Instruction-Specific 
Syndrome -> X24
+.equ EX_TYPE_SYNC, 0x0
+.equ EX_TYPE_IRQ, 0x1
+.equ EX_TYPE_FIQ, 0x2
+.equ EX_TYPE_SERROR, 0x3
 
-1:  wfi
-b   1b
+entry_unexpected_exception curr_el_sp0, sync, #CURR_EL_SP0, #EX_TYPE_SYNC
+entry_unexpected_exception curr_el_sp0, irq, 

Re: [osv-dev] Proposal to change virtual memory layout

2022-03-13 Thread Nadav Har'El
On Fri, Mar 11, 2022 at 7:03 AM Waldek Kozaczuk 
wrote:

> In the last couple of weeks I have been working on various issues related
> to aarch64 port. I have managed to make good progress and I will be sending
> new patches soon.
>
> Two issues had to do with making Java run on aarch64 -
> https://github.com/cloudius-systems/osv/issues/1145 and
> https://github.com/cloudius-systems/osv/issues/1157.  After exchanging
> some emails on the openjdk emailing list and researching this problem, I
> finally discovered that the problem only happens when JIT is enabled and is
> caused by the fact that the JIT compiler generates machine code to access
> arbitrary address in memory in a way that assumes all addresses are 48
> bits, meaning first 16 bits are 0. And here are the details:
>
> "Once I got hold of the JDK debuginfo files and identified the patching
> code - MacroAssembler::pd_patch_instruction(), I was able to put a
> breakpoint in it and see something very revealing:
>
> #0  MacroAssembler::pd_patch_instruction_size (branch=0x2879465c
> "\351\377\237\322\351\377\277\362\351\377\337\362\n\243\352\227\037",
> target=0xa00042c862e0 "\020zB") at
> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp:75
> #1  0x10bc13cc in MacroAssembler::pd_patch_instruction (file=0x0,
> line=0, target=0xa00042c862e0 "\020zB", branch=)
> at src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp:626
> #2  NativeMovConstReg::set_data (this=0x2879465c, x=-105551995837728)
> at src/hotspot/cpu/aarch64/nativeInst_aarch64.cpp:262
> #3  0x10850bd0 in CompiledIC::set_ic_destination_and_value
> (value=0xa00042c862e0,
> entry_point=0x2823d290 
> "(\b@\271\b\001]\322*\005@\371\037\001\n\353,\001",
> , this=)
> at src/hotspot/share/code/compiledIC.hpp:193
> #4  ICStub::finalize (this=) at
> src/hotspot/share/code/icBuffer.cpp:91
> #5  ICStubInterface::finalize (this=, self=)
> at src/hotspot/share/code/icBuffer.cpp:43
> #6  0x10e30958 in StubQueue::stub_finalize
> (this=0xa00041555300, s=) at
> src/hotspot/share/code/stubs.hpp:168
> #7  StubQueue::remove_first (this=0xa00041555300) at
> src/hotspot/share/code/stubs.cpp:175
> 
>
> The corresponding crash value of X9 was this:
>
> 0x*a00042c862e0*
>
> vs the target argument of pd_patch_instruction() (see above in the
> backtrace):
>
> 0x*a00042c862e0*
>
> Now given this comment:
>
> // Move a constant pointer into r.  In AArch64 mode the virtual
> // address space is 48 bits in size, so we only need three
> // instructions to create a patchable instruction sequence that can
> // reach anywhere.
>
> and this fragment of pd_patch_instruction() -
> https://github.com/openjdk/jdk17u/blob/6f0f42630eac1febf562062afc523fdf3d2a920a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L152-L161
>
> it seems that the code to load x8 register with an address gets patched
> with 0xa00042c862e0 instead of 0xa00042c862e0. It is interesting
> that this assert -
> https://github.com/openjdk/jdk17u/blob/6f0f42630eac1febf562062afc523fdf3d2a920a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L77
> - does not get hit.
>
> The bottom line is that the valid address 0xa00042c862e0 gets
> truncated to 0xa00042c862e0 I guess based on the assumption that in
> Linux all userspace addresses are 48-bits long (see
> https://www.kernel.org/doc/html/latest/arm64/memory.html). In OSv
> unikernel, there is no separation between user space and kernel space, and
> it happens that addresses returned by malloc fall into this range:
>
> 0xa000 - 0xafff
>
> So I guess the only solution to fix it on the OSv side would be to tweak
> its virtual memory mapping for mallocs and make sure it never uses virtual
> addresses > 48-bits."
>
> Currently OSv maps this part of virtual memory like so:
> -- 0x  8000   phys_mem --\ | | |- Main Area - 16T --
> 0x  9000   --X | | |- Page Area - 16T -- 0x  a000 
>  --X | | |- Mempool Area - 16T -- 0x  b000   --X | | |-
> Debug Area - 80T -- 0x     --/
>
> I wonder if this was arbitrary choice made early in OSv design and there
> was some good reason for it.
>

Well, the story for x86 address is as follows (as far as I remember):

Although x86 is nominally a 64-bit address space, it didn't fully support
the entire 64 bits and doesn't even now.
Rather (see a good explanation in https://en.wikipedia.org/wiki/X86-64,
"canonical form address") it only supported 48 bits.
Moreover, all the highest bits *must *be copies of bit 47. So basically you
have 47 bits (128 TB) with all highest bits 0, and another 128 TB with the
highest bits 1 - these are the 0xf... addresses.

So it was convenient for OSv to divide the address space with one half for
mmap and one half for malloc.

If we want OSv to only use the lower 128 TB of address space, that should
be fairly easy to fix, we just need to check if we