On Sat, Feb 13, 2021 at 6:24 PM Waldek Kozaczuk <jwkozac...@gmail.com> wrote:
> Hi, > > On Thu, Feb 11, 2021 at 9:06 AM Nadav Har'El <n...@scylladb.com> wrote: > >> On Thu, Feb 11, 2021 at 7:42 AM Waldek Kozaczuk <jwkozac...@gmail.com> >> wrote: >> >>> >>> #1 0x0000100000037954 in test_bsd_tcp1::tcp_server (this=0x2000006ff988) >>> at /home/wkozaczuk/projects/osv/tests/tst-bsd-tcp1-zsnd.cc:114 >>> >>> 114 int bytes2 = zcopy_tx(client_s, &zm); >>> >>> (gdb) p client_s >>> >>> $1 = 5 >>> >>> (gdb) p &zm >>> >>> $2 = (zmsghdr *) 0xffff800041782d40 >>> >>> >>> As you can see the test app calls zcopy_tx() which takes 2 arguments: >>> >>> ssize_t zcopy_tx(int s, struct zmsghdr *zm) >>> >>> the 1st one is int and has value 5 in the caller - the test app - and is >>> received as such >>> >>> in the kernel zcopy_tx. >>> >>> >>> The second one - the address of struct zmsghdr - is problematic. On the >>> caller's side looks OK but when received in the kernel it is wrong - 0x1. >>> >>> Why? >>> >> >> Not being an expert on aarch64 or it's function calling conventions, all >> I can do is raise some wild guesses, I hope one of them is correct and you >> can figure out which - perhaps by reading the code or trying to reproduce >> it in new tests (you can perhaps write a new test which loops calling some >> function f() with a bunch of parameters in multiple threads, and printing >> an error if f ever gets called with wrong parameters) . >> >> One possibility is that our context-switch implementation is forgetting >> to save some of the registers, and the register which is used to hold the >> third argument of a function is lost on the context switch. >> >> Another possibility is that we lose this register in situations smaller >> asynchronous events, not just context switches between threads. We have >> interrupts (e.g., the timer interrupt), exceptions, and signals, which can >> run complex OSv code in the middle of the user's function without the >> function knowing that this is happening, so when we switch to these >> interrupts or exceptions we mustn't forget the registers which the OSv code >> may clobber. >> >> > > I think you are right that we are "losing" a register or overwriting stack > memory we restore registers from. It is just not clear where. > > So here is what I have tried since last email: > 1) I had a theory that maybe there is a bug in the __elf_resolve_pltgot > assembly which would be called lazily as the app is executing it might be > messing the registers. So I forced the app symbols to be resolved eagerly > and still getting same exact crash. So back to the drawing board. > 2) I made gradual changed to conf/debug.mk to find the minimal difference > with release.mk that makes the crashes happen: > > - Disabling the debug messages with 'onf-logger_debug=0' still give > the exact same crashes > - Removing '-Wno-maybe-uninitialized' from debug.mk also does change > anything - still same crash > - So at this point the only difference is optimization level and > missing '-DNDEBUG'. So I thought of bumping the debug optimization level to > '-O1'. Interestingly this makes the crashes to happen in the kernel code > when mounting ROFS filesystem in rofs_mount() static function and it is > very repeatable. But at the end of the day it is similar to the original > crashes with -O0 - some register - typically x2* (x24, x22 or x28) changes > in the middle of the function after it calls another one that register > typically holding an address has wrong bogus value just as it was not > restored by callee. > > In the specific kernel crash with -O1 above, the rofs_mount() > calls rofs_read_blocks() which reads from a block device so for sure it > causes an exit to hypervisor and results in an interrupt handled so for > sure there are multiples threads being switched. So again maybe > something wrong in interrupt handler? > > Now looking at all assembly code (which is I think they issue is) there > are not that many candidates: entry.S where we have sync (page fault) and > async (interrupts) handlers entry points. Both save and restore all > registers they should I think using those 2 macros: > .macro push_state_to_exception_frame > sub sp, sp, #48 // make space for align2, align1+ESR, PSTATE, > PC, SP > push_pair x28, x29 > push_pair x26, x27 > I have two problems trying to understand this: first, sadly my memory of how we did all of this in x86 is very rusty. Second, I have no idea what was done differently in aarch64, and if so why. You seem to be pushing registers on the stack here. Where is this stack? In x86, we had separate stacks for exceptions, for nested exceptions, and interrupts. Is this also true in the arm version? This discussion of the stack made me think of another possible reason for losing data in functions. The red zone. Do we have a "red zone" on arm64 as well? Basically, the "red zone" is 128 bytes below the stack pointer that a function can use as scratch space, and it can use it for example to store some of the parameters if it needs the registers to store something else - without wasting time on instructions to change the stack pointer. If some interrupt or exception overwrites this redzone, we lose data. To avoid this, we usually had separate stacks for interrupts and exceptions and nested exceptions, but where we didn't want to do this, e.g., in syscalls, we had to skip the redzone (see for example commit 499b9433ae748b6c04dedc2125ea17010ffbdaf1). I have another wild guess below - caller-saved registers: push_pair x24, x25 > push_pair x22, x23 > push_pair x20, x21 > push_pair x18, x19 > push_pair x16, x17 > push_pair x14, x15 > push_pair x12, x13 > push_pair x10, x11 > push_pair x8, x9 > push_pair x6, x7 > push_pair x4, x5 > push_pair x2, x3 > push_pair x0, x1 > add x1, sp, #288 // x1 := old SP (48 + 16 * 15 = 288) > mrs x2, elr_el1 > mrs x3, spsr_el1 > stp x30, x1, [sp, #240] // store lr, old SP > stp x2, x3, [sp, #256] // store elr_el1, spsr_el1 > > .macro pop_state_from_exception_frame > ldp x21, x22, [sp, #256] // load elr_el1, spsr_el1 > pop_pair x0, x1 > pop_pair x2, x3 > pop_pair x4, x5 > pop_pair x6, x7 > pop_pair x8, x9 > msr elr_el1, x21 // set loaded elr and spsr > msr spsr_el1, x22 > pop_pair x10, x11 > pop_pair x12, x13 > pop_pair x14, x15 > pop_pair x16, x17 > pop_pair x18, x19 > pop_pair x20, x21 > pop_pair x22, x23 > pop_pair x24, x25 > pop_pair x26, x27 > pop_pair x28, x29 > ldr x30, [sp], #48 > > There maybe an issue in context switch functions which look like this but > I do not see any obvious issues here: > > void thread::switch_to() > { > thread* old = current(); > asm volatile ("msr tpidr_el0, %0; isb; " :: "r"(_tcb) : "memory"); > > asm volatile("\n" > "str x29, %0 \n" > "mov x2, sp \n" > "adr x1, 1f \n" /* address of label */ > "stp x2, x1, %1 \n" > > "ldp x29, x0, %2 \n" > "ldp x2, x1, %3 \n" > > "mov sp, x2 \n" > "blr x1 \n" > > "1: \n" /* label */ > : > : "Q"(old->_state.fp), "Ump"(old->_state.sp), > "Ump"(this->_state.fp), "Ump"(this->_state.sp) > : "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8", > "x9", "x10", "x11", "x12", "x13", "x14", "x15", > "x16", "x17", "x18", "x30", "memory"); > } > > void thread::switch_to_first() > { > asm volatile ("msr tpidr_el0, %0; isb; " :: "r"(_tcb) : "memory"); > > /* check that the tls variable preempt_counter is correct */ > assert(sched::get_preempt_counter() == 1); > > s_current = this; > current_cpu = _detached_state->_cpu; > remote_thread_local_var(percpu_base) = > _detached_state->_cpu->percpu_base; > > asm volatile("\n" > "ldp x29, x0, %0 \n" > "ldp x2, x1, %1 \n" > "mov sp, x2 \n" > "blr x1 \n" > : > : "Ump"(this->_state.fp), "Ump"(this->_state.sp) > : "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8", > "x9", "x10", "x11", "x12", "x13", "x14", "x15", > "x16", "x17", "x18", "x30", "memory"); > } > > > Just for reference the ARM function calling convention is this (as far as > registers go): > > Caller: > * Save x0-x18 registers if calling a function (it may change them) > * Use x0-x7 registers for 8 first parameters > * push extra parameters on stack > The "typical" problem here (I don't know if it happens in your case, but it happened in the past in various cases) is that "something" (interrupt, exception, signal, etc.) gets called *in the middle *of the user's function code, so he did not know he was going to call a function, so it didn't save these caller-saved registers. This is why all of that asynchronous code needs to save all those caller-saved registers. In x86, we had these problems with the FPU and had to save the FPU state in a bunch of places. Maybe in aarch64 we need to save additional registers in the same place we saved the FPU state for x86? > * call a function > * Restore ant x0-x18 registers if saved > > > Callee: > * push lr and any x19-x30 registered if used on stack > * execute code > * pop any x19-x30 registered if used above from stack > > Waldek > >> >>> I saw another test crashing in a similar way when the caller (another >>> test) would pass 3 arguments to kernel function and 2 of those >>> (non-addresses) were passed correctly but the 3rd one - address one was not. >>> >>> >>> Any ideas what might be going on? >>> >>> >>> Waldek >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "OSv Development" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to osv-dev+unsubscr...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/osv-dev/4a97809f-d207-48b9-88e7-06e218e5d829n%40googlegroups.com >>> <https://groups.google.com/d/msgid/osv-dev/4a97809f-d207-48b9-88e7-06e218e5d829n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CANEVyjv8jweaY8vFu9-W%3D6nsNqo-aLP0Q%3DkpF6FOQ_wuXVF3DQ%40mail.gmail.com.