On 7/4/23 12:52, Andreas Schwab wrote:
I think the issue is that the value returned from brk(0) is no longer page aligned.
$ ./qemu-riscv64 -strace ../exe1 18329 brk(NULL) = 0x0000000000303000 18329 faccessat(AT_FDCWD,"/etc/ld.so.preload",R_OK,0x3010d0) = -1 errno=2 (No such file or directory) 18329 openat(AT_FDCWD,"/etc/ld.so.cache",O_RDONLY|O_CLOEXEC) = 3 18329 newfstatat(3,"",0x00000040007fe900,0x1000) = 0 18329 mmap(NULL,8799,PROT_READ,MAP_PRIVATE,3,0) = 0x0000004000824000 18329 close(3) = 0 18329 openat(AT_FDCWD,"/lib64/lp64d/libc.so.6",O_RDONLY|O_CLOEXEC) = 3 18329 read(3,0x7fea70,832) = 832 18329 newfstatat(3,"",0x00000040007fe8f0,0x1000) = 0 18329 mmap(NULL,1405128,PROT_EXEC|PROT_READ,MAP_PRIVATE|MAP_DENYWRITE,3,0) = 0x0000004000827000 18329 mmap(0x000000400096d000,20480,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_DENYWRITE|MAP_FIXED,3,0x146000) = 0x000000400096d000 18329 mmap(0x0000004000972000,49352,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED,-1,0) = 0x0000004000972000 18329 close(3) = 0 18329 mmap(NULL,8192,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0x000000400097f000 18329 set_tid_address(0x400097f710) = 18329 18329 set_robust_list(0x400097f720,24) = -1 errno=38 (Function not implemented) 18329 mprotect(0x000000400096d000,12288,PROT_READ) = 0 18329 mprotect(0x0000004000820000,4096,PROT_READ) = 0 18329 prlimit64(0,RLIMIT_STACK,NULL,0x00000040007ff4f8) = 0 ({rlim_cur=8388608,rlim_max=-1}) 18329 munmap(0x0000004000824000,8799) = 0 18329 newfstatat(1,"",0x00000040007ff658,0x1000) = 0 18329 getrandom(0x4000976a40,8,1) = 8 18329 brk(NULL) = 0x0000000000303000 18329 brk(0x0000000000324000) = 0x0000000000324000 18329 write(1,0x3032a0,12)Hello world = 12 18329 exit_group(0)
$ qemu-riscv64 -strace ../exe1 18369 brk(NULL) = 0x00000000003022e8 18369 faccessat(AT_FDCWD,"/etc/ld.so.preload",R_OK,0x3010d0) = -1 errno=2 (No such file or directory) 18369 openat(AT_FDCWD,"/etc/ld.so.cache",O_RDONLY|O_CLOEXEC) = 3 18369 newfstatat(3,"",0x00000040007fe8f0,0x1000) = 0 18369 mmap(NULL,8799,PROT_READ,MAP_PRIVATE,3,0) = 0x0000004000824000 18369 close(3) = 0 18369 openat(AT_FDCWD,"/lib64/lp64d/libc.so.6",O_RDONLY|O_CLOEXEC) = 3 18369 read(3,0x7fea60,832) = 832 18369 newfstatat(3,"",0x00000040007fe8e0,0x1000) = 0 18369 mmap(NULL,1405128,PROT_EXEC|PROT_READ,MAP_PRIVATE|MAP_DENYWRITE,3,0) = 0x0000004000827000 18369 mmap(0x000000400096d000,20480,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_DENYWRITE|MAP_FIXED,3,0x146000) = 0x000000400096d000 18369 mmap(0x0000004000972000,49352,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED,-1,0) = 0x0000004000972000 18369 close(3) = 0 18369 mmap(NULL,8192,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0x000000400097f000 18369 set_tid_address(0x400097f710) = 18369 18369 set_robust_list(0x400097f720,24) = -1 errno=38 (Function not implemented) 18369 mprotect(0x000000400096d000,12288,PROT_READ) = 0 18369 mprotect(0x0000004000820000,4096,PROT_READ) = 0 18369 prlimit64(0,RLIMIT_STACK,NULL,0x00000040007ff4e8) = 0 ({rlim_cur=8388608,rlim_max=-1}) 18369 munmap(0x0000004000824000,8799) = 0 18369 newfstatat(1,"",0x00000040007ff648,0x1000) = 0 18369 getrandom(0x4000976a40,8,1) = 8 18369 brk(NULL) = 0x00000000003022e8 18369 brk(0x00000000003232e8)** ERROR:../accel/tcg/cpu-exec.c:1028:cpu_exec_setjmp: assertion failed: (cpu == current_cpu) Bail out! ERROR:../accel/tcg/cpu-exec.c:1028:cpu_exec_setjmp: assertion failed: (cpu == current_cpu) ** ERROR:../accel/tcg/cpu-exec.c:1028:cpu_exec_setjmp: assertion failed: (cpu == current_cpu) Bail out! ERROR:../accel/tcg/cpu-exec.c:1028:cpu_exec_setjmp: assertion failed: (cpu == current_cpu)
This reminds me on a failure I once saw on the hppa target. See commit bd4b7fd6ba98 ("linux-user/hppa: Fix segfaults on page zero"). Maybe the not-page-aligned brk address triggers the glibc or application in the guest to jump somewhere else (see cpu_exec_setjmp)? The example in my commit message jumped to address 0, which isn't writeable for applications in the target machine and qemu was missing to trigger/handle the correct target exception handling. I think your patch to page-align the initial brk() is correct, but it probably just hides the real problem. Maybe you are able to test what happens with exe1 on a physical risc-v machine if the brk-adress wouldn't be page aligned? Maybe you are missing some exception handling for risc-v in qemu too? Helge