Public bug reported: The linux guest OS catches a page fault bug when running the dotnet application.
host = metal = x86_64 host OS = ubuntu 19.10 qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master guest emulation = x86_64 guest OS = ubuntu 19.10 guest app = dotnet, running any program qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10, 2020) qemu invocation is: qemu/build/x86_64-softmmu/qemu-system-x86_64 \ -m size=4096 \ -smp cpus=1 \ -machine type=pc-i440fx-5.0,accel=tcg \ -cpu Skylake-Server-v1 \ -nographic \ -bios OVMF-pure-efi.fd \ -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \ -device virtio-blk,drive=hd0 \ -drive if=none,id=cloud,file=linux_cloud_config.img \ -device virtio-blk,drive=cloud \ -netdev user,id=user0,hostfwd=tcp::2223-:22 \ -device virtio-net,netdev=user0 Here's the guest kernel console output: [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0 [ 2834.009895] #PF: supervisor read access in user mode [ 2834.013872] #PF: error_code(0x0001) - permissions violation [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f) [ 2834.022242] LDTR: NULL [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867 [ 2834.038672] Oops: 0001 [#4] SMP PTI [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G D 5.3.0-29-generic #31-Ubuntu [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 2834.054785] RIP: 0033:0x1555547eaeda [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00 [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202 [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001 [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0 [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0 [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002 [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388 [ 2834.097309] FS: 0000155554fa5740 GS: 0000000000000000 [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy [ 2834.122539] CR2: 00007fffffffc2c0 [ 2834.126867] ---[ end trace dfae51f1d9432708 ]--- [ 2834.131239] RIP: 0033:0x14d793262eda [ 2834.135715] Code: Bad RIP value. [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202 [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002 [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670 [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870 [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002 [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040 [ 2834.166239] FS: 0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000 [ 2834.170529] CS: 0033 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0 [ 2834.178892] PKRU: 55555554 I run the application from a shell with `ulimit -s unlimited` (unlimited stack to size). The application creates a number of threads, and those threads make a lot of calls to sigaltstack() and mprotect(); see the relevant source for dotnet here https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467 using strace -f on the app shows that no alt stacks come anywhere near the failing address; all alt stacks are in the heap, as expected. None of the mmap/mprotect/munmap syscalls were given arguments in the high memory 0x7fffffff0000 and up. gdb (with default signal stop/print/pass semantics) does not report any signals prior to the kernel bug being tripped, so I doubt the alternate signal stack is actually used. When I run the same dotnet binary on the host (eg, on "bare metal"), the host kernel seems happy and dotnet runs as expected. I have not tried different qemu or guest or host O/S. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1866892 Title: guest OS catches a page fault bug when running dotnet Status in QEMU: New Bug description: The linux guest OS catches a page fault bug when running the dotnet application. host = metal = x86_64 host OS = ubuntu 19.10 qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master guest emulation = x86_64 guest OS = ubuntu 19.10 guest app = dotnet, running any program qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10, 2020) qemu invocation is: qemu/build/x86_64-softmmu/qemu-system-x86_64 \ -m size=4096 \ -smp cpus=1 \ -machine type=pc-i440fx-5.0,accel=tcg \ -cpu Skylake-Server-v1 \ -nographic \ -bios OVMF-pure-efi.fd \ -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \ -device virtio-blk,drive=hd0 \ -drive if=none,id=cloud,file=linux_cloud_config.img \ -device virtio-blk,drive=cloud \ -netdev user,id=user0,hostfwd=tcp::2223-:22 \ -device virtio-net,netdev=user0 Here's the guest kernel console output: [ 2834.005449] BUG: unable to handle page fault for address: 00007fffffffc2c0 [ 2834.009895] #PF: supervisor read access in user mode [ 2834.013872] #PF: error_code(0x0001) - permissions violation [ 2834.018025] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f) [ 2834.022242] LDTR: NULL [ 2834.026306] TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f [ 2834.030395] PGD 80000000360d0067 P4D 80000000360d0067 PUD 36105067 PMD 36193067 PTE 8000000076d8e867 [ 2834.038672] Oops: 0001 [#4] SMP PTI [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G D 5.3.0-29-generic #31-Ubuntu [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 2834.054785] RIP: 0033:0x1555547eaeda [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00 [ 2834.072103] RSP: 002b:00007fffffffc2c0 EFLAGS: 00000202 [ 2834.076507] RAX: 0000000000000000 RBX: 00001554b401af38 RCX: 0000000000000001 [ 2834.080832] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fffffffcfb0 [ 2834.085010] RBP: 00007fffffffd730 R08: 0000000000000000 R09: 00007fffffffd1b0 [ 2834.089184] R10: 0000155555331dd5 R11: 00001555553ad8d0 R12: 0000000000000002 [ 2834.093350] R13: 0000000000000001 R14: 0000000000000001 R15: 00001554b401d388 [ 2834.097309] FS: 0000155554fa5740 GS: 0000000000000000 [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy [ 2834.122539] CR2: 00007fffffffc2c0 [ 2834.126867] ---[ end trace dfae51f1d9432708 ]--- [ 2834.131239] RIP: 0033:0x14d793262eda [ 2834.135715] Code: Bad RIP value. [ 2834.140243] RSP: 002b:00007ffddb4e2980 EFLAGS: 00000202 [ 2834.144615] RAX: 0000000000000000 RBX: 000014d6f402acb8 RCX: 0000000000000002 [ 2834.148943] RDX: 0000000001cd6950 RSI: 0000000000000000 RDI: 00007ffddb4e3670 [ 2834.153335] RBP: 00007ffddb4e3df0 R08: 0000000000000001 R09: 00007ffddb4e3870 [ 2834.157774] R10: 000014d793da9dd5 R11: 000014d793e258d0 R12: 0000000000000002 [ 2834.162132] R13: 0000000000000001 R14: 0000000000000001 R15: 000014d6f402d040 [ 2834.166239] FS: 0000155554fa5740(0000) GS:ffff97213ba00000(0000) knlGS:0000000000000000 [ 2834.170529] CS: 0033 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2834.174751] CR2: 000014d793262eb0 CR3: 0000000036130000 CR4: 00000000007406f0 [ 2834.178892] PKRU: 55555554 I run the application from a shell with `ulimit -s unlimited` (unlimited stack to size). The application creates a number of threads, and those threads make a lot of calls to sigaltstack() and mprotect(); see the relevant source for dotnet here https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467 using strace -f on the app shows that no alt stacks come anywhere near the failing address; all alt stacks are in the heap, as expected. None of the mmap/mprotect/munmap syscalls were given arguments in the high memory 0x7fffffff0000 and up. gdb (with default signal stop/print/pass semantics) does not report any signals prior to the kernel bug being tripped, so I doubt the alternate signal stack is actually used. When I run the same dotnet binary on the host (eg, on "bare metal"), the host kernel seems happy and dotnet runs as expected. I have not tried different qemu or guest or host O/S. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1866892/+subscriptions