Re: [PATCH 00/10] target/i386/tcg: fixes for seg_helper.c
I have only skimmed the diffs. Your knowledge of the deep semantics, gained by close differential reading of intel and amd docs, is truly amazing. Many thanks for pushing this through! I have 2 nits, perhaps stylistic only. For code like "sp -= 2" or "sp += 2" followed or preceded by a write to the stack pointer of a uint16_t variable 'x', would it be better/more robust to rewrite as: "sp -= sizeof(x)" ? There are a lot of masks constructed using -1. I think it would be clearer to use 0x (for 32-bit masks) as that reminds the reader that this is a bit mask. But it seems that using -1 is how the original code was written. On Tue, Jul 9, 2024 at 11:29 PM Paolo Bonzini wrote: > This includes bugfixes: > - allowing IRET from user mode to user mode with SMAP (do not use implicit > kernel accesses, which break if the stack is in userspace) > > - use DPL-level accesses for interrupts and call gates > > - various fixes for task switching > > And two related cleanups: computing MMU index once for far calls and > returns > (including task switches), and using X86Access for TSS access. > > Tested with a really ugly patch to kvm-unit-tests, included after > signature. > > Paolo Bonzini (7): > target/i386/tcg: Allow IRET from user mode to user mode with SMAP > target/i386/tcg: use PUSHL/PUSHW for error code > target/i386/tcg: Compute MMU index once > target/i386/tcg: Use DPL-level accesses for interrupts and call gates > target/i386/tcg: check for correct busy state before switching to a > new task > target/i386/tcg: use X86Access for TSS access > target/i386/tcg: save current task state before loading new one > > Richard Henderson (3): > target/i386/tcg: Remove SEG_ADDL > target/i386/tcg: Reorg push/pop within seg_helper.c > target/i386/tcg: Introduce x86_mmu_index_{kernel_,}pl > > target/i386/cpu.h| 11 +- > target/i386/cpu.c| 27 +- > target/i386/tcg/seg_helper.c | 606 +++ > 3 files changed, 354 insertions(+), 290 deletions(-) > > -- > 2.45.2 > > diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c > index c3ec0ad7..0bf40c6d 100644 > --- a/lib/x86/usermode.c > +++ b/lib/x86/usermode.c > @@ -5,13 +5,15 @@ > #include "x86/desc.h" > #include "x86/isr.h" > #include "alloc.h" > +#include "alloc_page.h" > #include "setjmp.h" > #include "usermode.h" > > #include "libcflat.h" > #include > > -#define USERMODE_STACK_SIZE0x2000 > +#define USERMODE_STACK_ORDER 1 /* 8k */ > +#define USERMODE_STACK_SIZE(1 << (12 + USERMODE_STACK_ORDER)) > #define RET_TO_KERNEL_IRQ 0x20 > > static jmp_buf jmpbuf; > @@ -37,9 +39,14 @@ uint64_t run_in_user(usermode_func func, unsigned int > fault_vector, > { > extern char ret_to_kernel; > volatile uint64_t rax = 0; > - static unsigned char user_stack[USERMODE_STACK_SIZE]; > + static unsigned char *user_stack; > handler old_ex; > > + if (!user_stack) { > + user_stack = alloc_pages(USERMODE_STACK_ORDER); > + printf("%p\n", user_stack); > + } > + > *raised_vector = 0; > set_idt_entry(RET_TO_KERNEL_IRQ, &ret_to_kernel, 3); > old_ex = handle_exception(fault_vector, > @@ -51,6 +58,8 @@ uint64_t run_in_user(usermode_func func, unsigned int > fault_vector, > return 0; > } > > + memcpy(user_stack + USERMODE_STACK_SIZE - 8, &func, 8); > + > asm volatile ( > /* Prepare kernel SP for exception handlers */ > "mov %%rsp, %[rsp0]\n\t" > @@ -63,12 +72,13 @@ uint64_t run_in_user(usermode_func func, unsigned int > fault_vector, > "pushq %[user_stack_top]\n\t" > "pushfq\n\t" > "pushq %[user_cs]\n\t" > - "lea user_mode(%%rip), %%rax\n\t" > + "lea user_mode+0x80(%%rip), %%rax\n\t" // > smap.flat places usermode addresses at 8MB-16MB > "pushq %%rax\n\t" > "iretq\n" > > "user_mode:\n\t" > /* Back up volatile registers before invoking func > */ > + "pop %%rax\n\t" > "push %%rcx\n\t" > "push %%rdx\n\t" > "push %%rdi\n\t" > @@ -78,11 +88,12 @@ uint64_t run_in_user(usermode_func func, unsigned int > fault_vector, > "push %%r10\n\t" > "push %%r11\n\t" > /* Call user mode function */ > + "add $0x80,%%rbp\n\t" > "mov %[arg1], %%rdi\n\t" > "mov %[arg2], %%rsi\n\t" > "mov %[arg3], %%rdx\n\t" > "mov %[arg4], %%rcx\n\t" > - "call *%[func]\n\t" > + "call *%%rax\n\t" >
Re: [PATCH 1/1] i386/tcg: Allow IRET from user mode to user mode for dotnet runtime
I do not think I will have the time or focus to work on improving this patch this summer, as I will retire in 2 weeks and need to make a clean break to focus on other things (health, for one) for a while. If anyone wants to put into place Richard's ideas, I will not be offended! I do not see any of this chatter in this email thread on the bug report https://gitlab.com/qemu-project/qemu/-/issues/249 Robert Henry On Sat, Jun 15, 2024 at 4:25 PM Richard Henderson < richard.hender...@linaro.org> wrote: > On 6/11/24 09:20, Robert R. Henry wrote: > > This fixes a bug wherein i386/tcg assumed an interrupt return using > > the IRET instruction was always returning from kernel mode to either > > kernel mode or user mode. This assumption is violated when IRET is used > > as a clever way to restore thread state, as for example in the dotnet > > runtime. There, IRET returns from user mode to user mode. > > > > This bug manifested itself as a page fault in the guest Linux kernel. > > > > This bug appears to have been in QEMU since the beginning. > > > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/249 > > Signed-off-by: Robert R. Henry > > --- > > target/i386/tcg/seg_helper.c | 78 ++-- > > 1 file changed, 47 insertions(+), 31 deletions(-) > > > > diff --git a/target/i386/tcg/seg_helper.c b/target/i386/tcg/seg_helper.c > > index 715db1f232..815d26e61d 100644 > > --- a/target/i386/tcg/seg_helper.c > > +++ b/target/i386/tcg/seg_helper.c > > @@ -843,20 +843,35 @@ static void do_interrupt_protected(CPUX86State > *env, int intno, int is_int, > > > > #ifdef TARGET_X86_64 > > > > -#define PUSHQ_RA(sp, val, ra) \ > > -{ \ > > -sp -= 8;\ > > -cpu_stq_kernel_ra(env, sp, (val), ra); \ > > -} > > - > > -#define POPQ_RA(sp, val, ra)\ > > -{ \ > > -val = cpu_ldq_kernel_ra(env, sp, ra); \ > > -sp += 8;\ > > -} > > +#define PUSHQ_RA(sp, val, ra, cpl, dpl) \ > > + FUNC_PUSHQ_RA(env, &sp, val, ra, cpl, dpl) > > + > > +static inline void FUNC_PUSHQ_RA( > > +CPUX86State *env, target_ulong *sp, > > +target_ulong val, target_ulong ra, int cpl, int dpl) { > > + *sp -= 8; > > + if (dpl == 0) { > > +cpu_stq_kernel_ra(env, *sp, val, ra); > > + } else { > > +cpu_stq_data_ra(env, *sp, val, ra); > > + } > > +} > > This doesn't seem quite right. > > I would be much happier if we were to resolve the proper mmu index > earlier, once, rather > than within each call to cpu_{ld,st}*_{kernel,data}_ra. With the mmu > index in hand, use > cpu_{ld,st}*_mmuidx_ra instead. > > I believe you will want to factor out a subroutine of x86_cpu_mmu_index > which passes in > the pl, rather than reading cpl from env->hflags. This will also allow > cpu_mmu_index_kernel to be eliminated or simplified, which is written to > assume pl=0. > > > r~ >
Re: [EXTERNAL] Plugins Not Reporting AArch64 SVE Memory Operations
I have not done anything on this front, alas. From: Aaron Lindsay Sent: Thursday, March 24, 2022 1:17 PM To: qemu-devel@nongnu.org ; qemu-...@nongnu.org Cc: Alex Bennée ; richard.hender...@linaro.org ; Robert Henry Subject: [EXTERNAL] Plugins Not Reporting AArch64 SVE Memory Operations Hi folks, I see there has been some previous discussion [1] about 1.5 years ago around the fact that AArch64 SVE instructions do not emit any memory operations via the plugin interface, as one might expect them to. I am interested in being able to more accurately trace the memory operations of SVE instructions using the plugin interface - has there been any further discussion or work on this topic off-list (or that escaped my searching)? In the previous discussion [1], Richard raised some interesting questions: > The plugin interface needs extension for this. How should I signal that 256 > consecutive byte loads have occurred? How should I signal that the > controlling > predicate was not all true, so only 250 of those 256 were actually active? > How > should I signal 59 non-consecutive (gather) loads have occurred? > > If the answer is simply that you want 256 or 250 or 59 plugin callbacks > respectively, then we might be able to force the memory operations into the > slow path, and hook the operation there. As if it were an i/o operation. My initial reaction is that simply sending individual callbacks for each access (only the ones which were active, in the case of predication) seems to fit reasonably well with the existing plugin interface. For instance, I think we already receive two callbacks for each AArch64 `LDP` instruction, right? If this is an agreeable solution that wouldn't take too much effort to implement (and no one else is doing it), would someone mind pointing me in the right direction to get started? Thanks! -Aaron [1] https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.nongnu.org%2Farchive%2Fhtml%2Fqemu-discuss%2F2020-12%2Fmsg00015.html&data=04%7C01%7Crobhenry%40microsoft.com%7C4fbf9f5adeca457a475e08da0dd35dc4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637837498833440416%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=bTHMSkvOvvX7o7seFIbf7gDk5V%2BMBhC6YytOorHbNts%3D&reserved=0
Re: [EXTERNAL] Re: Range of vcpu_index to plugin callbacks
Yes, the value of the cpu_index seems to track the number of clone() syscalls. (I did strace the process, but forgot to look for clone() syscalls... I still have vfork()/execve() on the brain.) From: Qemu-discuss on behalf of Philippe Mathieu-Daudé Sent: Sunday, September 19, 2021 10:54 AM To: rrh.henry ; qemu-disc...@nongnu.org Cc: Alex Bennée ; qemu-devel Subject: [EXTERNAL] Re: Range of vcpu_index to plugin callbacks (Cc'ing qemu-devel@ mailing list since this is a development question). On 9/19/21 19:44, Robert Henry wrote: > What is the range of the values for vcpu_index given to callbacks, such as: > > typedef void (*qemu_plugin_vcpu_udata_cb_t)(unsigned int vcpu_index, > void *userdata); > > Empirically, when QEMU is in system mode, the maximum vcpu_index is 1 > less than the -smp cpus=$(NCPUS) value. > > Empirically, when QEMU is in user mode, the values for vcpu_index slowly > increase without an apparent upper bound known statically (or when the > plugin is loaded?). Isn't it related to clone() calls? I'd expect new threads use a new vCPU, incrementing vcpu_index. But that is just a guess without having looked at the code to corroborate... Regards, Phil.
Re: [EXTERNAL] Re: ia-32/ia-64 fxsave64 instruction behavior when saving mmx
On 7/31/20 1:34 PM, Eduardo Habkost wrote: > On Mon, Jun 01, 2020 at 08:19:51AM +0200, Philippe Mathieu-Daudé wrote: >> Hi Robert. >> >> Top-posting is difficult to read on technical lists, >> it's better to reply inline. >> >> Cc'ing the X86 FPU maintainers: >> >> ./scripts/get_maintainer.pl -f target/i386/fpu_helper.c >> Paolo Bonzini (maintainer:X86 TCG CPUs) >> Richard Henderson (maintainer:X86 TCG CPUs) >> Eduardo Habkost (maintainer:X86 TCG CPUs) >> >> On 6/1/20 1:22 AM, Robert Henry wrote: >>> Here's additional information. >>> >>> All of the remill tests of the legacy MMX instructions fail. These >>> instructions work on 64-bit registers aliased with the lower 64-bits of >>> the x87 fp80 registers. Ã, The tests fail because remill expects the >>> fxsave64 instruction to deliver 16 bits of 1's (infinity or nan prefix) >>> in the fp80 exponent, eg bits 79:64. Ã, Metal does this, but QEMU does not. >> Metal is what matters, QEMU should emulate it when possible. >> >>> Reading of Intel Software development manual, table 3.44 >>> (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.felixcloutier.com%2Fx86%2FFXSAVE.html%23tbl-3-44&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=8CFUNe%2F%2BbLyukyV6BmmTBFyoV%2B3pTFh%2Bg5QAs3ccLbc%3D&reserved=0) >>> says these 16 >>> bits are reserved, but another version of the manual >>> (https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmath-atlas.sourceforge.net%2Fdevel%2Farch%2Fia32_arch.pdf&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=2SiKu1cx4SVhwzzSbj7hgz%2B8ICCZHDnu0npUs9yMJLA%3D&reserved=0) >>> section >>> 9.6.2 "Transitions between x87 fpu and mmx code" says a write to an MMX >>> register sets those 16 bits to all 1s. >> You are [1] here answering [2] you asked below. >> >>> In digging through the code for the implementation of the SSE/mmx >>> instruction pavgb I see a nice clean implementation in the SSE_HELPER_B >>> macro which takes a MMXREG which is an MMREG_UNION which does not >>> provide, to the extent that I can figure this out, a handle to bits >>> 79:64 of the aliased-with x87 register. >>> >>> I find it hard to believe that an apparent bug like this has been here >>> "forever". Am I missing something? >> Likely the developer who implemented this code didn't have all the >> information you found, nor the test-suite, and eventually not even the >> hardware to compare. >> >> Since you have a good understanding of Intel FPU and hardware to >> compare, do you mind sending a patch to have QEMU emulate the correct >> hardware behavior? >> >> If possible add a test case to tests/tcg/i386/test-i386.c (see >> test_fxsave there). > Was this issue addressed, or does it remain unfixed? I remember > seeing x86 FPU patches merged recently, but I don't know if they > were related to this. > I haven't done anything to address this issue. >>> Robert Henry >>> >>> *From:* Robert Henry >>> *Sent:* Friday, May 29, 2020 10:38 AM >>> *To:* qemu-devel@nongnu.org >>> *Subject:* ia-32/ia-64 fxsave64 instruction behavior when saving mmx >>> Ã, >>> Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy >>> SSE mmx registers. The mmx registers are saved as if they were fp80 >>> values. The lower 64 bits of the constructed fp80 value is the mmx >>> register.Ã, The upper 16 bits of the constructed fp80 value are reserved; >>> see the last row of table 3-44 >>> ofÃ, >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.felixcloutier.com%2Fx86%2Ffxsave%23tbl-3-44&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=0CBE%2Btnm2b%2FJu9FjNHHjuh5vrYJ2MTfkitMApxXRSZQ%3D&reserved=0 >>> >>> The Intel core i9-9980XE Skylake metal I have puts 0x into these >>> reserved 16 bits when saving MMX. >>> >>> QEMU appears to put 0's there. >>> >>> Does anybody have insight as to what "reserved" really means, or must >>> be, in this case? >> You self-answered to this [2] in [1] earlier. >> >>> I take the verb "reserved" to mean something other >>> than "undefined". >>> >>> I came across this issue when running the remill instruction test >>> engine.Ã, See my >>> issueÃ, >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flifting-bits%2Fremill%2Fissues%2F423%25C3%2583&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=BjW7gZplqoUlKdpOT7dRnCvlzTrC4Vgpy%2BFf8bNpT0k%3D&reserved=0, >>> For better or >>> worse, remill assumes that those bits are 0x, not 0x >>> >> Regards, >> >> Phil. >>
Failure of test 'basic gdbstub support'
The newish test 'basic gdbstub support' fails for me on an out-of-the-box build on a host x86_64. (See below for the config.log head.) Is this failure expected? If so, where can I see that in the various CI engines you have running them? In digging through the test driver python code in tests/tcg/multiarch/gdbstub/sha1.py I see that the test assumes that a breakpoint on the function SHA1Init is a breakpoint at the 1st assignment statement; the 1st next executes the 1st assignment statement, etc. This is a very fragile assumption. It depends on the compiler used to compile sha1.c; it depends on the optimization level; it depends on the accuracy of the pc mapping in the debug info; it depends on gdb. Better would be to change SHA1Init to do its work, and then call another non-inlined function taking a context pointer, and then examine context->state[0] and context->state[1]. Thanks in advance TESTbasic gdbstub support make[2]: *** [/mnt/robhenry/qemu_robhenry_amd64/qemu/tests/tcg/multiarch/Makefile.target:51: run-gdbstub-sha1] Error 2 QEMU configure log Tue 09 Jun 2020 02:45:06 PM PDT # Configured with: '../configure' '--disable-sdl' '--enable-gtk' '--extra-ldflags=-L/usr/lib' '--enable-plugins' '--target-list=x86_64-softmmu x86_64-linux-user'
Re: ia-32/ia-64 fxsave64 instruction behavior when saving mmx
Here's additional information. All of the remill tests of the legacy MMX instructions fail. These instructions work on 64-bit registers aliased with the lower 64-bits of the x87 fp80 registers. The tests fail because remill expects the fxsave64 instruction to deliver 16 bits of 1's (infinity or nan prefix) in the fp80 exponent, eg bits 79:64. Metal does this, but QEMU does not. Reading of Intel Software development manual, table 3.44 (https://www.felixcloutier.com/x86/FXSAVE.html#tbl-3-44) says these 16 bits are reserved, but another version of the manual (http://math-atlas.sourceforge.net/devel/arch/ia32_arch.pdf) section 9.6.2 "Transitions between x87 fpu and mmx code" says a write to an MMX register sets those 16 bits to all 1s. In digging through the code for the implementation of the SSE/mmx instruction pavgb I see a nice clean implementation in the SSE_HELPER_B macro which takes a MMXREG which is an MMREG_UNION which does not provide, to the extent that I can figure this out, a handle to bits 79:64 of the aliased-with x87 register. I find it hard to believe that an apparent bug like this has been here "forever". Am I missing something? Robert Henry ________ From: Robert Henry Sent: Friday, May 29, 2020 10:38 AM To: qemu-devel@nongnu.org Subject: ia-32/ia-64 fxsave64 instruction behavior when saving mmx Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy SSE mmx registers. The mmx registers are saved as if they were fp80 values. The lower 64 bits of the constructed fp80 value is the mmx register. The upper 16 bits of the constructed fp80 value are reserved; see the last row of table 3-44 of https://www.felixcloutier.com/x86/fxsave#tbl-3-44 The Intel core i9-9980XE Skylake metal I have puts 0x into these reserved 16 bits when saving MMX. QEMU appears to put 0's there. Does anybody have insight as to what "reserved" really means, or must be, in this case? I take the verb "reserved" to mean something other than "undefined". I came across this issue when running the remill instruction test engine. See my issue https://github.com/lifting-bits/remill/issues/423 For better or worse, remill assumes that those bits are 0x, not 0x
ia-32/ia-64 fxsave64 instruction behavior when saving mmx
Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy SSE mmx registers. The mmx registers are saved as if they were fp80 values. The lower 64 bits of the constructed fp80 value is the mmx register. The upper 16 bits of the constructed fp80 value are reserved; see the last row of table 3-44 of https://www.felixcloutier.com/x86/fxsave#tbl-3-44 The Intel core i9-9980XE Skylake metal I have puts 0x into these reserved 16 bits when saving MMX. QEMU appears to put 0's there. Does anybody have insight as to what "reserved" really means, or must be, in this case? I take the verb "reserved" to mean something other than "undefined". I came across this issue when running the remill instruction test engine. See my issue https://github.com/lifting-bits/remill/issues/423 For better or worse, remill assumes that those bits are 0x, not 0x
qemu plugin exposure of register addresses
There is now a qemu plugin interface function qemu_plugin_register_vcpu_mem_cb which registers a plugin-side callback. This callback is later invoked at the start of each emulated instruction, and it receives information about memory addresses and read/write indicators. I'm wondering how hard it is to add a similar callback to expose register addresses and read/write indicators. For example, executing `add r3, r1, $1` would generate two callbacks, one {write r3} and the other {read r1}. I'd like this for all kinds of registers such as simd regs, and, gulp, flags registers. With this information ISA simulators could examine the data flow graph and register dependencies. I'm not asking for register contents; we don't get memory contents either! I gather there is some concern about exposing too much functionality to the plugin API, as a plugin might then be used to subvert some aspects of the GPL. I don't understand the details of this concern, nor know where the "line in the sand" is. Robert Henry
[Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
The change should only be dynamically visible when doing an iretq from and to the same protection level, AFAICT. The code clearly[sic] works now for the interrupt return that is used by the linux kernel, presumably {from=kernel, to=kernel} or {from=kernel, to=user}. I would claim that to make this code correct it needs to work for all 4*4 state changes, as windows (notoriously) uses all 4 protection levels. I'm still digging, but your suggestion is certainly on the path forward. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1866892 Title: guest OS catches a page fault bug when running dotnet Status in QEMU: New Bug description: The linux guest OS catches a page fault bug when running the dotnet application. host = metal = x86_64 host OS = ubuntu 19.10 qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master guest emulation = x86_64 guest OS = ubuntu 19.10 guest app = dotnet, running any program qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10, 2020) qemu invocation is: qemu/build/x86_64-softmmu/qemu-system-x86_64 \ -m size=4096 \ -smp cpus=1 \ -machine type=pc-i440fx-5.0,accel=tcg \ -cpu Skylake-Server-v1 \ -nographic \ -bios OVMF-pure-efi.fd \ -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \ -device virtio-blk,drive=hd0 \ -drive if=none,id=cloud,file=linux_cloud_config.img \ -device virtio-blk,drive=cloud \ -netdev user,id=user0,hostfwd=tcp::2223-:22 \ -device virtio-net,netdev=user0 Here's the guest kernel console output: [ 2834.005449] BUG: unable to handle page fault for address: 7fffc2c0 [ 2834.009895] #PF: supervisor read access in user mode [ 2834.013872] #PF: error_code(0x0001) - permissions violation [ 2834.018025] IDT: 0xfe00 (limit=0xfff) GDT: 0xfe001000 (limit=0x7f) [ 2834.022242] LDTR: NULL [ 2834.026306] TR: 0x40 -- base=0xfe003000 limit=0x206f [ 2834.030395] PGD 8000360d0067 P4D 8000360d0067 PUD 36105067 PMD 36193067 PTE 800076d8e867 [ 2834.038672] Oops: 0001 [#4] SMP PTI [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G D 5.3.0-29-generic #31-Ubuntu [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 2834.054785] RIP: 0033:0x147eaeda [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00 [ 2834.072103] RSP: 002b:7fffc2c0 EFLAGS: 0202 [ 2834.076507] RAX: RBX: 1554b401af38 RCX: 0001 [ 2834.080832] RDX: RSI: RDI: 7fffcfb0 [ 2834.085010] RBP: 7fffd730 R08: R09: 7fffd1b0 [ 2834.089184] R10: 15331dd5 R11: 153ad8d0 R12: 0002 [ 2834.093350] R13: 0001 R14: 0001 R15: 1554b401d388 [ 2834.097309] FS: 14fa5740 GS: [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy [ 2834.122539] CR2: 7fffc2c0 [ 2834.126867] ---[ end trace dfae51f1d9432708 ]--- [ 2834.131239] RIP: 0033:0x14d793262eda [ 2834.135715] Code: Bad RIP value. [ 2834.140243] RSP: 002b:7ffddb4e2980 EFLAGS: 0202 [ 2834.144615] RAX: RBX: 14d6f402acb8 RCX: 0002 [ 2834.148943] RDX: 01cd6950 RSI: RDI: 7ffddb4e3670 [ 2834.153335] RBP: 7ffddb4e3df0 R08: 0001 R09: 7ffddb4e3870 [ 2834.157774] R10: 14d793da9dd5 R11: 14d793e258d0 R12: 0002 [ 2834.162132] R13: 0001 R14: 0001 R15: 14d6f402d040 [ 2834.166239] FS: 14fa5740() GS:97213ba0() knlGS: [ 2834.170529] CS: 0033 DS: ES: CR0: 80050033 [ 2834.174751] CR2: 14d793262eb0 CR3: 3613 CR4: 007406f0 [ 2834.178892] PKRU: 5554 I run the application from a shell with `ulimit -s unlimited` (unlimited stack to size). The application creates a number of threads, and those threads make a lot of calls to sigaltstack() and mprotect(); see the relevant source for dotnet here https://github.com/dotnet/runtime/b
[Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
Peter: I think your intuition is right. The POPQ_RA (pop quad, passing through return address handle) is only called from helper_ret_protected, and it suspiciously calls cpu_ldq_kernel_ra which calls cpu_mmu_index_kernel which only is prepared for kernel space iretq (and of course the substring _kernel in the function name tells us that too). -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1866892 Title: guest OS catches a page fault bug when running dotnet Status in QEMU: New Bug description: The linux guest OS catches a page fault bug when running the dotnet application. host = metal = x86_64 host OS = ubuntu 19.10 qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master guest emulation = x86_64 guest OS = ubuntu 19.10 guest app = dotnet, running any program qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10, 2020) qemu invocation is: qemu/build/x86_64-softmmu/qemu-system-x86_64 \ -m size=4096 \ -smp cpus=1 \ -machine type=pc-i440fx-5.0,accel=tcg \ -cpu Skylake-Server-v1 \ -nographic \ -bios OVMF-pure-efi.fd \ -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \ -device virtio-blk,drive=hd0 \ -drive if=none,id=cloud,file=linux_cloud_config.img \ -device virtio-blk,drive=cloud \ -netdev user,id=user0,hostfwd=tcp::2223-:22 \ -device virtio-net,netdev=user0 Here's the guest kernel console output: [ 2834.005449] BUG: unable to handle page fault for address: 7fffc2c0 [ 2834.009895] #PF: supervisor read access in user mode [ 2834.013872] #PF: error_code(0x0001) - permissions violation [ 2834.018025] IDT: 0xfe00 (limit=0xfff) GDT: 0xfe001000 (limit=0x7f) [ 2834.022242] LDTR: NULL [ 2834.026306] TR: 0x40 -- base=0xfe003000 limit=0x206f [ 2834.030395] PGD 8000360d0067 P4D 8000360d0067 PUD 36105067 PMD 36193067 PTE 800076d8e867 [ 2834.038672] Oops: 0001 [#4] SMP PTI [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G D 5.3.0-29-generic #31-Ubuntu [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 2834.054785] RIP: 0033:0x147eaeda [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00 [ 2834.072103] RSP: 002b:7fffc2c0 EFLAGS: 0202 [ 2834.076507] RAX: RBX: 1554b401af38 RCX: 0001 [ 2834.080832] RDX: RSI: RDI: 7fffcfb0 [ 2834.085010] RBP: 7fffd730 R08: R09: 7fffd1b0 [ 2834.089184] R10: 15331dd5 R11: 153ad8d0 R12: 0002 [ 2834.093350] R13: 0001 R14: 0001 R15: 1554b401d388 [ 2834.097309] FS: 14fa5740 GS: [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy [ 2834.122539] CR2: 7fffc2c0 [ 2834.126867] ---[ end trace dfae51f1d9432708 ]--- [ 2834.131239] RIP: 0033:0x14d793262eda [ 2834.135715] Code: Bad RIP value. [ 2834.140243] RSP: 002b:7ffddb4e2980 EFLAGS: 0202 [ 2834.144615] RAX: RBX: 14d6f402acb8 RCX: 0002 [ 2834.148943] RDX: 01cd6950 RSI: RDI: 7ffddb4e3670 [ 2834.153335] RBP: 7ffddb4e3df0 R08: 0001 R09: 7ffddb4e3870 [ 2834.157774] R10: 14d793da9dd5 R11: 14d793e258d0 R12: 0002 [ 2834.162132] R13: 0001 R14: 0001 R15: 14d6f402d040 [ 2834.166239] FS: 14fa5740() GS:97213ba0() knlGS: [ 2834.170529] CS: 0033 DS: ES: CR0: 80050033 [ 2834.174751] CR2: 14d793262eb0 CR3: 3613 CR4: 007406f0 [ 2834.178892] PKRU: 5554 I run the application from a shell with `ulimit -s unlimited` (unlimited stack to size). The application creates a number of threads, and those threads make a lot of calls to sigaltstack() and mprotect(); see the relevant source for dotnet here https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467 using strace -f on the app shows that no alt
[Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
I've stepped/nexted from the helper_iret_protected, going deep into the bowels of the TLB, MMU and page table engine. None of which I understand. The helper_ret_protected faults in the first POPQ_RA. I'll investigate the value of sp at the time of the POPQ_RA. Here's the POPQ_RA in i386/seg_helper.c:2140 sp = env->regs[R_ESP]; ssp = env->segs[R_SS].base; new_eflags = 0; /* avoid warning */ #ifdef TARGET_X86_64 if (shift == 2) { POPQ_RA(sp, new_eip, retaddr); POPQ_RA(sp, new_cs, retaddr); new_cs &= 0x; if (is_iret) { POPQ_RA(sp, new_eflags, retaddr); } and here's the stack. Note some of the logical intermediate frames are optimized out due to -O3 and inline. (the value of env-errorcode is 1) 0 0x55a370c0 in raise_interrupt2 (env=env@entry=0x566ef200, intno=14, is_int=is_int@entry=0, error_code=1, next_eip_addend=next_eip_addend@entry=0, retaddr=retaddr@entry=140736367565663) at /mnt/robhenry/qemu_robhenry_amd64/qemu/include/exec/cpu-all.h:426 #1 0x55a377f9 in raise_exception_err_ra (env=env@entry=0x566ef200, exception_index=, error_code=, retaddr=retaddr@entry=140736367565663) at /mnt/robhenry/qemu_robhenry_amd64/qemu/target/i386/excp_helper.c:127 #2 0x55a37d69 in x86_cpu_tlb_fill (cs=0x566e69a0, addr=140727872411616, size=, access_type=MMU_DATA_LOAD, mmu_idx=0, probe=, retaddr=140736367565663) at /mnt/robhenry/qemu_robhenry_amd64/qemu/target/i386/excp_helper.c:697 #3 0x55952295 in tlb_fill (cpu=0x566e69a0, addr=140727872411616, size=8, access_type=MMU_DATA_LOAD, mmu_idx=0, retaddr=140736367565663) at /mnt/robhenry/qemu_robhenry_amd64/qemu/accel/tcg/cputlb.c:1017 #4 0x55956320 in load_helper (full_load=0x55956140 , code_read=false, op=MO_64, retaddr=93825010692608, oi=48, addr=140727872411616, env=0x566ef200) at /mnt/robhenry/qemu_robhenry_amd64/qemu/include/exec/cpu-all.h:426 #5 0x55956320 in helper_le_ldq_mmu (env=env@entry=0x566ef200, addr=addr@entry=140727872411616, oi=oi@entry=48, retaddr=retaddr@entry=140736367565663) at /mnt/robhenry/qemu_robhenry_amd64/qemu/accel/tcg/cputlb.c:1688 #6 0x55956dc0 in cpu_load_helper (full_load=0x55956140 , op=MO_64, retaddr=140736367565663, mmu_idx=, addr=140727872411616, env=0x566ef200) at /mnt/robhenry/qemu_robhenry_amd64/qemu/accel/tcg/cputlb.c:1752 #7 0x55956dc0 in cpu_ldq_mmuidx_ra (env=env@entry=0x566ef200, addr=addr@entry=140727872411616, mmu_idx=, ra=ra@entry=140736367565663) --Type for more, q to quit, c to continue without paging-- at /mnt/robhenry/qemu_robhenry_amd64/qemu/accel/tcg/cputlb.c:1799 #8 0x55a4ff09 in helper_ret_protected (env=env@entry=0x566ef200, shift=shift@entry=2, is_iret=is_iret@entry=1, addend=addend@entry=0, retaddr=140736367565663) at /mnt/robhenry/qemu_robhenry_amd64/qemu/target/i386/seg_helper.c:2140 #9 0x55a50ff5 in helper_iret_protected (env=0x566ef200, shift=2, next_eip=-999377888) at /mnt/robhenry/qemu_robhenry_amd64/qemu/target/i386/seg_helper.c:2363 #10 0x7fffbd321b5f in code_gen_buffer () -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1866892 Title: guest OS catches a page fault bug when running dotnet Status in QEMU: New Bug description: The linux guest OS catches a page fault bug when running the dotnet application. host = metal = x86_64 host OS = ubuntu 19.10 qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master guest emulation = x86_64 guest OS = ubuntu 19.10 guest app = dotnet, running any program qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10, 2020) qemu invocation is: qemu/build/x86_64-softmmu/qemu-system-x86_64 \ -m size=4096 \ -smp cpus=1 \ -machine type=pc-i440fx-5.0,accel=tcg \ -cpu Skylake-Server-v1 \ -nographic \ -bios OVMF-pure-efi.fd \ -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \ -device virtio-blk,drive=hd0 \ -drive if=none,id=cloud,file=linux_cloud_config.img \ -device virtio-blk,drive=cloud \ -netdev user,id=user0,hostfwd=tcp::2223-:22 \ -device virtio-net,netdev=user0 Here's the guest kernel console output: [ 2834.005449] BUG: unable to handle page fault for address: 7fffc2c0 [ 2834.009895] #PF: supervisor read access in user mode [ 2834.013872] #PF: error_code(0x0001) - permissions violation [ 2834.018025] IDT: 0xfe00 (limit=0xfff) GDT: 0xfe001000 (limit=0x7f) [ 2834.022242] LDTR: NULL [ 2834.026306] TR: 0x40 -- base=0xfe003000 limit=0x206f [ 2834.030395] PGD 8000360d0067 P4D 8000360d0067 PUD 36105067 PMD 36193067 PTE 800076d8e867 [ 2834.038672] Oops: 0001 [#4] SMP PTI [ 2834.0
[Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
yes, it is intentional. I don't yet understand why, but am talking to those who do. https://github.com/dotnet/runtime/blob/1b02665be501b695b9c22c1ebd69148c07a225f6/src/coreclr/src/pal/src/arch/amd64/context2.S#L183 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1866892 Title: guest OS catches a page fault bug when running dotnet Status in QEMU: New Bug description: The linux guest OS catches a page fault bug when running the dotnet application. host = metal = x86_64 host OS = ubuntu 19.10 qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master guest emulation = x86_64 guest OS = ubuntu 19.10 guest app = dotnet, running any program qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10, 2020) qemu invocation is: qemu/build/x86_64-softmmu/qemu-system-x86_64 \ -m size=4096 \ -smp cpus=1 \ -machine type=pc-i440fx-5.0,accel=tcg \ -cpu Skylake-Server-v1 \ -nographic \ -bios OVMF-pure-efi.fd \ -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \ -device virtio-blk,drive=hd0 \ -drive if=none,id=cloud,file=linux_cloud_config.img \ -device virtio-blk,drive=cloud \ -netdev user,id=user0,hostfwd=tcp::2223-:22 \ -device virtio-net,netdev=user0 Here's the guest kernel console output: [ 2834.005449] BUG: unable to handle page fault for address: 7fffc2c0 [ 2834.009895] #PF: supervisor read access in user mode [ 2834.013872] #PF: error_code(0x0001) - permissions violation [ 2834.018025] IDT: 0xfe00 (limit=0xfff) GDT: 0xfe001000 (limit=0x7f) [ 2834.022242] LDTR: NULL [ 2834.026306] TR: 0x40 -- base=0xfe003000 limit=0x206f [ 2834.030395] PGD 8000360d0067 P4D 8000360d0067 PUD 36105067 PMD 36193067 PTE 800076d8e867 [ 2834.038672] Oops: 0001 [#4] SMP PTI [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G D 5.3.0-29-generic #31-Ubuntu [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 2834.054785] RIP: 0033:0x147eaeda [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00 [ 2834.072103] RSP: 002b:7fffc2c0 EFLAGS: 0202 [ 2834.076507] RAX: RBX: 1554b401af38 RCX: 0001 [ 2834.080832] RDX: RSI: RDI: 7fffcfb0 [ 2834.085010] RBP: 7fffd730 R08: R09: 7fffd1b0 [ 2834.089184] R10: 15331dd5 R11: 153ad8d0 R12: 0002 [ 2834.093350] R13: 0001 R14: 0001 R15: 1554b401d388 [ 2834.097309] FS: 14fa5740 GS: [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy [ 2834.122539] CR2: 7fffc2c0 [ 2834.126867] ---[ end trace dfae51f1d9432708 ]--- [ 2834.131239] RIP: 0033:0x14d793262eda [ 2834.135715] Code: Bad RIP value. [ 2834.140243] RSP: 002b:7ffddb4e2980 EFLAGS: 0202 [ 2834.144615] RAX: RBX: 14d6f402acb8 RCX: 0002 [ 2834.148943] RDX: 01cd6950 RSI: RDI: 7ffddb4e3670 [ 2834.153335] RBP: 7ffddb4e3df0 R08: 0001 R09: 7ffddb4e3870 [ 2834.157774] R10: 14d793da9dd5 R11: 14d793e258d0 R12: 0002 [ 2834.162132] R13: 0001 R14: 0001 R15: 14d6f402d040 [ 2834.166239] FS: 14fa5740() GS:97213ba0() knlGS: [ 2834.170529] CS: 0033 DS: ES: CR0: 80050033 [ 2834.174751] CR2: 14d793262eb0 CR3: 3613 CR4: 007406f0 [ 2834.178892] PKRU: 5554 I run the application from a shell with `ulimit -s unlimited` (unlimited stack to size). The application creates a number of threads, and those threads make a lot of calls to sigaltstack() and mprotect(); see the relevant source for dotnet here https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467 using strace -f on the app shows that no alt stacks come anywhere near the failing address; all alt stacks are in the heap, as expected. None of the mmap/mprotect/munmap
[Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
I have confirmed that the dotnet guest application is executing a "iretq" instruction when this guest kernel bug is hit. A first round of analysis shows nothing unreasonable at the point the iretq is executed. The $rsp points into the middle of a mapped in page, the returned-to $rip looks reasonable, etc. We continue our analysis of qemu and the dotnet runtime. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1866892 Title: guest OS catches a page fault bug when running dotnet Status in QEMU: New Bug description: The linux guest OS catches a page fault bug when running the dotnet application. host = metal = x86_64 host OS = ubuntu 19.10 qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master guest emulation = x86_64 guest OS = ubuntu 19.10 guest app = dotnet, running any program qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10, 2020) qemu invocation is: qemu/build/x86_64-softmmu/qemu-system-x86_64 \ -m size=4096 \ -smp cpus=1 \ -machine type=pc-i440fx-5.0,accel=tcg \ -cpu Skylake-Server-v1 \ -nographic \ -bios OVMF-pure-efi.fd \ -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \ -device virtio-blk,drive=hd0 \ -drive if=none,id=cloud,file=linux_cloud_config.img \ -device virtio-blk,drive=cloud \ -netdev user,id=user0,hostfwd=tcp::2223-:22 \ -device virtio-net,netdev=user0 Here's the guest kernel console output: [ 2834.005449] BUG: unable to handle page fault for address: 7fffc2c0 [ 2834.009895] #PF: supervisor read access in user mode [ 2834.013872] #PF: error_code(0x0001) - permissions violation [ 2834.018025] IDT: 0xfe00 (limit=0xfff) GDT: 0xfe001000 (limit=0x7f) [ 2834.022242] LDTR: NULL [ 2834.026306] TR: 0x40 -- base=0xfe003000 limit=0x206f [ 2834.030395] PGD 8000360d0067 P4D 8000360d0067 PUD 36105067 PMD 36193067 PTE 800076d8e867 [ 2834.038672] Oops: 0001 [#4] SMP PTI [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G D 5.3.0-29-generic #31-Ubuntu [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 2834.054785] RIP: 0033:0x147eaeda [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00 [ 2834.072103] RSP: 002b:7fffc2c0 EFLAGS: 0202 [ 2834.076507] RAX: RBX: 1554b401af38 RCX: 0001 [ 2834.080832] RDX: RSI: RDI: 7fffcfb0 [ 2834.085010] RBP: 7fffd730 R08: R09: 7fffd1b0 [ 2834.089184] R10: 15331dd5 R11: 153ad8d0 R12: 0002 [ 2834.093350] R13: 0001 R14: 0001 R15: 1554b401d388 [ 2834.097309] FS: 14fa5740 GS: [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy [ 2834.122539] CR2: 7fffc2c0 [ 2834.126867] ---[ end trace dfae51f1d9432708 ]--- [ 2834.131239] RIP: 0033:0x14d793262eda [ 2834.135715] Code: Bad RIP value. [ 2834.140243] RSP: 002b:7ffddb4e2980 EFLAGS: 0202 [ 2834.144615] RAX: RBX: 14d6f402acb8 RCX: 0002 [ 2834.148943] RDX: 01cd6950 RSI: RDI: 7ffddb4e3670 [ 2834.153335] RBP: 7ffddb4e3df0 R08: 0001 R09: 7ffddb4e3870 [ 2834.157774] R10: 14d793da9dd5 R11: 14d793e258d0 R12: 0002 [ 2834.162132] R13: 0001 R14: 0001 R15: 14d6f402d040 [ 2834.166239] FS: 14fa5740() GS:97213ba0() knlGS: [ 2834.170529] CS: 0033 DS: ES: CR0: 80050033 [ 2834.174751] CR2: 14d793262eb0 CR3: 3613 CR4: 007406f0 [ 2834.178892] PKRU: 5554 I run the application from a shell with `ulimit -s unlimited` (unlimited stack to size). The application creates a number of threads, and those threads make a lot of calls to sigaltstack() and mprotect(); see the relevant source for dotnet here https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467 using strace -f on the app
[Bug 1824344] Re: x86: retf or iret pagefault sets wrong error code
This appears to be similar to https://bugs.launchpad.net/qemu/+bug/1866892 (and much simpler) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1824344 Title: x86: retf or iret pagefault sets wrong error code Status in QEMU: New Bug description: With a x86_64 or i386 guest, non-KVM, when trying to execute a "iret/iretq/retf" instruction in userspace with invalid stack pointer (under a protected mode OS, like Linux), wrong bits are set in the pushed error code; bit 2 is not set, indicating the error comes from kernel space. If the guest OS is using this flag to decide whether this was a kernel or user page fault, it will mistakenly decide a kernel has irrecoverably faulted, possibly causing guest OS panic. How to reproduce the problem a guest (non-KVM) Linux: Note, on recent Linux kernel version, this needs a CPU with SMAP support (eg. -cpu max) $ cat tst.c int main() { __asm__ volatile ( "mov $0,%esp\n" "retf" ); return 0; } $ gcc tst.c $ ./a.out Killed "dmesg" shows the kernel has in fact triggered a "BUG: unable to handle kernel NULL pointer dereference...", but it has "recovered" by killing the faulting process (see attached screenshot). Using self-compiled qemu from git: commit 532cc6da74ec25b5ba6893b5757c977d54582949 (HEAD -> master, tag: v4.0.0-rc3, origin/master, origin/HEAD) Author: Peter Maydell Date: Wed Apr 10 15:38:59 2019 +0100 Update version for v4.0.0-rc3 release Signed-off-by: Peter Maydell To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1824344/+subscriptions
[Bug 1866892] Re: guest OS catches a page fault bug when running dotnet
A simpler case seems to produce the same error. See https://bugs.launchpad.net/qemu/+bug/1824344 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1866892 Title: guest OS catches a page fault bug when running dotnet Status in QEMU: New Bug description: The linux guest OS catches a page fault bug when running the dotnet application. host = metal = x86_64 host OS = ubuntu 19.10 qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master guest emulation = x86_64 guest OS = ubuntu 19.10 guest app = dotnet, running any program qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10, 2020) qemu invocation is: qemu/build/x86_64-softmmu/qemu-system-x86_64 \ -m size=4096 \ -smp cpus=1 \ -machine type=pc-i440fx-5.0,accel=tcg \ -cpu Skylake-Server-v1 \ -nographic \ -bios OVMF-pure-efi.fd \ -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \ -device virtio-blk,drive=hd0 \ -drive if=none,id=cloud,file=linux_cloud_config.img \ -device virtio-blk,drive=cloud \ -netdev user,id=user0,hostfwd=tcp::2223-:22 \ -device virtio-net,netdev=user0 Here's the guest kernel console output: [ 2834.005449] BUG: unable to handle page fault for address: 7fffc2c0 [ 2834.009895] #PF: supervisor read access in user mode [ 2834.013872] #PF: error_code(0x0001) - permissions violation [ 2834.018025] IDT: 0xfe00 (limit=0xfff) GDT: 0xfe001000 (limit=0x7f) [ 2834.022242] LDTR: NULL [ 2834.026306] TR: 0x40 -- base=0xfe003000 limit=0x206f [ 2834.030395] PGD 8000360d0067 P4D 8000360d0067 PUD 36105067 PMD 36193067 PTE 800076d8e867 [ 2834.038672] Oops: 0001 [#4] SMP PTI [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G D 5.3.0-29-generic #31-Ubuntu [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 2834.054785] RIP: 0033:0x147eaeda [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00 [ 2834.072103] RSP: 002b:7fffc2c0 EFLAGS: 0202 [ 2834.076507] RAX: RBX: 1554b401af38 RCX: 0001 [ 2834.080832] RDX: RSI: RDI: 7fffcfb0 [ 2834.085010] RBP: 7fffd730 R08: R09: 7fffd1b0 [ 2834.089184] R10: 15331dd5 R11: 153ad8d0 R12: 0002 [ 2834.093350] R13: 0001 R14: 0001 R15: 1554b401d388 [ 2834.097309] FS: 14fa5740 GS: [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy [ 2834.122539] CR2: 7fffc2c0 [ 2834.126867] ---[ end trace dfae51f1d9432708 ]--- [ 2834.131239] RIP: 0033:0x14d793262eda [ 2834.135715] Code: Bad RIP value. [ 2834.140243] RSP: 002b:7ffddb4e2980 EFLAGS: 0202 [ 2834.144615] RAX: RBX: 14d6f402acb8 RCX: 0002 [ 2834.148943] RDX: 01cd6950 RSI: RDI: 7ffddb4e3670 [ 2834.153335] RBP: 7ffddb4e3df0 R08: 0001 R09: 7ffddb4e3870 [ 2834.157774] R10: 14d793da9dd5 R11: 14d793e258d0 R12: 0002 [ 2834.162132] R13: 0001 R14: 0001 R15: 14d6f402d040 [ 2834.166239] FS: 14fa5740() GS:97213ba0() knlGS: [ 2834.170529] CS: 0033 DS: ES: CR0: 80050033 [ 2834.174751] CR2: 14d793262eb0 CR3: 3613 CR4: 007406f0 [ 2834.178892] PKRU: 5554 I run the application from a shell with `ulimit -s unlimited` (unlimited stack to size). The application creates a number of threads, and those threads make a lot of calls to sigaltstack() and mprotect(); see the relevant source for dotnet here https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467 using strace -f on the app shows that no alt stacks come anywhere near the failing address; all alt stacks are in the heap, as expected. None of the mmap/mprotect/munmap syscalls were given arguments in the high memory 0x7fff and up. gdb (with default signal stop/print/pass
[Bug 1866892] [NEW] guest OS catches a page fault bug when running dotnet
Public bug reported: The linux guest OS catches a page fault bug when running the dotnet application. host = metal = x86_64 host OS = ubuntu 19.10 qemu emulation, without KVM, with "tiny code generator" tcg; no plugins; built from head/master guest emulation = x86_64 guest OS = ubuntu 19.10 guest app = dotnet, running any program qemu sha=7bc4d1980f95387c4cc921d7a066217ff4e42b70 (head/master Mar 10, 2020) qemu invocation is: qemu/build/x86_64-softmmu/qemu-system-x86_64 \ -m size=4096 \ -smp cpus=1 \ -machine type=pc-i440fx-5.0,accel=tcg \ -cpu Skylake-Server-v1 \ -nographic \ -bios OVMF-pure-efi.fd \ -drive if=none,id=hd0,file=ubuntu-19.10-server-cloudimg-amd64.img \ -device virtio-blk,drive=hd0 \ -drive if=none,id=cloud,file=linux_cloud_config.img \ -device virtio-blk,drive=cloud \ -netdev user,id=user0,hostfwd=tcp::2223-:22 \ -device virtio-net,netdev=user0 Here's the guest kernel console output: [ 2834.005449] BUG: unable to handle page fault for address: 7fffc2c0 [ 2834.009895] #PF: supervisor read access in user mode [ 2834.013872] #PF: error_code(0x0001) - permissions violation [ 2834.018025] IDT: 0xfe00 (limit=0xfff) GDT: 0xfe001000 (limit=0x7f) [ 2834.022242] LDTR: NULL [ 2834.026306] TR: 0x40 -- base=0xfe003000 limit=0x206f [ 2834.030395] PGD 8000360d0067 P4D 8000360d0067 PUD 36105067 PMD 36193067 PTE 800076d8e867 [ 2834.038672] Oops: 0001 [#4] SMP PTI [ 2834.042707] CPU: 0 PID: 13537 Comm: dotnet Tainted: G D 5.3.0-29-generic #31-Ubuntu [ 2834.050591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 2834.054785] RIP: 0033:0x147eaeda [ 2834.059017] Code: d0 00 00 00 4c 8b a7 d8 00 00 00 4c 8b af e0 00 00 00 4c 8b b7 e8 00 00 00 4c 8b bf f0 00 00 00 48 8b bf b0 00 00 00 9d 74 02 <48> cf 48 8d 64 24 30 5d c3 90 cc c3 66 90 55 4c 8b a7 d8 00 00 00 [ 2834.072103] RSP: 002b:7fffc2c0 EFLAGS: 0202 [ 2834.076507] RAX: RBX: 1554b401af38 RCX: 0001 [ 2834.080832] RDX: RSI: RDI: 7fffcfb0 [ 2834.085010] RBP: 7fffd730 R08: R09: 7fffd1b0 [ 2834.089184] R10: 15331dd5 R11: 153ad8d0 R12: 0002 [ 2834.093350] R13: 0001 R14: 0001 R15: 1554b401d388 [ 2834.097309] FS: 14fa5740 GS: [ 2834.101131] Modules linked in: isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ppdev input_leds serio_raw parport_pc parport sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_net psmouse net_failover failover virtio_blk floppy [ 2834.122539] CR2: 7fffc2c0 [ 2834.126867] ---[ end trace dfae51f1d9432708 ]--- [ 2834.131239] RIP: 0033:0x14d793262eda [ 2834.135715] Code: Bad RIP value. [ 2834.140243] RSP: 002b:7ffddb4e2980 EFLAGS: 0202 [ 2834.144615] RAX: RBX: 14d6f402acb8 RCX: 0002 [ 2834.148943] RDX: 01cd6950 RSI: RDI: 7ffddb4e3670 [ 2834.153335] RBP: 7ffddb4e3df0 R08: 0001 R09: 7ffddb4e3870 [ 2834.157774] R10: 14d793da9dd5 R11: 14d793e258d0 R12: 0002 [ 2834.162132] R13: 0001 R14: 0001 R15: 14d6f402d040 [ 2834.166239] FS: 14fa5740() GS:97213ba0() knlGS: [ 2834.170529] CS: 0033 DS: ES: CR0: 80050033 [ 2834.174751] CR2: 14d793262eb0 CR3: 3613 CR4: 007406f0 [ 2834.178892] PKRU: 5554 I run the application from a shell with `ulimit -s unlimited` (unlimited stack to size). The application creates a number of threads, and those threads make a lot of calls to sigaltstack() and mprotect(); see the relevant source for dotnet here https://github.com/dotnet/runtime/blob/15ec69e47b4dc56098e6058a11ccb6ae4d5d4fa1/src/coreclr/src/pal/src/thread/thread.cpp#L2467 using strace -f on the app shows that no alt stacks come anywhere near the failing address; all alt stacks are in the heap, as expected. None of the mmap/mprotect/munmap syscalls were given arguments in the high memory 0x7fff and up. gdb (with default signal stop/print/pass semantics) does not report any signals prior to the kernel bug being tripped, so I doubt the alternate signal stack is actually used. When I run the same dotnet binary on the host (eg, on "bare metal"), the host kernel seems happy and dotnet runs as expected. I have not tried different qemu or guest or host O/S. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to Q
[Bug 1860610] Re: cap_disas_plugin leaks memory
I run git blame in the capstone repository, and cs_free has been around for at least 4 years in the capstone ABI. I can not tell if the need to call cs_free is a (new) requirement. Documentation capstone is a little informal... -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1860610 Title: cap_disas_plugin leaks memory Status in QEMU: New Bug description: Looking at origin/master head, the function cap_disas_plugin leaks memory. per capstone's examples using their ABI, cs_free(insn, count); needs to called just before cs_close. I discovered this running qemu under valgrind. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1860610/+subscriptions
Re: [EXTERNAL] Re: QEMU for aarch64 with plugins seems to fail basic consistency checks
This proposed text sounds good. Better English is to say "concepts" rather than "conceptions". My plugin currently allocates its own unique user data structure on every call to the instrumentation-time callback. This piece of user data captures the transient data presented from qemu. What's missing from the interface is a callback when qemu can guarantee that the run-time callback will never be called again with this piece of user data. At that point I would free my piece of user data. I'm not too worried about this memory loss, yet. I ran ubuntu for 3 days on qemu+plugins and only observed a tolerable growth in qemu's memory consumption. From: Alex Bennée Sent: Friday, January 24, 2020 11:44 AM To: Robert Henry Cc: Laurent Desnogues ; qemu-devel@nongnu.org Subject: Re: [EXTERNAL] Re: QEMU for aarch64 with plugins seems to fail basic consistency checks Robert Henry writes: > I found at least one problem with my plugin. > > I was assuming that the insn data from > struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb, i); > could be passed into qemu_plugin_register_vcpu_insn_exec_cb both as the 1st > argument AND as the user data last argument. This assumed that insn would > persist and be unique from when qemu_plugin_register_vcpu_insn_exec_cb was > called to when the execution-time callback (vcpu_insn_exec_before) was called. > > This assumption is not true. > > I now capture pieces of *insn into my own persistent data structure, and pass > that in as void *udata, my problems went away. > > I think this lack of persistence of insn should be documented, or > treated as a bug to be fixed. I thought I had done this but it turns out I only mentioned it for hwaddr: /* * qemu_plugin_get_hwaddr(): * @vaddr: the virtual address of the memory operation * * For system emulation returns a qemu_plugin_hwaddr handle to query * details about the actual physical address backing the virtual * address. For linux-user guests it just returns NULL. * * This handle is *only* valid for the duration of the callback. Any * information about the handle should be recovered before the * callback returns. */ But the concept of handle lifetime is true for all the handles. I propose something like this for the docs: --8<---cut here---start->8--- docs/devel: document query handle lifetimes I forgot to document the lifetime of handles in the developer documentation. Do so now. Signed-off-by: Alex Bennée 1 file changed, 11 insertions(+), 2 deletions(-) docs/devel/tcg-plugins.rst | 13 +++-- modified docs/devel/tcg-plugins.rst @@ -51,8 +51,17 @@ about how QEMU's translation works to the plugins. While there are conceptions such as translation time and translation blocks the details are opaque to plugins. The plugin is able to query select details of instructions and system configuration only through the -exported *qemu_plugin* functions. The types used to describe -instructions and events are opaque to the plugins themselves. +exported *qemu_plugin* functions. + +Query Handle Lifetime +- + +Each callback provides an opaque anonymous information handle which +can usually be further queried to find out information about a +translation, instruction or operation. The handles themselves are only +valid during the lifetime of the callback so it is important that any +information that is needed is extracted during the callback and saved +by the plugin. Usage = --8<---cut here---end--->8--- -- Alex Bennée
Re: [EXTERNAL] Re: QEMU for aarch64 with plugins seems to fail basic consistency checks
I found at least one problem with my plugin. I was assuming that the insn data from struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb, i); could be passed into qemu_plugin_register_vcpu_insn_exec_cb both as the 1st argument AND as the user data last argument. This assumed that insn would persist and be unique from when qemu_plugin_register_vcpu_insn_exec_cb was called to when the execution-time callback (vcpu_insn_exec_before) was called. This assumption is not true. I now capture pieces of *insn into my own persistent data structure, and pass that in as void *udata, my problems went away. I think this lack of persistence of insn should be documented, or treated as a bug to be fixed. From: Alex Bennée Sent: Friday, January 24, 2020 8:36 AM To: Robert Henry Cc: qemu-devel@nongnu.org Subject: [EXTERNAL] Re: QEMU for aarch64 with plugins seems to fail basic consistency checks Robert Henry writes: > I wrote a QEMU plugin for aarch64 where the insn and mem callbacks > print out the specifics of the guest instructions as they are > "executed". I expect this trace stream to be well behaved but it is > not. Can you post your plugin? It's hard to diagnose what might be wrong without the actual code. > By well-behaved, I expect memory insns print out some memory details, > non-memory insns don't print anything, and the pc only changes after a > control flow instruction. Exactly how are you tracking the PC? You should have the correct PC as you instrument each instruction. Are you saying qemu_plugin_insn_vaddr() doesn't report a different PC for each instrumented instruction in a block? > I don't see that gross correctness about 2% > of the time. > > > 1. I'm using qemu at tag v4.2.0 (or master head; it doesn't matter), > running on a x86_64 host. > 2. I build qemu using ./configure --disable-sdl --enable-gtk > --enable-plugins --enable-debug --target-list=aarch64-softmmu > aarch64-linux-user > 3. I execute qemu from its build area > build/aarch64-linux-user/qemu-aarch64, with flags --cpu cortex-a72 and the > appropriate args to --plugin ... -d plugin -D . > 4. I'm emulating a simple C program in linux emulation mode. > 5. The resulting qemu execution is valgrind clean (eg, I run qemu under > valgrind) for my little program save for memory leaks I reported a few days > ago. > > Below is an example of my trace output (the first int printed is the > cpu_index, checked to be always 0). Note that the ldr instruction at 0x41a608 > sometimes reports a memop, but most of the time it doesn't. Note that > 0x41a608 is seen, by trace, running back to back. Note that (bottom of trace) > that the movz instruction reports a memop. (The executed code comes from > glibc _dl_aux_init, executed before main() is called.) > > How should this problem be tackled? I can't figure out how to make each tcg > block be exactly 1 guest (aarch64) insn, which is where I'd first start out. > > 0 0x0041a784 0x0041a784 0xf1000c3f cmp x1, #3 > 0 0x0041a788 0x0041a788 0x54fff401 b.ne #0xfe80 > 0 0x0041a78c 0x0041a78c 0x52800033 movz w19, #0x1 > 0 0x0041a790 0x0041a790 0xf9400416 ldr x22, [x0, #8] 0 mem > {3 0 0 0} 0x004000800618 > 0 0x0041a794 0x0041a794 0x179d b #0xfe74 > 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 > mem {3 0 0 0} 0x004000800620 > 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 > 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! > 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 > 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! > 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! > 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 > 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! > 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! > 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 > 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! > 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 > 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! > 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 > 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 > mem {3 0 0 0} 0x004000800630 > 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 > 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! > 0 0x0041a60c 0x0041a60c 0xb
QEMU for aarch64 with plugins seems to fail basic consistency checks
I wrote a QEMU plugin for aarch64 where the insn and mem callbacks print out the specifics of the guest instructions as they are "executed". I expect this trace stream to be well behaved but it is not. By well-behaved, I expect memory insns print out some memory details, non-memory insns don't print anything, and the pc only changes after a control flow instruction. I don't see that gross correctness about 2% of the time. 1. I'm using qemu at tag v4.2.0 (or master head; it doesn't matter), running on a x86_64 host. 2. I build qemu using ./configure --disable-sdl --enable-gtk --enable-plugins --enable-debug --target-list=aarch64-softmmu aarch64-linux-user 3. I execute qemu from its build area build/aarch64-linux-user/qemu-aarch64, with flags --cpu cortex-a72 and the appropriate args to --plugin ... -d plugin -D . 4. I'm emulating a simple C program in linux emulation mode. 5. The resulting qemu execution is valgrind clean (eg, I run qemu under valgrind) for my little program save for memory leaks I reported a few days ago. Below is an example of my trace output (the first int printed is the cpu_index, checked to be always 0). Note that the ldr instruction at 0x41a608 sometimes reports a memop, but most of the time it doesn't. Note that 0x41a608 is seen, by trace, running back to back. Note that (bottom of trace) that the movz instruction reports a memop. (The executed code comes from glibc _dl_aux_init, executed before main() is called.) How should this problem be tackled? I can't figure out how to make each tcg block be exactly 1 guest (aarch64) insn, which is where I'd first start out. 0 0x0041a784 0x0041a784 0xf1000c3f cmp x1, #3 0 0x0041a788 0x0041a788 0x54fff401 b.ne #0xfe80 0 0x0041a78c 0x0041a78c 0x52800033 movz w19, #0x1 0 0x0041a790 0x0041a790 0xf9400416 ldr x22, [x0, #8] 0 mem {3 0 0 0} 0x004000800618 0 0x0041a794 0x0041a794 0x179d b #0xfe74 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 mem {3 0 0 0} 0x004000800620 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 mem {3 0 0 0} 0x004000800630 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a608 0x0041a608 0xf8410c01 ldr x1, [x0, #0x10]! 0 0x0041a60c 0x0041a60c 0xb4000221 cbz x1, #0x44 0 0x0041a7d8 0x0041a7d8 0x52800035 movz w21, #0x1 0 0x0041a7dc 0x0041a7dc 0xf9400418 ldr x24, [x0, #8] 0 mem {3 0 0 0} 0x004000800638 0 0x0041a7e0 0x0041a7e0 0x178a b #0xfe28 0 0x0041a7d8 0x0041a7d8 0x52800035 movz w21, #0x1 0 mem {3 0 0 0} 0x004000800640
[Bug 1860610] [NEW] cap_disas_plugin leaks memory
Public bug reported: Looking at origin/master head, the function cap_disas_plugin leaks memory. per capstone's examples using their ABI, cs_free(insn, count); needs to called just before cs_close. I discovered this running qemu under valgrind. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1860610 Title: cap_disas_plugin leaks memory Status in QEMU: New Bug description: Looking at origin/master head, the function cap_disas_plugin leaks memory. per capstone's examples using their ABI, cs_free(insn, count); needs to called just before cs_close. I discovered this running qemu under valgrind. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1860610/+subscriptions
plugin interface function qemu_plugin_mem_size_shift
I don't understand what unsigned int qemu_plugin_mem_size_shift(qemu_plugin_meminfo_t info); does. The documentation in qemu-plugin.h is silent on this matter. It appears to expose more of the guts of qemu that I don't yet know.
plugin order of registration and order of callback
The documentation on the new plugin capabilities of qemu is silent about the order call back registration should be done, and is also silent on the order in which call backs are fired. Case in point: The callback registered by qemu_plugin_register_vcpu_mem_cb is called after the call back registered by qemu_plugin_register_vcpu_insn_exec_cb, regardless of the order of registration. However, I'd like to have the insn_exec_cb called after the mem_cb so that I can save the mem information to be consumed by the insn callback.