Re: gdbstub and gbd segfaults on different instructions in user space emulation

Peter Maydell Mon, 30 Sep 2019 09:24:36 -0700

On Mon, 30 Sep 2019 at 16:57, Libo Zhou <zhl...@foxmail.com> wrote:
> I am encountering segmentation fault while porting my custom ISA to QEMU. My 
> custom ISA is VERY VERY simple, it only changes the [31:26] opcode field of 
> LW and SW instructions. The link has my very simple implementation: 
> https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg06976.html


> I have tried 2 ways of debugging it.
> Firstly, I connected gdb-multiarch to gdbstub, and I single-stepped the 
> instructions in my ELF. Immediately after the LW instruction, the segfault 
> was thrown. I observed the memory location using 'x' command and found that 
> at least my SW instruction was implemented correctly.
> Secondly, I used gdb to directly debug QEMU. I set the breakpoint at function 
> in translate.c:decode_opc. Pressing 'c' should have the same effect as 
> single-stepping instruction in gdbstub. However, the segmentation fault 
> wasn't thrown after LW. It was instead thrown after the 'nop' after 'jr r31' 
> in the objdump.

(1) If you're debugging the QEMU JIT itself, then you're probably
better off using QEMU's logging facilities (under the -d option)
rather than the gdbstub. The gdbstub is good if you're sure that
QEMU is basically functional and want to debug your guest, but
if you suspect bugs in QEMU itself then it can confuse you.
The -d debug logging is at a much lower level, which makes it
a better guide to what QEMU is really doing, though it is also
trickier to interpret.

(2) No, breakpointing on decode_opc is not the same as singlestepping
an instruction in gdb. This is a really important concept in QEMU
(and JITs in general) and if you don't understand it you're going
to be very confused. A JIT has two phases:
 (a) "translate time", when we take a block of guest instructions
and generate host machine code for them
 (b) "execution time", when we execute one or more of the blocks
of host machine code that we wrote at translate time
QEMU calls the blocks it works with "translation blocks", and
usually it will put multiple guest instructions into each TB;
a TB usually stops after a guest branch instructions. (You can
ask QEMU to put just one guest instruction into a TB using
the -singlestep command line option -- this is sometimes useful
when debugging.)

So if you put a breakpoint on decode_opc you'll see it is hit
for every instruction in the TB, which for the TB starting at
"00400090 <main>" will be every instruction up to and including
the 'nop' in the delay slot of the 'jr'. Once the whole TB is
translated, *then* we will execute it. It's only at execute time
that we perform the actual operations on the guest CPU that
the instructions require. If the segfault is because we think
the guest has made a bad memory access, we'll generate it here.
If the segfault is an actual crash in QEMU itself, it will
happen here if the bug is one that happens at execution time.

Note that the -d logging will distinguish between things that
happen at translate time (which is when the in_asm, op, out_asm etc
logging is printed) and things that happen at execution time
(which is when cpu, exec, int, etc logs are printed).

thanks
-- PMM

Re: gdbstub and gbd segfaults on different instructions in user space emulation

Reply via email to