On Mon, 30 Sep 2019 at 16:57, Libo Zhou <zhl...@foxmail.com> wrote: > I am encountering segmentation fault while porting my custom ISA to QEMU. My > custom ISA is VERY VERY simple, it only changes the [31:26] opcode field of > LW and SW instructions. The link has my very simple implementation: > https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg06976.html
> I have tried 2 ways of debugging it. > Firstly, I connected gdb-multiarch to gdbstub, and I single-stepped the > instructions in my ELF. Immediately after the LW instruction, the segfault > was thrown. I observed the memory location using 'x' command and found that > at least my SW instruction was implemented correctly. > Secondly, I used gdb to directly debug QEMU. I set the breakpoint at function > in translate.c:decode_opc. Pressing 'c' should have the same effect as > single-stepping instruction in gdbstub. However, the segmentation fault > wasn't thrown after LW. It was instead thrown after the 'nop' after 'jr r31' > in the objdump. (1) If you're debugging the QEMU JIT itself, then you're probably better off using QEMU's logging facilities (under the -d option) rather than the gdbstub. The gdbstub is good if you're sure that QEMU is basically functional and want to debug your guest, but if you suspect bugs in QEMU itself then it can confuse you. The -d debug logging is at a much lower level, which makes it a better guide to what QEMU is really doing, though it is also trickier to interpret. (2) No, breakpointing on decode_opc is not the same as singlestepping an instruction in gdb. This is a really important concept in QEMU (and JITs in general) and if you don't understand it you're going to be very confused. A JIT has two phases: (a) "translate time", when we take a block of guest instructions and generate host machine code for them (b) "execution time", when we execute one or more of the blocks of host machine code that we wrote at translate time QEMU calls the blocks it works with "translation blocks", and usually it will put multiple guest instructions into each TB; a TB usually stops after a guest branch instructions. (You can ask QEMU to put just one guest instruction into a TB using the -singlestep command line option -- this is sometimes useful when debugging.) So if you put a breakpoint on decode_opc you'll see it is hit for every instruction in the TB, which for the TB starting at "00400090 <main>" will be every instruction up to and including the 'nop' in the delay slot of the 'jr'. Once the whole TB is translated, *then* we will execute it. It's only at execute time that we perform the actual operations on the guest CPU that the instructions require. If the segfault is because we think the guest has made a bad memory access, we'll generate it here. If the segfault is an actual crash in QEMU itself, it will happen here if the bug is one that happens at execution time. Note that the -d logging will distinguish between things that happen at translate time (which is when the in_asm, op, out_asm etc logging is printed) and things that happen at execution time (which is when cpu, exec, int, etc logs are printed). thanks -- PMM