On Tue, 17 Jan 2023 at 19:21, Guenter Roeck <li...@roeck-us.net> wrote: > Anyway - any idea what to do to help figuring out what is happening ? > Add tracing support to pci interrupt handling, maybe ?
For intermittent bugs, I like recording the QEMU session under rr (using its chaos mode to provoke the failure if necessary) to get a recording that I can debug and re-debug at leisure. Usually you want to turn on/add tracing to help with this, and if the failure doesn't hit early in bootup then you might need to do a QEMU snapshot just before point-of-failure so you can run rr only on the short snapshot-to-failure segment. https://translatedcode.wordpress.com/2015/05/30/tricks-for-debugging-qemu-rr/ https://translatedcode.wordpress.com/2015/07/06/tricks-for-debugging-qemu-savevm-snapshots/ This gives you a debugging session from the QEMU side's perspective, of course -- assuming you know what the hardware is supposed to do you hopefully wind up with either "the guest software did X,Y,Z and we incorrectly did A" or else "the guest software did X,Y,Z, the spec says A is the right/a permitted thing but the guest got confused". If it's the latter then you have to look at the guest as a separate code analysis/debug problem. thanks -- PMM