On 10.10.2022 18:32, Peter Maydell wrote:
On Mon, 10 Oct 2022 at 16:21, Jason A. Donenfeld <ja...@zx2c4.com> wrote:

On Mon, Oct 10, 2022 at 11:54:50AM +0100, Peter Maydell wrote:
The error is essentially the record-and-replay subsystem saying "the
replay just asked for a random number at point when the recording
did not ask for one, and so there's no 'this is what the number was'
info in the record".

I have had a quick look, and I think the reason for this is that
load_snapshot() ("reset the VM state to the snapshot state stored in the
disk image or migration stream") does a system reset. The replay
process involves a lot of "load state from a snapshot and play
forwards from there" operations. It doesn't expect that load_snapshot()
would result in something reading random data, but now that we are
calling qemu_guest_getrandom() in a reset hook, that happens.

Hmm... so this seems like a bug in the replay code then? Shouldn't that
reset handler get hit during both passes, so the entry should be in
each?

No, because record is just
"reset the system, record all the way to the end stop",
but replay is
"set the system to the point we want to start at by using
load_snapshot, play from there", and depending on the actions
you do in the debugger like reverse-continue we might repeatedly
do "reload that snapshot (implying a system reset) and play from there"
multiple times.

The idea of the patches is fdt randomization during reset, right?
But reset is used not only for real reboot, but also for restoring the snapshots. In the latter case it is like "just clear the hw registers to simplify the initialization". Therefore no other virtual hardware tried to read external data yet. And random numbers are external to the machine, they come from the outer world.

It means that this is completely new reset case and new solution should be found for it.

Pavel Dovgalyuk

Reply via email to