On Tue, Oct 11, 2022 at 09:46:01AM +0300, Pavel Dovgalyuk wrote:
> On 10.10.2022 18:32, Peter Maydell wrote:
> > On Mon, 10 Oct 2022 at 16:21, Jason A. Donenfeld <ja...@zx2c4.com> wrote:
> >>
> >> On Mon, Oct 10, 2022 at 11:54:50AM +0100, Peter Maydell wrote:
> >>> The error is essentially the record-and-replay subsystem saying "the
> >>> replay just asked for a random number at point when the recording
> >>> did not ask for one, and so there's no 'this is what the number was'
> >>> info in the record".
> >>>
> >>> I have had a quick look, and I think the reason for this is that
> >>> load_snapshot() ("reset the VM state to the snapshot state stored in the
> >>> disk image or migration stream") does a system reset. The replay
> >>> process involves a lot of "load state from a snapshot and play
> >>> forwards from there" operations. It doesn't expect that load_snapshot()
> >>> would result in something reading random data, but now that we are
> >>> calling qemu_guest_getrandom() in a reset hook, that happens.
> >>
> >> Hmm... so this seems like a bug in the replay code then? Shouldn't that
> >> reset handler get hit during both passes, so the entry should be in
> >> each?
> > 
> > No, because record is just
> > "reset the system, record all the way to the end stop",
> > but replay is
> > "set the system to the point we want to start at by using
> > load_snapshot, play from there", and depending on the actions
> > you do in the debugger like reverse-continue we might repeatedly
> > do "reload that snapshot (implying a system reset) and play from there"
> > multiple times.
> 
> The idea of the patches is fdt randomization during reset, right?
> But reset is used not only for real reboot, but also for restoring the 
> snapshots.
> In the latter case it is like "just clear the hw registers to simplify 
> the initialization".
> Therefore no other virtual hardware tried to read external data yet. And 
> random numbers are external to the machine, they come from the outer world.
> 
> It means that this is completely new reset case and new solution should 
> be found for it.

Do you have any proposals for that?

Jason

Reply via email to