On Tue, Oct 11, 2022 at 2:06 PM Jason A. Donenfeld <ja...@zx2c4.com> wrote:
>
> On Tue, Oct 11, 2022 at 09:46:01AM +0300, Pavel Dovgalyuk wrote:
> > On 10.10.2022 18:32, Peter Maydell wrote:
> > > On Mon, 10 Oct 2022 at 16:21, Jason A. Donenfeld <ja...@zx2c4.com> wrote:
> > >>
> > >> On Mon, Oct 10, 2022 at 11:54:50AM +0100, Peter Maydell wrote:
> > >>> The error is essentially the record-and-replay subsystem saying "the
> > >>> replay just asked for a random number at point when the recording
> > >>> did not ask for one, and so there's no 'this is what the number was'
> > >>> info in the record".
> > >>>
> > >>> I have had a quick look, and I think the reason for this is that
> > >>> load_snapshot() ("reset the VM state to the snapshot state stored in the
> > >>> disk image or migration stream") does a system reset. The replay
> > >>> process involves a lot of "load state from a snapshot and play
> > >>> forwards from there" operations. It doesn't expect that load_snapshot()
> > >>> would result in something reading random data, but now that we are
> > >>> calling qemu_guest_getrandom() in a reset hook, that happens.
> > >>
> > >> Hmm... so this seems like a bug in the replay code then? Shouldn't that
> > >> reset handler get hit during both passes, so the entry should be in
> > >> each?
> > >
> > > No, because record is just
> > > "reset the system, record all the way to the end stop",
> > > but replay is
> > > "set the system to the point we want to start at by using
> > > load_snapshot, play from there", and depending on the actions
> > > you do in the debugger like reverse-continue we might repeatedly
> > > do "reload that snapshot (implying a system reset) and play from there"
> > > multiple times.
> >
> > The idea of the patches is fdt randomization during reset, right?
> > But reset is used not only for real reboot, but also for restoring the
> > snapshots.
> > In the latter case it is like "just clear the hw registers to simplify
> > the initialization".
> > Therefore no other virtual hardware tried to read external data yet. And
> > random numbers are external to the machine, they come from the outer world.
> >
> > It means that this is completely new reset case and new solution should
> > be found for it.
>
> Do you have any proposals for that?

Okay I've actually read your message like 6 times now and think I may
have come up with something. Initial testing indicates it works well.
I'll send a new series shortly.

Jason

Reply via email to