On Wed, 2024-05-01 at 13:20 -0400, Stefan Berger wrote:
> 
> 
> On 5/1/24 12:52, James Bottomley wrote:
> > On Wed, 2024-05-01 at 12:31 -0400, Stefan Berger wrote:
> > > 
> > > 
> > > On 5/1/24 12:21, James Bottomley wrote:
> > > > On Tue, 2024-04-30 at 17:12 -0400, Stefan Berger wrote:
> > > > > On 4/30/24 15:08, James Bottomley wrote:
> > > > [...]
> > > > > > +The mssim backend supports snapshotting and migration by
> > > > > > not resetting
> > > > > 
> > > > > I don't thing snapshotting is supported because snapshooting
> > > > > would require you to be able to set the state of the vTPM
> > > > > from the snapshot you started. I would remove the claim.
> > > > 
> > > > I thought we established last time that it can definitely do
> > > > both (and I've tested it because you asked me to). 
> > > > Snapshotting and migration are essentially the same thing, with
> > > > snapshotting being easier because it can be done on the same
> > > > host meaning the same command line parameters.  If you migrate
> > > > to a different host you need the socket to point back to the
> > > > host serving the vTPM.
> > > > 
> > > > To do this easily you simply keep the vTPM running while the VM
> > > > is undergoing snapshot and migration.  If you're thinking of
> > > > and extended down time for the snapshot, then it's up to the
> > > > vTPM implementation to store the state (or simply keep it
> > > > running for an extended time doing nothing).
> > > 
> > > Which part of the code injects the state into the vTPM so that it
> > > resumes with the state of the TPM (PCRs, NVRAM indices, keys,
> > > sessions etc.) from when the snapshot was taken?
> > 
> > We've had this conversation before too:
> > 
> > https://lore.kernel.org/qemu-devel/f928986fd4095b1f27c83ede96f3b0dd65ad965e.ca...@linux.ibm.com/T/#u
> > 
> > But the synopsis is nothing does.  The design is to be entirely
> > independent of vTPM implementation: it will actually work with any
> > TPM obeying the simulator IP protocol (MS reference, ibmswtpm2 or
> > even your swtpm) but the price of this is that the user has to
> > preserve the vTPM state, by whatever means they deem appropriate,
> > independently of the VM snapshot image.
> 
> Unless your backend can retrieve the state upon snapshot save and
> inject state into the vTPM upon snapshot resume, 'snapshotting' is
> not working  (correctly).

That's too narrow a definition.  Snapshot is working if you can capture
and restore the machine state.  A qemu snapshot can helpfully capture
some device states, but not all of it (Devices that are passed through,
like accelerators and AI units, are particularly problematic).  How you
get the external device state (IBM cloud has an elaborate scripted
state capture for GPUs, for instance) is up to the person implementing
the snapshot.  For this case, the added doc specifically warns "... the
state of the Microsoft Simulator server must be preserved (or the
server kept running) outside of QEMU for restore to be successful", so
I don't think there's going to be an expectation mismatch.

James



Reply via email to