On 12/12/19 8:22 AM, Stefan Berger wrote:
On 12/12/19 6:00 AM, Marc-André Lureau wrote:
Hi

On Wed, Dec 11, 2019 at 8:27 PM Stefan Berger
<stef...@linux.vnet.ibm.com> wrote:
Extend the tpm_spapr frontend with VM suspend and resume support.

Signed-off-by: Stefan Berger <stef...@linux.vnet.ibm.com>
---
  hw/tpm/tpm_spapr.c | 42 +++++++++++++++++++++++++++++++++++++++++-
  1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/hw/tpm/tpm_spapr.c b/hw/tpm/tpm_spapr.c
index c4a67e2403..d9153bd95c 100644
--- a/hw/tpm/tpm_spapr.c
+++ b/hw/tpm/tpm_spapr.c
@@ -87,6 +87,8 @@ typedef struct {
      TPMVersion be_tpm_version;

      size_t be_buffer_size;
+
+    bool deliver_response; /* whether to deliver response after VM resume */
  } SPAPRvTPMState;

  static void tpm_spapr_show_buffer(const unsigned char *buffer,
@@ -339,9 +341,47 @@ static enum TPMVersion tpm_spapr_get_version(TPMIf *ti)
      return tpm_backend_get_tpm_version(s->be_driver);
  }

+/* persistent state handling */
+
+static int tpm_spapr_pre_save(void *opaque)
+{
+    SPAPRvTPMState *s = opaque;
+
+    s->deliver_response = tpm_backend_finish_sync(s->be_driver);
+    /*
+     * we cannot deliver the results to the VM since DMA would touch VM memory
+     */
+
+    return 0;
+}
+
+static int tpm_spapr_post_load(void *opaque, int version_id)
+{
+    SPAPRvTPMState *s = opaque;
+
+    if (s->deliver_response) {
+        /* deliver the results to the VM via DMA */
+        tpm_spapr_request_completed(TPM_IF(s), 0);
Why isn't it enough to rely on tpm_spapr_request_completed callback
being called during pre-save when tpm_backend_finish_sync() is called?
(like tis & crb)


When .pre_save is called the VM memory has been fully replicated and only the devices need to save their state, right? So TIS and CRB save the response in memory of the device for the OS driver to pick up after resume. The SPAPR device model is expected to write the response into VM memory using DMA but memory won't be marked dirty anymore and replicated (afaik). So we may have the mechanism of having tpm_spapr_request_completed() invoked but in addition we need to re-deliver a response after resume so that the OS driver reads the proper response then. I'll investigate, though...


Suspend/resume works fine for as long as we don't catch a 'delayed response', meaning a response received while the devices suspend (and .pre_save is called). With this device the troubles are starting when having to deliver such a 'delayed response' because the whole tpm_spapr_request_completed() has to be deferred until after resume, otherwise the OS driver gets stuck. So, in tpm_spapr_request_completed() we need to be able to query the backend whether it is suspending, meaning whether its .pre_save() has been invoked and we received the response due to polling. If so, we must not do anything in this function at this time but do it in post_load and we need to remember this in a field 'deliver_response'. I have this field now but the reason why I set it was not correct, at least not for the tpm_emulator backend. In case we do get a delayed response while in the tpm_spapr's pre_save function, we do need to set the 'deliver_response' as well in order to call the tpm_spapr_request_completed() in tpm_spapr_post_load() when resuming.

The easiest way to trigger all this is to read a TPM 2.0's PCRs inside a VM and have libtpms instrumented with a sleep(4) in PCRRead() function to cause enough delays to trigger the critical code when suspending the VM.

I'll repost with these fixes.

   Stefan






Reply via email to