Re: [libvirt] [PATCH v2] qemu_agent: Issue guest-sync prior to every command

2012-03-30 Thread Jiri Denemark
On Fri, Mar 16, 2012 at 17:35:09 +0100, Michal Privoznik wrote:
 If we issue guest command and GA is not running, the issuing thread
 will block endlessly. We can check for GA presence by issuing
 guest-sync with unique ID (timestamp). We don't want to issue real
 command as even if GA is not running, once it is started, it process
 all commands written to GA socket.
 ---
 diff to v1:
 - don't keep list of issued IDs because it's pointless
 
 Some background on this:
 I've intended to switch to new guest-sync-delimited and use
 older guest-sync for older GA. However, since we don't use
 stream base implementation but use new line as delimiter for
 GA responses we don't need GA to issue sentinel byte 0xFF for us:
 
   http://wiki.qemu.org/Features/QAPI/GuestAgent#QEMU_Guest_Agent_Protocol
 
 Moreover, since we are using guest-sync just for detecting GA, it is
 not necessary to use sentinel byte at all:
 
   http://lists.nongnu.org/archive/html/qemu-devel/2012-03/msg03278.html
 
  src/qemu/qemu_agent.c |  134 ++--
  1 files changed, 128 insertions(+), 6 deletions(-)

ACK, but don't forget to run make syntax-check before pushing to fix your
violation of recently added sizeof() rule.

Anyway, there is a small chance the guest-agent will crash after sending us a
response to guest-sync but before it can process the real command. Current
code will just hang waiting for the reply. We were discussing this issue with
Michal and came up with a possible solution:

- use timeouts for all ga commands
- any code sending ga command has to register a callback which will be used in
  case the command takes longer than timeout; in that case the caller already
  exited with error and the callback will make sure to undo any changes the
  command (which we thought was never acted upon) made; e.g., it will unfreeze
  guest's filesystem after we didn't get a reply to freeze in time
- when a new command is about to be issued, guest-sync is called (already
  done) and getting reply to it removes any callbacks since we know it won't
  be processed
- anytime the callback is present, we may report that guest agent is not
  responding to us using domcontrol (if possible) or using a new api

Jirka

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v2] qemu_agent: Issue guest-sync prior to every command

2012-03-30 Thread Michal Privoznik
On 30.03.2012 16:31, Jiri Denemark wrote:
 On Fri, Mar 16, 2012 at 17:35:09 +0100, Michal Privoznik wrote:
 If we issue guest command and GA is not running, the issuing thread
 will block endlessly. We can check for GA presence by issuing
 guest-sync with unique ID (timestamp). We don't want to issue real
 command as even if GA is not running, once it is started, it process
 all commands written to GA socket.
 ---
 diff to v1:
 - don't keep list of issued IDs because it's pointless

 Some background on this:
 I've intended to switch to new guest-sync-delimited and use
 older guest-sync for older GA. However, since we don't use
 stream base implementation but use new line as delimiter for
 GA responses we don't need GA to issue sentinel byte 0xFF for us:

   http://wiki.qemu.org/Features/QAPI/GuestAgent#QEMU_Guest_Agent_Protocol

 Moreover, since we are using guest-sync just for detecting GA, it is
 not necessary to use sentinel byte at all:

   http://lists.nongnu.org/archive/html/qemu-devel/2012-03/msg03278.html

  src/qemu/qemu_agent.c |  134 
 ++--
  1 files changed, 128 insertions(+), 6 deletions(-)
 
 ACK, but don't forget to run make syntax-check before pushing to fix your
 violation of recently added sizeof() rule.
 

Thanks pushed.
 Anyway, there is a small chance the guest-agent will crash after sending us a
 response to guest-sync but before it can process the real command. Current
 code will just hang waiting for the reply. We were discussing this issue with
 Michal and came up with a possible solution:
 
 - use timeouts for all ga commands
 - any code sending ga command has to register a callback which will be used in
   case the command takes longer than timeout; in that case the caller already
   exited with error and the callback will make sure to undo any changes the
   command (which we thought was never acted upon) made; e.g., it will unfreeze
   guest's filesystem after we didn't get a reply to freeze in time
 - when a new command is about to be issued, guest-sync is called (already
   done) and getting reply to it removes any callbacks since we know it won't
   be processed
 - anytime the callback is present, we may report that guest agent is not
   responding to us using domcontrol (if possible) or using a new api
 
 Jirka

Yeah, I'll post follow up patch once we're after release.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v2] qemu_agent: Issue guest-sync prior to every command

2012-03-16 Thread Michal Privoznik
If we issue guest command and GA is not running, the issuing thread
will block endlessly. We can check for GA presence by issuing
guest-sync with unique ID (timestamp). We don't want to issue real
command as even if GA is not running, once it is started, it process
all commands written to GA socket.
---
diff to v1:
- don't keep list of issued IDs because it's pointless

Some background on this:
I've intended to switch to new guest-sync-delimited and use
older guest-sync for older GA. However, since we don't use
stream base implementation but use new line as delimiter for
GA responses we don't need GA to issue sentinel byte 0xFF for us:

  http://wiki.qemu.org/Features/QAPI/GuestAgent#QEMU_Guest_Agent_Protocol

Moreover, since we are using guest-sync just for detecting GA, it is
not necessary to use sentinel byte at all:

  http://lists.nongnu.org/archive/html/qemu-devel/2012-03/msg03278.html

 src/qemu/qemu_agent.c |  134 ++--
 1 files changed, 128 insertions(+), 6 deletions(-)

diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c
index a17d025..ee82fae 100644
--- a/src/qemu/qemu_agent.c
+++ b/src/qemu/qemu_agent.c
@@ -39,6 +39,7 @@
 #include virterror_internal.h
 #include json.h
 #include virfile.h
+#include virtime.h
 
 #define VIR_FROM_THIS VIR_FROM_QEMU
 
@@ -316,6 +317,7 @@ qemuAgentIOProcessLine(qemuAgentPtr mon,
 {
 virJSONValuePtr obj = NULL;
 int ret = -1;
+unsigned long long id;
 
 VIR_DEBUG(Line [%s], line);
 
@@ -340,6 +342,20 @@ qemuAgentIOProcessLine(qemuAgentPtr mon,
 obj = NULL;
 ret = 0;
 } else {
+/* If we've received something like:
+ *  {return: 1234}
+ * it is likely that somebody started GA
+ * which is now processing our previous
+ * guest-sync commands. Check if this is
+ * the case and don't report an error but
+ * return silently.
+ */
+if (virJSONValueObjectGetNumberUlong(obj, return, id) == 0) {
+VIR_DEBUG(Ignoring delayed reply to guest-sync: %llu, id);
+ret = 0;
+goto cleanup;
+}
+
 qemuReportError(VIR_ERR_INTERNAL_ERROR,
 _(Unexpected JSON reply '%s'), line);
 }
@@ -813,11 +829,28 @@ void qemuAgentClose(qemuAgentPtr mon)
 qemuAgentUnlock(mon);
 }
 
+#define QEMU_AGENT_WAIT_TIME (1000ull * 5)
 
+/**
+ * qemuAgentSend:
+ * @mon: Monitor
+ * @msg: Message
+ * @timeout: use timeout?
+ *
+ * Send @msg to agent @mon.
+ * Wait max QEMU_AGENT_WAIT_TIME for agent
+ * to reply.
+ *
+ * Returns: 0 on success,
+ *  -2 on timeout,
+ *  -1 otherwise
+ */
 static int qemuAgentSend(qemuAgentPtr mon,
- qemuAgentMessagePtr msg)
+ qemuAgentMessagePtr msg,
+ bool timeout)
 {
 int ret = -1;
+unsigned long long now, then;
 
 /* Check whether qemu quit unexpectedly */
 if (mon-lastError.code != VIR_ERR_OK) {
@@ -827,13 +860,26 @@ static int qemuAgentSend(qemuAgentPtr mon,
 return -1;
 }
 
+if (timeout) {
+if (virTimeMillisNow(now)  0)
+return -1;
+then = now + QEMU_AGENT_WAIT_TIME;
+}
+
 mon-msg = msg;
 qemuAgentUpdateWatch(mon);
 
 while (!mon-msg-finished) {
-if (virCondWait(mon-notify, mon-lock)  0) {
-qemuReportError(VIR_ERR_INTERNAL_ERROR, %s,
-_(Unable to wait on monitor condition));
+if ((timeout  virCondWaitUntil(mon-notify, mon-lock, then)  0) 
||
+(!timeout  virCondWait(mon-notify, mon-lock)  0)) {
+if (errno == ETIMEDOUT) {
+qemuReportError(VIR_ERR_INTERNAL_ERROR, %s,
+_(Guest agent not available for now));
+ret = -2;
+} else {
+qemuReportError(VIR_ERR_INTERNAL_ERROR, %s,
+_(Unable to wait on monitor condition));
+}
 goto cleanup;
 }
 }
@@ -855,6 +901,78 @@ cleanup:
 }
 
 
+/**
+ * qemuAgentGuestSync:
+ * @mon: Monitor
+ *
+ * Send guest-sync with unique ID
+ * and wait for reply. If we get one, check if
+ * received ID is equal to given.
+ *
+ * Returns: 0 on success,
+ *  -1 otherwise
+ */
+static int
+qemuAgentGuestSync(qemuAgentPtr mon)
+{
+int ret = -1;
+int send_ret;
+unsigned long long id, id_ret;
+qemuAgentMessage sync_msg;
+
+memset(sync_msg, 0, sizeof sync_msg);
+
+if (virTimeMillisNow(id)  0)
+return -1;
+
+if (virAsprintf(sync_msg.txBuffer,
+{\execute\:\guest-sync\, 
+\arguments\:{\id\:%llu}}, id)  0) {
+virReportOOMError();
+return -1;
+}
+
+sync_msg.txLength = strlen(sync_msg.txBuffer);
+
+VIR_DEBUG(Sending guest-sync command with ID: %llu, id);
+
+