Re: [libvirt] [PATCH v2] qemu_agent: Issue guest-sync prior to every command
On Fri, Mar 16, 2012 at 17:35:09 +0100, Michal Privoznik wrote: If we issue guest command and GA is not running, the issuing thread will block endlessly. We can check for GA presence by issuing guest-sync with unique ID (timestamp). We don't want to issue real command as even if GA is not running, once it is started, it process all commands written to GA socket. --- diff to v1: - don't keep list of issued IDs because it's pointless Some background on this: I've intended to switch to new guest-sync-delimited and use older guest-sync for older GA. However, since we don't use stream base implementation but use new line as delimiter for GA responses we don't need GA to issue sentinel byte 0xFF for us: http://wiki.qemu.org/Features/QAPI/GuestAgent#QEMU_Guest_Agent_Protocol Moreover, since we are using guest-sync just for detecting GA, it is not necessary to use sentinel byte at all: http://lists.nongnu.org/archive/html/qemu-devel/2012-03/msg03278.html src/qemu/qemu_agent.c | 134 ++-- 1 files changed, 128 insertions(+), 6 deletions(-) ACK, but don't forget to run make syntax-check before pushing to fix your violation of recently added sizeof() rule. Anyway, there is a small chance the guest-agent will crash after sending us a response to guest-sync but before it can process the real command. Current code will just hang waiting for the reply. We were discussing this issue with Michal and came up with a possible solution: - use timeouts for all ga commands - any code sending ga command has to register a callback which will be used in case the command takes longer than timeout; in that case the caller already exited with error and the callback will make sure to undo any changes the command (which we thought was never acted upon) made; e.g., it will unfreeze guest's filesystem after we didn't get a reply to freeze in time - when a new command is about to be issued, guest-sync is called (already done) and getting reply to it removes any callbacks since we know it won't be processed - anytime the callback is present, we may report that guest agent is not responding to us using domcontrol (if possible) or using a new api Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] qemu_agent: Issue guest-sync prior to every command
On 30.03.2012 16:31, Jiri Denemark wrote: On Fri, Mar 16, 2012 at 17:35:09 +0100, Michal Privoznik wrote: If we issue guest command and GA is not running, the issuing thread will block endlessly. We can check for GA presence by issuing guest-sync with unique ID (timestamp). We don't want to issue real command as even if GA is not running, once it is started, it process all commands written to GA socket. --- diff to v1: - don't keep list of issued IDs because it's pointless Some background on this: I've intended to switch to new guest-sync-delimited and use older guest-sync for older GA. However, since we don't use stream base implementation but use new line as delimiter for GA responses we don't need GA to issue sentinel byte 0xFF for us: http://wiki.qemu.org/Features/QAPI/GuestAgent#QEMU_Guest_Agent_Protocol Moreover, since we are using guest-sync just for detecting GA, it is not necessary to use sentinel byte at all: http://lists.nongnu.org/archive/html/qemu-devel/2012-03/msg03278.html src/qemu/qemu_agent.c | 134 ++-- 1 files changed, 128 insertions(+), 6 deletions(-) ACK, but don't forget to run make syntax-check before pushing to fix your violation of recently added sizeof() rule. Thanks pushed. Anyway, there is a small chance the guest-agent will crash after sending us a response to guest-sync but before it can process the real command. Current code will just hang waiting for the reply. We were discussing this issue with Michal and came up with a possible solution: - use timeouts for all ga commands - any code sending ga command has to register a callback which will be used in case the command takes longer than timeout; in that case the caller already exited with error and the callback will make sure to undo any changes the command (which we thought was never acted upon) made; e.g., it will unfreeze guest's filesystem after we didn't get a reply to freeze in time - when a new command is about to be issued, guest-sync is called (already done) and getting reply to it removes any callbacks since we know it won't be processed - anytime the callback is present, we may report that guest agent is not responding to us using domcontrol (if possible) or using a new api Jirka Yeah, I'll post follow up patch once we're after release. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2] qemu_agent: Issue guest-sync prior to every command
If we issue guest command and GA is not running, the issuing thread will block endlessly. We can check for GA presence by issuing guest-sync with unique ID (timestamp). We don't want to issue real command as even if GA is not running, once it is started, it process all commands written to GA socket. --- diff to v1: - don't keep list of issued IDs because it's pointless Some background on this: I've intended to switch to new guest-sync-delimited and use older guest-sync for older GA. However, since we don't use stream base implementation but use new line as delimiter for GA responses we don't need GA to issue sentinel byte 0xFF for us: http://wiki.qemu.org/Features/QAPI/GuestAgent#QEMU_Guest_Agent_Protocol Moreover, since we are using guest-sync just for detecting GA, it is not necessary to use sentinel byte at all: http://lists.nongnu.org/archive/html/qemu-devel/2012-03/msg03278.html src/qemu/qemu_agent.c | 134 ++-- 1 files changed, 128 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index a17d025..ee82fae 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -39,6 +39,7 @@ #include virterror_internal.h #include json.h #include virfile.h +#include virtime.h #define VIR_FROM_THIS VIR_FROM_QEMU @@ -316,6 +317,7 @@ qemuAgentIOProcessLine(qemuAgentPtr mon, { virJSONValuePtr obj = NULL; int ret = -1; +unsigned long long id; VIR_DEBUG(Line [%s], line); @@ -340,6 +342,20 @@ qemuAgentIOProcessLine(qemuAgentPtr mon, obj = NULL; ret = 0; } else { +/* If we've received something like: + * {return: 1234} + * it is likely that somebody started GA + * which is now processing our previous + * guest-sync commands. Check if this is + * the case and don't report an error but + * return silently. + */ +if (virJSONValueObjectGetNumberUlong(obj, return, id) == 0) { +VIR_DEBUG(Ignoring delayed reply to guest-sync: %llu, id); +ret = 0; +goto cleanup; +} + qemuReportError(VIR_ERR_INTERNAL_ERROR, _(Unexpected JSON reply '%s'), line); } @@ -813,11 +829,28 @@ void qemuAgentClose(qemuAgentPtr mon) qemuAgentUnlock(mon); } +#define QEMU_AGENT_WAIT_TIME (1000ull * 5) +/** + * qemuAgentSend: + * @mon: Monitor + * @msg: Message + * @timeout: use timeout? + * + * Send @msg to agent @mon. + * Wait max QEMU_AGENT_WAIT_TIME for agent + * to reply. + * + * Returns: 0 on success, + * -2 on timeout, + * -1 otherwise + */ static int qemuAgentSend(qemuAgentPtr mon, - qemuAgentMessagePtr msg) + qemuAgentMessagePtr msg, + bool timeout) { int ret = -1; +unsigned long long now, then; /* Check whether qemu quit unexpectedly */ if (mon-lastError.code != VIR_ERR_OK) { @@ -827,13 +860,26 @@ static int qemuAgentSend(qemuAgentPtr mon, return -1; } +if (timeout) { +if (virTimeMillisNow(now) 0) +return -1; +then = now + QEMU_AGENT_WAIT_TIME; +} + mon-msg = msg; qemuAgentUpdateWatch(mon); while (!mon-msg-finished) { -if (virCondWait(mon-notify, mon-lock) 0) { -qemuReportError(VIR_ERR_INTERNAL_ERROR, %s, -_(Unable to wait on monitor condition)); +if ((timeout virCondWaitUntil(mon-notify, mon-lock, then) 0) || +(!timeout virCondWait(mon-notify, mon-lock) 0)) { +if (errno == ETIMEDOUT) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Guest agent not available for now)); +ret = -2; +} else { +qemuReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Unable to wait on monitor condition)); +} goto cleanup; } } @@ -855,6 +901,78 @@ cleanup: } +/** + * qemuAgentGuestSync: + * @mon: Monitor + * + * Send guest-sync with unique ID + * and wait for reply. If we get one, check if + * received ID is equal to given. + * + * Returns: 0 on success, + * -1 otherwise + */ +static int +qemuAgentGuestSync(qemuAgentPtr mon) +{ +int ret = -1; +int send_ret; +unsigned long long id, id_ret; +qemuAgentMessage sync_msg; + +memset(sync_msg, 0, sizeof sync_msg); + +if (virTimeMillisNow(id) 0) +return -1; + +if (virAsprintf(sync_msg.txBuffer, +{\execute\:\guest-sync\, +\arguments\:{\id\:%llu}}, id) 0) { +virReportOOMError(); +return -1; +} + +sync_msg.txLength = strlen(sync_msg.txBuffer); + +VIR_DEBUG(Sending guest-sync command with ID: %llu, id); + +