On 11.03.2020 12:38, Michal Privoznik wrote:
> On 3/5/20 3:47 PM, Nikolay Shirokovskiy wrote:
>> needReply added in [1] looks redundant. Indeed it is set to false only
>> when mon->await_event is set too (the only exception qemuAgentFSTrim
>> which is mistaken).
>>
>> However it fixes the issue when qemuAgentCommand exits on error path and
>> mon->await_event is not reset. Let's instead reset mon->await_event properly.
>>
>> Also remove "Woken up by event" debug message as it can be misleading.
>> We can get it also if monitor is closed due to serial changed event
>> currently. Anyway both qemuAgentClose and qemuAgentNotifyEvent log
>> itself.
>>
>> [1] qemu: make sure agent returns error when required data are missing
>>
>> Signed-off-by: Nikolay Shirokovskiy <nshirokovs...@virtuozzo.com>
>> ---
>> src/qemu/qemu_agent.c | 47 ++++++++++++++++++++-----------------------
>> 1 file changed, 22 insertions(+), 25 deletions(-)
>>
>> diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c
>> index 2de53efb7a..605c9f563e 100644
>> --- a/src/qemu/qemu_agent.c
>> +++ b/src/qemu/qemu_agent.c
>> @@ -1112,7 +1112,6 @@ static int
>> qemuAgentCommand(qemuAgentPtr mon,
>> virJSONValuePtr cmd,
>> virJSONValuePtr *reply,
>> - bool needReply,
>> int seconds)
>> {
>> int ret = -1;
>> @@ -1121,17 +1120,16 @@ qemuAgentCommand(qemuAgentPtr mon,
>> int await_event = mon->await_event;
>> *reply = NULL;
>> + memset(&msg, 0, sizeof(msg));
>> if (!mon->running) {
>> virReportError(VIR_ERR_AGENT_UNRESPONSIVE, "%s",
>> _("Guest agent disappeared while executing
>> command"));
>> - return -1;
>> + goto cleanup;
>> }
>> if (qemuAgentGuestSync(mon) < 0)
>> - return -1;
>> -
>> - memset(&msg, 0, sizeof(msg));
>> + goto cleanup;
>> if (!(cmdstr = virJSONValueToString(cmd, false)))
>> goto cleanup;
>> @@ -1149,9 +1147,7 @@ qemuAgentCommand(qemuAgentPtr mon,
>> /* If we haven't obtained any reply but we wait for an
>> * event, then don't report this as error */
>> if (!msg.rxObject) {
>> - if (await_event && !needReply) {
>> - VIR_DEBUG("Woken up by event %d", await_event);
>> - } else {
>
> Please keep this debug printing. It doesn't hurt anything and has potential
> of helping us to debug.
This debug line is not correct. I reflected it in the commit message.
I actually get this line during debug session when agent monitor is closed
during VM shutdown on serial event and not shutdown event. Serial event
just comes first.
>
>> + if (!await_event) {
>> if (mon->running)
>> virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
>> _("Missing monitor reply object"));
>> @@ -1169,6 +1165,7 @@ qemuAgentCommand(qemuAgentPtr mon,
>> cleanup:
>> VIR_FREE(cmdstr);
>> VIR_FREE(msg.txBuffer);
>> + mon->await_event = QEMU_AGENT_EVENT_NONE;
>
> This is the part I don't like about this patch. The rest looks fine. Why is
> this needed? Either the mon->await_event is not set by the caller, or if set
> the the event handler will clear it out by calling qemuAgentNotifyEvent(). Or
> am I missing something?
The whole point of the patch is in this line)
Yeah, you are talking of success paths. But on error paths mon->await_event is
not cleared currently so next command which need reply
can be finish on await event without needReply which lead to SIGSGEV. This is
what patch [1] fixes by introducing needReply which don't need to be cleared
as it is on stack.
I guess scenario is next given the bug [2] mentioned in [1]:
- vm shutdown command is sent to GA
- it is finished with timeout (60 s), mon->await_event is not cleared
- guest ping is sent
- guest shutdowned and ping command get awakend by this event as
mon->await_event is set (bug)
So needReply argument is helpful but it is cleaner just to reset
mon->await_event properly.
[1] qemu: make sure agent returns error when required data are missing
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1058149
Nikolay
>
>> return ret;
>> }
>> @@ -1269,7 +1266,7 @@ int qemuAgentShutdown(qemuAgentPtr mon,
>> mon->await_event = QEMU_AGENT_EVENT_RESET;
>> else
>> mon->await_event = QEMU_AGENT_EVENT_SHUTDOWN;
>> - ret = qemuAgentCommand(mon, cmd, &reply, false,
>> + ret = qemuAgentCommand(mon, cmd, &reply,
>> VIR_DOMAIN_QEMU_AGENT_COMMAND_SHUTDOWN);
>> virJSONValueFree(cmd);
>> @@ -1312,7 +1309,7 @@ int qemuAgentFSFreeze(qemuAgentPtr mon, const char
>> **mountpoints,
>> if (!cmd)
>> goto cleanup;
>> - if (qemuAgentCommand(mon, cmd, &reply, true, mon->timeout) < 0)
>> + if (qemuAgentCommand(mon, cmd, &reply, mon->timeout) < 0)
>> goto cleanup;
>> if (virJSONValueObjectGetNumberInt(reply, "return", &ret) < 0) {
>> @@ -1349,7 +1346,7 @@ int qemuAgentFSThaw(qemuAgentPtr mon)
>> if (!cmd)
>> return -1;
>> - if (qemuAgentCommand(mon, cmd, &reply, true, mon->timeout) < 0)
>> + if (qemuAgentCommand(mon, cmd, &reply, mon->timeout) < 0)
>> goto cleanup;
>> if (virJSONValueObjectGetNumberInt(reply, "return", &ret) < 0) {
>> @@ -1386,7 +1383,7 @@ qemuAgentSuspend(qemuAgentPtr mon,
>> return -1;
>> mon->await_event = QEMU_AGENT_EVENT_SUSPEND;
>> - ret = qemuAgentCommand(mon, cmd, &reply, false, mon->timeout);
>> + ret = qemuAgentCommand(mon, cmd, &reply, mon->timeout);
>> virJSONValueFree(cmd);
>> virJSONValueFree(reply);
>> @@ -1415,7 +1412,7 @@ qemuAgentArbitraryCommand(qemuAgentPtr mon,
>> if (!(cmd = virJSONValueFromString(cmd_str)))
>> goto cleanup;
>> - if ((ret = qemuAgentCommand(mon, cmd, &reply, true, timeout)) < 0)
>> + if ((ret = qemuAgentCommand(mon, cmd, &reply, timeout)) < 0)
>> goto cleanup;
>> if (!(*result = virJSONValueToString(reply, false)))
>> @@ -1442,7 +1439,7 @@ qemuAgentFSTrim(qemuAgentPtr mon,
>> if (!cmd)
>> return ret;
>> - ret = qemuAgentCommand(mon, cmd, &reply, false, mon->timeout);
>> + ret = qemuAgentCommand(mon, cmd, &reply, mon->timeout);
>
> Oh this is fun. This 'false' you are removing should have been true as the
> command does have a reply that we need to wait for.
>
>> virJSONValueFree(cmd);
>> virJSONValueFree(reply);
>
> Michal
>