On 01/23/2014 05:07 AM, Fam Zheng wrote:
> On Wed, 01/22 17:53, Stratos Psomadakis wrote:
>> Hi,
>>
>> we've encountered a weird issue regarding monitor (qmp and hmp) behavior
>> with qemu-1.7 (and qemu-1.5). The following steps will reproduce the issue:
>>
>> 1) Client A connects to qmp socket with socat
>> 2) Client A gets greeting message {"QMP": {"version": ..}
>> 3) Client A waits (select on the socket's fd)
>> 4) Client B tries to connect to the *same* qmp socket with socat
>> 5) Client B does *NOT* get any greating message
>> 6) Client B waits (select on the socket's fd)
>> 7) Client B closes connection (kill socat)
>> 8) Client A quits too
>> 9) Client C connects to qmp socket
>> 10) Client C gets *two* greeting messages!!!
> Hi Stratos, thank you for debugging and reporting this.
>
> I tested this sequence but can't fully reproduce this. What I see is 5) but no
> 10). Client C acts normally. And your patch below doesn't solve it for me.Hm, which qemu version (or repo branch / tag) did you use? We did a quick scan of the master branch code / commits, but we didn't find anything that might fix the issue. > To submit a patch, please follow instructions as described in > http://wiki.qemu.org/Contribute/SubmitAPatch > so it could be picked up by maintainers. Specifically, you need to format your > patch email with "git format-patch" and add a "Signed-off-by:" line in your > patch email. Ok. If any dev can confirm that this is a bug (and that the patch below is the correct way to fix it) I'll resubmit it properly. Thanks, Stratos > Thanks, > > Fam > >> After some investigation, we traced it down to the monitor_flush() >> function in monitor.c. Specifically, when a second client connects to >> the qmp (client B), while another one is already using it (client A), we >> get the following from stracing the second client (client B): >> >> connect(3, {sa_family=AF_FILE, path="foo.mon"}, 9) = 0 >> getsockname(3, {sa_family=AF_FILE, NULL}, [2]) = 0 >> select(4, [0 3], [1 3], [], NULL) = 2 (out [1 3]) >> select(4, [0 3], [], [], NULL >> >> So, the connect() syscall from client B succeeds, although client B >> connection has not yet been accepted by the qmp server (it's still in >> the backlog of the qmp listening socket). >> >> After killing client B and then client A, we see the following when >> stracing the qemu proc: >> >> 22363 accept4(6, {sa_family=AF_FILE, NULL}, [2], SOCK_CLOEXEC) = 9 >> 22363 fcntl(9, F_GETFL) = 0x2 (flags O_RDWR) >> 22363 fcntl(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> 22363 fstat(9, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0 >> 22363 fcntl(9, F_GETFL) = 0x802 (flags >> O_RDWR|O_NONBLOCK) >> 22363 write(9, "{\"QMP\": {\"version\": {\"qemu\": {\"m"..., 127) = >> -1 EPIPE (Broken pipe) >> 22363 --- SIGPIPE (Broken pipe) @ 0 (0) --- >> >> The qmp server / qemu accepts the connection from client B (who has now >> closed the connection) and tries to write the greeting message to the >> socket fd. This results in write returning an error (EPIPE). >> >> The monitor_flush() function doesn't seem to handle this case (write >> error). Instead, it adds a watch / handler to retry the write operation. >> Thus, mon->outbuf is not cleaned up properly, which results in duplicate >> greeting messages for the next client to connect. >> >> The following seems to do the trick. >> >> diff --git a/monitor.c b/monitor.c >> index 845f608..5622f20 100644 >> --- a/monitor.c >> +++ b/monitor.c >> @@ -288,8 +288,8 @@ void monitor_flush(Monitor *mon) >> >> if (len && !mon->mux_out) { >> rc = qemu_chr_fe_write(mon->chr, (const uint8_t *) buf, len); >> - if (rc == len) { >> - /* all flushed */ >> + if ((rc < 0 && errno != EAGAIN) || (rc == len)) { >> + /* all flushed or error */ >> QDECREF(mon->outbuf); >> mon->outbuf = qstring_new(); >> return; >> >> Comments? >> >> Thanks, >> Stratos >> >> -- >> Stratos Psomadakis >> <[email protected]> >> > -- Stratos Psomadakis <[email protected]>
signature.asc
Description: OpenPGP digital signature
