Re: [Qemu-devel] [PATCH v2] monitor: work around delayed CHR_EVENT_OPENED events

2013-05-30 Thread Stefan Priebe - Profihost AG
Am 30.05.2013 00:29, schrieb mdroth:
 On Wed, May 29, 2013 at 01:27:33PM -0400, Luiz Capitulino wrote:
 On Mon, 27 May 2013 12:59:25 -0500
 mdroth mdr...@linux.vnet.ibm.com wrote:

 On Mon, May 27, 2013 at 11:16:01AM -0500, Anthony Liguori wrote:
 Luiz Capitulino lcapitul...@redhat.com writes:

 On Sun, 26 May 2013 10:33:39 -0500
 Michael Roth mdr...@linux.vnet.ibm.com wrote:

 In the past, CHR_EVENT_OPENED events were emitted via a pre-expired
 QEMUTimer. Due to timers being processing at the tail end of each main
 loop iteration, this generally meant that such events would be emitted
 within the same main loop iteration, prior any client data being read
 by tcp/unix socket server backends.

 With 9f939df955a4152aad69a19a77e0898631bb2c18, this was switched to
 a bottom-half that is registered via g_idle_add(). This makes it
 likely that the event won't be sent until a subsequent iteration, and
 also adds the possibility that such events will race with the
 processing of client data.

 In cases where we rely on the CHR_EVENT_OPENED event being delivered
 prior to any client data being read, this may cause unexpected behavior.

 In the case of a QMP monitor session using a socket backend, the delayed
 processing of the CHR_EVENT_OPENED event can lead to a situation where
 a previous session, where 'qmp_capabilities' has already processed, can
 cause the rejection of 'qmp_capabilities' for a subsequent session,
 since we reset capabilities in response to CHR_EVENT_OPENED, which may
 not yet have been delivered. This can be reproduced with the following
 command, generally within 50 or so iterations:

   mdroth@loki:~$ cat cmds
   {'execute':'qmp_capabilities'}
   {'execute':'query-cpus'}
   mdroth@loki:~$ while true; do if socat stdio unix-connect:/tmp/qmp.sock
   cmds | grep CommandNotFound /dev/null; then echo failed; break; else
   echo ok; fi; done;
   ok
   ok
   failed
   mdroth@loki:~$

 As a work-around, we reset capabilities as part of the CHR_EVENT_CLOSED
 hook, which gets called as part of the GIOChannel cb associated with the
 client rather than a bottom-half, and is thus guaranteed to be delivered
 prior to accepting any subsequent client sessions.

 This does not address the more general problem of whether or not there
 are chardev users that rely on CHR_EVENT_OPENED being delivered prior to
 client data, and whether or not we should consider moving 
 CHR_EVENT_OPENED
 in-band with connection establishment as a general solution, but fixes
 QMP for the time being.

 Reported-by: Stefan Priebe s.pri...@profihost.ag
 Cc: qemu-sta...@nongnu.org
 Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com

 Thanks for debugging this Michael. I'm going to apply your patch to the 
 qmp
 branch because we can't let this broken (esp. in -stable) but this is 
 papering
 over the real bug...

 Do we really need OPENED to happen in a bottom half?  Shouldn't the open
 event handlers register the callback instead if they need it?

 I think that's the more general fix, but wasn't sure if there were
 historical reasons why we didn't do this in the first place.

 Looking at it more closely it does seem like we need a general fix
 sooner rather than later though, and I don't see any reason why we can't
 move BH registration into the actual OPENED hooks as you suggest:

 hw/usb/ccid-card-passthru.c
  - currently affected? no
 debug hook only
  - needs OPENED BH? no

 hw/usb/redirect.c
  - currently affected? yes
 can_read handler checks for dev-parser != NULL, which may be
 true if CLOSED BH has been executed yet. In the past, OPENED
 quiesced outstanding CLOSED events prior to us reading client data.
  - need OPENED BH? possibly
 seems to only be needed by CLOSED events, which can be issued by
 USBDevice, so we do teardown/usb_detach in BH. For OPENED, this
 may not apply. if we do issue a BH, we'd need to modify can_read
 handler to avoid race.

 hw/usb/dev-serial.c
  - currently affected? no
 can_read checks for dev.attached != NULL. set to NULL
 synchronously in CLOSED, will not accept reads until OPENED gets
 called and sets dev.attached
  - need opened BH?  no

 hw/char/sclpconsole.c
  - currently affected? no
 guarded by can_read handler
  - need opened BH? no

 hw/char/ipoctal232.c
  - currently affected? no
 debug hook only
  - need opened BH? no

 hw/char/virtio-console.c
  - currently affected? no
 OPENED/CLOSED map to vq events handled asynchronously. can_read
 checks for guest_connected state rather than anything set by OPENED
  - need opened BH? no

 qtest.c
  - currently affected? yes
 can_read handler doesn't check for qtest_opened == true, can read
 data before OPENED event is processed
  - need opened BH? no

 monitor.c
  - current affected? yes
 negotiated session state can temporarilly leak into a new session
  - need opened BH? no

 gdbstub.c
  - currently affected? yes
 can fail to pause machine prior to starting gdb 

Re: [Qemu-devel] [PATCH v2] monitor: work around delayed CHR_EVENT_OPENED events

2013-05-29 Thread Luiz Capitulino
On Mon, 27 May 2013 12:59:25 -0500
mdroth mdr...@linux.vnet.ibm.com wrote:

 On Mon, May 27, 2013 at 11:16:01AM -0500, Anthony Liguori wrote:
  Luiz Capitulino lcapitul...@redhat.com writes:
  
   On Sun, 26 May 2013 10:33:39 -0500
   Michael Roth mdr...@linux.vnet.ibm.com wrote:
  
   In the past, CHR_EVENT_OPENED events were emitted via a pre-expired
   QEMUTimer. Due to timers being processing at the tail end of each main
   loop iteration, this generally meant that such events would be emitted
   within the same main loop iteration, prior any client data being read
   by tcp/unix socket server backends.
   
   With 9f939df955a4152aad69a19a77e0898631bb2c18, this was switched to
   a bottom-half that is registered via g_idle_add(). This makes it
   likely that the event won't be sent until a subsequent iteration, and
   also adds the possibility that such events will race with the
   processing of client data.
   
   In cases where we rely on the CHR_EVENT_OPENED event being delivered
   prior to any client data being read, this may cause unexpected behavior.
   
   In the case of a QMP monitor session using a socket backend, the delayed
   processing of the CHR_EVENT_OPENED event can lead to a situation where
   a previous session, where 'qmp_capabilities' has already processed, can
   cause the rejection of 'qmp_capabilities' for a subsequent session,
   since we reset capabilities in response to CHR_EVENT_OPENED, which may
   not yet have been delivered. This can be reproduced with the following
   command, generally within 50 or so iterations:
   
 mdroth@loki:~$ cat cmds
 {'execute':'qmp_capabilities'}
 {'execute':'query-cpus'}
 mdroth@loki:~$ while true; do if socat stdio unix-connect:/tmp/qmp.sock
 cmds | grep CommandNotFound /dev/null; then echo failed; break; else
 echo ok; fi; done;
 ok
 ok
 failed
 mdroth@loki:~$
   
   As a work-around, we reset capabilities as part of the CHR_EVENT_CLOSED
   hook, which gets called as part of the GIOChannel cb associated with the
   client rather than a bottom-half, and is thus guaranteed to be delivered
   prior to accepting any subsequent client sessions.
   
   This does not address the more general problem of whether or not there
   are chardev users that rely on CHR_EVENT_OPENED being delivered prior to
   client data, and whether or not we should consider moving 
   CHR_EVENT_OPENED
   in-band with connection establishment as a general solution, but fixes
   QMP for the time being.
   
   Reported-by: Stefan Priebe s.pri...@profihost.ag
   Cc: qemu-sta...@nongnu.org
   Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
  
   Thanks for debugging this Michael. I'm going to apply your patch to the 
   qmp
   branch because we can't let this broken (esp. in -stable) but this is 
   papering
   over the real bug...
  
  Do we really need OPENED to happen in a bottom half?  Shouldn't the open
  event handlers register the callback instead if they need it?
 
 I think that's the more general fix, but wasn't sure if there were
 historical reasons why we didn't do this in the first place.
 
 Looking at it more closely it does seem like we need a general fix
 sooner rather than later though, and I don't see any reason why we can't
 move BH registration into the actual OPENED hooks as you suggest:
 
 hw/usb/ccid-card-passthru.c
  - currently affected? no
 debug hook only
  - needs OPENED BH? no
 
 hw/usb/redirect.c
  - currently affected? yes
 can_read handler checks for dev-parser != NULL, which may be
 true if CLOSED BH has been executed yet. In the past, OPENED
 quiesced outstanding CLOSED events prior to us reading client data.
  - need OPENED BH? possibly
 seems to only be needed by CLOSED events, which can be issued by
 USBDevice, so we do teardown/usb_detach in BH. For OPENED, this
 may not apply. if we do issue a BH, we'd need to modify can_read
 handler to avoid race.
 
 hw/usb/dev-serial.c
  - currently affected? no
 can_read checks for dev.attached != NULL. set to NULL
 synchronously in CLOSED, will not accept reads until OPENED gets
 called and sets dev.attached
  - need opened BH?  no
 
 hw/char/sclpconsole.c
  - currently affected? no
 guarded by can_read handler
  - need opened BH? no
 
 hw/char/ipoctal232.c
  - currently affected? no
 debug hook only
  - need opened BH? no
 
 hw/char/virtio-console.c
  - currently affected? no
 OPENED/CLOSED map to vq events handled asynchronously. can_read
 checks for guest_connected state rather than anything set by OPENED
  - need opened BH? no
 
 qtest.c
  - currently affected? yes
 can_read handler doesn't check for qtest_opened == true, can read
 data before OPENED event is processed
  - need opened BH? no
 
 monitor.c
  - current affected? yes
 negotiated session state can temporarilly leak into a new session
  - need opened BH? no
 
 gdbstub.c
  - currently affected? yes
 can fail to 

Re: [Qemu-devel] [PATCH v2] monitor: work around delayed CHR_EVENT_OPENED events

2013-05-29 Thread mdroth
On Wed, May 29, 2013 at 01:27:33PM -0400, Luiz Capitulino wrote:
 On Mon, 27 May 2013 12:59:25 -0500
 mdroth mdr...@linux.vnet.ibm.com wrote:
 
  On Mon, May 27, 2013 at 11:16:01AM -0500, Anthony Liguori wrote:
   Luiz Capitulino lcapitul...@redhat.com writes:
   
On Sun, 26 May 2013 10:33:39 -0500
Michael Roth mdr...@linux.vnet.ibm.com wrote:
   
In the past, CHR_EVENT_OPENED events were emitted via a pre-expired
QEMUTimer. Due to timers being processing at the tail end of each main
loop iteration, this generally meant that such events would be emitted
within the same main loop iteration, prior any client data being read
by tcp/unix socket server backends.

With 9f939df955a4152aad69a19a77e0898631bb2c18, this was switched to
a bottom-half that is registered via g_idle_add(). This makes it
likely that the event won't be sent until a subsequent iteration, and
also adds the possibility that such events will race with the
processing of client data.

In cases where we rely on the CHR_EVENT_OPENED event being delivered
prior to any client data being read, this may cause unexpected 
behavior.

In the case of a QMP monitor session using a socket backend, the 
delayed
processing of the CHR_EVENT_OPENED event can lead to a situation where
a previous session, where 'qmp_capabilities' has already processed, can
cause the rejection of 'qmp_capabilities' for a subsequent session,
since we reset capabilities in response to CHR_EVENT_OPENED, which may
not yet have been delivered. This can be reproduced with the following
command, generally within 50 or so iterations:

  mdroth@loki:~$ cat cmds
  {'execute':'qmp_capabilities'}
  {'execute':'query-cpus'}
  mdroth@loki:~$ while true; do if socat stdio 
unix-connect:/tmp/qmp.sock
  cmds | grep CommandNotFound /dev/null; then echo failed; break; 
else
  echo ok; fi; done;
  ok
  ok
  failed
  mdroth@loki:~$

As a work-around, we reset capabilities as part of the CHR_EVENT_CLOSED
hook, which gets called as part of the GIOChannel cb associated with 
the
client rather than a bottom-half, and is thus guaranteed to be 
delivered
prior to accepting any subsequent client sessions.

This does not address the more general problem of whether or not there
are chardev users that rely on CHR_EVENT_OPENED being delivered prior 
to
client data, and whether or not we should consider moving 
CHR_EVENT_OPENED
in-band with connection establishment as a general solution, but 
fixes
QMP for the time being.

Reported-by: Stefan Priebe s.pri...@profihost.ag
Cc: qemu-sta...@nongnu.org
Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
   
Thanks for debugging this Michael. I'm going to apply your patch to the 
qmp
branch because we can't let this broken (esp. in -stable) but this is 
papering
over the real bug...
   
   Do we really need OPENED to happen in a bottom half?  Shouldn't the open
   event handlers register the callback instead if they need it?
  
  I think that's the more general fix, but wasn't sure if there were
  historical reasons why we didn't do this in the first place.
  
  Looking at it more closely it does seem like we need a general fix
  sooner rather than later though, and I don't see any reason why we can't
  move BH registration into the actual OPENED hooks as you suggest:
  
  hw/usb/ccid-card-passthru.c
   - currently affected? no
  debug hook only
   - needs OPENED BH? no
  
  hw/usb/redirect.c
   - currently affected? yes
  can_read handler checks for dev-parser != NULL, which may be
  true if CLOSED BH has been executed yet. In the past, OPENED
  quiesced outstanding CLOSED events prior to us reading client data.
   - need OPENED BH? possibly
  seems to only be needed by CLOSED events, which can be issued by
  USBDevice, so we do teardown/usb_detach in BH. For OPENED, this
  may not apply. if we do issue a BH, we'd need to modify can_read
  handler to avoid race.
  
  hw/usb/dev-serial.c
   - currently affected? no
  can_read checks for dev.attached != NULL. set to NULL
  synchronously in CLOSED, will not accept reads until OPENED gets
  called and sets dev.attached
   - need opened BH?  no
  
  hw/char/sclpconsole.c
   - currently affected? no
  guarded by can_read handler
   - need opened BH? no
  
  hw/char/ipoctal232.c
   - currently affected? no
  debug hook only
   - need opened BH? no
  
  hw/char/virtio-console.c
   - currently affected? no
  OPENED/CLOSED map to vq events handled asynchronously. can_read
  checks for guest_connected state rather than anything set by OPENED
   - need opened BH? no
  
  qtest.c
   - currently affected? yes
  can_read handler doesn't check for qtest_opened == true, can read
  data before OPENED event is 

Re: [Qemu-devel] [PATCH v2] monitor: work around delayed CHR_EVENT_OPENED events

2013-05-29 Thread mdroth
On Wed, May 29, 2013 at 01:27:33PM -0400, Luiz Capitulino wrote:
 On Mon, 27 May 2013 12:59:25 -0500
 mdroth mdr...@linux.vnet.ibm.com wrote:
 
  On Mon, May 27, 2013 at 11:16:01AM -0500, Anthony Liguori wrote:
   Luiz Capitulino lcapitul...@redhat.com writes:
   
On Sun, 26 May 2013 10:33:39 -0500
Michael Roth mdr...@linux.vnet.ibm.com wrote:
   
In the past, CHR_EVENT_OPENED events were emitted via a pre-expired
QEMUTimer. Due to timers being processing at the tail end of each main
loop iteration, this generally meant that such events would be emitted
within the same main loop iteration, prior any client data being read
by tcp/unix socket server backends.

With 9f939df955a4152aad69a19a77e0898631bb2c18, this was switched to
a bottom-half that is registered via g_idle_add(). This makes it
likely that the event won't be sent until a subsequent iteration, and
also adds the possibility that such events will race with the
processing of client data.

In cases where we rely on the CHR_EVENT_OPENED event being delivered
prior to any client data being read, this may cause unexpected 
behavior.

In the case of a QMP monitor session using a socket backend, the 
delayed
processing of the CHR_EVENT_OPENED event can lead to a situation where
a previous session, where 'qmp_capabilities' has already processed, can
cause the rejection of 'qmp_capabilities' for a subsequent session,
since we reset capabilities in response to CHR_EVENT_OPENED, which may
not yet have been delivered. This can be reproduced with the following
command, generally within 50 or so iterations:

  mdroth@loki:~$ cat cmds
  {'execute':'qmp_capabilities'}
  {'execute':'query-cpus'}
  mdroth@loki:~$ while true; do if socat stdio 
unix-connect:/tmp/qmp.sock
  cmds | grep CommandNotFound /dev/null; then echo failed; break; 
else
  echo ok; fi; done;
  ok
  ok
  failed
  mdroth@loki:~$

As a work-around, we reset capabilities as part of the CHR_EVENT_CLOSED
hook, which gets called as part of the GIOChannel cb associated with 
the
client rather than a bottom-half, and is thus guaranteed to be 
delivered
prior to accepting any subsequent client sessions.

This does not address the more general problem of whether or not there
are chardev users that rely on CHR_EVENT_OPENED being delivered prior 
to
client data, and whether or not we should consider moving 
CHR_EVENT_OPENED
in-band with connection establishment as a general solution, but 
fixes
QMP for the time being.

Reported-by: Stefan Priebe s.pri...@profihost.ag
Cc: qemu-sta...@nongnu.org
Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
   
Thanks for debugging this Michael. I'm going to apply your patch to the 
qmp
branch because we can't let this broken (esp. in -stable) but this is 
papering
over the real bug...
   
   Do we really need OPENED to happen in a bottom half?  Shouldn't the open
   event handlers register the callback instead if they need it?
  
  I think that's the more general fix, but wasn't sure if there were
  historical reasons why we didn't do this in the first place.
  
  Looking at it more closely it does seem like we need a general fix
  sooner rather than later though, and I don't see any reason why we can't
  move BH registration into the actual OPENED hooks as you suggest:
  
  hw/usb/ccid-card-passthru.c
   - currently affected? no
  debug hook only
   - needs OPENED BH? no
  
  hw/usb/redirect.c
   - currently affected? yes
  can_read handler checks for dev-parser != NULL, which may be
  true if CLOSED BH has been executed yet. In the past, OPENED
  quiesced outstanding CLOSED events prior to us reading client data.
   - need OPENED BH? possibly
  seems to only be needed by CLOSED events, which can be issued by
  USBDevice, so we do teardown/usb_detach in BH. For OPENED, this
  may not apply. if we do issue a BH, we'd need to modify can_read
  handler to avoid race.
  
  hw/usb/dev-serial.c
   - currently affected? no
  can_read checks for dev.attached != NULL. set to NULL
  synchronously in CLOSED, will not accept reads until OPENED gets
  called and sets dev.attached
   - need opened BH?  no
  
  hw/char/sclpconsole.c
   - currently affected? no
  guarded by can_read handler
   - need opened BH? no
  
  hw/char/ipoctal232.c
   - currently affected? no
  debug hook only
   - need opened BH? no
  
  hw/char/virtio-console.c
   - currently affected? no
  OPENED/CLOSED map to vq events handled asynchronously. can_read
  checks for guest_connected state rather than anything set by OPENED
   - need opened BH? no
  
  qtest.c
   - currently affected? yes
  can_read handler doesn't check for qtest_opened == true, can read
  data before OPENED event is 

Re: [Qemu-devel] [PATCH v2] monitor: work around delayed CHR_EVENT_OPENED events

2013-05-27 Thread Luiz Capitulino
On Sun, 26 May 2013 10:33:39 -0500
Michael Roth mdr...@linux.vnet.ibm.com wrote:

 In the past, CHR_EVENT_OPENED events were emitted via a pre-expired
 QEMUTimer. Due to timers being processing at the tail end of each main
 loop iteration, this generally meant that such events would be emitted
 within the same main loop iteration, prior any client data being read
 by tcp/unix socket server backends.
 
 With 9f939df955a4152aad69a19a77e0898631bb2c18, this was switched to
 a bottom-half that is registered via g_idle_add(). This makes it
 likely that the event won't be sent until a subsequent iteration, and
 also adds the possibility that such events will race with the
 processing of client data.
 
 In cases where we rely on the CHR_EVENT_OPENED event being delivered
 prior to any client data being read, this may cause unexpected behavior.
 
 In the case of a QMP monitor session using a socket backend, the delayed
 processing of the CHR_EVENT_OPENED event can lead to a situation where
 a previous session, where 'qmp_capabilities' has already processed, can
 cause the rejection of 'qmp_capabilities' for a subsequent session,
 since we reset capabilities in response to CHR_EVENT_OPENED, which may
 not yet have been delivered. This can be reproduced with the following
 command, generally within 50 or so iterations:
 
   mdroth@loki:~$ cat cmds
   {'execute':'qmp_capabilities'}
   {'execute':'query-cpus'}
   mdroth@loki:~$ while true; do if socat stdio unix-connect:/tmp/qmp.sock
   cmds | grep CommandNotFound /dev/null; then echo failed; break; else
   echo ok; fi; done;
   ok
   ok
   failed
   mdroth@loki:~$
 
 As a work-around, we reset capabilities as part of the CHR_EVENT_CLOSED
 hook, which gets called as part of the GIOChannel cb associated with the
 client rather than a bottom-half, and is thus guaranteed to be delivered
 prior to accepting any subsequent client sessions.
 
 This does not address the more general problem of whether or not there
 are chardev users that rely on CHR_EVENT_OPENED being delivered prior to
 client data, and whether or not we should consider moving CHR_EVENT_OPENED
 in-band with connection establishment as a general solution, but fixes
 QMP for the time being.
 
 Reported-by: Stefan Priebe s.pri...@profihost.ag
 Cc: qemu-sta...@nongnu.org
 Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com

Thanks for debugging this Michael. I'm going to apply your patch to the qmp
branch because we can't let this broken (esp. in -stable) but this is papering
over the real bug...

 ---
 v1-v2:
 * remove command_mode reset from CHR_EVENT_OPENED case, since this
   might still cause a race
 
  monitor.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/monitor.c b/monitor.c
 index 6ce2a4e..f1953a0 100644
 --- a/monitor.c
 +++ b/monitor.c
 @@ -4649,7 +4649,6 @@ static void monitor_control_event(void *opaque, int 
 event)
  
  switch (event) {
  case CHR_EVENT_OPENED:
 -mon-mc-command_mode = 0;
  data = get_qmp_greeting();
  monitor_json_emitter(mon, data);
  qobject_decref(data);
 @@ -4660,6 +4659,7 @@ static void monitor_control_event(void *opaque, int 
 event)
  json_message_parser_init(mon-mc-parser, handle_qmp_command);
  mon_refcount--;
  monitor_fdsets_cleanup();
 +mon-mc-command_mode = 0;
  break;
  }
  }




Re: [Qemu-devel] [PATCH v2] monitor: work around delayed CHR_EVENT_OPENED events

2013-05-27 Thread Anthony Liguori
Luiz Capitulino lcapitul...@redhat.com writes:

 On Sun, 26 May 2013 10:33:39 -0500
 Michael Roth mdr...@linux.vnet.ibm.com wrote:

 In the past, CHR_EVENT_OPENED events were emitted via a pre-expired
 QEMUTimer. Due to timers being processing at the tail end of each main
 loop iteration, this generally meant that such events would be emitted
 within the same main loop iteration, prior any client data being read
 by tcp/unix socket server backends.
 
 With 9f939df955a4152aad69a19a77e0898631bb2c18, this was switched to
 a bottom-half that is registered via g_idle_add(). This makes it
 likely that the event won't be sent until a subsequent iteration, and
 also adds the possibility that such events will race with the
 processing of client data.
 
 In cases where we rely on the CHR_EVENT_OPENED event being delivered
 prior to any client data being read, this may cause unexpected behavior.
 
 In the case of a QMP monitor session using a socket backend, the delayed
 processing of the CHR_EVENT_OPENED event can lead to a situation where
 a previous session, where 'qmp_capabilities' has already processed, can
 cause the rejection of 'qmp_capabilities' for a subsequent session,
 since we reset capabilities in response to CHR_EVENT_OPENED, which may
 not yet have been delivered. This can be reproduced with the following
 command, generally within 50 or so iterations:
 
   mdroth@loki:~$ cat cmds
   {'execute':'qmp_capabilities'}
   {'execute':'query-cpus'}
   mdroth@loki:~$ while true; do if socat stdio unix-connect:/tmp/qmp.sock
   cmds | grep CommandNotFound /dev/null; then echo failed; break; else
   echo ok; fi; done;
   ok
   ok
   failed
   mdroth@loki:~$
 
 As a work-around, we reset capabilities as part of the CHR_EVENT_CLOSED
 hook, which gets called as part of the GIOChannel cb associated with the
 client rather than a bottom-half, and is thus guaranteed to be delivered
 prior to accepting any subsequent client sessions.
 
 This does not address the more general problem of whether or not there
 are chardev users that rely on CHR_EVENT_OPENED being delivered prior to
 client data, and whether or not we should consider moving CHR_EVENT_OPENED
 in-band with connection establishment as a general solution, but fixes
 QMP for the time being.
 
 Reported-by: Stefan Priebe s.pri...@profihost.ag
 Cc: qemu-sta...@nongnu.org
 Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com

 Thanks for debugging this Michael. I'm going to apply your patch to the qmp
 branch because we can't let this broken (esp. in -stable) but this is papering
 over the real bug...

Do we really need OPENED to happen in a bottom half?  Shouldn't the open
event handlers register the callback instead if they need it?

Regards,

Anthony Liguori


 ---
 v1-v2:
 * remove command_mode reset from CHR_EVENT_OPENED case, since this
   might still cause a race
 
  monitor.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/monitor.c b/monitor.c
 index 6ce2a4e..f1953a0 100644
 --- a/monitor.c
 +++ b/monitor.c
 @@ -4649,7 +4649,6 @@ static void monitor_control_event(void *opaque, int 
 event)
  
  switch (event) {
  case CHR_EVENT_OPENED:
 -mon-mc-command_mode = 0;
  data = get_qmp_greeting();
  monitor_json_emitter(mon, data);
  qobject_decref(data);
 @@ -4660,6 +4659,7 @@ static void monitor_control_event(void *opaque, int 
 event)
  json_message_parser_init(mon-mc-parser, handle_qmp_command);
  mon_refcount--;
  monitor_fdsets_cleanup();
 +mon-mc-command_mode = 0;
  break;
  }
  }



Re: [Qemu-devel] [PATCH v2] monitor: work around delayed CHR_EVENT_OPENED events

2013-05-27 Thread mdroth
On Mon, May 27, 2013 at 11:16:01AM -0500, Anthony Liguori wrote:
 Luiz Capitulino lcapitul...@redhat.com writes:
 
  On Sun, 26 May 2013 10:33:39 -0500
  Michael Roth mdr...@linux.vnet.ibm.com wrote:
 
  In the past, CHR_EVENT_OPENED events were emitted via a pre-expired
  QEMUTimer. Due to timers being processing at the tail end of each main
  loop iteration, this generally meant that such events would be emitted
  within the same main loop iteration, prior any client data being read
  by tcp/unix socket server backends.
  
  With 9f939df955a4152aad69a19a77e0898631bb2c18, this was switched to
  a bottom-half that is registered via g_idle_add(). This makes it
  likely that the event won't be sent until a subsequent iteration, and
  also adds the possibility that such events will race with the
  processing of client data.
  
  In cases where we rely on the CHR_EVENT_OPENED event being delivered
  prior to any client data being read, this may cause unexpected behavior.
  
  In the case of a QMP monitor session using a socket backend, the delayed
  processing of the CHR_EVENT_OPENED event can lead to a situation where
  a previous session, where 'qmp_capabilities' has already processed, can
  cause the rejection of 'qmp_capabilities' for a subsequent session,
  since we reset capabilities in response to CHR_EVENT_OPENED, which may
  not yet have been delivered. This can be reproduced with the following
  command, generally within 50 or so iterations:
  
mdroth@loki:~$ cat cmds
{'execute':'qmp_capabilities'}
{'execute':'query-cpus'}
mdroth@loki:~$ while true; do if socat stdio unix-connect:/tmp/qmp.sock
cmds | grep CommandNotFound /dev/null; then echo failed; break; else
echo ok; fi; done;
ok
ok
failed
mdroth@loki:~$
  
  As a work-around, we reset capabilities as part of the CHR_EVENT_CLOSED
  hook, which gets called as part of the GIOChannel cb associated with the
  client rather than a bottom-half, and is thus guaranteed to be delivered
  prior to accepting any subsequent client sessions.
  
  This does not address the more general problem of whether or not there
  are chardev users that rely on CHR_EVENT_OPENED being delivered prior to
  client data, and whether or not we should consider moving CHR_EVENT_OPENED
  in-band with connection establishment as a general solution, but fixes
  QMP for the time being.
  
  Reported-by: Stefan Priebe s.pri...@profihost.ag
  Cc: qemu-sta...@nongnu.org
  Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
 
  Thanks for debugging this Michael. I'm going to apply your patch to the qmp
  branch because we can't let this broken (esp. in -stable) but this is 
  papering
  over the real bug...
 
 Do we really need OPENED to happen in a bottom half?  Shouldn't the open
 event handlers register the callback instead if they need it?

I think that's the more general fix, but wasn't sure if there were
historical reasons why we didn't do this in the first place.

Looking at it more closely it does seem like we need a general fix
sooner rather than later though, and I don't see any reason why we can't
move BH registration into the actual OPENED hooks as you suggest:

hw/usb/ccid-card-passthru.c
 - currently affected? no
debug hook only
 - needs OPENED BH? no

hw/usb/redirect.c
 - currently affected? yes
can_read handler checks for dev-parser != NULL, which may be
true if CLOSED BH has been executed yet. In the past, OPENED
quiesced outstanding CLOSED events prior to us reading client data.
 - need OPENED BH? possibly
seems to only be needed by CLOSED events, which can be issued by
USBDevice, so we do teardown/usb_detach in BH. For OPENED, this
may not apply. if we do issue a BH, we'd need to modify can_read
handler to avoid race.

hw/usb/dev-serial.c
 - currently affected? no
can_read checks for dev.attached != NULL. set to NULL
synchronously in CLOSED, will not accept reads until OPENED gets
called and sets dev.attached
 - need opened BH?  no

hw/char/sclpconsole.c
 - currently affected? no
guarded by can_read handler
 - need opened BH? no

hw/char/ipoctal232.c
 - currently affected? no
debug hook only
 - need opened BH? no

hw/char/virtio-console.c
 - currently affected? no
OPENED/CLOSED map to vq events handled asynchronously. can_read
checks for guest_connected state rather than anything set by OPENED
 - need opened BH? no

qtest.c
 - currently affected? yes
can_read handler doesn't check for qtest_opened == true, can read
data before OPENED event is processed
 - need opened BH? no

monitor.c
 - current affected? yes
negotiated session state can temporarilly leak into a new session
 - need opened BH? no

gdbstub.c
 - currently affected? yes
can fail to pause machine prior to starting gdb session
 - need opened BH? no

So we have a number of cases that can be fixed by avoiding the use of
the generic BH, and only 1 possible cause where we might need a

[Qemu-devel] [PATCH v2] monitor: work around delayed CHR_EVENT_OPENED events

2013-05-26 Thread Michael Roth
In the past, CHR_EVENT_OPENED events were emitted via a pre-expired
QEMUTimer. Due to timers being processing at the tail end of each main
loop iteration, this generally meant that such events would be emitted
within the same main loop iteration, prior any client data being read
by tcp/unix socket server backends.

With 9f939df955a4152aad69a19a77e0898631bb2c18, this was switched to
a bottom-half that is registered via g_idle_add(). This makes it
likely that the event won't be sent until a subsequent iteration, and
also adds the possibility that such events will race with the
processing of client data.

In cases where we rely on the CHR_EVENT_OPENED event being delivered
prior to any client data being read, this may cause unexpected behavior.

In the case of a QMP monitor session using a socket backend, the delayed
processing of the CHR_EVENT_OPENED event can lead to a situation where
a previous session, where 'qmp_capabilities' has already processed, can
cause the rejection of 'qmp_capabilities' for a subsequent session,
since we reset capabilities in response to CHR_EVENT_OPENED, which may
not yet have been delivered. This can be reproduced with the following
command, generally within 50 or so iterations:

  mdroth@loki:~$ cat cmds
  {'execute':'qmp_capabilities'}
  {'execute':'query-cpus'}
  mdroth@loki:~$ while true; do if socat stdio unix-connect:/tmp/qmp.sock
  cmds | grep CommandNotFound /dev/null; then echo failed; break; else
  echo ok; fi; done;
  ok
  ok
  failed
  mdroth@loki:~$

As a work-around, we reset capabilities as part of the CHR_EVENT_CLOSED
hook, which gets called as part of the GIOChannel cb associated with the
client rather than a bottom-half, and is thus guaranteed to be delivered
prior to accepting any subsequent client sessions.

This does not address the more general problem of whether or not there
are chardev users that rely on CHR_EVENT_OPENED being delivered prior to
client data, and whether or not we should consider moving CHR_EVENT_OPENED
in-band with connection establishment as a general solution, but fixes
QMP for the time being.

Reported-by: Stefan Priebe s.pri...@profihost.ag
Cc: qemu-sta...@nongnu.org
Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
v1-v2:
* remove command_mode reset from CHR_EVENT_OPENED case, since this
  might still cause a race

 monitor.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/monitor.c b/monitor.c
index 6ce2a4e..f1953a0 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4649,7 +4649,6 @@ static void monitor_control_event(void *opaque, int event)
 
 switch (event) {
 case CHR_EVENT_OPENED:
-mon-mc-command_mode = 0;
 data = get_qmp_greeting();
 monitor_json_emitter(mon, data);
 qobject_decref(data);
@@ -4660,6 +4659,7 @@ static void monitor_control_event(void *opaque, int event)
 json_message_parser_init(mon-mc-parser, handle_qmp_command);
 mon_refcount--;
 monitor_fdsets_cleanup();
+mon-mc-command_mode = 0;
 break;
 }
 }
-- 
1.7.9.5