Re: [PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs

2024-05-23 Thread Ross Lagerwall
On Tue, Apr 9, 2024 at 3:19 PM Ross Lagerwall  wrote:
>
> On Tue, Apr 9, 2024 at 11:20 AM Anthony PERARD  
> wrote:
> >
> > On Thu, Apr 04, 2024 at 03:08:33PM +0100, Ross Lagerwall wrote:
> > > diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
> > > index 1627da739822..1116b3978938 100644
> > > --- a/hw/xen/xen-hvm-common.c
> > > +++ b/hw/xen/xen-hvm-common.c
> > > @@ -521,22 +521,30 @@ static bool handle_buffered_iopage(XenIOState 
> > > *state)
> > [...]
> > >
> > >  static void handle_buffered_io(void *opaque)
> > >  {
> > > +unsigned int handled;
> > >  XenIOState *state = opaque;
> > >
> > > -if (handle_buffered_iopage(state)) {
> > > +handled = handle_buffered_iopage(state);
> > > +if (handled >= IOREQ_BUFFER_SLOT_NUM) {
> > > +/* We handled a full page of ioreqs. Schedule a timer to continue
> > > + * processing while giving other stuff a chance to run.
> > > + */
> >
> > ./scripts/checkpatch.pl report a style issue here:
> > WARNING: Block comments use a leading /* on a separate line
> >
> > I can try to remember to fix that on commit.
>
> I copied the comment style from a few lines above but I guess it was
> wrong.
>
> >
> > >  timer_mod(state->buffered_io_timer,
> > > -BUFFER_IO_MAX_DELAY + 
> > > qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
> > > -} else {
> > > +qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
> > > +} else if (handled == 0) {
> >
> > Just curious, why did you check for `handled == 0` here instead of
> > `handled != 0`? That would have avoided to invert the last 2 cases, and
> > the patch would just have introduce a new case without changing the
> > order of the existing ones. But not that important I guess.
> >
>
> In general I try to use conditionals with the least amount of negation
> since I think it is easier to read. I can change it if you would prefer?

It looks like this hasn't been committed anywhere. Were you expecting
an updated version with the style issue fixed or has it fallen through
the cracks?

Thanks,
Ross



Re: [PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs

2024-04-09 Thread Peter Maydell
On Tue, 9 Apr 2024 at 15:20, Ross Lagerwall  wrote:
>
> On Tue, Apr 9, 2024 at 11:20 AM Anthony PERARD  
> wrote:
> >
> > On Thu, Apr 04, 2024 at 03:08:33PM +0100, Ross Lagerwall wrote:
> > > diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
> > > index 1627da739822..1116b3978938 100644
> > > --- a/hw/xen/xen-hvm-common.c
> > > +++ b/hw/xen/xen-hvm-common.c
> > > @@ -521,22 +521,30 @@ static bool handle_buffered_iopage(XenIOState 
> > > *state)
> > [...]
> > >
> > >  static void handle_buffered_io(void *opaque)
> > >  {
> > > +unsigned int handled;
> > >  XenIOState *state = opaque;
> > >
> > > -if (handle_buffered_iopage(state)) {
> > > +handled = handle_buffered_iopage(state);
> > > +if (handled >= IOREQ_BUFFER_SLOT_NUM) {
> > > +/* We handled a full page of ioreqs. Schedule a timer to continue
> > > + * processing while giving other stuff a chance to run.
> > > + */
> >
> > ./scripts/checkpatch.pl report a style issue here:
> > WARNING: Block comments use a leading /* on a separate line
> >
> > I can try to remember to fix that on commit.
>
> I copied the comment style from a few lines above but I guess it was
> wrong.

Yes, this is one of those cases where we've settled on a
style choice but there's still quite a lot of older code
in the codebase that doesn't adhere to it. Checkpatch
usually will catch this kind of nit for you.

thanks
-- PMM



Re: [PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs

2024-04-09 Thread Ross Lagerwall
On Tue, Apr 9, 2024 at 11:20 AM Anthony PERARD  wrote:
>
> On Thu, Apr 04, 2024 at 03:08:33PM +0100, Ross Lagerwall wrote:
> > diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
> > index 1627da739822..1116b3978938 100644
> > --- a/hw/xen/xen-hvm-common.c
> > +++ b/hw/xen/xen-hvm-common.c
> > @@ -521,22 +521,30 @@ static bool handle_buffered_iopage(XenIOState *state)
> [...]
> >
> >  static void handle_buffered_io(void *opaque)
> >  {
> > +unsigned int handled;
> >  XenIOState *state = opaque;
> >
> > -if (handle_buffered_iopage(state)) {
> > +handled = handle_buffered_iopage(state);
> > +if (handled >= IOREQ_BUFFER_SLOT_NUM) {
> > +/* We handled a full page of ioreqs. Schedule a timer to continue
> > + * processing while giving other stuff a chance to run.
> > + */
>
> ./scripts/checkpatch.pl report a style issue here:
> WARNING: Block comments use a leading /* on a separate line
>
> I can try to remember to fix that on commit.

I copied the comment style from a few lines above but I guess it was
wrong.

>
> >  timer_mod(state->buffered_io_timer,
> > -BUFFER_IO_MAX_DELAY + 
> > qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
> > -} else {
> > +qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
> > +} else if (handled == 0) {
>
> Just curious, why did you check for `handled == 0` here instead of
> `handled != 0`? That would have avoided to invert the last 2 cases, and
> the patch would just have introduce a new case without changing the
> order of the existing ones. But not that important I guess.
>

In general I try to use conditionals with the least amount of negation
since I think it is easier to read. I can change it if you would prefer?

Ross



Re: [PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs

2024-04-09 Thread Anthony PERARD
On Thu, Apr 04, 2024 at 03:08:33PM +0100, Ross Lagerwall wrote:
> diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
> index 1627da739822..1116b3978938 100644
> --- a/hw/xen/xen-hvm-common.c
> +++ b/hw/xen/xen-hvm-common.c
> @@ -521,22 +521,30 @@ static bool handle_buffered_iopage(XenIOState *state)
[...]
>  
>  static void handle_buffered_io(void *opaque)
>  {
> +unsigned int handled;
>  XenIOState *state = opaque;
>  
> -if (handle_buffered_iopage(state)) {
> +handled = handle_buffered_iopage(state);
> +if (handled >= IOREQ_BUFFER_SLOT_NUM) {
> +/* We handled a full page of ioreqs. Schedule a timer to continue
> + * processing while giving other stuff a chance to run.
> + */

./scripts/checkpatch.pl report a style issue here:
WARNING: Block comments use a leading /* on a separate line

I can try to remember to fix that on commit.

>  timer_mod(state->buffered_io_timer,
> -BUFFER_IO_MAX_DELAY + 
> qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
> -} else {
> +qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
> +} else if (handled == 0) {

Just curious, why did you check for `handled == 0` here instead of
`handled != 0`? That would have avoided to invert the last 2 cases, and
the patch would just have introduce a new case without changing the
order of the existing ones. But not that important I guess.

>  timer_del(state->buffered_io_timer);
>  qemu_xen_evtchn_unmask(state->xce_handle, 
> state->bufioreq_local_port);
> +} else {
> +timer_mod(state->buffered_io_timer,
> +BUFFER_IO_MAX_DELAY + 
> qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
>  }
>  }

Cheers,

-- 
Anthony PERARD



Re: [PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs

2024-04-08 Thread Paul Durrant

On 04/04/2024 15:08, Ross Lagerwall wrote:

A malicious or buggy guest may generated buffered ioreqs faster than
QEMU can process them in handle_buffered_iopage(). The result is a
livelock - QEMU continuously processes ioreqs on the main thread without
iterating through the main loop which prevents handling other events,
processing timers, etc. Without QEMU handling other events, it often
results in the guest becoming unsable and makes it difficult to stop the
source of buffered ioreqs.

To avoid this, if we process a full page of buffered ioreqs, stop and
reschedule an immediate timer to continue processing them. This lets
QEMU go back to the main loop and catch up.

Signed-off-by: Ross Lagerwall 
---
  hw/xen/xen-hvm-common.c | 26 +-
  1 file changed, 17 insertions(+), 9 deletions(-)



Reviewed-by: Paul Durrant 




Re: [PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs

2024-04-08 Thread Paul Durrant

On 08/04/2024 14:00, Ross Lagerwall wrote:

On Sat, Apr 6, 2024 at 11:58 AM Durrant, Paul  wrote:


On 04/04/2024 15:08, Ross Lagerwall wrote:

A malicious or buggy guest may generated buffered ioreqs faster than
QEMU can process them in handle_buffered_iopage(). The result is a
livelock - QEMU continuously processes ioreqs on the main thread without
iterating through the main loop which prevents handling other events,
processing timers, etc. Without QEMU handling other events, it often
results in the guest becoming unsable and makes it difficult to stop the
source of buffered ioreqs.

To avoid this, if we process a full page of buffered ioreqs, stop and
reschedule an immediate timer to continue processing them. This lets
QEMU go back to the main loop and catch up.



Do PV backends potentially cause the same scheduling issue (if not using
io threads)?



 From what I can tell:

xen-block: It reads req_prod / req_cons once before entering the loop
so it should be fine, I think.

xen_console: Same as xen-block

xen_nic: It reads req_prod / req_cons once before entering the loop.
However, once the loop ends it checks for more requests and if there
are more requests it restarts from the beginning. It seems like this
could be susceptible to the same issue.

(These PV backends generally aren't used by XenServer's system QEMU
so I didn't spend too much time looking into it.)

Thanks,


Ok. Thanks for checking.

  Paul




Re: [PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs

2024-04-08 Thread Ross Lagerwall
On Sat, Apr 6, 2024 at 11:58 AM Durrant, Paul  wrote:
>
> On 04/04/2024 15:08, Ross Lagerwall wrote:
> > A malicious or buggy guest may generated buffered ioreqs faster than
> > QEMU can process them in handle_buffered_iopage(). The result is a
> > livelock - QEMU continuously processes ioreqs on the main thread without
> > iterating through the main loop which prevents handling other events,
> > processing timers, etc. Without QEMU handling other events, it often
> > results in the guest becoming unsable and makes it difficult to stop the
> > source of buffered ioreqs.
> >
> > To avoid this, if we process a full page of buffered ioreqs, stop and
> > reschedule an immediate timer to continue processing them. This lets
> > QEMU go back to the main loop and catch up.
> >
>
> Do PV backends potentially cause the same scheduling issue (if not using
> io threads)?
>

>From what I can tell:

xen-block: It reads req_prod / req_cons once before entering the loop
so it should be fine, I think.

xen_console: Same as xen-block

xen_nic: It reads req_prod / req_cons once before entering the loop.
However, once the loop ends it checks for more requests and if there
are more requests it restarts from the beginning. It seems like this
could be susceptible to the same issue.

(These PV backends generally aren't used by XenServer's system QEMU
so I didn't spend too much time looking into it.)

Thanks,
Ross



Re: [PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs

2024-04-06 Thread Durrant, Paul

On 04/04/2024 15:08, Ross Lagerwall wrote:

A malicious or buggy guest may generated buffered ioreqs faster than
QEMU can process them in handle_buffered_iopage(). The result is a
livelock - QEMU continuously processes ioreqs on the main thread without
iterating through the main loop which prevents handling other events,
processing timers, etc. Without QEMU handling other events, it often
results in the guest becoming unsable and makes it difficult to stop the
source of buffered ioreqs.

To avoid this, if we process a full page of buffered ioreqs, stop and
reschedule an immediate timer to continue processing them. This lets
QEMU go back to the main loop and catch up.



Do PV backends potentially cause the same scheduling issue (if not using 
io threads)?



Signed-off-by: Ross Lagerwall 
---
  hw/xen/xen-hvm-common.c | 26 +-
  1 file changed, 17 insertions(+), 9 deletions(-)



Reviewed-by: Paul Durrant 




[PATCH] xen-hvm: Avoid livelock while handling buffered ioreqs

2024-04-04 Thread Ross Lagerwall
A malicious or buggy guest may generated buffered ioreqs faster than
QEMU can process them in handle_buffered_iopage(). The result is a
livelock - QEMU continuously processes ioreqs on the main thread without
iterating through the main loop which prevents handling other events,
processing timers, etc. Without QEMU handling other events, it often
results in the guest becoming unsable and makes it difficult to stop the
source of buffered ioreqs.

To avoid this, if we process a full page of buffered ioreqs, stop and
reschedule an immediate timer to continue processing them. This lets
QEMU go back to the main loop and catch up.

Signed-off-by: Ross Lagerwall 
---
 hw/xen/xen-hvm-common.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
index 1627da739822..1116b3978938 100644
--- a/hw/xen/xen-hvm-common.c
+++ b/hw/xen/xen-hvm-common.c
@@ -463,11 +463,11 @@ static void handle_ioreq(XenIOState *state, ioreq_t *req)
 }
 }
 
-static bool handle_buffered_iopage(XenIOState *state)
+static unsigned int handle_buffered_iopage(XenIOState *state)
 {
 buffered_iopage_t *buf_page = state->buffered_io_page;
 buf_ioreq_t *buf_req = NULL;
-bool handled_ioreq = false;
+unsigned int handled = 0;
 ioreq_t req;
 int qw;
 
@@ -480,7 +480,7 @@ static bool handle_buffered_iopage(XenIOState *state)
 req.count = 1;
 req.dir = IOREQ_WRITE;
 
-for (;;) {
+do {
 uint32_t rdptr = buf_page->read_pointer, wrptr;
 
 xen_rmb();
@@ -521,22 +521,30 @@ static bool handle_buffered_iopage(XenIOState *state)
 assert(!req.data_is_ptr);
 
 qatomic_add(&buf_page->read_pointer, qw + 1);
-handled_ioreq = true;
-}
+handled += qw + 1;
+} while (handled < IOREQ_BUFFER_SLOT_NUM);
 
-return handled_ioreq;
+return handled;
 }
 
 static void handle_buffered_io(void *opaque)
 {
+unsigned int handled;
 XenIOState *state = opaque;
 
-if (handle_buffered_iopage(state)) {
+handled = handle_buffered_iopage(state);
+if (handled >= IOREQ_BUFFER_SLOT_NUM) {
+/* We handled a full page of ioreqs. Schedule a timer to continue
+ * processing while giving other stuff a chance to run.
+ */
 timer_mod(state->buffered_io_timer,
-BUFFER_IO_MAX_DELAY + qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
-} else {
+qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
+} else if (handled == 0) {
 timer_del(state->buffered_io_timer);
 qemu_xen_evtchn_unmask(state->xce_handle, state->bufioreq_local_port);
+} else {
+timer_mod(state->buffered_io_timer,
+BUFFER_IO_MAX_DELAY + qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
 }
 }
 
-- 
2.43.0