RE: [PATCH] xen-blkback: fix compatibility bug with single page rings

2021-01-28 Thread Paul Durrant
Apologies; I missed the v2 from the subject line. I'll re-send.

  Paul

> -Original Message-
> From: Xen-devel  On Behalf Of Paul 
> Durrant
> Sent: 28 January 2021 12:55
> To: xen-de...@lists.xenproject.org; linux-bl...@vger.kernel.org; 
> linux-kernel@vger.kernel.org
> Cc: Paul Durrant ; Konrad Rzeszutek Wilk 
> ; Roger Pau
> Monné ; Jens Axboe ; Dongli Zhang 
> 
> Subject: [PATCH] xen-blkback: fix compatibility bug with single page rings
> 
> From: Paul Durrant 
> 
> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> behaviour of xen-blkback when connecting to a frontend was:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring specified by 'ring-ref'
> - else expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> 
> This was correct behaviour, but was broken by the afforementioned commit to
> become:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring (i.e. ring-page-order = 0)
> - expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> - if that didn't work then see if there's a single page ring specified by
>   'ring-ref'
> 
> This incorrect behaviour works most of the time but fails when a frontend
> that sets 'ring-page-order' is unloaded and replaced by one that does not
> because, instead of reading 'ring-ref', xen-blkback will read the stale
> 'ring-ref0' left around by the previous frontend will try to map the wrong
> grant reference.
> 
> This patch restores the original behaviour.
> 
> Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid 
> inconsistent xenstore 'ring-page-
> order' set by malicious blkfront")
> Signed-off-by: Paul Durrant 
> ---
> Cc: Konrad Rzeszutek Wilk 
> Cc: "Roger Pau Monné" 
> Cc: Jens Axboe 
> Cc: Dongli Zhang 
> 
> v2:
>  - Remove now-spurious error path special-case when nr_grefs == 1
> ---
>  drivers/block/xen-blkback/common.h |  1 +
>  drivers/block/xen-blkback/xenbus.c | 38 +-
>  2 files changed, 17 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/common.h 
> b/drivers/block/xen-blkback/common.h
> index b0c71d3a81a0..524a79f10de6 100644
> --- a/drivers/block/xen-blkback/common.h
> +++ b/drivers/block/xen-blkback/common.h
> @@ -313,6 +313,7 @@ struct xen_blkif {
> 
>   struct work_struct  free_work;
>   unsigned intnr_ring_pages;
> + boolmulti_ref;
>   /* All rings for this device. */
>   struct xen_blkif_ring   *rings;
>   unsigned intnr_rings;
> diff --git a/drivers/block/xen-blkback/xenbus.c 
> b/drivers/block/xen-blkback/xenbus.c
> index 9860d4842f36..6c5e9373e91c 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -998,14 +998,17 @@ static int read_per_ring_refs(struct xen_blkif_ring 
> *ring, const char *dir)
>   for (i = 0; i < nr_grefs; i++) {
>   char ring_ref_name[RINGREF_NAME_LEN];
> 
> - snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
> + if (blkif->multi_ref)
> + snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", 
> i);
> + else {
> + WARN_ON(i != 0);
> + snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref");
> + }
> +
>   err = xenbus_scanf(XBT_NIL, dir, ring_ref_name,
>  "%u", _ref[i]);
> 
>   if (err != 1) {
> - if (nr_grefs == 1)
> - break;
> -
>   err = -EINVAL;
>   xenbus_dev_fatal(dev, err, "reading %s/%s",
>dir, ring_ref_name);
> @@ -1013,18 +1016,6 @@ static int read_per_ring_refs(struct xen_blkif_ring 
> *ring, const char *dir)
>   }
>   }
> 
> - if (err != 1) {
> - WARN_ON(nr_grefs != 1);
> -
> - err = xenbus_scanf(XBT_NIL, dir, "ring-ref", "%u",
> -_ref[0]);
> - if (err != 1) {
> - err = -EINVAL;
> - xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
> - return err;
> - }
> - }
> -
>   err = -ENOMEM;
>   for (i = 0; i < nr_grefs * XEN_BLKIF_REQS_PER_PAGE; i++) {
>   req = kzalloc(sizeof(*req), GFP_KERNEL);
> @@ -1129,10 +1120,15 @@ static int connect_ring(struct backend_info *be)
>blkif->nr_rings, blkif->blk_protocol, protocol,
>blkif->vbd.feature_gnt_persistent ? "persistent grants" : "");
> 
> - ring_page_order = xenbus_read_unsigned(dev->otherend,
> -"ring-page-order", 0);
> -
> - if (ring_page_order > xen_blkif_max_ring_order) {
> +   

RE: [PATCH] xen-blkback: fix compatibility bug with single page rings

2021-01-28 Thread Paul Durrant
> -Original Message-
> From: Xen-devel  On Behalf Of Dongli 
> Zhang
> Sent: 27 January 2021 19:57
> To: Paul Durrant ; xen-de...@lists.xenproject.org; 
> linux-bl...@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Cc: Paul Durrant ; Konrad Rzeszutek Wilk 
> ; Roger Pau
> Monné ; Jens Axboe 
> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
> 
> 
> 
> On 1/27/21 2:30 AM, Paul Durrant wrote:
> > From: Paul Durrant 
> >
> > Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> > inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> > behaviour of xen-blkback when connecting to a frontend was:
> >
> > - read 'ring-page-order'
> > - if not present then expect a single page ring specified by 'ring-ref'
> > - else expect a ring specified by 'ring-refX' where X is between 0 and
> >   1 << ring-page-order
> >
> > This was correct behaviour, but was broken by the afforementioned commit to
> > become:
> >
> > - read 'ring-page-order'
> > - if not present then expect a single page ring
> > - expect a ring specified by 'ring-refX' where X is between 0 and
> >   1 << ring-page-order
> > - if that didn't work then see if there's a single page ring specified by
> >   'ring-ref'
> >
> > This incorrect behaviour works most of the time but fails when a frontend
> > that sets 'ring-page-order' is unloaded and replaced by one that does not
> > because, instead of reading 'ring-ref', xen-blkback will read the stale
> > 'ring-ref0' left around by the previous frontend will try to map the wrong
> > grant reference.
> >
> > This patch restores the original behaviour.
> >
> > Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid 
> > inconsistent xenstore 'ring-page-
> order' set by malicious blkfront")
> > Signed-off-by: Paul Durrant 
> > ---
> > Cc: Konrad Rzeszutek Wilk 
> > Cc: "Roger Pau Monné" 
> > Cc: Jens Axboe 
> > Cc: Dongli Zhang 
> > ---
> >  drivers/block/xen-blkback/common.h |  1 +
> >  drivers/block/xen-blkback/xenbus.c | 36 +-
> >  2 files changed, 17 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/block/xen-blkback/common.h 
> > b/drivers/block/xen-blkback/common.h
> > index b0c71d3a81a0..524a79f10de6 100644
> > --- a/drivers/block/xen-blkback/common.h
> > +++ b/drivers/block/xen-blkback/common.h
> > @@ -313,6 +313,7 @@ struct xen_blkif {
> >
> > struct work_struct  free_work;
> > unsigned intnr_ring_pages;
> > +   boolmulti_ref;
> > /* All rings for this device. */
> > struct xen_blkif_ring   *rings;
> > unsigned intnr_rings;
> > diff --git a/drivers/block/xen-blkback/xenbus.c 
> > b/drivers/block/xen-blkback/xenbus.c
> > index 9860d4842f36..4c1541cde68c 100644
> > --- a/drivers/block/xen-blkback/xenbus.c
> > +++ b/drivers/block/xen-blkback/xenbus.c
> > @@ -998,10 +998,15 @@ static int read_per_ring_refs(struct xen_blkif_ring 
> > *ring, const char *dir)
> > for (i = 0; i < nr_grefs; i++) {
> > char ring_ref_name[RINGREF_NAME_LEN];
> >
> > -   snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
> > +   if (blkif->multi_ref)
> > +   snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", 
> > i);
> > +   else {
> > +   WARN_ON(i != 0);
> > +   snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref");
> > +   }
> > +
> > err = xenbus_scanf(XBT_NIL, dir, ring_ref_name,
> >"%u", _ref[i]);
> > -
> > if (err != 1) {
> > if (nr_grefs == 1)
> > break;
> 
> I think we should not simply break here because the failure can be due to when
> (nr_grefs == 1) and reading from legacy "ring-ref".
> 

Yes, you're quite right. This special case is no longer correct.

> Should we do something as below?
> 
> err = -EINVAL;
> xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
> return err;
> 

I think simply removing the 'if (nr_grefs == 1)' will be sufficient.

  Paul

> Dongli Zhang
> 
> 
> > @@ -1013,18 +1018,6 @@ static int read_per_ring_refs(struct xen_blkif_ring 
> > *ring, const char *dir)
> > }
> >  

Re: [PATCH] xen-blkback: fix compatibility bug with single page rings

2021-01-27 Thread Dongli Zhang



On 1/27/21 2:30 AM, Paul Durrant wrote:
> From: Paul Durrant 
> 
> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> behaviour of xen-blkback when connecting to a frontend was:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring specified by 'ring-ref'
> - else expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> 
> This was correct behaviour, but was broken by the afforementioned commit to
> become:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring
> - expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> - if that didn't work then see if there's a single page ring specified by
>   'ring-ref'
> 
> This incorrect behaviour works most of the time but fails when a frontend
> that sets 'ring-page-order' is unloaded and replaced by one that does not
> because, instead of reading 'ring-ref', xen-blkback will read the stale
> 'ring-ref0' left around by the previous frontend will try to map the wrong
> grant reference.
> 
> This patch restores the original behaviour.
> 
> Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid 
> inconsistent xenstore 'ring-page-order' set by malicious blkfront")
> Signed-off-by: Paul Durrant 
> ---
> Cc: Konrad Rzeszutek Wilk 
> Cc: "Roger Pau Monné" 
> Cc: Jens Axboe 
> Cc: Dongli Zhang 
> ---
>  drivers/block/xen-blkback/common.h |  1 +
>  drivers/block/xen-blkback/xenbus.c | 36 +-
>  2 files changed, 17 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/common.h 
> b/drivers/block/xen-blkback/common.h
> index b0c71d3a81a0..524a79f10de6 100644
> --- a/drivers/block/xen-blkback/common.h
> +++ b/drivers/block/xen-blkback/common.h
> @@ -313,6 +313,7 @@ struct xen_blkif {
>  
>   struct work_struct  free_work;
>   unsigned intnr_ring_pages;
> + boolmulti_ref;
>   /* All rings for this device. */
>   struct xen_blkif_ring   *rings;
>   unsigned intnr_rings;
> diff --git a/drivers/block/xen-blkback/xenbus.c 
> b/drivers/block/xen-blkback/xenbus.c
> index 9860d4842f36..4c1541cde68c 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -998,10 +998,15 @@ static int read_per_ring_refs(struct xen_blkif_ring 
> *ring, const char *dir)
>   for (i = 0; i < nr_grefs; i++) {
>   char ring_ref_name[RINGREF_NAME_LEN];
>  
> - snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
> + if (blkif->multi_ref)
> + snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", 
> i);
> + else {
> + WARN_ON(i != 0);
> + snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref");
> + }
> +
>   err = xenbus_scanf(XBT_NIL, dir, ring_ref_name,
>  "%u", _ref[i]);
> -
>   if (err != 1) {
>   if (nr_grefs == 1)
>   break;

I think we should not simply break here because the failure can be due to when
(nr_grefs == 1) and reading from legacy "ring-ref".

Should we do something as below?

err = -EINVAL;
xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
return err;

Dongli Zhang


> @@ -1013,18 +1018,6 @@ static int read_per_ring_refs(struct xen_blkif_ring 
> *ring, const char *dir)
>   }
>   }
>  
> - if (err != 1) {
> - WARN_ON(nr_grefs != 1);
> -
> - err = xenbus_scanf(XBT_NIL, dir, "ring-ref", "%u",
> -_ref[0]);
> - if (err != 1) {
> - err = -EINVAL;
> - xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
> - return err;
> - }
> - }
> -
>   err = -ENOMEM;
>   for (i = 0; i < nr_grefs * XEN_BLKIF_REQS_PER_PAGE; i++) {
>   req = kzalloc(sizeof(*req), GFP_KERNEL);
> @@ -1129,10 +1122,15 @@ static int connect_ring(struct backend_info *be)
>blkif->nr_rings, blkif->blk_protocol, protocol,
>blkif->vbd.feature_gnt_persistent ? "persistent grants" : "");
>  
> - ring_page_order = xenbus_read_unsigned(dev->otherend,
> -"ring-page-order", 0);
> -
> - if (ring_page_order > xen_blkif_max_ring_order) {
> + err = xenbus_scanf(XBT_NIL, dev->otherend, "ring-page-order", "%u",
> +_page_order);
> + if (err != 1) {
> + blkif->nr_ring_pages = 1;
> + blkif->multi_ref = false;
> + } else if (ring_page_order <= xen_blkif_max_ring_order) {
> + blkif->nr_ring_pages = 1 << ring_page_order;
> + blkif->multi_ref = true;
> + } else {
>   

Re: [PATCH] xen-blkback: fix compatibility bug with single page rings

2021-01-27 Thread Jan Beulich
On 27.01.2021 12:33, Paul Durrant wrote:
>> -Original Message-
>> From: Jan Beulich 
>> Sent: 27 January 2021 11:21
>> To: p...@xen.org
>> Cc: 'Paul Durrant' ; 'Konrad Rzeszutek Wilk' 
>> ; 'Roger Pau
>> Monné' ; 'Jens Axboe' ; 'Dongli Zhang'
>> ; linux-kernel@vger.kernel.org; 
>> linux-bl...@vger.kernel.org; xen-
>> de...@lists.xenproject.org
>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page 
>> rings
>>
>> On 27.01.2021 12:09, Paul Durrant wrote:
>>>> -Original Message-
>>>> From: Jan Beulich 
>>>> Sent: 27 January 2021 10:57
>>>> To: Paul Durrant 
>>>> Cc: Paul Durrant ; Konrad Rzeszutek Wilk 
>>>> ; Roger Pau
>>>> Monné ; Jens Axboe ; Dongli Zhang 
>>>> ;
>>>> linux-kernel@vger.kernel.org; linux-bl...@vger.kernel.org; 
>>>> xen-de...@lists.xenproject.org
>>>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page 
>>>> rings
>>>>
>>>> On 27.01.2021 11:30, Paul Durrant wrote:
>>>>> From: Paul Durrant 
>>>>>
>>>>> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
>>>>> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
>>>>> behaviour of xen-blkback when connecting to a frontend was:
>>>>>
>>>>> - read 'ring-page-order'
>>>>> - if not present then expect a single page ring specified by 'ring-ref'
>>>>> - else expect a ring specified by 'ring-refX' where X is between 0 and
>>>>>   1 << ring-page-order
>>>>>
>>>>> This was correct behaviour, but was broken by the afforementioned commit 
>>>>> to
>>>>> become:
>>>>>
>>>>> - read 'ring-page-order'
>>>>> - if not present then expect a single page ring
>>>>> - expect a ring specified by 'ring-refX' where X is between 0 and
>>>>>   1 << ring-page-order
>>>>> - if that didn't work then see if there's a single page ring specified by
>>>>>   'ring-ref'
>>>>>
>>>>> This incorrect behaviour works most of the time but fails when a frontend
>>>>> that sets 'ring-page-order' is unloaded and replaced by one that does not
>>>>> because, instead of reading 'ring-ref', xen-blkback will read the stale
>>>>> 'ring-ref0' left around by the previous frontend will try to map the wrong
>>>>> grant reference.
>>>>>
>>>>> This patch restores the original behaviour.
>>>>
>>>> Isn't this only the 2nd of a pair of fixes that's needed, the
>>>> first being the drivers, upon being unloaded, to fully clean up
>>>> after itself? Any stale key left may lead to confusion upon
>>>> re-use of the containing directory.
>>>
>>> In a backend we shouldn't be relying on, nor really expect IMO, a frontend 
>>> to clean up after itself.
>> Any backend should know *exactly* what xenstore nodes it’s looking for from 
>> a frontend.
>>
>> But the backend can't know whether a node exists because the present
>> frontend has written it, or because an earlier instance forgot to
>> delete it. It can only honor what's there. (In fact the other day I
>> was wondering whether some of the writes of boolean "false" nodes
>> wouldn't better be xenbus_rm() instead.)
> 
> In the particular case this patch is fixing for me, the frontends are the 
> Windows XENVBD driver and the Windows crash version of the same driver 
> (actually built from different code). The 'normal' instance is multi-page 
> aware and the crash instance is not quite, i.e. it uses the old ring-ref but 
> knows to clean up 'ring-page-order'.
> Clearly, in a crash situation, we cannot rely on frontend to clean up

Ah, I see (and agree).

> so what you say does highlight that there indeed needs to be a second patch 
> to xen-blkback to make sure it removes 'ring-page-order' itself as 'state' 
> cycles through Closed and back to InitWait.

And not just this one node then, I suppose?

> I think this patch does still stand on its own though.

Perhaps, yes.

Jan


RE: [PATCH] xen-blkback: fix compatibility bug with single page rings

2021-01-27 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich 
> Sent: 27 January 2021 11:21
> To: p...@xen.org
> Cc: 'Paul Durrant' ; 'Konrad Rzeszutek Wilk' 
> ; 'Roger Pau
> Monné' ; 'Jens Axboe' ; 'Dongli Zhang'
> ; linux-kernel@vger.kernel.org; 
> linux-bl...@vger.kernel.org; xen-
> de...@lists.xenproject.org
> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
> 
> On 27.01.2021 12:09, Paul Durrant wrote:
> >> -Original Message-
> >> From: Jan Beulich 
> >> Sent: 27 January 2021 10:57
> >> To: Paul Durrant 
> >> Cc: Paul Durrant ; Konrad Rzeszutek Wilk 
> >> ; Roger Pau
> >> Monné ; Jens Axboe ; Dongli Zhang 
> >> ;
> >> linux-kernel@vger.kernel.org; linux-bl...@vger.kernel.org; 
> >> xen-de...@lists.xenproject.org
> >> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page 
> >> rings
> >>
> >> On 27.01.2021 11:30, Paul Durrant wrote:
> >>> From: Paul Durrant 
> >>>
> >>> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> >>> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> >>> behaviour of xen-blkback when connecting to a frontend was:
> >>>
> >>> - read 'ring-page-order'
> >>> - if not present then expect a single page ring specified by 'ring-ref'
> >>> - else expect a ring specified by 'ring-refX' where X is between 0 and
> >>>   1 << ring-page-order
> >>>
> >>> This was correct behaviour, but was broken by the afforementioned commit 
> >>> to
> >>> become:
> >>>
> >>> - read 'ring-page-order'
> >>> - if not present then expect a single page ring
> >>> - expect a ring specified by 'ring-refX' where X is between 0 and
> >>>   1 << ring-page-order
> >>> - if that didn't work then see if there's a single page ring specified by
> >>>   'ring-ref'
> >>>
> >>> This incorrect behaviour works most of the time but fails when a frontend
> >>> that sets 'ring-page-order' is unloaded and replaced by one that does not
> >>> because, instead of reading 'ring-ref', xen-blkback will read the stale
> >>> 'ring-ref0' left around by the previous frontend will try to map the wrong
> >>> grant reference.
> >>>
> >>> This patch restores the original behaviour.
> >>
> >> Isn't this only the 2nd of a pair of fixes that's needed, the
> >> first being the drivers, upon being unloaded, to fully clean up
> >> after itself? Any stale key left may lead to confusion upon
> >> re-use of the containing directory.
> >
> > In a backend we shouldn't be relying on, nor really expect IMO, a frontend 
> > to clean up after itself.
> Any backend should know *exactly* what xenstore nodes it’s looking for from a 
> frontend.
> 
> But the backend can't know whether a node exists because the present
> frontend has written it, or because an earlier instance forgot to
> delete it. It can only honor what's there. (In fact the other day I
> was wondering whether some of the writes of boolean "false" nodes
> wouldn't better be xenbus_rm() instead.)

In the particular case this patch is fixing for me, the frontends are the 
Windows XENVBD driver and the Windows crash version of the same driver 
(actually built from different code). The 'normal' instance is multi-page aware 
and the crash instance is not quite, i.e. it uses the old ring-ref but knows to 
clean up 'ring-page-order'.
Clearly, in a crash situation, we cannot rely on frontend to clean up so what 
you say does highlight that there indeed needs to be a second patch to 
xen-blkback to make sure it removes 'ring-page-order' itself as 'state' cycles 
through Closed and back to InitWait. I think this patch does still stand on its 
own though.

  Paul

> 
> Jan



Re: [PATCH] xen-blkback: fix compatibility bug with single page rings

2021-01-27 Thread Jan Beulich
On 27.01.2021 12:09, Paul Durrant wrote:
>> -Original Message-
>> From: Jan Beulich 
>> Sent: 27 January 2021 10:57
>> To: Paul Durrant 
>> Cc: Paul Durrant ; Konrad Rzeszutek Wilk 
>> ; Roger Pau
>> Monné ; Jens Axboe ; Dongli Zhang 
>> ;
>> linux-kernel@vger.kernel.org; linux-bl...@vger.kernel.org; 
>> xen-de...@lists.xenproject.org
>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page 
>> rings
>>
>> On 27.01.2021 11:30, Paul Durrant wrote:
>>> From: Paul Durrant 
>>>
>>> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
>>> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
>>> behaviour of xen-blkback when connecting to a frontend was:
>>>
>>> - read 'ring-page-order'
>>> - if not present then expect a single page ring specified by 'ring-ref'
>>> - else expect a ring specified by 'ring-refX' where X is between 0 and
>>>   1 << ring-page-order
>>>
>>> This was correct behaviour, but was broken by the afforementioned commit to
>>> become:
>>>
>>> - read 'ring-page-order'
>>> - if not present then expect a single page ring
>>> - expect a ring specified by 'ring-refX' where X is between 0 and
>>>   1 << ring-page-order
>>> - if that didn't work then see if there's a single page ring specified by
>>>   'ring-ref'
>>>
>>> This incorrect behaviour works most of the time but fails when a frontend
>>> that sets 'ring-page-order' is unloaded and replaced by one that does not
>>> because, instead of reading 'ring-ref', xen-blkback will read the stale
>>> 'ring-ref0' left around by the previous frontend will try to map the wrong
>>> grant reference.
>>>
>>> This patch restores the original behaviour.
>>
>> Isn't this only the 2nd of a pair of fixes that's needed, the
>> first being the drivers, upon being unloaded, to fully clean up
>> after itself? Any stale key left may lead to confusion upon
>> re-use of the containing directory.
> 
> In a backend we shouldn't be relying on, nor really expect IMO, a frontend to 
> clean up after itself. Any backend should know *exactly* what xenstore nodes 
> it’s looking for from a frontend.

But the backend can't know whether a node exists because the present
frontend has written it, or because an earlier instance forgot to
delete it. It can only honor what's there. (In fact the other day I
was wondering whether some of the writes of boolean "false" nodes
wouldn't better be xenbus_rm() instead.)

Jan


RE: [PATCH] xen-blkback: fix compatibility bug with single page rings

2021-01-27 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich 
> Sent: 27 January 2021 10:57
> To: Paul Durrant 
> Cc: Paul Durrant ; Konrad Rzeszutek Wilk 
> ; Roger Pau
> Monné ; Jens Axboe ; Dongli Zhang 
> ;
> linux-kernel@vger.kernel.org; linux-bl...@vger.kernel.org; 
> xen-de...@lists.xenproject.org
> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
> 
> On 27.01.2021 11:30, Paul Durrant wrote:
> > From: Paul Durrant 
> >
> > Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> > inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> > behaviour of xen-blkback when connecting to a frontend was:
> >
> > - read 'ring-page-order'
> > - if not present then expect a single page ring specified by 'ring-ref'
> > - else expect a ring specified by 'ring-refX' where X is between 0 and
> >   1 << ring-page-order
> >
> > This was correct behaviour, but was broken by the afforementioned commit to
> > become:
> >
> > - read 'ring-page-order'
> > - if not present then expect a single page ring
> > - expect a ring specified by 'ring-refX' where X is between 0 and
> >   1 << ring-page-order
> > - if that didn't work then see if there's a single page ring specified by
> >   'ring-ref'
> >
> > This incorrect behaviour works most of the time but fails when a frontend
> > that sets 'ring-page-order' is unloaded and replaced by one that does not
> > because, instead of reading 'ring-ref', xen-blkback will read the stale
> > 'ring-ref0' left around by the previous frontend will try to map the wrong
> > grant reference.
> >
> > This patch restores the original behaviour.
> 
> Isn't this only the 2nd of a pair of fixes that's needed, the
> first being the drivers, upon being unloaded, to fully clean up
> after itself? Any stale key left may lead to confusion upon
> re-use of the containing directory.

In a backend we shouldn't be relying on, nor really expect IMO, a frontend to 
clean up after itself. Any backend should know *exactly* what xenstore nodes 
it’s looking for from a frontend.

  Paul

> 
> Jan



Re: [PATCH] xen-blkback: fix compatibility bug with single page rings

2021-01-27 Thread Jan Beulich
On 27.01.2021 11:30, Paul Durrant wrote:
> From: Paul Durrant 
> 
> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> behaviour of xen-blkback when connecting to a frontend was:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring specified by 'ring-ref'
> - else expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> 
> This was correct behaviour, but was broken by the afforementioned commit to
> become:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring
> - expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> - if that didn't work then see if there's a single page ring specified by
>   'ring-ref'
> 
> This incorrect behaviour works most of the time but fails when a frontend
> that sets 'ring-page-order' is unloaded and replaced by one that does not
> because, instead of reading 'ring-ref', xen-blkback will read the stale
> 'ring-ref0' left around by the previous frontend will try to map the wrong
> grant reference.
> 
> This patch restores the original behaviour.

Isn't this only the 2nd of a pair of fixes that's needed, the
first being the drivers, upon being unloaded, to fully clean up
after itself? Any stale key left may lead to confusion upon
re-use of the containing directory.

Jan