Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush

2020-04-22 Thread Jan Beulich
On 22.04.2020 18:31, Stefano Stabellini wrote:
> On Wed, 22 Apr 2020, Jan Beulich wrote:
>> On 22.04.2020 15:07, Juergen Gross wrote:
>>> --- a/xen/common/grant_table.c
>>> +++ b/xen/common/grant_table.c
>>> @@ -3626,12 +3626,12 @@ do_grant_table_op(
>>>  if ( unlikely(!guest_handle_okay(cflush, count)) )
>>>  goto out;
>>>  rc = gnttab_cache_flush(cflush, &opaque_in, count);
>>> -if ( rc > 0 )
>>> +if ( rc >= 0 )
>>>  {
>>>  guest_handle_add_offset(cflush, rc);
>>>  uop = guest_handle_cast(cflush, void);
>>> +opaque_out = opaque_in;
>>>  }
>>> -opaque_out = opaque_in;
>>>  break;
>>>  }
>>>  
>>> @@ -3641,7 +3641,7 @@ do_grant_table_op(
>>>  }
>>>  
>>>out:
>>> -if ( rc > 0 || opaque_out != 0 )
>>> +if ( rc > 0 || (opaque_out != 0 && rc == 0) )
>>
>> I disagree with this part - opaque_out shouldn't end up non-zero
>> when rc < 0, and it won't anymore with the change in the earlier
>> hunk.
> 
> But opaque_out could end up being non-zero when rc == 0.

Which is the case the original code meant to deal with. (I still
think it is unfortunate behavior of the cache-flush implementation
to assign meaning other than "success, done" to rc == 0.)

> I think it is a
> good improvement that we explicitly prevent rc < 0 from entering this
> if condition. This is why this new version of the patch is my favorite:
> it is the one that leads to the code most robust. 
> 
> Reviewed-by: Stefano Stabellini 

Well, looks like I'm outvoted then. Nevertheless thanks ...

> That said, as I mentioned before, I would be OK even without the last
> part because it would still be enough to fix the bug.

.. for also clarifying this.

Jan



Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush

2020-04-22 Thread Stefano Stabellini
On Wed, 22 Apr 2020, Jan Beulich wrote:
> On 22.04.2020 15:07, Juergen Gross wrote:
> > --- a/xen/common/grant_table.c
> > +++ b/xen/common/grant_table.c
> > @@ -3626,12 +3626,12 @@ do_grant_table_op(
> >  if ( unlikely(!guest_handle_okay(cflush, count)) )
> >  goto out;
> >  rc = gnttab_cache_flush(cflush, &opaque_in, count);
> > -if ( rc > 0 )
> > +if ( rc >= 0 )
> >  {
> >  guest_handle_add_offset(cflush, rc);
> >  uop = guest_handle_cast(cflush, void);
> > +opaque_out = opaque_in;
> >  }
> > -opaque_out = opaque_in;
> >  break;
> >  }
> >  
> > @@ -3641,7 +3641,7 @@ do_grant_table_op(
> >  }
> >  
> >out:
> > -if ( rc > 0 || opaque_out != 0 )
> > +if ( rc > 0 || (opaque_out != 0 && rc == 0) )
> 
> I disagree with this part - opaque_out shouldn't end up non-zero
> when rc < 0, and it won't anymore with the change in the earlier
> hunk.

But opaque_out could end up being non-zero when rc == 0. I think it is a
good improvement that we explicitly prevent rc < 0 from entering this
if condition. This is why this new version of the patch is my favorite:
it is the one that leads to the code most robust. 

Reviewed-by: Stefano Stabellini 


That said, as I mentioned before, I would be OK even without the last
part because it would still be enough to fix the bug.



Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush

2020-04-22 Thread Jan Beulich
On 22.04.2020 15:07, Juergen Gross wrote:
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -3626,12 +3626,12 @@ do_grant_table_op(
>  if ( unlikely(!guest_handle_okay(cflush, count)) )
>  goto out;
>  rc = gnttab_cache_flush(cflush, &opaque_in, count);
> -if ( rc > 0 )
> +if ( rc >= 0 )
>  {
>  guest_handle_add_offset(cflush, rc);
>  uop = guest_handle_cast(cflush, void);
> +opaque_out = opaque_in;
>  }
> -opaque_out = opaque_in;
>  break;
>  }
>  
> @@ -3641,7 +3641,7 @@ do_grant_table_op(
>  }
>  
>out:
> -if ( rc > 0 || opaque_out != 0 )
> +if ( rc > 0 || (opaque_out != 0 && rc == 0) )

I disagree with this part - opaque_out shouldn't end up non-zero
when rc < 0, and it won't anymore with the change in the earlier
hunk.

Jan



Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush

2020-04-22 Thread Jürgen Groß

On 22.04.20 15:43, Julien Grall wrote:

Hi Juergen,

On 22/04/2020 14:07, Juergen Gross wrote:

The GNTTABOP_cache_flush hypercall has a wrong test for hypercall
continuation, the test today is:

 if ( rc > 0 || opaque_out != 0 )

Unfortunately this will be true even in case of an error (rc < 0),
possibly leading to very long lasting hypercalls (times of more
than an hour have been observed in a test case).

Correct the test condition to result in false with rc < 0 and set
opaque_out only if no error occurred, to be on the safe side.

Partially-suggested-by: Jan Beulich 
Signed-off-by: Juergen Gross 
---
  xen/common/grant_table.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 96080b3dec..5ef7ff940d 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3626,12 +3626,12 @@ do_grant_table_op(
  if ( unlikely(!guest_handle_okay(cflush, count)) )
  goto out;
  rc = gnttab_cache_flush(cflush, &opaque_in, count);
-    if ( rc > 0 )
+    if ( rc >= 0 )
  {
  guest_handle_add_offset(cflush, rc);
  uop = guest_handle_cast(cflush, void);
+    opaque_out = opaque_in;
  }
-    opaque_out = opaque_in;
  break;
  }
@@ -3641,7 +3641,7 @@ do_grant_table_op(
  }
    out:
-    if ( rc > 0 || opaque_out != 0 )
+    if ( rc > 0 || (opaque_out != 0 && rc == 0) )


I don't understand this change. If you look at the implementation of 
gnttab_flush() it is not possible to have opaque_out non-zero with rc = 0.


Why not?

In gnttab_cache_flush() we have:

  if ( hypercall_preempt_check() )
  return i;

i will be 0 in the first loop iteration.


Juergen



Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush

2020-04-22 Thread Julien Grall




On 22/04/2020 14:43, Julien Grall wrote:

Hi Juergen,

On 22/04/2020 14:07, Juergen Gross wrote:

The GNTTABOP_cache_flush hypercall has a wrong test for hypercall
continuation, the test today is:

 if ( rc > 0 || opaque_out != 0 )

Unfortunately this will be true even in case of an error (rc < 0),
possibly leading to very long lasting hypercalls (times of more
than an hour have been observed in a test case).

Correct the test condition to result in false with rc < 0 and set
opaque_out only if no error occurred, to be on the safe side.

Partially-suggested-by: Jan Beulich 
Signed-off-by: Juergen Gross 
---
  xen/common/grant_table.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 96080b3dec..5ef7ff940d 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3626,12 +3626,12 @@ do_grant_table_op(
  if ( unlikely(!guest_handle_okay(cflush, count)) )
  goto out;
  rc = gnttab_cache_flush(cflush, &opaque_in, count);
-    if ( rc > 0 )
+    if ( rc >= 0 )
  {
  guest_handle_add_offset(cflush, rc);
  uop = guest_handle_cast(cflush, void);
+    opaque_out = opaque_in;
  }
-    opaque_out = opaque_in;
  break;
  }
@@ -3641,7 +3641,7 @@ do_grant_table_op(
  }
    out:
-    if ( rc > 0 || opaque_out != 0 )
+    if ( rc > 0 || (opaque_out != 0 && rc == 0) )


I don't understand this change. If you look at the implementation of 
gnttab_flush() it is not possible to have opaque_out non-zero with rc = 0.


Hmmm... I misread the code and missed the:

if ( hypercall_preempt_check() )
  return i;

Sorry for the noise.

I am also assuming this want a Fixes tag on 
18e8d22fe750c8c7b2830fa37961352693425cf1 "introduce GNTTABOP_cache_flush".


Reviewed-by: Julien Grall 

Cheers,

--
Julien Grall



Re: [PATCH] xen/grants: fix hypercall continuation for GNTTABOP_cache_flush

2020-04-22 Thread Julien Grall

Hi Juergen,

On 22/04/2020 14:07, Juergen Gross wrote:

The GNTTABOP_cache_flush hypercall has a wrong test for hypercall
continuation, the test today is:

 if ( rc > 0 || opaque_out != 0 )

Unfortunately this will be true even in case of an error (rc < 0),
possibly leading to very long lasting hypercalls (times of more
than an hour have been observed in a test case).

Correct the test condition to result in false with rc < 0 and set
opaque_out only if no error occurred, to be on the safe side.

Partially-suggested-by: Jan Beulich 
Signed-off-by: Juergen Gross 
---
  xen/common/grant_table.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 96080b3dec..5ef7ff940d 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3626,12 +3626,12 @@ do_grant_table_op(
  if ( unlikely(!guest_handle_okay(cflush, count)) )
  goto out;
  rc = gnttab_cache_flush(cflush, &opaque_in, count);
-if ( rc > 0 )
+if ( rc >= 0 )
  {
  guest_handle_add_offset(cflush, rc);
  uop = guest_handle_cast(cflush, void);
+opaque_out = opaque_in;
  }
-opaque_out = opaque_in;
  break;
  }
  
@@ -3641,7 +3641,7 @@ do_grant_table_op(

  }
  
out:

-if ( rc > 0 || opaque_out != 0 )
+if ( rc > 0 || (opaque_out != 0 && rc == 0) )


I don't understand this change. If you look at the implementation of 
gnttab_flush() it is not possible to have opaque_out non-zero with rc = 0.


So what's the point of the second condition?


  {
  /* Adjust rc, see gnttab_copy() for why this is needed. */
  if ( cmd == GNTTABOP_copy )



Cheers,

--
Julien Grall