[Xen-devel] [PATCH v2 for-4.6 3/3] tools/libxl: Only continue stream operations if the stream is still in progress

2015-07-27 Thread Andrew Cooper
Part of the callback contract with check_all_finished() is that each
running parallel task shall call it exactly once.

Previously, it was possible for stream_continue() or
write_toolstack_record() to fail and call into check_all_finished().  As
the save helpers callback has fired, it no longer counts as in use,
which causes check_all_finished() to fire the stream callback.  Then,
unwinding the stack back and calling check_all_finished() a second time
results in the same conditions being observed, and the stream callback
being fired a second time.

To avoid this, check_all_finished() is called before any other actions
which continue the stream functionality, and the stream is only
continued if it has not been torn down.  This guarantees not to continue
stream operations if the stream does not owe a callback to
check_all_finished().

Signed-off-by: Andrew Cooper 
---
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_stream_read.c  |   14 --
 tools/libxl/libxl_stream_write.c |5 +++--
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 65100f4..3432933 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -738,14 +738,16 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void 
*dcs_void,
 goto err;
 }
 
-/*
- * Libxc has indicated that it is done with the stream.  Resume reading
- * libxl records from it.
- */
-stream_continue(egc, stream);
-
  err:
 check_all_finished(egc, stream, rc);
+
+if (libxl__stream_read_inuse(stream)) {
+/*
+ * Libxc has indicated that it is done with the stream.  Resume reading
+ * libxl records from it.
+ */
+stream_continue(egc, stream);
+}
 }
 
 static void conversion_done(libxl__egc *egc,
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index 676ad0a..44f8b2f 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -275,10 +275,11 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void 
*dss_void,
 goto err;
 }
 
-write_toolstack_record(egc, stream);
-
  err:
 check_all_finished(egc, stream, rc);
+
+if (libxl__stream_write_inuse(stream))
+write_toolstack_record(egc, stream);
 }
 
 static void write_toolstack_record(libxl__egc *egc,
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for-4.6 3/3] tools/libxl: Only continue stream operations if the stream is still in progress

2015-07-28 Thread Ian Jackson
Andrew Cooper writes ("[PATCH v2 for-4.6 3/3] tools/libxl: Only continue stream 
operations if the stream is still in progress"):
> Part of the callback contract with check_all_finished() is that each
> running parallel task shall call it exactly once.
...
> @@ -738,14 +738,16 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, 
> void *dcs_void,
>  goto err;
>  }
>  
> -/*
> - * Libxc has indicated that it is done with the stream.  Resume reading
> - * libxl records from it.
> - */
> -stream_continue(egc, stream);
> -
>   err:
>  check_all_finished(egc, stream, rc);
> +
> +if (libxl__stream_read_inuse(stream)) {
> +/*
> + * Libxc has indicated that it is done with the stream.  Resume re\
ading
> + * libxl records from it.
> + */
> +stream_continue(egc, stream);
> +}

This doesn't look convincing to me.  (Also, wrap damage in the
comment.)  It may be possible to prove that it's not buggy, but it's
certainly confusing.  In general the the which sets up the next
callback should be at the end of functions, because that way
reentrancy hazards are more easily seen to be avoided.

Why not just do this:

   /*
* Libxc has indicated that it is done with the stream.  Resume reading
* libxl records from it.
*/
   stream_continue(egc, stream);
  +return;

err:
   check_all_finished(egc, stream, rc);

?

And the same in the next one.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for-4.6 3/3] tools/libxl: Only continue stream operations if the stream is still in progress

2015-07-28 Thread Andrew Cooper
On 28/07/15 14:41, Ian Jackson wrote:
> Andrew Cooper writes ("[PATCH v2 for-4.6 3/3] tools/libxl: Only continue 
> stream operations if the stream is still in progress"):
>> Part of the callback contract with check_all_finished() is that each
>> running parallel task shall call it exactly once.
> ...
>> @@ -738,14 +738,16 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, 
>> void *dcs_void,
>>  goto err;
>>  }
>>  
>> -/*
>> - * Libxc has indicated that it is done with the stream.  Resume reading
>> - * libxl records from it.
>> - */
>> -stream_continue(egc, stream);
>> -
>>   err:
>>  check_all_finished(egc, stream, rc);
>> +
>> +if (libxl__stream_read_inuse(stream)) {
>> +/*
>> + * Libxc has indicated that it is done with the stream.  Resume re\
> ading
>> + * libxl records from it.
>> + */
>> +stream_continue(egc, stream);
>> +}
> This doesn't look convincing to me.  (Also, wrap damage in the
> comment.)  It may be possible to prove that it's not buggy, but it's
> certainly confusing.  In general the the which sets up the next
> callback should be at the end of functions, because that way
> reentrancy hazards are more easily seen to be avoided.
>
> Why not just do this:
>
>/*
> * Libxc has indicated that it is done with the stream.  Resume reading
> * libxl records from it.
> */
>stream_continue(egc, stream);
>   +return;
>
> err:
>check_all_finished(egc, stream, rc);
>
> ?
>
> And the same in the next one.

That was my first attempt to fix this issue (as observed in v1), but it
is buggy.

libxl__xc_domain_restore_done() is required to call check_all_finished()
exactly once, as part of its callback contract.

Imagine a scenario whereby some error has occured and
check_all_finished() has _abort()'ed the tasks, but the save helper was
already on the way out, signalling success.

In such a case, we would both omit the check_all_finished() call, thus
omitting the eventual stream callback, and start a further action on a
now-defunct stream.

It is only save to stream_continue() if the stream is currently in use,
which is not a guaranteed situation in this function even if rc is 0.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for-4.6 3/3] tools/libxl: Only continue stream operations if the stream is still in progress

2015-07-28 Thread Ian Jackson
Andrew Cooper writes ("Re: [PATCH v2 for-4.6 3/3] tools/libxl: Only continue 
stream operations if the stream is still in progress"):
> Imagine a scenario whereby some error has occured and
> check_all_finished() has _abort()'ed the tasks, but the save helper was
> already on the way out, signalling success.
...
> It is only save to stream_continue() if the stream is currently in use,
> which is not a guaranteed situation in this function even if rc is 0.

Hrm.  Yes.

What do you think about putting the inuse check in stream_continue ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for-4.6 3/3] tools/libxl: Only continue stream operations if the stream is still in progress

2015-07-28 Thread Andrew Cooper
On 28/07/15 16:12, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [PATCH v2 for-4.6 3/3] tools/libxl: Only continue 
> stream operations if the stream is still in progress"):
>> Imagine a scenario whereby some error has occured and
>> check_all_finished() has _abort()'ed the tasks, but the save helper was
>> already on the way out, signalling success.
> ...
>> It is only save to stream_continue() if the stream is currently in use,
>> which is not a guaranteed situation in this function even if rc is 0.

erm s/save/safe/

> Hrm.  Yes.
>
> What do you think about putting the inuse check in stream_continue ?

That would work on the stream_read side but not the stream_write side,
but is not really correct IMO.

The _inuse() check is needed because the save helper callback is not
sure whether the stream is in use or not.  This is a property of the
save helper callback, rather than the stream.

Pushing the _inuse() check into the next layer would function, but it
adds extra _inuse() checks to other codepaths which should be fatal if
they failed in other contexts.

Would resubmitting with extra comments explaining this suffice?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for-4.6 3/3] tools/libxl: Only continue stream operations if the stream is still in progress

2015-07-28 Thread Ian Jackson
Andrew Cooper writes ("Re: [PATCH v2 for-4.6 3/3] tools/libxl: Only continue 
stream operations if the stream is still in progress"):
> On 28/07/15 16:12, Ian Jackson wrote:
> > What do you think about putting the inuse check in stream_continue ?
> 
> That would work on the stream_read side but not the stream_write side,
> but is not really correct IMO.

I have reread the code on the write side and I think I agree with you
there.  The next function to be called is write_toolstack_record so
things are a lot less abstract.

On the read side I looked at a lot of functions which contained:

  stream_continue(egc, stream);
  return;

   err:
  assert(rc);
  stream_complete(egc, stream, rc);

and ISTM that if the analysis leading to these call sites for
stream_continue is wrong, a similar hazard would exist.


> The _inuse() check is needed because the save helper callback is not
> sure whether the stream is in use or not.  This is a property of the
> save helper callback, rather than the stream.

In this event-driven code, I prefer to have a situation where the next
thing to do is obvious from the recorded state, and is calculated
explicitly (see also the way that check_all_finished is now).

This is much less confusing and error prone than code where the next
thing to do depends on _both_ the recorded state, _and_ the call path
(representing the most recent thing that happened).

> Pushing the _inuse() check into the next layer would function, but it
> adds extra _inuse() checks to other codepaths which should be fatal if
> they failed in other contexts.

I don't understand how an _inuse check in stream_continue could ever
do the wrong thing.  If the code is reorganised later, it seems more
likely to me that there will introduced new situations where we need
to do "see if we are still running, and if so do stream_continue,
otherwise go to check_all_finished".

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for-4.6 3/3] tools/libxl: Only continue stream operations if the stream is still in progress

2015-07-28 Thread Andrew Cooper
On 28/07/15 16:59, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [PATCH v2 for-4.6 3/3] tools/libxl: Only continue 
> stream operations if the stream is still in progress"):
>> On 28/07/15 16:12, Ian Jackson wrote:
>>> What do you think about putting the inuse check in stream_continue ?
>> That would work on the stream_read side but not the stream_write side,
>> but is not really correct IMO.
> I have reread the code on the write side and I think I agree with you
> there.  The next function to be called is write_toolstack_record so
> things are a lot less abstract.
>
> On the read side I looked at a lot of functions which contained:
>
>   stream_continue(egc, stream);
>   return;
>
>err:
>   assert(rc);
>   stream_complete(egc, stream, rc);
>
> and ISTM that if the analysis leading to these call sites for
> stream_continue is wrong, a similar hazard would exist.

The vast majority of these cases are from internal callbacks.  i.e. they
are within the logical context of the current running stream.

The cases which come in from some external context are
libxl__stream_read_start_checkpoint() and
libxl__xc_domain_restore_done().  The latter is the problem case, while
the former asserts that the stream is in use.

>
>> The _inuse() check is needed because the save helper callback is not
>> sure whether the stream is in use or not.  This is a property of the
>> save helper callback, rather than the stream.
> In this event-driven code, I prefer to have a situation where the next
> thing to do is obvious from the recorded state, and is calculated
> explicitly (see also the way that check_all_finished is now).
>
> This is much less confusing and error prone than code where the next
> thing to do depends on _both_ the recorded state, _and_ the call path
> (representing the most recent thing that happened).

This point is specifically a join point of the end of one asynchronous
task, to another which is expected to be active and running.

It is logically different to the other cases where stream_continue() is
used.

>
>> Pushing the _inuse() check into the next layer would function, but it
>> adds extra _inuse() checks to other codepaths which should be fatal if
>> they failed in other contexts.
> I don't understand how an _inuse check in stream_continue could ever
> do the wrong thing.  If the code is reorganised later, it seems more
> likely to me that there will introduced new situations where we need
> to do "see if we are still running, and if so do stream_continue,
> otherwise go to check_all_finished".

check_all_finished() must only ever be called once for each started
task.  Having

if ( inuse )
continue()
else
check_all_finished()

will make it far more likely to accidentally fire the overall stream
callback twice.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for-4.6 3/3] tools/libxl: Only continue stream operations if the stream is still in progress

2015-07-28 Thread Ian Jackson
Andrew Cooper writes ("Re: [PATCH v2 for-4.6 3/3] tools/libxl: Only continue 
stream operations if the stream is still in progress"):
> check_all_finished() must only ever be called once for each started
> task.  Having
> 
> if ( inuse )
> continue()
> else
> check_all_finished()
> 
> will make it far more likely to accidentally fire the overall stream
> callback twice.

I think extra calls to check_all_finished are harmless.

But anyway, I can see that this conversation isn't really converging.
I think there is no functional difference between the two approaches
we are debating.  So this comes down to a matter of taste.

I don't want to put my foot down as maintainer, especially since you
wrote most of this code.  So, I think the bikeshed might as well be
your favoured shade of orange:

Acked-by: Ian Jackson 

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for-4.6 3/3] tools/libxl: Only continue stream operations if the stream is still in progress

2015-07-29 Thread Ian Campbell
On Wed, 2015-07-29 at 15:17 +0100, Ian Campbell wrote:
> On Tue, 2015-07-28 at 18:07 +0100, Ian Jackson wrote:
> > Andrew Cooper writes ("Re: [PATCH v2 for-4.6 3/3] tools/libxl: Only 
> > continue stream operations if the stream is still in progress"):
> > > check_all_finished() must only ever be called once for each started
> > > task.  Having
> > > 
> > > if ( inuse )
> > > continue()
> > > else
> > > check_all_finished()
> > > 
> > > will make it far more likely to accidentally fire the overall stream
> > > callback twice.
> > 
> > I think extra calls to check_all_finished are harmless.
> > 
> > But anyway, I can see that this conversation isn't really converging.
> > I think there is no functional difference between the two approaches
> > we are debating.  So this comes down to a matter of taste.
> > 
> > I don't want to put my foot down as maintainer, especially since you
> > wrote most of this code.  So, I think the bikeshed might as well be
> > your favoured shade of orange:
> > 
> > Acked-by: Ian Jackson 
> 
> This is an ack to the patch in <
> 1438015647-25377-4-git-send-email-andrew.coop...@citrix.com> i.e. the
> original 3/3 of this posting, right?
> 
> (and not, for example, an ack to some other proposal made during this
> thread)

Nevermind, I see now that this came back in "Fix libxl TOOLSTACK records
for migration v2" with the ack in place.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for-4.6 3/3] tools/libxl: Only continue stream operations if the stream is still in progress

2015-07-29 Thread Ian Campbell
On Tue, 2015-07-28 at 18:07 +0100, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [PATCH v2 for-4.6 3/3] tools/libxl: Only 
> continue stream operations if the stream is still in progress"):
> > check_all_finished() must only ever be called once for each started
> > task.  Having
> > 
> > if ( inuse )
> > continue()
> > else
> > check_all_finished()
> > 
> > will make it far more likely to accidentally fire the overall stream
> > callback twice.
> 
> I think extra calls to check_all_finished are harmless.
> 
> But anyway, I can see that this conversation isn't really converging.
> I think there is no functional difference between the two approaches
> we are debating.  So this comes down to a matter of taste.
> 
> I don't want to put my foot down as maintainer, especially since you
> wrote most of this code.  So, I think the bikeshed might as well be
> your favoured shade of orange:
> 
> Acked-by: Ian Jackson 

This is an ack to the patch in <
1438015647-25377-4-git-send-email-andrew.coop...@citrix.com> i.e. the
original 3/3 of this posting, right?

(and not, for example, an ack to some other proposal made during this
thread)

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel