Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions

2016-10-11 Thread Kashyap Chamarthy
On Mon, Oct 10, 2016 at 02:28:52PM -0500, Eric Blake wrote:
> On 10/10/2016 01:36 PM, John Snow wrote:

[...]

> > I'll be honest that I don't know; this is related to Replication which I
> > know reasonably little about overall. It got added in the 2.8 timeframe,
> > so the behavior it currently exhibits is not a good or meaningful
> > reference for maintaining compatibility.
> > 
> > We can /change/ the behavior before releases with no love lost.
> 
> And if Replication is the only way to trigger internal use of jobs, then
> we aren't breaking libvirt (which doesn't know how to drive replication
> yet) by changing anything on that front.

Exactly.

> > Or, what if you just didn't get events for internal jobs? Are events for
> > un-managed jobs useful in any sense beyond understanding the stateful
> > availability of the drive to participate in a new job?
> 
> If libvirt isn't driving replication, then it's a moot point. And even
> though replication in libvirt is not supported yet, I suspect that down
> the road when support is added, the easiest thing for libvirt will be to
> state that replication and libvirt-controlled block jobs are mutually
> exclusive; there's probably enough lurking dragons that if your system
> MUST be high-reliance by replication, you probably don't want to be
> confusing things by changing the backing file depth manually with
> streams, pulls, or other manual actions at the same time as replication
> is managing the system, because how can you guarantee that both primary
> and secondary see the same manual actions at all the right times?

Very nice argument for making them mutually exclusive, from a libvirt
POV.

> At any rate, not seeing internal-only jobs is probably the most
> reasonable, even if it means an internal-only job can block the attempt
> to do a manual job.

FWIW, I agree, if only as a user / observer of these events during
debugging.

[...]

-- 
/kashyap



Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions

2016-10-11 Thread Markus Armbruster
John Snow  writes:

> On 10/05/2016 09:43 AM, Kevin Wolf wrote:
[...]
>> Here we have an additional caller in block/replication.c and qemu-img,
>> so the parameters must stay. For qemu-img, nothing changes. For
>> replication, the block job events are added as a side effect.
>>
>> Not sure if we want to emit such events for an internal block job, but
>> if we do want the change, it should be explicit.
>>
>
> Hmm, do we want to make it so some jobs are invisible and others are
> not? Because as it stands right now, neither case is strictly true. We
> only emit cancelled/completed events if it was started via QMP,
> however we do emit events for error and ready regardless of who
> started the job.
>
> That didn't seem particularly consistent to me; either all events
> should be controlled by the job layer itself or none of them should
> be.
>
> I opted for "all."
>
> For "internal" jobs that did not previously emit any events, is it not
> true that these jobs still appear in the block job list and are
> effectively public regardless? I'd argue that these messages may be of
> value for management utilities who are still blocked by these jobs
> whether or not they are 'internal' or not.
>
> I'll push for keeping it mandatory and explicit. If it becomes a
> problem, we can always add a 'silent' job property that silences ALL
> qmp events, including all completion, error, and ready notices.
>
> I've CC'd Wen Congyang and Eric Blake to talk me down if they wish.

Having read the thread so far, I have two high-level remarks:

1. We should expose a job externally either completely (all queries show
it, all events get sent, any non-query command works normally as far as
it makes sense) or not at all.

2. Exposing internal jobs risks making them ABI.  Implementation details
need to be kept out of ABI.  So the question is whether the job is truly
internal detail, or a bona fide part of the external interface.



Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions

2016-10-10 Thread John Snow



On 10/10/2016 12:45 PM, Kashyap Chamarthy wrote:

On Wed, Oct 05, 2016 at 05:00:29PM -0400, John Snow wrote:

[Arbitrarily chiming here, and still catching up with the details of the
thread.]


On 10/05/2016 03:24 PM, Eric Blake wrote:

On 10/05/2016 01:49 PM, John Snow wrote:


[...]


Hmm, do we want to make it so some jobs are invisible and others are
not? Because as it stands right now, neither case is strictly true. We
only emit cancelled/completed events if it was started via QMP, however
we do emit events for error and ready regardless of who started the job.


Libvirt tries to mirror any block job event it receives to upper layers.
 But if it cannot figure out which upper-layer disk the event is
associated with, it just drops the event.  So I think that from the
libvirt perspective, you are okay if if you always report job events,
even for internal jobs.  (Do we have a quick and easy way to set up an
internal job event, to double-check if I can produce some sort of
libvirt scenario to trigger the event and see that it gets safely ignored?)



Not in a QEMU release yet, I think.


If not from an official QEMU release, it'd still be useful to work out a
a way to reproduce what Eric asked even from upstream Git master.



I'll be honest that I don't know; this is related to Replication which I 
know reasonably little about overall. It got added in the 2.8 timeframe, 
so the behavior it currently exhibits is not a good or meaningful 
reference for maintaining compatibility.


We can /change/ the behavior before releases with no love lost.


That didn't seem particularly consistent to me; either all events should
be controlled by the job layer itself or none of them should be.

I opted for "all."

For "internal" jobs that did not previously emit any events, is it not
true that these jobs still appear in the block job list and are
effectively public regardless? I'd argue that these messages may be of
value for management utilities who are still blocked by these jobs
whether or not they are 'internal' or not.


It'd certainly be useful durign debugging (for the said management
utilities), if it's possible to distinguish an event that was triggerred
by an internal block job vs. an event emitted by a job explicitly
triggerred by a user action.



Or, what if you just didn't get events for internal jobs? Are events for 
un-managed jobs useful in any sense beyond understanding the stateful 
availability of the drive to participate in a new job?



For example, OpenStack Nova calls libvirt API virDomainBlockRebase(),
which internally calls QMP `drive-mirror` that emits events.  An "event
origin classification" system, if were to exist, allows one to pay
attention to only those events that are emitted due to an explicit
action and ignore all the rest ('internal').

But I'm not quite sure if it's desirable to have this event
classification for cleanliness reasons as Eric points out below.


I'll push for keeping it mandatory and explicit. If it becomes a
problem, we can always add a 'silent' job property that silences ALL qmp
events, including all completion, error, and ready notices.


Completely silencing an internal job seems a little cleaner than having
events for the job but not being able to query it.  But if nothing
breaks by exposing the internal jobs, that seems even easier than trying
to decide which jobs are internal and hidden vs. user-requested and public.



Well, at the moment anything requested directly via blockdev.c is "public."
Before 2.8, all jobs were public ones, with the exception of those in
qemu-img which is a bit of a different/special case.

We have this block/replication.c beast now, though, and it uses backup_start
and commit_active_start as it sees fit without direct user intervention.

As it stands, I believe the jobs that replication creates are user-visible
via query, will not issue cancellation or completion events, but WILL emit
error events. It may emit ready events for the mirror job it uses, but I
haven't traced that. (It could, at least.)


Thanks, the above is useful to know.






Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions

2016-10-10 Thread Kashyap Chamarthy
On Wed, Oct 05, 2016 at 05:00:29PM -0400, John Snow wrote:

[Arbitrarily chiming here, and still catching up with the details of the
thread.]
 
> On 10/05/2016 03:24 PM, Eric Blake wrote:
> > On 10/05/2016 01:49 PM, John Snow wrote:

[...]

> > > Hmm, do we want to make it so some jobs are invisible and others are
> > > not? Because as it stands right now, neither case is strictly true. We
> > > only emit cancelled/completed events if it was started via QMP, however
> > > we do emit events for error and ready regardless of who started the job.
> > 
> > Libvirt tries to mirror any block job event it receives to upper layers.
> >  But if it cannot figure out which upper-layer disk the event is
> > associated with, it just drops the event.  So I think that from the
> > libvirt perspective, you are okay if if you always report job events,
> > even for internal jobs.  (Do we have a quick and easy way to set up an
> > internal job event, to double-check if I can produce some sort of
> > libvirt scenario to trigger the event and see that it gets safely ignored?)
> > 
> 
> Not in a QEMU release yet, I think.

If not from an official QEMU release, it'd still be useful to work out a
a way to reproduce what Eric asked even from upstream Git master.

> > > That didn't seem particularly consistent to me; either all events should
> > > be controlled by the job layer itself or none of them should be.
> > > 
> > > I opted for "all."
> > > 
> > > For "internal" jobs that did not previously emit any events, is it not
> > > true that these jobs still appear in the block job list and are
> > > effectively public regardless? I'd argue that these messages may be of
> > > value for management utilities who are still blocked by these jobs
> > > whether or not they are 'internal' or not.

It'd certainly be useful durign debugging (for the said management
utilities), if it's possible to distinguish an event that was triggerred
by an internal block job vs. an event emitted by a job explicitly
triggerred by a user action.

For example, OpenStack Nova calls libvirt API virDomainBlockRebase(),
which internally calls QMP `drive-mirror` that emits events.  An "event
origin classification" system, if were to exist, allows one to pay
attention to only those events that are emitted due to an explicit
action and ignore all the rest ('internal').

But I'm not quite sure if it's desirable to have this event
classification for cleanliness reasons as Eric points out below.

> > > I'll push for keeping it mandatory and explicit. If it becomes a
> > > problem, we can always add a 'silent' job property that silences ALL qmp
> > > events, including all completion, error, and ready notices.
> > 
> > Completely silencing an internal job seems a little cleaner than having
> > events for the job but not being able to query it.  But if nothing
> > breaks by exposing the internal jobs, that seems even easier than trying
> > to decide which jobs are internal and hidden vs. user-requested and public.
> >
> 
> Well, at the moment anything requested directly via blockdev.c is "public."
> Before 2.8, all jobs were public ones, with the exception of those in
> qemu-img which is a bit of a different/special case.
> 
> We have this block/replication.c beast now, though, and it uses backup_start
> and commit_active_start as it sees fit without direct user intervention.
> 
> As it stands, I believe the jobs that replication creates are user-visible
> via query, will not issue cancellation or completion events, but WILL emit
> error events. It may emit ready events for the mirror job it uses, but I
> haven't traced that. (It could, at least.)

Thanks, the above is useful to know. 

-- 
/kashyap



Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions

2016-10-06 Thread John Snow



On 10/06/2016 03:44 AM, Kevin Wolf wrote:

Am 05.10.2016 um 20:49 hat John Snow geschrieben:

On 10/05/2016 09:43 AM, Kevin Wolf wrote:

Am 01.10.2016 um 00:00 hat John Snow geschrieben:

@@ -3136,10 +3111,10 @@ void qmp_block_commit(bool has_job_id, const char 
*job_id, const char *device,
goto out;
}
commit_active_start(has_job_id ? job_id : NULL, bs, base_bs, speed,
-on_error, block_job_cb, bs, _err, false);
+on_error, NULL, bs, _err, false);


Here we have an additional caller in block/replication.c and qemu-img,
so the parameters must stay. For qemu-img, nothing changes. For
replication, the block job events are added as a side effect.

Not sure if we want to emit such events for an internal block job, but
if we do want the change, it should be explicit.



Hmm, do we want to make it so some jobs are invisible and others are
not? Because as it stands right now, neither case is strictly true.
We only emit cancelled/completed events if it was started via QMP,
however we do emit events for error and ready regardless of who
started the job.

That didn't seem particularly consistent to me; either all events
should be controlled by the job layer itself or none of them should
be.


Yes, I agree. The use of block jobs in replication is rather broken and
we should change it one way or another. But I'd prefer to do so
explicitly instead of doing it as a side-effect of a patch like this
one.



I can always split this patch out and CC Wen, Eric, Markus et al and 
adjust the commit message to be explicit.



I opted for "all."

For "internal" jobs that did not previously emit any events, is it
not true that these jobs still appear in the block job list and are
effectively public regardless? I'd argue that these messages may be
of value for management utilities who are still blocked by these
jobs whether or not they are 'internal' or not.

I'll push for keeping it mandatory and explicit. If it becomes a
problem, we can always add a 'silent' job property that silences ALL
qmp events, including all completion, error, and ready notices.


Actually, there is at least one other reason why the block jobs in
replication are a bad a idea as they are today: Job naming. Currently
they use a fixed string, conflicting with the user-controlled job
namespace and with itself (i.e. restricting replication to a single
disk).

And are we really prepared to handle cases where the user decides to
pause, complete or cancel an internal job?

I think we should really hide them from the user. And maybe the way to
do so isn't a bool job->user flag, but actually job->id = NULL. Then it
would work the same way as named/internal BlockBackends do and we would
get rid of the naming problem, too.

Kevin



Mirrors "internal bitmaps," too.

I can rig it such that if a job has no ID, it will cease to show up via 
query and no longer emit events.


Downside: Whether or not a device is busy or can accept another job 
becomes opaque to the management layer.


--js



Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions

2016-10-06 Thread Kevin Wolf
Am 05.10.2016 um 20:49 hat John Snow geschrieben:
> On 10/05/2016 09:43 AM, Kevin Wolf wrote:
> >Am 01.10.2016 um 00:00 hat John Snow geschrieben:
> >>@@ -3136,10 +3111,10 @@ void qmp_block_commit(bool has_job_id, const char 
> >>*job_id, const char *device,
> >> goto out;
> >> }
> >> commit_active_start(has_job_id ? job_id : NULL, bs, base_bs, speed,
> >>-on_error, block_job_cb, bs, _err, false);
> >>+on_error, NULL, bs, _err, false);
> >
> >Here we have an additional caller in block/replication.c and qemu-img,
> >so the parameters must stay. For qemu-img, nothing changes. For
> >replication, the block job events are added as a side effect.
> >
> >Not sure if we want to emit such events for an internal block job, but
> >if we do want the change, it should be explicit.
> >
> 
> Hmm, do we want to make it so some jobs are invisible and others are
> not? Because as it stands right now, neither case is strictly true.
> We only emit cancelled/completed events if it was started via QMP,
> however we do emit events for error and ready regardless of who
> started the job.
> 
> That didn't seem particularly consistent to me; either all events
> should be controlled by the job layer itself or none of them should
> be.

Yes, I agree. The use of block jobs in replication is rather broken and
we should change it one way or another. But I'd prefer to do so
explicitly instead of doing it as a side-effect of a patch like this
one.

> I opted for "all."
> 
> For "internal" jobs that did not previously emit any events, is it
> not true that these jobs still appear in the block job list and are
> effectively public regardless? I'd argue that these messages may be
> of value for management utilities who are still blocked by these
> jobs whether or not they are 'internal' or not.
> 
> I'll push for keeping it mandatory and explicit. If it becomes a
> problem, we can always add a 'silent' job property that silences ALL
> qmp events, including all completion, error, and ready notices.

Actually, there is at least one other reason why the block jobs in
replication are a bad a idea as they are today: Job naming. Currently
they use a fixed string, conflicting with the user-controlled job
namespace and with itself (i.e. restricting replication to a single
disk).

And are we really prepared to handle cases where the user decides to
pause, complete or cancel an internal job?

I think we should really hide them from the user. And maybe the way to
do so isn't a bool job->user flag, but actually job->id = NULL. Then it
would work the same way as named/internal BlockBackends do and we would
get rid of the naming problem, too.

Kevin



Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions

2016-10-05 Thread John Snow



On 10/05/2016 03:24 PM, Eric Blake wrote:

On 10/05/2016 01:49 PM, John Snow wrote:


Here we have an additional caller in block/replication.c and qemu-img,
so the parameters must stay. For qemu-img, nothing changes. For
replication, the block job events are added as a side effect.

Not sure if we want to emit such events for an internal block job, but
if we do want the change, it should be explicit.



Hmm, do we want to make it so some jobs are invisible and others are
not? Because as it stands right now, neither case is strictly true. We
only emit cancelled/completed events if it was started via QMP, however
we do emit events for error and ready regardless of who started the job.


Libvirt tries to mirror any block job event it receives to upper layers.
 But if it cannot figure out which upper-layer disk the event is
associated with, it just drops the event.  So I think that from the
libvirt perspective, you are okay if if you always report job events,
even for internal jobs.  (Do we have a quick and easy way to set up an
internal job event, to double-check if I can produce some sort of
libvirt scenario to trigger the event and see that it gets safely ignored?)



Not in a QEMU release yet, I think.



That didn't seem particularly consistent to me; either all events should
be controlled by the job layer itself or none of them should be.

I opted for "all."

For "internal" jobs that did not previously emit any events, is it not
true that these jobs still appear in the block job list and are
effectively public regardless? I'd argue that these messages may be of
value for management utilities who are still blocked by these jobs
whether or not they are 'internal' or not.

I'll push for keeping it mandatory and explicit. If it becomes a
problem, we can always add a 'silent' job property that silences ALL qmp
events, including all completion, error, and ready notices.


Completely silencing an internal job seems a little cleaner than having
events for the job but not being able to query it.  But if nothing
breaks by exposing the internal jobs, that seems even easier than trying
to decide which jobs are internal and hidden vs. user-requested and public.



Well, at the moment anything requested directly via blockdev.c is 
"public." Before 2.8, all jobs were public ones, with the exception of 
those in qemu-img which is a bit of a different/special case.


We have this block/replication.c beast now, though, and it uses 
backup_start and commit_active_start as it sees fit without direct user 
intervention.


As it stands, I believe the jobs that replication creates are 
user-visible via query, will not issue cancellation or completion 
events, but WILL emit error events. It may emit ready events for the 
mirror job it uses, but I haven't traced that. (It could, at least.)









Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions

2016-10-05 Thread Eric Blake
On 10/05/2016 01:49 PM, John Snow wrote:

>> Here we have an additional caller in block/replication.c and qemu-img,
>> so the parameters must stay. For qemu-img, nothing changes. For
>> replication, the block job events are added as a side effect.
>>
>> Not sure if we want to emit such events for an internal block job, but
>> if we do want the change, it should be explicit.
>>
> 
> Hmm, do we want to make it so some jobs are invisible and others are
> not? Because as it stands right now, neither case is strictly true. We
> only emit cancelled/completed events if it was started via QMP, however
> we do emit events for error and ready regardless of who started the job.

Libvirt tries to mirror any block job event it receives to upper layers.
 But if it cannot figure out which upper-layer disk the event is
associated with, it just drops the event.  So I think that from the
libvirt perspective, you are okay if if you always report job events,
even for internal jobs.  (Do we have a quick and easy way to set up an
internal job event, to double-check if I can produce some sort of
libvirt scenario to trigger the event and see that it gets safely ignored?)

> 
> That didn't seem particularly consistent to me; either all events should
> be controlled by the job layer itself or none of them should be.
> 
> I opted for "all."
> 
> For "internal" jobs that did not previously emit any events, is it not
> true that these jobs still appear in the block job list and are
> effectively public regardless? I'd argue that these messages may be of
> value for management utilities who are still blocked by these jobs
> whether or not they are 'internal' or not.
> 
> I'll push for keeping it mandatory and explicit. If it becomes a
> problem, we can always add a 'silent' job property that silences ALL qmp
> events, including all completion, error, and ready notices.

Completely silencing an internal job seems a little cleaner than having
events for the job but not being able to query it.  But if nothing
breaks by exposing the internal jobs, that seems even easier than trying
to decide which jobs are internal and hidden vs. user-requested and public.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions

2016-10-05 Thread John Snow



On 10/05/2016 09:43 AM, Kevin Wolf wrote:

Am 01.10.2016 um 00:00 hat John Snow geschrieben:

There's no reason to leave this to blockdev; we can do it in blockjobs
directly and get rid of an extra callback for most users.

Signed-off-by: John Snow 
---
 blockdev.c | 37 ++---
 blockjob.c | 16 ++--
 2 files changed, 20 insertions(+), 33 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 29c6561..03200e7 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2957,31 +2957,6 @@ out:
 aio_context_release(aio_context);
 }

-static void block_job_cb(void *opaque, int ret)
-{
-/* Note that this function may be executed from another AioContext besides
- * the QEMU main loop.  If you need to access anything that assumes the
- * QEMU global mutex, use a BH or introduce a mutex.
- */
-
-BlockDriverState *bs = opaque;
-const char *msg = NULL;
-
-trace_block_job_cb(bs, bs->job, ret);


This trace event is removed from the code, but not from trace-events.


-
-assert(bs->job);
-
-if (ret < 0) {
-msg = strerror(-ret);
-}
-
-if (block_job_is_cancelled(bs->job)) {
-block_job_event_cancelled(bs->job);
-} else {
-block_job_event_completed(bs->job, msg);


block_job_event_cancelled/completed can become static now.


-}
-}
-
 void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
   bool has_base, const char *base,
   bool has_backing_file, const char *backing_file,
@@ -3033,7 +3008,7 @@ void qmp_block_stream(bool has_job_id, const char 
*job_id, const char *device,
 base_name = has_backing_file ? backing_file : base_name;

 stream_start(has_job_id ? job_id : NULL, bs, base_bs, base_name,
- has_speed ? speed : 0, on_error, block_job_cb, bs, 
_err);
+ has_speed ? speed : 0, on_error, NULL, bs, _err);


Passing cb == NULL, but opaque != NULL is harmless, but feels odd.



Yes.


And actually this is the only caller of stream_start, so the parameters
could just be dropped.



OK. I left the parameters in on purpose, but they can be re-added in the 
future if desired.


(Hm, maybe as part of a CommonJobOpts parameter someday?)


 if (local_err) {
 error_propagate(errp, local_err);
 goto out;
@@ -3136,10 +3111,10 @@ void qmp_block_commit(bool has_job_id, const char 
*job_id, const char *device,
 goto out;
 }
 commit_active_start(has_job_id ? job_id : NULL, bs, base_bs, speed,
-on_error, block_job_cb, bs, _err, false);
+on_error, NULL, bs, _err, false);


Here we have an additional caller in block/replication.c and qemu-img,
so the parameters must stay. For qemu-img, nothing changes. For
replication, the block job events are added as a side effect.

Not sure if we want to emit such events for an internal block job, but
if we do want the change, it should be explicit.



Hmm, do we want to make it so some jobs are invisible and others are 
not? Because as it stands right now, neither case is strictly true. We 
only emit cancelled/completed events if it was started via QMP, however 
we do emit events for error and ready regardless of who started the job.


That didn't seem particularly consistent to me; either all events should 
be controlled by the job layer itself or none of them should be.


I opted for "all."

For "internal" jobs that did not previously emit any events, is it not 
true that these jobs still appear in the block job list and are 
effectively public regardless? I'd argue that these messages may be of 
value for management utilities who are still blocked by these jobs 
whether or not they are 'internal' or not.


I'll push for keeping it mandatory and explicit. If it becomes a 
problem, we can always add a 'silent' job property that silences ALL qmp 
events, including all completion, error, and ready notices.


I've CC'd Wen Congyang and Eric Blake to talk me down if they wish.


 } else {
 commit_start(has_job_id ? job_id : NULL, bs, base_bs, top_bs, speed,
- on_error, block_job_cb, bs,
+ on_error, NULL, bs,
  has_backing_file ? backing_file : NULL, _err);


Like stream_start, drop the parameters.


 }
 if (local_err != NULL) {
@@ -3260,7 +3235,7 @@ static void do_drive_backup(DriveBackup *backup, 
BlockJobTxn *txn, Error **errp)

 backup_start(backup->job_id, bs, target_bs, backup->speed, backup->sync,
  bmap, backup->compress, backup->on_source_error,
- backup->on_target_error, block_job_cb, bs, txn, _err);
+ backup->on_target_error, NULL, bs, txn, _err);
 bdrv_unref(target_bs);
 if (local_err != NULL) {
 error_propagate(errp, local_err);
@@ -3330,7 +3305,7 @@ void do_blockdev_backup(BlockdevBackup *backup, 
BlockJobTxn *txn,