Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Luiz Capitulino
On Mon, 27 Feb 2012 10:42:31 -0600
Anthony Liguori  wrote:

> On 02/27/2012 10:33 AM, Federico Simoncelli wrote:
> > I'm all for the modularity of the commands (I suggested it since the 
> > beginning),
> > but all this infrastructure goes way beyond what I'd need for oVirt now.
> >
> > When I submitted my patches we knew that my work wasn't the definitive 
> > solution
> > (eg: the future implementation of -blockdev). So I'd suggest to try and 
> > settle
> > with something that is good enough and that is not preventing a future 
> > improvement.
> >
> > What about having a (temporary) flag in drive-mirror to accept also a 
> > new-image-file
> > until we will have the optimal solution?
> 
> Unless there are extenuating circumstances (like the absence of core 
> infrastructure in QEMU), then we should not add commands that we know are not 
> the right command.
> 
> We have to support our commands forever.

Agreed. Worst case we have vendor extension support that downstream can use to
add its own commands.



Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Paolo Bonzini
On 02/27/2012 05:58 PM, Anthony Liguori wrote:
>>
>> Jeff could rework his patches to work with transaction begin/commit
>> commands, and Federico can then add drive-reopen and drive-migrate on
>> top.
> 
> Yes, maybe I lack imagination but I fail to see how it generalizes
> easily/nicely.

> From what I can tell, all of the rollback logic is very specific to the
> commands being used, right?

The rollback logic is just "close the new devices".

The commit logic is specific to the commands being used, but reopen
should be easier than snapshot (and basically the same except for
backing_hd handling).

Migrate is really syntactic sugar around reopen, so no surprises there.

Paolo



Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Anthony Liguori

On 02/27/2012 10:54 AM, Paolo Bonzini wrote:

On 02/27/2012 05:53 PM, Anthony Liguori wrote:

It looks like this is exactly
the case where the core infrastructure (transactions) is missing.


Batch requests are incredibly easy to add.  I'm stuck in meetings for
the next couple days but I'm sure Luiz throw it together in no time at all.


Batch requests are unnecessary.  They should be a convenience, not a
correctness feature.


I think this discussion has gotten a bit unwieldy as I'm having trouble keeping 
up with the various proposals.  Can we move to something more formal and written 
on the wiki?


Can we start with a clear description of what the various requirements are?

Regards,

Anthony Liguroi



Paolo






Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Anthony Liguori

On 02/27/2012 10:51 AM, Paolo Bonzini wrote:

On 02/27/2012 04:24 PM, Anthony Liguori wrote:


Then you get an error with the block devices still frozen.  You can
execute another command to reopen back to the old image to roll back the
transaction.

Pushing the rollback logic to the client does make the client interface
a bit more complicated and adds latency to the error path but it's much
easier than building a complex transaction infrastructure.

And there are examples of this in the wild too.  LDAP uses a similar
mechanism.


Actually, have you seen Jeff's atomic snapshot patch?  It really
implements all that is needed to do transactions, and gets 100% ok
error-recovery unlike the existing blockdev_snapshot_sync.  It really
looks like we can do better than client-side error management, and there
is not that much complexity at all.

Jeff could rework his patches to work with transaction begin/commit
commands, and Federico can then add drive-reopen and drive-migrate on top.


Yes, maybe I lack imagination but I fail to see how it generalizes 
easily/nicely.

From what I can tell, all of the rollback logic is very specific to the 
commands being used, right?


Regards,

Anthony Liguori



Paolo





Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Paolo Bonzini
On 02/27/2012 05:53 PM, Anthony Liguori wrote:
>> It looks like this is exactly
>> the case where the core infrastructure (transactions) is missing.
> 
> Batch requests are incredibly easy to add.  I'm stuck in meetings for
> the next couple days but I'm sure Luiz throw it together in no time at all.

Batch requests are unnecessary.  They should be a convenience, not a
correctness feature.

Paolo



Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Anthony Liguori

On 02/27/2012 10:33 AM, Federico Simoncelli wrote:

I'm all for the modularity of the commands (I suggested it since
the beginning),
but all this infrastructure goes way beyond what I'd need for oVirt
now.

When I submitted my patches we knew that my work wasn't the
definitive solution
(eg: the future implementation of -blockdev). So I'd suggest to try
and settle
with something that is good enough and that is not preventing a
future improvement.

What about having a (temporary) flag in drive-mirror to accept also
a new-image-file
until we will have the optimal solution?


Unless there are extenuating circumstances (like the absence of core
infrastructure in QEMU), then we should not add commands that we know
are not
the right command.


So are you in favor or against my suggestion?


Against.  I think we have everything we need.


It looks like this is exactly
the case where the core infrastructure (transactions) is missing.


Batch requests are incredibly easy to add.  I'm stuck in meetings for the next 
couple days but I'm sure Luiz throw it together in no time at all.


Regards,

Anthony Liguori








Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Paolo Bonzini
On 02/27/2012 04:24 PM, Anthony Liguori wrote:
> 
> Then you get an error with the block devices still frozen.  You can
> execute another command to reopen back to the old image to roll back the
> transaction.
> 
> Pushing the rollback logic to the client does make the client interface
> a bit more complicated and adds latency to the error path but it's much
> easier than building a complex transaction infrastructure.
> 
> And there are examples of this in the wild too.  LDAP uses a similar
> mechanism.

Actually, have you seen Jeff's atomic snapshot patch?  It really
implements all that is needed to do transactions, and gets 100% ok
error-recovery unlike the existing blockdev_snapshot_sync.  It really
looks like we can do better than client-side error management, and there
is not that much complexity at all.

Jeff could rework his patches to work with transaction begin/commit
commands, and Federico can then add drive-reopen and drive-migrate on top.

Paolo



Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Federico Simoncelli
- Original Message -
> From: "Anthony Liguori" 
> To: "Federico Simoncelli" 
> Cc: "Paolo Bonzini" , kw...@redhat.com, 
> arm...@redhat.com, "Jeff Cody" ,
> mtosa...@redhat.com, qemu-devel@nongnu.org, "Luiz Capitulino" 
> 
> Sent: Monday, February 27, 2012 5:42:31 PM
> Subject: Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the 
> blockdev-reopen and blockdev-migrate
> commands)
> 
> On 02/27/2012 10:33 AM, Federico Simoncelli wrote:
> > I'm all for the modularity of the commands (I suggested it since
> > the beginning),
> > but all this infrastructure goes way beyond what I'd need for oVirt
> > now.
> >
> > When I submitted my patches we knew that my work wasn't the
> > definitive solution
> > (eg: the future implementation of -blockdev). So I'd suggest to try
> > and settle
> > with something that is good enough and that is not preventing a
> > future improvement.
> >
> > What about having a (temporary) flag in drive-mirror to accept also
> > a new-image-file
> > until we will have the optimal solution?
> 
> Unless there are extenuating circumstances (like the absence of core
> infrastructure in QEMU), then we should not add commands that we know
> are not
> the right command.

So are you in favor or against my suggestion? It looks like this is exactly
the case where the core infrastructure (transactions) is missing.

-- 
Federico



Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Anthony Liguori

On 02/27/2012 10:33 AM, Federico Simoncelli wrote:

I'm all for the modularity of the commands (I suggested it since the beginning),
but all this infrastructure goes way beyond what I'd need for oVirt now.

When I submitted my patches we knew that my work wasn't the definitive solution
(eg: the future implementation of -blockdev). So I'd suggest to try and settle
with something that is good enough and that is not preventing a future 
improvement.

What about having a (temporary) flag in drive-mirror to accept also a 
new-image-file
until we will have the optimal solution?


Unless there are extenuating circumstances (like the absence of core 
infrastructure in QEMU), then we should not add commands that we know are not 
the right command.


We have to support our commands forever.

Regards,

Anthony Liguori








Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Paolo Bonzini
On 02/27/2012 05:33 PM, Federico Simoncelli wrote:
>> > 
>> > blockdev-begin-transaction
>> > drive-reopen device new-image-file
>> > drive-mirror streaming=false device dest
>> > blockdev-commit-transaction
>> > 
>> > No strange optional arguments, no proliferation of commands, etc.
>> >  The
>> > only downside is that if someone tries to do (4) without transactions
>> > (or without stopping the VM) they'll get corruption because atomicity
>> > is
>> > required.
> I'm all for the modularity of the commands (I suggested it since the 
> beginning),
> but all this infrastructure goes way beyond what I'd need for oVirt now.
> 
> When I submitted my patches we knew that my work wasn't the definitive 
> solution
> (eg: the future implementation of -blockdev). So I'd suggest to try and settle
> with something that is good enough and that is not preventing a future 
> improvement.
> 
> What about having a (temporary) flag in drive-mirror to accept also a 
> new-image-file
> until we will have the optimal solution?

What if libvirt could simply try blockdev-freeze and, if it's not there,
try passing a new-image-file argument.  Add the new-image-file
downstream if you have time constraints, and leave it out upstream.  Too
ugly?

Paolo



Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Federico Simoncelli
- Original Message -
> From: "Paolo Bonzini" 
> To: "Luiz Capitulino" 
> Cc: "Federico Simoncelli" , kw...@redhat.com, 
> mtosa...@redhat.com, qemu-devel@nongnu.org,
> arm...@redhat.com, "Jeff Cody" 
> Sent: Monday, February 27, 2012 3:39:33 PM
> Subject: drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen 
> and blockdev-migrate commands)
> 
> > > 4) incremental=true, new_image_file passed:
> > >- no images will be created
> > >- writes will be mirrored to new_image_file and dest
> 
> No need to provide this from within QEMU, because libvirt/oVirt can
> do
> the dance using elementary operations:
> 
> blockdev-begin-transaction
> drive-reopen device new-image-file
> drive-mirror streaming=false device dest
> blockdev-commit-transaction
> 
> No strange optional arguments, no proliferation of commands, etc.
>  The
> only downside is that if someone tries to do (4) without transactions
> (or without stopping the VM) they'll get corruption because atomicity
> is
> required.

I'm all for the modularity of the commands (I suggested it since the beginning),
but all this infrastructure goes way beyond what I'd need for oVirt now.

When I submitted my patches we knew that my work wasn't the definitive solution
(eg: the future implementation of -blockdev). So I'd suggest to try and settle
with something that is good enough and that is not preventing a future 
improvement.

What about having a (temporary) flag in drive-mirror to accept also a 
new-image-file
until we will have the optimal solution?

-- 
Federico



Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Anthony Liguori

On 02/27/2012 09:17 AM, Kevin Wolf wrote:

Am 27.02.2012 15:59, schrieb Anthony Liguori:

On 02/27/2012 08:54 AM, Paolo Bonzini wrote:

On 02/27/2012 03:46 PM, Anthony Liguori wrote:

I think a better way to think of this is as a batch submission.  It
would be relatively easy to model in QMP too (just have a batch-command
that has a list of commands as it's argument).

The difference between batch submission and a transaction is atomic
rollback. But I don't think atomic rollback is really needed here.


A transaction enforces atomicity at the block layer level.  It's
different from batch commands in two ways:

* bdrv_drain_all/bdrv_flush needs to be called at the beginning of the
commit.  This may not be the case with batch commands.

* with batch commands, atomicity happens by chance because VCPUs cannot
send I/O while the monitor is holding the global mutex.


If the commands are designed correctly, then this works well.  The problem is
that the current commands are not designed well.  For instance, multi-snapshot
could look like:

block-freeze ide0-hd0
block-freeze ide1-hd1
block-reopen ide0-hd0 my-new-file0.qcow2
block-reopen ide1-hd1 my-new-file1.qcow2
block-unfreeze ide1-hd1
block-unfreeze ide1-hd0

This would work regardless of whether the commands were implemented
asynchronously within QEMU too.


What if block-reopen ide1-hd1 fails? In case of failure, you want both
drives to point to their old image, not the first one to the new image
and the second one to the old image.


Then you get an error with the block devices still frozen.  You can execute 
another command to reopen back to the old image to roll back the transaction.


Pushing the rollback logic to the client does make the client interface a bit 
more complicated and adds latency to the error path but it's much easier than 
building a complex transaction infrastructure.


And there are examples of this in the wild too.  LDAP uses a similar mechanism.

Regards,

Anthony Liguori



Kevin





Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Kevin Wolf
Am 27.02.2012 15:59, schrieb Anthony Liguori:
> On 02/27/2012 08:54 AM, Paolo Bonzini wrote:
>> On 02/27/2012 03:46 PM, Anthony Liguori wrote:
>>> I think a better way to think of this is as a batch submission.  It
>>> would be relatively easy to model in QMP too (just have a batch-command
>>> that has a list of commands as it's argument).
>>>
>>> The difference between batch submission and a transaction is atomic
>>> rollback. But I don't think atomic rollback is really needed here.
>>
>> A transaction enforces atomicity at the block layer level.  It's
>> different from batch commands in two ways:
>>
>> * bdrv_drain_all/bdrv_flush needs to be called at the beginning of the
>> commit.  This may not be the case with batch commands.
>>
>> * with batch commands, atomicity happens by chance because VCPUs cannot
>> send I/O while the monitor is holding the global mutex.
> 
> If the commands are designed correctly, then this works well.  The problem is 
> that the current commands are not designed well.  For instance, 
> multi-snapshot 
> could look like:
> 
> block-freeze ide0-hd0
> block-freeze ide1-hd1
> block-reopen ide0-hd0 my-new-file0.qcow2
> block-reopen ide1-hd1 my-new-file1.qcow2
> block-unfreeze ide1-hd1
> block-unfreeze ide1-hd0
> 
> This would work regardless of whether the commands were implemented 
> asynchronously within QEMU too.

What if block-reopen ide1-hd1 fails? In case of failure, you want both
drives to point to their old image, not the first one to the new image
and the second one to the old image.

Kevin



Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Anthony Liguori

On 02/27/2012 09:03 AM, Paolo Bonzini wrote:

On 02/27/2012 03:59 PM, Anthony Liguori wrote:

The problem is that the current commands are not designed well.  For
instance, multi-snapshot could look like:

block-freeze ide0-hd0
block-freeze ide1-hd1
block-reopen ide0-hd0 my-new-file0.qcow2
block-reopen ide1-hd1 my-new-file1.qcow2
block-unfreeze ide1-hd1
block-unfreeze ide1-hd0

This would work regardless of whether the commands were implemented
asynchronously within QEMU too.


This looks good, too.  Positive: maps well to fsfreeze/thaw with help
from the guest agent.  Negative: you have to specify the devices three
times.  Overall, I think I like it.

However, you need to add freeze/unfreeze capabilities to the block
layer.  Not hard, but one more thing to do.


Right.  But it also generalizes to other QMP operations which is potentially 
interesting.


And providing mechanisms like this gives more flexibility to management tools to 
implement interesting features without constantly chancing new QMP commands.


Regards,

Anthony Liguori



Paolo





Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Paolo Bonzini
On 02/27/2012 03:59 PM, Anthony Liguori wrote:
> The problem is that the current commands are not designed well.  For
> instance, multi-snapshot could look like:
> 
> block-freeze ide0-hd0
> block-freeze ide1-hd1
> block-reopen ide0-hd0 my-new-file0.qcow2
> block-reopen ide1-hd1 my-new-file1.qcow2
> block-unfreeze ide1-hd1
> block-unfreeze ide1-hd0
> 
> This would work regardless of whether the commands were implemented
> asynchronously within QEMU too.

This looks good, too.  Positive: maps well to fsfreeze/thaw with help
from the guest agent.  Negative: you have to specify the devices three
times.  Overall, I think I like it.

However, you need to add freeze/unfreeze capabilities to the block
layer.  Not hard, but one more thing to do.

Paolo



Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Anthony Liguori

On 02/27/2012 08:54 AM, Paolo Bonzini wrote:

On 02/27/2012 03:46 PM, Anthony Liguori wrote:

I think a better way to think of this is as a batch submission.  It
would be relatively easy to model in QMP too (just have a batch-command
that has a list of commands as it's argument).

The difference between batch submission and a transaction is atomic
rollback. But I don't think atomic rollback is really needed here.


A transaction enforces atomicity at the block layer level.  It's
different from batch commands in two ways:

* bdrv_drain_all/bdrv_flush needs to be called at the beginning of the
commit.  This may not be the case with batch commands.

* with batch commands, atomicity happens by chance because VCPUs cannot
send I/O while the monitor is holding the global mutex.


If the commands are designed correctly, then this works well.  The problem is 
that the current commands are not designed well.  For instance, multi-snapshot 
could look like:


block-freeze ide0-hd0
block-freeze ide1-hd1
block-reopen ide0-hd0 my-new-file0.qcow2
block-reopen ide1-hd1 my-new-file1.qcow2
block-unfreeze ide1-hd1
block-unfreeze ide1-hd0

This would work regardless of whether the commands were implemented 
asynchronously within QEMU too.


Regards,

Anthony Liguori



Paolo





Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Paolo Bonzini
On 02/27/2012 03:46 PM, Anthony Liguori wrote:
> I think a better way to think of this is as a batch submission.  It
> would be relatively easy to model in QMP too (just have a batch-command
> that has a list of commands as it's argument).
> 
> The difference between batch submission and a transaction is atomic
> rollback. But I don't think atomic rollback is really needed here.

A transaction enforces atomicity at the block layer level.  It's
different from batch commands in two ways:

* bdrv_drain_all/bdrv_flush needs to be called at the beginning of the
commit.  This may not be the case with batch commands.

* with batch commands, atomicity happens by chance because VCPUs cannot
send I/O while the monitor is holding the global mutex.

Paolo



Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)

2012-02-27 Thread Anthony Liguori

On 02/27/2012 08:39 AM, Paolo Bonzini wrote:

On 02/27/2012 02:06 PM, Luiz Capitulino wrote:

IMHO, this is asking for two commands, where cases 1&  2 is one of them
and cases 3&  4 is the other one. Note how 'incremental' goes away and
'new_image_file' really becomes an optional.


I really would have no idea on naming except perhaps "drive_migrate" and
"funny_drive_migrate_for_ovirt".  But actually I must admit that it's a
rat's nest.

First, there's no reason why live-snapshotting to new_image_file
shouldn't be handled within QEMU.  That would make the above table much
more orthogonal.  However, oVirt likes to create its own snapshots.

Perhaps we do need to rethink this together with group snapshots.

There are four kinds of operations that we do on block devices:
(a) create an image.  This is part of what blockdev-snapshot does.
(b) switch a block device to a new image.  drive-reopen does this.
(c) add mirroring to a new destination.
(d) activate streaming from a base image

drive-migrate does (b) and (c), will do (a) and (d) when we add
pre-copy, and should do (a) right now if Federico wasn't an oVirt
developer. :)

Thinking more about it, the commands we _do_ need are:
- start a transaction
- switch to a new image
- add mirroring to a new destination
- commit a transaction
- rollback a transaction
- query transaction errors


I think a better way to think of this is as a batch submission.  It would be 
relatively easy to model in QMP too (just have a batch-command that has a list 
of commands as it's argument).


The difference between batch submission and a transaction is atomic rollback. 
But I don't think atomic rollback is really needed here.


Regards,

Anthony Liguori