Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/29/11 15:45, Anthony Liguori wrote: On 04/29/2011 08:38 AM, Jes Sorensen wrote: It is exactly the same for the management tool: - Creation of the new image either succeeds or fails - Switchover either succeeds or fails Creating an image can be treated as an atomic operation. It either fully succeeds or you assume it failed and throw the image away since you can always try again. When you combine creating an image with another operation, you create a a single operation that cannot be treated as atomic which makes recovery from failure much more difficult. As explained in previous emails, you still have the option to do this with the current implementation, if you so desire. It's a bad idea, but for those who insist on doing the wrong thing, sure go ahead. Just create the image first, and then pass it in as the snapshot file rather than specifying a snapshot file that doesn't exist already. 3) Begin switch over to new image 4) Switch over image. 5) Receive notification when it's complete Sorry but this isn't an accurate description of the process. Once you have a new image ready, you need to halt writes, flush buffers, and then you can do the switch, which has to be atomic. You need to queue writes, you can still let the guest run while the remaining writes are sent to disk. But if this is a background task, then as a management tool, don't I want a notification when it's been completed? Don't I want to have a little spinning box in my GUI or something like that while this is happening? I agree that having an interface that can run it in the background asynchronously isn't a bad thing, and it would be a nice feature. However it really doesn't qualify as anything but an enhancement in my book. You already have the option to pre-create the snapshot image if you so desire. But combining 1-5 in a single interface creates a command that while convenient on the command line, is not usable for a robust management tool. As I explained you can already use an externally created image with the current interface, the only issue that may be worth doing async is the buffer flushing. However once you do that, guest writes are going to stall anyway, and eventually the guest applications will all stall. The interface shouldn't have the option to create an image... because if you have that, the interface is difficult to use. Why present an option to do something that we know is broken? We can't blame management tools for not doing a good job managing KVM if we're giving them poor interfaces to work with. Sorry this is utterly bogus. This is *not* broken, it is the right way to do things. It is clearly a matter of taste whether you prefer it one way or another, but it is not broken. There is nothing wrong with it being an option, especially since you don't have to do anything magic for creating an image in advance. I realize it doesn't match your image of how you would like the command to work, but that is not the same as it being broken. I really see nothing in your above arguments that backs up the argument that the current implementation is broken in any way. I am totally in favor or having an async implementation as well, but I really don't see it being as big an issue as you are making it. Regards, Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/28/11 17:10, Anthony Liguori wrote: On 04/28/2011 09:57 AM, Jes Sorensen wrote: On 04/28/11 16:46, Anthony Liguori wrote: Sorry this is inherently broken. The management tool should not be keeping state in this process. I agree an async interface would be nice, but the above process is plain wrong. The async snapshot process needs to be doing the exact same as in the current implementation, the main difference is that it would be running asynchronously and that QMP would be able to query the state of it. No, the command does too many things and as such, makes it impossible for a management tool to gracefully recover. It is exactly the same for the management tool: - Creation of the new image either succeeds or fails - Switchover either succeeds or fails If you have a crash in the process, you still can only know by checking the consistency of the new image after rebooting. There is no issue here, you have the exact same problem if you get a crash during d) in your example. It is the same with the existing command, the crash is only an issue if it happens right in the middle of the switch over. Until then, only the original image remains valid. But the new image is always valid once it's been created as pending writes are lost no matter what in a hard power failure. That suggests the following flow: 1) Create new image with a backing file 2) Completion ACK At this stage, the management tool should update it's internal state to point to the new image. This can still be done with the existing code - it's not the ideal setup, but it is possible due to evil things such as selinux labeling. 3) Begin switch over to new image 4) Switch over image. 5) Receive notification when it's complete Sorry but this isn't an accurate description of the process. Once you have a new image ready, you need to halt writes, flush buffers, and then you can do the switch, which has to be atomic. The only thing that can really be async in all of this is the buffer flushing. There isn't any long switch-over process. If (4) is happening asynchronously, things get kind of complicated. You can either wait for things to quiesce on their own or you can queue pending requests. It's unclear to me what the right strategy is or if there's a mixed strategy that's needed. I think it makes sense to treat 3/4 as a command with (5) being an event notification. The actual guest application freeze (what some strange people use the quiicannotspell word for) is done prior to the switchover, you cannot do it in parallel with it. But combining 1-5 in a single interface creates a command that while convenient on the command line, is not usable for a robust management tool. As I explained you can already use an externally created image with the current interface, the only issue that may be worth doing async is the buffer flushing. However once you do that, guest writes are going to stall anyway, and eventually the guest applications will all stall. The single command is both better from a consistency perspective, and it will do the right job. Things are not going to get any easier from the management tool's perspective than they are with the current interface. Cheers, Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/29/2011 08:38 AM, Jes Sorensen wrote: On 04/28/11 17:10, Anthony Liguori wrote: No, the command does too many things and as such, makes it impossible for a management tool to gracefully recover. It is exactly the same for the management tool: - Creation of the new image either succeeds or fails - Switchover either succeeds or fails Creating an image can be treated as an atomic operation. It either fully succeeds or you assume it failed and throw the image away since you can always try again. When you combine creating an image with another operation, you create a a single operation that cannot be treated as atomic which makes recovery from failure much more difficult. But the new image is always valid once it's been created as pending writes are lost no matter what in a hard power failure. That suggests the following flow: 1) Create new image with a backing file 2) Completion ACK At this stage, the management tool should update it's internal state to point to the new image. This can still be done with the existing code - it's not the ideal setup, but it is possible due to evil things such as selinux labeling I don't follow. 3) Begin switch over to new image 4) Switch over image. 5) Receive notification when it's complete Sorry but this isn't an accurate description of the process. Once you have a new image ready, you need to halt writes, flush buffers, and then you can do the switch, which has to be atomic. You need to queue writes, you can still let the guest run while the remaining writes are sent to disk. But if this is a background task, then as a management tool, don't I want a notification when it's been completed? Don't I want to have a little spinning box in my GUI or something like that while this is happening? If (4) is happening asynchronously, things get kind of complicated. You can either wait for things to quiesce on their own or you can queue pending requests. It's unclear to me what the right strategy is or if there's a mixed strategy that's needed. I think it makes sense to treat 3/4 as a command with (5) being an event notification. The actual guest application freeze (what some strange people use the quiicannotspell word for) is done prior to the switchover, you cannot do it in parallel with it. I'm not even talking about application quiescing here. And yeah, I have to frequently google that word because Thunderbird doesn't have it in it's dictionary :-) But combining 1-5 in a single interface creates a command that while convenient on the command line, is not usable for a robust management tool. As I explained you can already use an externally created image with the current interface, the only issue that may be worth doing async is the buffer flushing. However once you do that, guest writes are going to stall anyway, and eventually the guest applications will all stall. The interface shouldn't have the option to create an image... because if you have that, the interface is difficult to use. Why present an option to do something that we know is broken? We can't blame management tools for not doing a good job managing KVM if we're giving them poor interfaces to work with. Regards, Anthony Liguori The single command is both better from a consistency perspective, and it will do the right job. Things are not going to get any easier from the management tool's perspective than they are with the current interface. Cheers, Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/27/11 17:05, Jes Sorensen wrote: On 04/27/11 17:05, Luiz Capitulino wrote: On Mon, 18 Apr 2011 16:27:01 +0200 jes.soren...@redhat.com wrote: From: Jes Sorensen jes.soren...@redhat.com This is quivalent to snapshot_blkdev in the human monitor, with _sync added to the command name to make it explicit that the command is synchronous and leave space for a future async version. I'm not sure appending _sync is such a good convention, most commands are sync today and they don't have it. I'd prefer to call it snapshot_blkdev and note in the documentation how it works. On the other hand, I'm not sure how Anthony is going to model async commands, so maybe he has a better suggestion. The _sync prefix is on purpose to leave space for a possible async implementation of the snapshot command in the future. This isn't related to it being a sync vs async qmp command though. If people are more comfortable with the QMP command being blockdev-snapshot and then using {-,_}async for the async comment in the monitor and QMP later, that is fine with me. I'd just like to move on this and get it upstream. Cheers, Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/27/11 17:05, Luiz Capitulino wrote: +Synchronous snapshot of block device, using snapshot file as target +if provided. It's not optional in HMP: (qemu) snapshot_blkdev ide0-hd0 Parameter 'snapshot_file' is missing (qemu) The parameter is optional in HMP, however it will fail if there is no snapshot filename or there is no flag for internal snapshots (which is currently unsupported). Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/27/11 17:05, Luiz Capitulino wrote: +If a new image file is specified, the new image file will become the +new root image. If format is specified, the snapshot file will be +created in that format. Otherwise the snapshot will be internal! +(currently unsupported). Sorry for the stupid question, but what's a new root image? Also, all these assumptions seem human features to me, as it can save some typing and I can poke around to see where the snapshots are stored. All arguments should be mandatory in QMP, IMO. Sorry, but there is absolutely no reason to make all arguments mandatory. Sure it can be done, but the only result is a separate handling function for it, so we got more almost identical, but still different code to maintain. Finally, what's the expect behavior when -snapshot is used? I'm getting this: (qemu) snapshot_blkdev ide0-hd0 snap-test Could not open '/tmp/vl.6w8YXA' (qemu) What type of file system is your /tmp? You need to provide full path to the snapshot file if you don't want it created next to where your qemu binary is being executed. At first, I don't see why we shouldn't generate the live snapshot, but anyway, any special behavior like this should be noted in the section called Notes in the command's documentation. I don't follow this at all, please elaborate. Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
Am 28.04.2011 15:21, schrieb Jes Sorensen: On 04/27/11 17:05, Luiz Capitulino wrote: +If a new image file is specified, the new image file will become the +new root image. If format is specified, the snapshot file will be +created in that format. Otherwise the snapshot will be internal! +(currently unsupported). Sorry for the stupid question, but what's a new root image? Also, all these assumptions seem human features to me, as it can save some typing and I can poke around to see where the snapshots are stored. All arguments should be mandatory in QMP, IMO. Sorry, but there is absolutely no reason to make all arguments mandatory. Sure it can be done, but the only result is a separate handling function for it, so we got more almost identical, but still different code to maintain. Finally, what's the expect behavior when -snapshot is used? I'm getting this: (qemu) snapshot_blkdev ide0-hd0 snap-test Could not open '/tmp/vl.6w8YXA' (qemu) What type of file system is your /tmp? You need to provide full path to the snapshot file if you don't want it created next to where your qemu binary is being executed. I think the problem is that this is a temporary file, i.e. unlinked directly after it has been opened. Trying to reopen a deleted file is a bad idea. Kevin
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/28/11 15:41, Kevin Wolf wrote: Finally, what's the expect behavior when -snapshot is used? I'm getting this: (qemu) snapshot_blkdev ide0-hd0 snap-test Could not open '/tmp/vl.6w8YXA' (qemu) What type of file system is your /tmp? You need to provide full path to the snapshot file if you don't want it created next to where your qemu binary is being executed. I think the problem is that this is a temporary file, i.e. unlinked directly after it has been opened. Trying to reopen a deleted file is a bad idea. True, but if /tmp is tmpfs, it isn't possible to open things O_DIRECT, which could also be the source of the problem here. Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On Thu, 28 Apr 2011 15:21:41 +0200 Jes Sorensen jes.soren...@redhat.com wrote: On 04/27/11 17:05, Luiz Capitulino wrote: +If a new image file is specified, the new image file will become the +new root image. If format is specified, the snapshot file will be +created in that format. Otherwise the snapshot will be internal! +(currently unsupported). Sorry for the stupid question, but what's a new root image? Also, all these assumptions seem human features to me, as it can save some typing and I can poke around to see where the snapshots are stored. All arguments should be mandatory in QMP, IMO. Sorry, but there is absolutely no reason to make all arguments mandatory. Sure it can be done, but the only result is a separate handling function for it, so we got more almost identical, but still different code to maintain. We shouldn't compromise our external interface quality because of implementation details. What I'm really asking here is whether this is a good command for our management tools. For example, I've just realized that the new root image is going to be automatically created after the first call to this command, and subsequent calls w/o the snapshot file name will re-use that file. Is that correct? Also note the optional format usage, the command (randomly) picks qcow2 if the format is not given. What happens if I pass a raw image and don't specify the format? Will it work as it works for qcow2? I'm not exactly asking for mandatory arguments. For the format argument for example, we could try to auto-detect the format (is it possible)? And then we could fail with a meaningful error message. And, I also would like to hear from Anthony, as he's picking up QMP maintenance. Finally, what's the expect behavior when -snapshot is used? I'm getting this: (qemu) snapshot_blkdev ide0-hd0 snap-test Could not open '/tmp/vl.6w8YXA' (qemu) What type of file system is your /tmp? ext4 You need to provide full path to the snapshot file if you don't want it created next to where your qemu binary is being executed. I'm not running in /tmp. At first, I don't see why we shouldn't generate the live snapshot, but anyway, any special behavior like this should be noted in the section called Notes in the command's documentation. I don't follow this at all, please elaborate. Any kind of limitation should be noted in the documentation.
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/28/11 16:21, Luiz Capitulino wrote: On Thu, 28 Apr 2011 15:21:41 +0200 Jes Sorensen jes.soren...@redhat.com wrote: On 04/27/11 17:05, Luiz Capitulino wrote: All arguments should be mandatory in QMP, IMO. Sorry, but there is absolutely no reason to make all arguments mandatory. Sure it can be done, but the only result is a separate handling function for it, so we got more almost identical, but still different code to maintain. We shouldn't compromise our external interface quality because of implementation details. What I'm really asking here is whether this is a good command for our management tools. It has been discussed repeatedly for months, so yes I will argue it is. For example, I've just realized that the new root image is going to be automatically created after the first call to this command, and subsequent calls w/o the snapshot file name will re-use that file. Is that correct? No of course not. Every call to snapshot-blockdev will create a new snapshot. If you don't specify a filename, the new snapshot will be internal, except you will get an error as we don't currently support that. snapshots can be chained, so you can end up with a snapshot pointing to the previous snapshot, which points to the previous snapshot, which points to the original image . Also note the optional format usage, the command (randomly) picks qcow2 if the format is not given. What happens if I pass a raw image and don't specify the format? Will it work as it works for qcow2? The command doesn't pick randomly, it picks the default cow format for qemu. You cannot pass a raw image as it is not cow compatible. I'm not exactly asking for mandatory arguments. For the format argument for example, we could try to auto-detect the format (is it possible)? And then we could fail with a meaningful error message. The code is in there now, and there hasn't been requests for this in the past. The introduction of the qmp wrapper is not the place to discuss this. And, I also would like to hear from Anthony, as he's picking up QMP maintenance. Anthony already stated to me that he was fairly happy with it. However you are the QMP maintainer, so it needs to go in via the QMP tree. Finally, what's the expect behavior when -snapshot is used? I'm getting this: (qemu) snapshot_blkdev ide0-hd0 snap-test Could not open '/tmp/vl.6w8YXA' (qemu) What type of file system is your /tmp? ext4 You need to provide full path to the snapshot file if you don't want it created next to where your qemu binary is being executed. I'm not running in /tmp. Well something is funny with your /tmp then, as the above isn't normal behavior. At first, I don't see why we shouldn't generate the live snapshot, but anyway, any special behavior like this should be noted in the section called Notes in the command's documentation. I don't follow this at all, please elaborate. Any kind of limitation should be noted in the documentation. We cannot document a users choice of /tmp, when /tmp isn't part of what the command does. Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/27/2011 10:05 AM, Luiz Capitulino wrote: On Mon, 18 Apr 2011 16:27:01 +0200 jes.soren...@redhat.com wrote: From: Jes Sorensenjes.soren...@redhat.com This is quivalent to snapshot_blkdev in the human monitor, with _sync added to the command name to make it explicit that the command is synchronous and leave space for a future async version. I'm not sure appending _sync is such a good convention, most commands are sync today and they don't have it. I'd prefer to call it snapshot_blkdev and note in the documentation how it works. It probably should be called snapshot_blkdev_broken because that's what it really is. The '_sync' is there to indicate that this command doesn't properly implement asynchronous logic and can break a guest. I'd actually prefer that we not expose this command through QMP at all and instead implement a proper snapshot command. Regards, Anthony Liguori On the other hand, I'm not sure how Anthony is going to model async commands, so maybe he has a better suggestion. Signed-off-by: Jes Sorensenjes.soren...@redhat.com --- qmp-commands.hx | 26 ++ 1 files changed, 26 insertions(+), 0 deletions(-) diff --git a/qmp-commands.hx b/qmp-commands.hx index fbd98ee..b8f537c 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -667,6 +667,32 @@ Example: EQMP { +.name = blockdev-snapshot-sync, +.args_type = device:B,snapshot_file:s?,format:s?, +.params = device [new-image-file] [format], +.user_print = monitor_user_noop, +.mhandler.cmd_new = do_snapshot_blkdev, +}, + +SQMP This doesn't follow QMP doc convention, which is: command name (Explain how the command works, like you do below) Arguments Example +Synchronous snapshot of block device, using snapshot file as target +if provided. It's not optional in HMP: (qemu) snapshot_blkdev ide0-hd0 Parameter 'snapshot_file' is missing (qemu) And the command argument is called snapshot_file not new-image-file as written in the HMP help text. + +If a new image file is specified, the new image file will become the +new root image. If format is specified, the snapshot file will be +created in that format. Otherwise the snapshot will be internal! +(currently unsupported). Sorry for the stupid question, but what's a new root image? Also, all these assumptions seem human features to me, as it can save some typing and I can poke around to see where the snapshots are stored. All arguments should be mandatory in QMP, IMO. Finally, what's the expect behavior when -snapshot is used? I'm getting this: (qemu) snapshot_blkdev ide0-hd0 snap-test Could not open '/tmp/vl.6w8YXA' (qemu) At first, I don't see why we shouldn't generate the live snapshot, but anyway, any special behavior like this should be noted in the section called Notes in the command's documentation. + +Errors: +If creating the new snapshot image fails, QEMU will continue running +on the original image. If switching to the newly created image fails, +it will be attempted to fall back to the original image and return +QERR_OPEN_FILE_FAILED with the snapshot filename. If re-opening +the original image fails, QERR_OPEN_FILE_FAILED will be returned with +the original image filename. +EQMP + +{ .name = balloon, .args_type = value:M, .params = target,
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/28/2011 09:21 AM, Luiz Capitulino wrote: On Thu, 28 Apr 2011 15:21:41 +0200 Jes Sorensenjes.soren...@redhat.com wrote: On 04/27/11 17:05, Luiz Capitulino wrote: +If a new image file is specified, the new image file will become the +new root image. If format is specified, the snapshot file will be +created in that format. Otherwise the snapshot will be internal! +(currently unsupported). Sorry for the stupid question, but what's a new root image? Also, all these assumptions seem human features to me, as it can save some typing and I can poke around to see where the snapshots are stored. All arguments should be mandatory in QMP, IMO. Sorry, but there is absolutely no reason to make all arguments mandatory. Sure it can be done, but the only result is a separate handling function for it, so we got more almost identical, but still different code to maintain. We shouldn't compromise our external interface quality because of implementation details. What I'm really asking here is whether this is a good command for our management tools. For example, I've just realized that the new root image is going to be automatically created after the first call to this command, and subsequent calls w/o the snapshot file name will re-use that file. Is that correct? Also note the optional format usage, the command (randomly) picks qcow2 if the format is not given. What happens if I pass a raw image and don't specify the format? Will it work as it works for qcow2? I'm not exactly asking for mandatory arguments. For the format argument for example, we could try to auto-detect the format (is it possible)? And then we could fail with a meaningful error message. And, I also would like to hear from Anthony, as he's picking up QMP maintenance. I've been ignoring this interface because it's fundamentally broken. Maybe we should not expose this via QMP and instead focus on making a proper interface for this operation. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/28/11 16:36, Anthony Liguori wrote: On 04/27/2011 10:05 AM, Luiz Capitulino wrote: On Mon, 18 Apr 2011 16:27:01 +0200 jes.soren...@redhat.com wrote: From: Jes Sorensenjes.soren...@redhat.com This is quivalent to snapshot_blkdev in the human monitor, with _sync added to the command name to make it explicit that the command is synchronous and leave space for a future async version. I'm not sure appending _sync is such a good convention, most commands are sync today and they don't have it. I'd prefer to call it snapshot_blkdev and note in the documentation how it works. It probably should be called snapshot_blkdev_broken because that's what it really is. The '_sync' is there to indicate that this command doesn't properly implement asynchronous logic and can break a guest. I'd actually prefer that we not expose this command through QMP at all and instead implement a proper snapshot command. Sorry but this is utterly bogus. The snapshot support as is works fine, and the command is in the monitor. We should expose it in QMP as well. If we eventually get a different implementation, then we can rename it or replace it then. Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
Am 28.04.2011 15:46, schrieb Jes Sorensen: On 04/28/11 15:41, Kevin Wolf wrote: Finally, what's the expect behavior when -snapshot is used? I'm getting this: (qemu) snapshot_blkdev ide0-hd0 snap-test Could not open '/tmp/vl.6w8YXA' (qemu) What type of file system is your /tmp? You need to provide full path to the snapshot file if you don't want it created next to where your qemu binary is being executed. I think the problem is that this is a temporary file, i.e. unlinked directly after it has been opened. Trying to reopen a deleted file is a bad idea. True, but if /tmp is tmpfs, it isn't possible to open things O_DIRECT, which could also be the source of the problem here. Not really, -snapshot is very clearly the problem here. Note that what's failing here is not opening the new snapshot but the old temporary image created by -snapshot. Kevin
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/28/11 16:42, Kevin Wolf wrote: What type of file system is your /tmp? You need to provide full path to the snapshot file if you don't want it created next to where your qemu binary is being executed. I think the problem is that this is a temporary file, i.e. unlinked directly after it has been opened. Trying to reopen a deleted file is a bad idea. True, but if /tmp is tmpfs, it isn't possible to open things O_DIRECT, which could also be the source of the problem here. Not really, -snapshot is very clearly the problem here. Note that what's failing here is not opening the new snapshot but the old temporary image created by -snapshot. Sorry you are losing me here - why would reopening the old image fail? And if it fails, why does it have such a bizarre name in Luiz's case? Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/28/2011 09:38 AM, Jes Sorensen wrote: Sorry but this is utterly bogus. The snapshot support as is works fine, and the command is in the monitor. We should expose it in QMP as well. It went in for the monitor because it was considered an imperfect command so we held up the QMP side because we wanted a better interface. The current command does: 1) Create new image backing to current image 2) Flush outstanding I/O to old image 3) Close current image 4) Reopen newly created image 5) Go Operations (1) and (2) are very synchronous operations. (4) can be too. We really should have a bdrv_aio_snapshot() function that implements the logic for at least (2) in an asynchronous fashion. That sort of interface is going to affect how we expose things in QMP. As from a QMP perspective, we're going to do something like: a) start snapshot b) query snapshot progress c) receive notification of snapshot completion d) flip over image And of course, this needs to be carefully thought through for race conditions. In the current command, what happens if you get a crash between (2) and (3)? There's no way for the management tools to know that we didn't finish flushing writes. How does the management tool know that (1) didn't fail mid way through resulting in a corrupted image? Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/28/11 16:46, Anthony Liguori wrote: On 04/28/2011 09:38 AM, Jes Sorensen wrote: Sorry but this is utterly bogus. The snapshot support as is works fine, and the command is in the monitor. We should expose it in QMP as well. It went in for the monitor because it was considered an imperfect command so we held up the QMP side because we wanted a better interface. I am not sure who is included in the 'we' here. The current command does: 1) Create new image backing to current image 2) Flush outstanding I/O to old image 3) Close current image 4) Reopen newly created image 5) Go Operations (1) and (2) are very synchronous operations. (4) can be too. We really should have a bdrv_aio_snapshot() function that implements the logic for at least (2) in an asynchronous fashion. That sort of interface is going to affect how we expose things in QMP. As from a QMP perspective, we're going to do something like: a) start snapshot b) query snapshot progress c) receive notification of snapshot completion d) flip over image Sorry this is inherently broken. The management tool should not be keeping state in this process. I agree an async interface would be nice, but the above process is plain wrong. The async snapshot process needs to be doing the exact same as in the current implementation, the main difference is that it would be running asynchronously and that QMP would be able to query the state of it. You definitely do *not* want the management tool to launch the snapshot process, and then do the flip by the management tool later. It should all happen as part of the original command. Being able to query it while it is progressing is a different story. The one reason why this isn't all that big an issue in the first place, is that in the normal usage case, a user will run a guest agent which freezes the guest file systems during the snapshot process, so the fact that the existing command is synchronous pretty much disappears in the noise. And of course, this needs to be carefully thought through for race conditions. In the current command, what happens if you get a crash between (2) and (3)? There's no way for the management tools to know that we didn't finish flushing writes. How does the management tool know that (1) didn't fail mid way through resulting in a corrupted image? There is no issue here, you have the exact same problem if you get a crash during d) in your example. It is the same with the existing command, the crash is only an issue if it happens right in the middle of the switch over. Until then, only the original image remains valid. Cheers, Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/28/2011 09:57 AM, Jes Sorensen wrote: On 04/28/11 16:46, Anthony Liguori wrote: On 04/28/2011 09:38 AM, Jes Sorensen wrote: Sorry but this is utterly bogus. The snapshot support as is works fine, and the command is in the monitor. We should expose it in QMP as well. It went in for the monitor because it was considered an imperfect command so we held up the QMP side because we wanted a better interface. I am not sure who is included in the 'we' here. The current command does: 1) Create new image backing to current image 2) Flush outstanding I/O to old image 3) Close current image 4) Reopen newly created image 5) Go Operations (1) and (2) are very synchronous operations. (4) can be too. We really should have a bdrv_aio_snapshot() function that implements the logic for at least (2) in an asynchronous fashion. That sort of interface is going to affect how we expose things in QMP. As from a QMP perspective, we're going to do something like: a) start snapshot b) query snapshot progress c) receive notification of snapshot completion d) flip over image Sorry this is inherently broken. The management tool should not be keeping state in this process. I agree an async interface would be nice, but the above process is plain wrong. The async snapshot process needs to be doing the exact same as in the current implementation, the main difference is that it would be running asynchronously and that QMP would be able to query the state of it. No, the command does too many things and as such, makes it impossible for a management tool to gracefully recover. And of course, this needs to be carefully thought through for race conditions. In the current command, what happens if you get a crash between (2) and (3)? There's no way for the management tools to know that we didn't finish flushing writes. How does the management tool know that (1) didn't fail mid way through resulting in a corrupted image? There is no issue here, you have the exact same problem if you get a crash during d) in your example. It is the same with the existing command, the crash is only an issue if it happens right in the middle of the switch over. Until then, only the original image remains valid. But the new image is always valid once it's been created as pending writes are lost no matter what in a hard power failure. That suggests the following flow: 1) Create new image with a backing file 2) Completion ACK At this stage, the management tool should update it's internal state to point to the new image. 3) Begin switch over to new image 4) Switch over image. 5) Receive notification when it's complete If (4) is happening asynchronously, things get kind of complicated. You can either wait for things to quiesce on their own or you can queue pending requests. It's unclear to me what the right strategy is or if there's a mixed strategy that's needed. I think it makes sense to treat 3/4 as a command with (5) being an event notification. But combining 1-5 in a single interface creates a command that while convenient on the command line, is not usable for a robust management tool. Regards, Anthony Liguori Cheers, Jes
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On Mon, 18 Apr 2011 16:27:01 +0200 jes.soren...@redhat.com wrote: From: Jes Sorensen jes.soren...@redhat.com This is quivalent to snapshot_blkdev in the human monitor, with _sync added to the command name to make it explicit that the command is synchronous and leave space for a future async version. I'm not sure appending _sync is such a good convention, most commands are sync today and they don't have it. I'd prefer to call it snapshot_blkdev and note in the documentation how it works. On the other hand, I'm not sure how Anthony is going to model async commands, so maybe he has a better suggestion. Signed-off-by: Jes Sorensen jes.soren...@redhat.com --- qmp-commands.hx | 26 ++ 1 files changed, 26 insertions(+), 0 deletions(-) diff --git a/qmp-commands.hx b/qmp-commands.hx index fbd98ee..b8f537c 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -667,6 +667,32 @@ Example: EQMP { +.name = blockdev-snapshot-sync, +.args_type = device:B,snapshot_file:s?,format:s?, +.params = device [new-image-file] [format], +.user_print = monitor_user_noop, +.mhandler.cmd_new = do_snapshot_blkdev, +}, + +SQMP This doesn't follow QMP doc convention, which is: command name (Explain how the command works, like you do below) Arguments Example +Synchronous snapshot of block device, using snapshot file as target +if provided. It's not optional in HMP: (qemu) snapshot_blkdev ide0-hd0 Parameter 'snapshot_file' is missing (qemu) And the command argument is called snapshot_file not new-image-file as written in the HMP help text. + +If a new image file is specified, the new image file will become the +new root image. If format is specified, the snapshot file will be +created in that format. Otherwise the snapshot will be internal! +(currently unsupported). Sorry for the stupid question, but what's a new root image? Also, all these assumptions seem human features to me, as it can save some typing and I can poke around to see where the snapshots are stored. All arguments should be mandatory in QMP, IMO. Finally, what's the expect behavior when -snapshot is used? I'm getting this: (qemu) snapshot_blkdev ide0-hd0 snap-test Could not open '/tmp/vl.6w8YXA' (qemu) At first, I don't see why we shouldn't generate the live snapshot, but anyway, any special behavior like this should be noted in the section called Notes in the command's documentation. + +Errors: +If creating the new snapshot image fails, QEMU will continue running +on the original image. If switching to the newly created image fails, +it will be attempted to fall back to the original image and return +QERR_OPEN_FILE_FAILED with the snapshot filename. If re-opening +the original image fails, QERR_OPEN_FILE_FAILED will be returned with +the original image filename. +EQMP + +{ .name = balloon, .args_type = value:M, .params = target,
Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
On 04/27/11 17:05, Luiz Capitulino wrote: On Mon, 18 Apr 2011 16:27:01 +0200 jes.soren...@redhat.com wrote: From: Jes Sorensen jes.soren...@redhat.com This is quivalent to snapshot_blkdev in the human monitor, with _sync added to the command name to make it explicit that the command is synchronous and leave space for a future async version. I'm not sure appending _sync is such a good convention, most commands are sync today and they don't have it. I'd prefer to call it snapshot_blkdev and note in the documentation how it works. On the other hand, I'm not sure how Anthony is going to model async commands, so maybe he has a better suggestion. The _sync prefix is on purpose to leave space for a possible async implementation of the snapshot command in the future. This isn't related to it being a sync vs async qmp command though. Jes
[Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
From: Jes Sorensen jes.soren...@redhat.com This is quivalent to snapshot_blkdev in the human monitor, with _sync added to the command name to make it explicit that the command is synchronous and leave space for a future async version. Signed-off-by: Jes Sorensen jes.soren...@redhat.com --- qmp-commands.hx | 26 ++ 1 files changed, 26 insertions(+), 0 deletions(-) diff --git a/qmp-commands.hx b/qmp-commands.hx index fbd98ee..b8f537c 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -667,6 +667,32 @@ Example: EQMP { +.name = blockdev-snapshot-sync, +.args_type = device:B,snapshot_file:s?,format:s?, +.params = device [new-image-file] [format], +.user_print = monitor_user_noop, +.mhandler.cmd_new = do_snapshot_blkdev, +}, + +SQMP +Synchronous snapshot of block device, using snapshot file as target +if provided. + +If a new image file is specified, the new image file will become the +new root image. If format is specified, the snapshot file will be +created in that format. Otherwise the snapshot will be internal! +(currently unsupported). + +Errors: +If creating the new snapshot image fails, QEMU will continue running +on the original image. If switching to the newly created image fails, +it will be attempted to fall back to the original image and return +QERR_OPEN_FILE_FAILED with the snapshot filename. If re-opening +the original image fails, QERR_OPEN_FILE_FAILED will be returned with +the original image filename. +EQMP + +{ .name = balloon, .args_type = value:M, .params = target, -- 1.7.4.4