Re: [Qemu-devel] [v2 Patch 2/9]block: raw-posix image file reopen
On 08/10/2012 07:15 PM, Corey Bryant wrote: > > > On 07/30/2012 05:34 PM, Supriya Kannery wrote: >> +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState >> **prs, >> + int flags) >> +{ >> + BDRVRawReopenState *raw_rs = g_malloc0(sizeof(BDRVRawReopenState)); >> + BDRVRawState *s = bs->opaque; >> + int ret = 0; >> + >> + raw_rs->reopen_state.bs = bs; >> + >> + /* stash state before reopen */ >> + raw_rs->stash_s = g_malloc0(sizeof(BDRVRawState)); >> + raw_stash_state(raw_rs->stash_s, s); >> + s->fd = dup3(raw_rs->stash_s->fd, s->fd, O_CLOEXEC); >> + >> + *prs = &(raw_rs->reopen_state); >> + >> + /* Flags that can be set using fcntl */ >> + int fcntl_flags = BDRV_O_NOCACHE; >> + >> + if ((bs->open_flags & ~fcntl_flags) == (flags & ~fcntl_flags)) { >> + if ((flags & BDRV_O_NOCACHE)) { >> + s->open_flags |= O_DIRECT; >> + } else { >> + s->open_flags &= ~O_DIRECT; >> + } >> + ret = fcntl_setfl(s->fd, s->open_flags); >> + } else { >> + >> + /* close and reopen using new flags */ >> + bs->drv->bdrv_close(bs); >> + ret = bs->drv->bdrv_file_open(bs, bs->filename, flags); > > Will this allow the fdset refcount to get to zero? I was hoping your > patches would prevent that from happening. Perhaps Kevin or Eric can > weigh in. qemu_open() increments the refcount for an fdset when an fd > from it is used, and qemu_close() decrements it. I think if you were > able to perform the open before the close here that refcount wouldn't > get to zero. > Since we are duping the file descriptor before reaching this bdrv_close(), refcount for fd won't become zero. - thanks, Supriya
Re: [Qemu-devel] [v2 Patch 2/9]block: raw-posix image file reopen
On 07/31/2012 10:47 PM, Eric Blake wrote: > On 07/30/2012 03:34 PM, Supriya Kannery wrote: >> +s->fd = dup3(raw_rs->stash_s->fd, s->fd, O_CLOEXEC); > > You called it out in your cover letter, but just pointing out that this > is one of the locations that needs a conditional fallback to > dup2/qemu_set_cloexec if dup3 and O_CLOEXEC are missing. > > More importantly, though, you want this to be fcntl(F_DUP_CLOEXEC) and > NOT dup3, so that you are duplicating to the first available fd rather > than accidentally overwriting whatever s->fd happened to be on entry. > Also, where is your error checking that you didn't just assign s->fd to > -1? It looks like your goal here is to stash a copy of the fd, so that > on failure you can then pivot over to your copy. > Will use fcntl(F_DUP_CLOEXEC) in updated patch. >> + >> +*prs =&(raw_rs->reopen_state); >> + >> +/* Flags that can be set using fcntl */ >> +int fcntl_flags = BDRV_O_NOCACHE; > > This variable name is misleading; fcntl_flags in my mind implies O_* not > BDRV_O_*, as we are not guaranteed that they are the same values. Maybe > bdrv_flags is a better name. Or are you trying to list the subset of > BDRV_O flags that can be changed via mapping to the appropriate O_ flag > during fcntl? > From the list of flags in bdrv->openflags (BDRV_O*), only few of them can be changed using fcntl. So to identify those allowed subset, I am using fcntl_flags. Excerpt from man fcntl for F_SETFL: On Linux this command can only change the O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME, and O_NONBLOCK flags. May be naming it fcntl_supported_flags would be better? - thanks, Supriya
Re: [Qemu-devel] [v2 Patch 1/9]block: Framework for reopening image files safely
On 08/09/2012 06:32 PM, Jeff Cody wrote: > On 08/09/2012 05:20 AM, Kevin Wolf wrote: >> Am 09.08.2012 06:26, schrieb Jeff Cody: >>> On 07/30/2012 05:34 PM, Supriya Kannery wrote: >>>> + >>>> +void bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp) >>>> +{ >>>> +BlockDriver *drv = bs->drv; >>>> +int ret = 0; >>>> +BDRVReopenState *reopen_state = NULL; >>>> + >>> >>> We should assert if drv is NULL: >>> >>> assert(drv != NULL); >>> >>> >>>> +/* Quiesce IO for the given block device */ >>>> +bdrv_drain_all(); >>>> +ret = bdrv_flush(bs); >>>> +if (ret != 0) { >>>> +error_set(errp, QERR_IO_ERROR); >>>> +return; >>>> +} >>>> + >>> >>> We also need to reopen bs->file, so maybe something like this: >>> >>> /* open any file images */ >>> if (bs->file) { >>> bdrv_reopen(bs->file, bdrv_flags, errp); >>> if (errp&& *errp) { >>> goto exit; >>> } >>> } >>> >>> This will necessitate making the handlers in the raw.c file just stubs >>> (I'll respond to that patch as well). >> >> Doesn't this break the transactional semantics? I think you should only >> prepare the bs->file reopen here and commit it when committing this one. >> > > Yes, it would, so if everything stayed as it is now, that would be the > right way to do it... but one thing that would be nice (I also mention > this in my comments on patch 0/9) is that it become transactional for > multiple BlockDriverStates to be reopened. That would obviously change > the interface a bit, so that multiple BDS entries could be queued, but > then it shouldn't be any different to queue the bs->file as well as the > bs. > > All prepare() stages would happen on queued items, and only > progress to commit() if all prepare() stages passed, as you mentioned. Yes, will work on to get the transactional semantics applied to bs->file reopen as well. > > One other thing, and perhaps this is nit-picking some - but the > prepare() functions all modify the 'live' structures, after backing them > up into stashes. So, on abort, the structures are restored from the > stashes, and on commit the stashes are discarded. Conceptually, it > seems cleaner to this the opposite way - prepare() does it work into > temporary stashes, and the commit() then copies the data over from the > stash to the live structure, and abort() would just discard the stashes. > > prepare()/commit() and abort() are from the perspective of new changes to be tried on respective block driver state. Once block driver is ready for the state change, then commit() means we are commiting these new changes to driver state. Similarly abort() means we are aborting these new changes half way and getting back to old stashed state. This is the intended logic. - thanks, Supriya
Re: [Qemu-devel] [v2 Patch 1/9]block: Framework for reopening image files safely
On 08/09/2012 02:43 AM, Jeff Cody wrote: On 07/30/2012 05:34 PM, Supriya Kannery wrote: Struct BDRVReopenState along with three reopen related functions introduced for handling reopening of images safely. This can be extended by each of the block drivers to reopen respective image files. + +bdrv_reopen_commit(bs, reopen_state); +bs->open_flags = bdrv_flags; You also need to set bs->read_only here, like so: bs->read_only = !(bdrv_flags& BDRV_O_RDWR); Sure..will include in updated patch. - thanks, Supriya
Re: [Qemu-devel] [v2 Patch 1/9]block: Framework for reopening image files safely
On 08/03/2012 01:49 AM, Luiz Capitulino wrote: > On Tue, 31 Jul 2012 03:04:22 +0530 > Supriya Kannery wrote: > >> + >> +void bdrv_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) >> +{ >> +BlockDriver *drv = bs->drv; >> + >> +drv->bdrv_reopen_commit(bs, rs); >> +} >> + >> +void bdrv_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) >> +{ >> +BlockDriver *drv = bs->drv; >> + >> +drv->bdrv_reopen_abort(bs, rs); >> +} >> + >> +void bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp) >> +{ >> +BlockDriver *drv = bs->drv; >> +int ret = 0; >> +BDRVReopenState *reopen_state = NULL; >> + >> +/* Quiesce IO for the given block device */ >> +bdrv_drain_all(); >> +ret = bdrv_flush(bs); >> +if (ret != 0) { >> +error_set(errp, QERR_IO_ERROR); >> +return; >> +} >> + >> +/* Use driver specific reopen() if available */ >> +if (drv->bdrv_reopen_prepare) { >> +ret = bdrv_reopen_prepare(bs,&reopen_state, bdrv_flags); >> + if (ret< 0) { >> +bdrv_reopen_abort(bs, reopen_state); > > Why do you have to call bdrv_reopen_abort()? I'd expect bdrv_reopen_prepare() > (to be able) to undo anything it has done. > Having separate abort function avoids cluttering of reopen- prepare(). We wanted to logically separate out preparation, commit and abort. Same format is followed in implementations at block driver level as well. -thanks, Supriya
[Qemu-devel] [v2 Patch 8/9]block: Cmd "block_set_hostcache" for dynamic cache change
New command "block_set_hostcache" added for dynamically changing host pagecache setting of a block device. Usage: block_set_hostcache = block device = on/off Example: (qemu) block_set_hostcache ide0-hd0 off Signed-off-by: Supriya Kannery --- Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -970,6 +970,30 @@ void bdrv_close_all(void) } } +void bdrv_change_hostcache(BlockDriverState *bs, bool enable, Error **errp) +{ +int bdrv_flags = bs->open_flags; + +/* set hostcache flags (without changing WCE/flush bits) */ +if (enable) { +bdrv_flags &= ~BDRV_O_NOCACHE; +} else { +bdrv_flags |= BDRV_O_NOCACHE; +} + +/* If no change in flags, no need to reopen */ +if (bdrv_flags != bs->open_flags) { +if (bdrv_is_inserted(bs)) { +/* Reopen file with changed set of flags */ +bdrv_flags &= ~BDRV_O_CACHE_WB; +bdrv_reopen(bs, bdrv_flags, errp); +} else { +/* Save hostcache change for future use */ +bs->open_flags = bdrv_flags; +} +} +} + /* * Wait for pending requests to complete across all BlockDriverStates * Index: qemu/blockdev.c === --- qemu.orig/blockdev.c +++ qemu/blockdev.c @@ -1191,3 +1191,22 @@ BlockJobInfoList *qmp_query_block_jobs(E bdrv_iterate(do_qmp_query_block_jobs_one, &prev); return dummy.next; } + +/* + * Change host page cache setting while guest is running. +*/ +void qmp_block_set_hostcache(const char *device, bool enable, + Error **errp) +{ +BlockDriverState *bs = NULL; + +/* Validate device */ +bs = bdrv_find(device); +if (!bs) { +error_set(errp, QERR_DEVICE_NOT_FOUND, device); +return; +} + +bdrv_change_hostcache(bs, enable, errp); +return; +} Index: qemu/hmp-commands.hx === --- qemu.orig/hmp-commands.hx +++ qemu/hmp-commands.hx @@ -1344,6 +1344,21 @@ passed since 1970, i.e. unix epoch. ETEXI { +.name = "block_set_hostcache", +.args_type = "device:B,option:b", +.params = "device on|off", +.help = "Change setting of host pagecache", +.mhandler.cmd = hmp_block_set_hostcache, +}, + +STEXI +@item block_set_hostcache @var{device} @var{option} +@findex block_set_hostcache +Change host pagecache setting of a block device while guest is running. +ETEXI + + +{ .name = "info", .args_type = "item:s?", .params = "[subcommand]", Index: qemu/qmp-commands.hx === --- qemu.orig/qmp-commands.hx +++ qemu/qmp-commands.hx @@ -987,6 +987,30 @@ Example: EQMP { +.name = "block_set_hostcache", +.args_type = "device:B,option:b", +.mhandler.cmd_new = qmp_marshal_input_block_set_hostcache, +}, + +SQMP +block_set_hostcache +--- + +Change host pagecache setting of a block device + +Arguments: + +- "device": the device's ID, must be unique (json-string) +- "option": hostcache setting (json-bool) + +Example: +-> { +"execute": "block_set_hostcache", "arguments": { "device": "ide0-hd0", "option": false } } +<- { "return": {} } + +EQMP + + +{ .name = "set_password", .args_type = "protocol:s,password:s,connected:s?", .mhandler.cmd_new = qmp_marshal_input_set_password, Index: qemu/qapi-schema.json === --- qemu.orig/qapi-schema.json +++ qemu/qapi-schema.json @@ -1604,6 +1604,23 @@ 'iops': 'int', 'iops_rd': 'int', 'iops_wr': 'int' } } ## +# @block_set_hostcache: +# +# Change host pagecache setting of a block device +# +# @device: name of the block device +# +# @option: hostcache setting (true/false) +# +# Returns: Nothing on success +# If @device is not a valid block device, DeviceNotFound +# +# Since: 1.2 +## +{ 'command': 'block_set_hostcache', + 'data': { 'device': 'str', 'option': 'bool' } } + +## # @block-stream: # # Copy data from a backing file into a block device. Index: qemu/hmp.c === --- qemu.orig/hmp.c +++ qemu/hmp.c @@ -837,6 +837,15 @@ void hmp_block_set_io_throttle(Monitor * hmp_handle_error(mon, &err); } +void hmp_block_set_hostcache(Monitor *mon, const QDict *qdict) +{ +E
Re: [Qemu-devel] [v2 Patch 6/9]block: qcow image file reopen
qcow driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Shrinidhi Joshi Index: qemu/block/qcow.c === --- qemu.orig/block/qcow.c +++ qemu/block/qcow.c @@ -78,7 +78,14 @@ typedef struct BDRVQcowState { Error *migration_blocker; } BDRVQcowState; +typedef struct BDRVQcowReopenState { +BDRVReopenState reopen_state; +BDRVQcowState *stash_s; +} BDRVQcowReopenState; + static int decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset); +static void qcow_stash_state(BDRVQcowState *stash_s, BDRVQcowState *s); +static void qcow_revert_state(BDRVQcowState *s, BDRVQcowState *stash_s); static int qcow_probe(const uint8_t *buf, int buf_size, const char *filename) { @@ -197,6 +204,103 @@ static int qcow_open(BlockDriverState *b return ret; } +static int qcow_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVQcowReopenState *qcow_rs = g_malloc0(sizeof(BDRVQcowReopenState)); +int ret = 0; +BDRVQcowState *s = bs->opaque; + +qcow_rs->reopen_state.bs = bs; +qcow_rs->stash_s = g_malloc0(sizeof(BDRVQcowState)); +qcow_stash_state(qcow_rs->stash_s, s); +*prs = &(qcow_rs->reopen_state); + +ret = qcow_open(bs, flags); +return ret; +} + +static void qcow_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQcowReopenState *qcow_rs; + +qcow_rs = container_of(rs, BDRVQcowReopenState, reopen_state); +g_free(qcow_rs->stash_s); +g_free(qcow_rs); +} + +static void qcow_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQcowReopenState *qcow_rs; +BDRVQcowState *s = bs->opaque; + +qcow_rs = container_of(rs, BDRVQcowReopenState, reopen_state); + +/* revert to stashed state */ +qcow_revert_state(s, qcow_rs->stash_s); + +g_free(qcow_rs->stash_s); +g_free(qcow_rs); +} + +static void qcow_stash_state(BDRVQcowState *stash_s, BDRVQcowState *s) +{ +int i; + +stash_s->cluster_bits = s->cluster_bits; +stash_s->cluster_size = s->cluster_size; +stash_s->cluster_sectors = s->cluster_sectors; +stash_s->l2_bits = s->l2_bits; +stash_s->l2_size = s->l2_size; +stash_s->l1_size = s->l1_size; +stash_s->cluster_offset_mask = s->cluster_offset_mask; +stash_s->l1_table_offset = s->l1_table_offset; +stash_s->l1_table = s->l1_table; +stash_s->l2_cache = s->l2_cache; +for (i = 0; i < L2_CACHE_SIZE; i++) { +stash_s->l2_cache_offsets[i] = s->l2_cache_offsets[i]; +stash_s->l2_cache_counts[i] = s->l2_cache_counts[i]; +} +stash_s->cluster_cache = s->cluster_cache; +stash_s->cluster_data = s->cluster_data; +stash_s->cluster_cache_offset = s->cluster_cache_offset; +stash_s->crypt_method = s->crypt_method; +stash_s->crypt_method_header = s->crypt_method_header; +stash_s->aes_encrypt_key = s->aes_encrypt_key; +stash_s->aes_decrypt_key = s->aes_decrypt_key; +stash_s->lock = s->lock; +stash_s->migration_blocker = s->migration_blocker; +} + +static void qcow_revert_state(BDRVQcowState *s, BDRVQcowState *stash_s) +{ +int i; + +s->cluster_bits = stash_s->cluster_bits; +s->cluster_size = stash_s->cluster_size; +s->cluster_sectors = stash_s->cluster_sectors; +s->l2_bits = stash_s->l2_bits; +s->l2_size = stash_s->l2_size; +s->l1_size = stash_s->l1_size; +s->cluster_offset_mask = stash_s->cluster_offset_mask; +s->l1_table_offset = stash_s->l1_table_offset; +s->l1_table = stash_s->l1_table; +s->l2_cache = stash_s->l2_cache; +for (i = 0; i < L2_CACHE_SIZE; i++) { +s->l2_cache_offsets[i] = s->l2_cache_offsets[i]; +s->l2_cache_counts[i] = stash_s->l2_cache_counts[i]; +} +s->cluster_cache = stash_s->cluster_cache; +s->cluster_data = stash_s->cluster_data; +s->cluster_cache_offset = stash_s->cluster_cache_offset; +s->crypt_method = stash_s->crypt_method; +s->crypt_method_header = stash_s->crypt_method_header; +s->aes_encrypt_key = stash_s->aes_encrypt_key; +s->aes_decrypt_key = stash_s->aes_decrypt_key; +s->lock = stash_s->lock; +s->migration_blocker = stash_s->migration_blocker; +} + static int qcow_set_key(BlockDriverState *bs, const char *key) { BDRVQcowState *s = bs->opaque; @@ -870,6 +974,10 @@ static BlockDriver bdrv_qcow = { .bdrv_close= qcow_close, .bdrv_create = qcow_create, +.bdrv_reopen_prepare= qcow_reopen_prepare, +.bdrv_reopen_commit = qcow_reopen_commit, +.bdrv_reopen_abort = qcow_reopen_abort, + .bdrv_co_readv = qcow_co_readv, .bdrv_co_writev = qcow_co_writev, .bdrv_co_is_allocated = qcow_co_is_allocated,
Re: [Qemu-devel] [v2 Patch 0/9]block: Image file reopen and dynamic host pagecache change
On 08/01/2012 02:03 AM, Jeff Cody wrote: On 07/30/2012 05:34 PM, Supriya Kannery wrote: For changing host pagecache setting of a running VM, it is important to have a safe way of reopening its image file. Hi Supriya, I never received patches 6 or 8, either through the list or at my direct email address - looking at the qemu-devel archives, they are also missing from there: http://lists.gnu.org/archive/html/qemu-devel/2012-07/threads.html Did those get lost somehow, or perhaps they didn't get sent? Thanks, Jeff Thanks! Jeff, for pointing this out. Looks like they got lost somehow, because the I had tried to send the patchset to myself as a trial before posting to qemu-devel and there I can see 6 and 8 as well. Will resend the lost patches now. -Rgds, Supriya
[Qemu-devel] [v2 Patch 7/9]block: qed image file reopen
qed driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery --- Index: qemu/block/qed.c === --- qemu.orig/block/qed.c +++ qemu/block/qed.c @@ -18,6 +18,14 @@ #include "qerror.h" #include "migration.h" +typedef struct BDRVQEDReopenState { +BDRVReopenState reopen_state; +BDRVQEDState *stash_s; +} BDRVQEDReopenState; + +static void qed_stash_state(BDRVQEDState *stashed_state, BDRVQEDState *s); +static void qed_revert_state(BDRVQEDState *s, BDRVQEDState *stashed_state); + static void qed_aio_cancel(BlockDriverAIOCB *blockacb) { QEDAIOCB *acb = (QEDAIOCB *)blockacb; @@ -512,6 +520,98 @@ out: return ret; } +static int bdrv_qed_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVQEDReopenState *qed_rs = g_malloc0(sizeof(BDRVQEDReopenState)); +int ret = 0; +BDRVQEDState *s = bs->opaque; + +qed_rs->reopen_state.bs = bs; + +/* save state before reopen */ +qed_rs->stash_s = g_malloc0(sizeof(BDRVQEDState)); +qed_stash_state(qed_rs->stash_s, s); +*prs = &(qed_rs->reopen_state); + +/* Reopen file with new flags */ + ret = bdrv_qed_open(bs, flags); + return ret; +} + +static void bdrv_qed_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQEDReopenState *qed_rs; +BDRVQEDState *stashed_s; + +qed_rs = container_of(rs, BDRVQEDReopenState, reopen_state); +stashed_s = qed_rs->stash_s; + +qed_cancel_need_check_timer(stashed_s); +qemu_free_timer(stashed_s->need_check_timer); + +qed_free_l2_cache(&stashed_s->l2_cache); +qemu_vfree(stashed_s->l1_table); + +g_free(stashed_s); +g_free(qed_rs); +} + +static void bdrv_qed_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQEDReopenState *qed_rs; +BDRVQEDState *s = bs->opaque; +BDRVQEDState *stashed_s; + +qed_rs = container_of(rs, BDRVQEDReopenState, reopen_state); + +/* Revert to stashed state */ +qed_revert_state(s, qed_rs->stash_s); +stashed_s = qed_rs->stash_s; + +g_free(stashed_s); +g_free(qed_rs); +} + +static void qed_stash_state(BDRVQEDState *stashed_state, BDRVQEDState *s) +{ +stashed_state->bs = s->bs; +stashed_state->file_size = s->file_size; + +stashed_state->header = s->header; +stashed_state->l1_table = s->l1_table; +stashed_state->l2_cache = s->l2_cache; +stashed_state->table_nelems = s->table_nelems; +stashed_state->l1_shift = s->l1_shift; +stashed_state->l2_shift = s->l2_shift; +stashed_state->l2_mask = s->l2_mask; + +stashed_state->allocating_write_reqs = s->allocating_write_reqs; +stashed_state->allocating_write_reqs_plugged = + s->allocating_write_reqs_plugged; + +stashed_state->need_check_timer = s->need_check_timer; +} + +static void qed_revert_state(BDRVQEDState *s, BDRVQEDState *stashed_state) +{ +s->bs = stashed_state->bs; +s->file_size = stashed_state->file_size; + +s->header = stashed_state->header; +s->l1_table = stashed_state->l1_table; +s->l2_cache = stashed_state->l2_cache; +s->table_nelems = stashed_state->table_nelems; +s->l1_shift = stashed_state->l1_shift; +s->l2_shift = stashed_state->l2_shift; +s->l2_mask = stashed_state->l2_mask; + +s->allocating_write_reqs = s->allocating_write_reqs; +s->allocating_write_reqs_plugged = s->allocating_write_reqs_plugged; + +s->need_check_timer = s->need_check_timer; +} + static void bdrv_qed_close(BlockDriverState *bs) { BDRVQEDState *s = bs->opaque; @@ -1560,6 +1660,9 @@ static BlockDriver bdrv_qed = { .bdrv_rebind = bdrv_qed_rebind, .bdrv_open= bdrv_qed_open, .bdrv_close = bdrv_qed_close, +.bdrv_reopen_prepare = bdrv_qed_reopen_prepare, +.bdrv_reopen_commit = bdrv_qed_reopen_commit, +.bdrv_reopen_abort= bdrv_qed_reopen_abort, .bdrv_create = bdrv_qed_create, .bdrv_co_is_allocated = bdrv_qed_co_is_allocated, .bdrv_make_empty = bdrv_qed_make_empty,
[Qemu-devel] [v2 Patch 9/9]block: Enhance "info block" to display host cache setting
Enhance "info block" to display hostcache setting for each block device. Example: (qemu) info block ide0-hd0: removable=0 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Enhanced to display "hostcache" setting: (qemu) info block ide0-hd0: removable=0 hostcache=1 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Signed-off-by: Supriya Kannery --- Index: qemu/qapi-schema.json === --- qemu.orig/qapi-schema.json +++ qemu/qapi-schema.json @@ -453,6 +453,8 @@ # @locked: True if the guest has locked this device from having its media # removed # +# @hostcache: True if host pagecache is enabled. +# # @tray_open: #optional True if the device has a tray and it is open # (only present if removable is true) # @@ -466,7 +468,7 @@ ## { 'type': 'BlockInfo', 'data': {'device': 'str', 'type': 'str', 'removable': 'bool', - 'locked': 'bool', '*inserted': 'BlockDeviceInfo', + 'locked': 'bool','hostcache': 'bool','*inserted': 'BlockDeviceInfo', '*tray_open': 'bool', '*io-status': 'BlockDeviceIoStatus'} } ## Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -2504,6 +2504,7 @@ BlockInfoList *qmp_query_block(Error **e info->value->device = g_strdup(bs->device_name); info->value->type = g_strdup("unknown"); info->value->locked = bdrv_dev_is_medium_locked(bs); +info->value->hostcache = !(bs->open_flags & BDRV_O_NOCACHE); info->value->removable = bdrv_dev_has_removable_media(bs); if (bdrv_dev_has_removable_media(bs)) { Index: qemu/hmp.c === --- qemu.orig/hmp.c +++ qemu/hmp.c @@ -215,6 +215,8 @@ void hmp_info_block(Monitor *mon) monitor_printf(mon, " tray-open=%d", info->value->tray_open); } +monitor_printf(mon, " hostcache=%d", info->value->hostcache); + if (info->value->has_io_status) { monitor_printf(mon, " io-status=%s", BlockDeviceIoStatus_lookup[info->value->io_status]);
[Qemu-devel] [v2 Patch 0/9]block: Image file reopen and dynamic host pagecache change
For changing host pagecache setting of a running VM, it is important to have a safe way of reopening its image file. V1 introduced: * a generic way to reopen image files safely. In this approach, before reopening an image, for each block driver, its state will be stashed. Incase preparation, (bdrv_reopen_prepare) for reopening returns success, the stashed state will be cleared (bdrv_reopen_commit) and reopened state will be used further. Incase preparation of reopening returns failure, the state of the driver will be rolled back (bdrv_reopen_abort) to the stashed state. This approach is implemented for raw-posix, raw-win32, vmdk, qcow, qcow2 and qed block drivers. * qmp and hmp command 'block_set_hostcache' using which host pagecache setting for a block device can be changed when the VM is running. * BDRVReopenState, a generic structure which can be extended by each of the block drivers to reopen respective image files. V2: * Changed ordering of patches such that code changes related to generic framework for safely reopening images gets applied first. * For block drivers not having bdrv_reopen_xx functions implemented, return "feature not supported" error. Testing: === [Thanks! to Yoganananth Subramanian for helping out with testing] Steps: 1) boot up guest image of different formats qed, raw, qcow2, vmdk 2) run iozone in these guests command: iozone -a 3) view cache setting of image file through qemu monitor command: info block "hostcache =" 0/1 should be displayed 4) Toggle hostcache value using block_set_hostcache command: block_set_hostcache virtio0 on 5) Disable and enable hostcache at randon intervals while iozone is running inside guest. command: block_set_hostcache virtio0 on/off 6) Info block should reflect toggled hostcache value and iozone should complete without any issue Results: Verified above steps for raw-posix, qcow2, qed and vmdk images. raw-posix, qed and vmdk images (split files) worked fine. With qcow2 image getting an error of double free after iozone running for a while. To Do: == * Debug the issue with qcow2 image and resolve asap. * Enhance code around dup3 in raw-posix to fall back to dup2/dup when dup3 is not supported by OS. * Do some more extensive testing, especially with qcow2 and qed drivers. New block command added: "block_set_hostcache" -- Sets hostcache parameter for block device while guest is running. Usage: block_set_hostcache = block device = on/off qemu/block.c | 79 ++ qemu/block.h |5 + qemu/block/qcow.c | 108 ++ qemu/block/qcow2.c | 175 + qemu/block/qed.c | 103 qemu/block/raw-posix.c | 121 + qemu/block/raw-win32.c | 96 ++ qemu/block/raw.c | 20 + qemu/block/vmdk.c | 103 qemu/block_int.h | 12 +++ qemu/blockdev.c| 19 + qemu/hmp-commands.hx | 15 qemu/hmp.c | 11 ++ qemu/hmp.h |1 qemu/qapi-schema.json | 21 - qemu/qemu-common.h |1 qemu/qmp-commands.hx | 24 ++ 17 files changed, 912 insertions(+), 2 deletions(-)
[Qemu-devel] [v2 Patch 5/9]block: qcow2 image file reopen
qcow2 driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery --- Index: qemu/block/qcow2.c === --- qemu.orig/block/qcow2.c +++ qemu/block/qcow2.c @@ -52,10 +52,19 @@ typedef struct { uint32_t magic; uint32_t len; } QCowExtension; + +typedef struct BDRVQcowReopenState { +BDRVReopenState reopen_state; +BDRVQcowState *stash_s; +} BDRVQcowReopenState; + #define QCOW2_EXT_MAGIC_END 0 #define QCOW2_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA #define QCOW2_EXT_MAGIC_FEATURE_TABLE 0x6803f857 +static void qcow2_stash_state(BDRVQcowState *stashed_state, BDRVQcowState *s); +static void qcow2_revert_state(BDRVQcowState *s, BDRVQcowState *stashed_state); + static int qcow2_probe(const uint8_t *buf, int buf_size, const char *filename) { const QCowHeader *cow_header = (const void *)buf; @@ -434,6 +443,169 @@ static int qcow2_open(BlockDriverState * return ret; } +static int qcow2_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVQcowReopenState *qcow2_rs = g_malloc0(sizeof(BDRVQcowReopenState)); +int ret = 0; +BDRVQcowState *s = bs->opaque; + +qcow2_rs->reopen_state.bs = bs; + +/* save state before reopen */ +qcow2_rs->stash_s = g_malloc0(sizeof(BDRVQcowState)); +qcow2_stash_state(qcow2_rs->stash_s, s); +*prs = &(qcow2_rs->reopen_state); + +/* Reopen file with new flags */ + ret = qcow2_open(bs, flags); + return ret; +} + +static void qcow2_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQcowReopenState *qcow2_rs; +BDRVQcowState *stashed_s; + +qcow2_rs = container_of(rs, BDRVQcowReopenState, reopen_state); +stashed_s = qcow2_rs->stash_s; + +qcow2_cache_flush(bs, stashed_s->l2_table_cache); +qcow2_cache_flush(bs, stashed_s->refcount_block_cache); + +qcow2_cache_destroy(bs, stashed_s->l2_table_cache); +qcow2_cache_destroy(bs, stashed_s->refcount_block_cache); + +g_free(stashed_s->unknown_header_fields); +cleanup_unknown_header_ext(bs); + +g_free(stashed_s->cluster_cache); +qemu_vfree(stashed_s->cluster_data); +qcow2_refcount_close(bs); +qcow2_free_snapshots(bs); + +g_free(stashed_s); +g_free(qcow2_rs); +} + +static void qcow2_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQcowReopenState *qcow2_rs; +BDRVQcowState *s = bs->opaque; +BDRVQcowState *stashed_s; + +qcow2_rs = container_of(rs, BDRVQcowReopenState, reopen_state); + +/* Revert to stashed state */ +qcow2_revert_state(s, qcow2_rs->stash_s); +stashed_s = qcow2_rs->stash_s; + +g_free(stashed_s); +g_free(qcow2_rs); +} + +static void qcow2_stash_state(BDRVQcowState *stashed_s, BDRVQcowState *s) +{ +stashed_s->cluster_bits = s->cluster_bits; +stashed_s->cluster_size = s->cluster_size; +stashed_s->cluster_sectors = s->cluster_sectors; +stashed_s->l2_bits = s->l2_bits; +stashed_s->l2_size = s->l2_size; +stashed_s->l1_size = s->l1_size; +stashed_s->l1_vm_state_index = s->l1_vm_state_index; +stashed_s->csize_shift = s->csize_shift; +stashed_s->csize_mask = s->csize_mask; +stashed_s->cluster_offset_mask = s->cluster_offset_mask; +stashed_s->l1_table_offset = s->l1_table_offset; +stashed_s->l1_table = s->l1_table; + +stashed_s->l2_table_cache = s->l2_table_cache; +stashed_s->refcount_block_cache = s->refcount_block_cache; + +stashed_s->cluster_cache = s->cluster_cache; +stashed_s->cluster_data = s->cluster_data; +stashed_s->cluster_cache_offset = s->cluster_cache_offset; +stashed_s->cluster_allocs = s->cluster_allocs; + +stashed_s->refcount_table = s->refcount_table; +stashed_s->refcount_table_offset = s->refcount_table_offset; +stashed_s->refcount_table_size = s->refcount_table_size; +stashed_s->free_cluster_index = s->free_cluster_index; +stashed_s->free_byte_offset = s->free_byte_offset; + +stashed_s->lock = s->lock; + +stashed_s->crypt_method = s->crypt_method; +stashed_s->crypt_method_header = s->crypt_method_header; +stashed_s->aes_encrypt_key = s->aes_encrypt_key; +stashed_s->aes_decrypt_key = s->aes_decrypt_key; +stashed_s->snapshots_offset = s->snapshots_offset; +stashed_s->snapshots_size = s->snapshots_size; +stashed_s->nb_snapshots = s->nb_snapshots; +stashed_s->snapshots = s->snapshots; + +stashed_s->flags = s->flags; +stashed_s->qcow_version = s->qcow_version; + +stashed_s->incompatible_features = s->incom
[Qemu-devel] [v2 Patch 2/9]block: raw-posix image file reopen
raw-posix driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery --- Index: qemu/block/raw.c === --- qemu.orig/block/raw.c +++ qemu/block/raw.c @@ -9,6 +9,22 @@ static int raw_open(BlockDriverState *bs return 0; } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +return bdrv_reopen_prepare(bs->file, prs, flags); +} + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +bdrv_reopen_commit(bs->file, rs); +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +bdrv_reopen_abort(bs->file, rs); +} + static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov) { @@ -113,6 +129,10 @@ static BlockDriver bdrv_raw = { .instance_size = 1, .bdrv_open = raw_open, +.bdrv_reopen_prepare += raw_reopen_prepare, +.bdrv_reopen_commit = raw_reopen_commit, +.bdrv_reopen_abort = raw_reopen_abort, .bdrv_close = raw_close, .bdrv_co_readv = raw_co_readv, Index: qemu/block/raw-posix.c === --- qemu.orig/block/raw-posix.c +++ qemu/block/raw-posix.c @@ -140,8 +140,15 @@ typedef struct BDRVRawState { #endif } BDRVRawState; +typedef struct BDRVRawReopenState { +BDRVReopenState reopen_state; +BDRVRawState *stash_s; +} BDRVRawReopenState; + static int fd_open(BlockDriverState *bs); static int64_t raw_getlength(BlockDriverState *bs); +static void raw_stash_state(BDRVRawState *stashed_state, BDRVRawState *s); +static void raw_revert_state(BDRVRawState *s, BDRVRawState *stashed_state); #if defined(__FreeBSD__) || defined(__FreeBSD_kernel__) static int cdrom_reopen(BlockDriverState *bs); @@ -283,6 +290,117 @@ static int raw_open(BlockDriverState *bs return raw_open_common(bs, filename, flags, 0); } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVRawReopenState *raw_rs = g_malloc0(sizeof(BDRVRawReopenState)); +BDRVRawState *s = bs->opaque; +int ret = 0; + +raw_rs->reopen_state.bs = bs; + +/* stash state before reopen */ +raw_rs->stash_s = g_malloc0(sizeof(BDRVRawState)); +raw_stash_state(raw_rs->stash_s, s); +s->fd = dup3(raw_rs->stash_s->fd, s->fd, O_CLOEXEC); + +*prs = &(raw_rs->reopen_state); + +/* Flags that can be set using fcntl */ +int fcntl_flags = BDRV_O_NOCACHE; + +if ((bs->open_flags & ~fcntl_flags) == (flags & ~fcntl_flags)) { +if ((flags & BDRV_O_NOCACHE)) { +s->open_flags |= O_DIRECT; +} else { +s->open_flags &= ~O_DIRECT; +} +ret = fcntl_setfl(s->fd, s->open_flags); +} else { + +/* close and reopen using new flags */ +bs->drv->bdrv_close(bs); +ret = bs->drv->bdrv_file_open(bs, bs->filename, flags); +} +return ret; +} + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVRawReopenState *raw_rs; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +/* clean up stashed state */ +close(raw_rs->stash_s->fd); +g_free(raw_rs->stash_s); +g_free(raw_rs); +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVRawReopenState *raw_rs; +BDRVRawState *s = bs->opaque; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +/* revert to stashed state */ +if (s->fd != -1) { +close(s->fd); +} +raw_revert_state(s, raw_rs->stash_s); +g_free(raw_rs->stash_s); +g_free(raw_rs); +} + +static void raw_stash_state(BDRVRawState *stashed_s, BDRVRawState *s) +{ +stashed_s->fd = -1; +stashed_s->type = s->type; +stashed_s->open_flags = s->open_flags; +#if defined(__linux__) +/* linux floppy specific */ +stashed_s->fd_open_time = s->fd_open_time; +stashed_s->fd_error_time = s->fd_error_time; +stashed_s->fd_got_error = s->fd_got_error; +stashed_s->fd_media_changed = s->fd_media_changed; +#endif +#ifdef CONFIG_LINUX_AIO +stashed_s->use_aio = s->use_aio; +stashed_s->aio_ctx = s->aio_ctx; +#endif +stashed_s->aligned_buf = s->aligned_buf; +stashed_s->aligned_buf_size = s->aligned_buf_size; +#ifdef CONFIG_XFS +stashed_s->is_xfs = s->is_xfs; +#endif + +} + +static void raw_revert_state(BDRVRawState *s, BDRVRawState *stashed_s) +{ + +s->fd = stashed_s->fd; +
[Qemu-devel] [v2 Patch 3/9]block: raw-win32 image file reopen
raw-win32 driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery Signed-off-by: Shrinidhi Joshi --- Index: qemu/block/raw-win32.c === --- qemu.orig/block/raw-win32.c +++ qemu/block/raw-win32.c @@ -26,18 +26,27 @@ #include "block_int.h" #include "module.h" #include +#include #include #define FTYPE_FILE 0 #define FTYPE_CD 1 #define FTYPE_HARDDISK 2 +#define WINDOWS_VISTA 6 typedef struct BDRVRawState { HANDLE hfile; int type; char drive_path[16]; /* format: "d:\" */ +DWORD overlapped; } BDRVRawState; +typedef struct BDRVRawReopenState { +BDRVReopenState reopen_state; +HANDLE stash_hfile; +DWORD stash_overlapped; +} BDRVRawReopenState; + int qemu_ftruncate64(int fd, int64_t length) { LARGE_INTEGER li; @@ -106,9 +115,96 @@ static int raw_open(BlockDriverState *bs return -EACCES; return -1; } +s->overlapped = overlapped; return 0; } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVRawReopenState *raw_rs = g_malloc0(sizeof(BDRVRawReopenState)); +BDRVRawState *s = bs->opaque; +int ret = 0; +OSVERSIONINFO osvi; +BOOL bIsWindowsVistaorLater; + +raw_rs->bs = bs; +raw_rs->stash_hfile = s->hfile; +raw_rs->stash_overlapped = s->overlapped; +*prs = raw_rs; + +if (flags & BDRV_O_NOCACHE) { +s->overlapped |= FILE_FLAG_NO_BUFFERING; +} else { +s->overlapped &= ~FILE_FLAG_NO_BUFFERING; +} + +if (!(flags & BDRV_O_CACHE_WB)) { +s->overlapped |= FILE_FLAG_WRITE_THROUGH; +} else { +s->overlapped &= ~FILE_FLAG_WRITE_THROUGH; +} + +ZeroMemory(&osvi, sizeof(OSVERSIONINFO)); +osvi.dwOSVersionInfoSize = sizeof(OSVERSIONINFO); + +GetVersionEx(&osvi); + +if (osvi.dwMajorVersion >= WINDOWS_VISTA) { +s->hfile = ReOpenFile(raw_rs->stash_hfile, 0, FILE_SHARE_READ, + overlapped); +if (s->hfile == INVALID_HANDLE_VALUE) { +int err = GetLastError(); +if (err == ERROR_ACCESS_DENIED) { +ret = -EACCES; +} else { +ret = -1; +} +} +} else { + +DuplicateHandle(GetCurrentProcess(), +raw_rs->stash_hfile, +GetCurrentProcess(), +&s->hfile, +0, +FALSE, +DUPLICATE_SAME_ACCESS); +bs->drv->bdrv_close(bs); +ret = bs->drv->bdrv_open(bs, bs->filename, flags); +} +return ret; +} + + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVRawReopenState *raw_rs; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +/* clean up stashed handle */ +CloseHandle(raw_rs->stash_hfile); +g_free(raw_rs); + +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ + +BDRVRawReopenState *raw_rs; +BDRVRawState *s = bs->opaque; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +if (s->hfile && (s->hfile != INVALID_HANDLE_VALUE)) { +CloseHandle(s->hfile); +} +s->hfile = raw_rs->stash_hfile; +s->overlapped = raw_rs->stash_overlapped; +g_free(raw_rs); +} + static int raw_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors) {
[Qemu-devel] [v2 Patch 4/9]block: vmdk image file reopen
vmdk driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery --- Index: qemu/block/vmdk.c === --- qemu.orig/block/vmdk.c +++ qemu/block/vmdk.c @@ -115,6 +115,14 @@ typedef struct VmdkGrainMarker { uint8_t data[0]; } VmdkGrainMarker; +typedef struct BDRVVmdkReopenState { +BDRVReopenState reopen_state; +BDRVVmdkState *stash_s; +} BDRVVmdkReopenState; + +static void vmdk_stash_state(BDRVVmdkState *stashed_state, BDRVVmdkState *s); +static void vmdk_revert_state(BDRVVmdkState *s, BDRVVmdkState *stashed_state); + static int vmdk_probe(const uint8_t *buf, int buf_size, const char *filename) { uint32_t magic; @@ -588,7 +596,6 @@ static int vmdk_parse_extents(const char if (!strcmp(type, "FLAT")) { /* FLAT extent */ VmdkExtent *extent; - extent = vmdk_add_extent(bs, extent_file, true, sectors, 0, 0, 0, 0, sectors); extent->flat_start_offset = flat_offset << 9; @@ -675,6 +682,94 @@ fail: return ret; } +static int vmdk_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVVmdkReopenState *vmdk_rs = g_malloc0(sizeof(BDRVVmdkReopenState)); +int ret = 0; +BDRVVmdkState *s = bs->opaque; + +vmdk_rs->reopen_state.bs = bs; + +/* save state before reopen */ +vmdk_rs->stash_s = g_malloc0(sizeof(BDRVVmdkState)); +vmdk_stash_state(vmdk_rs->stash_s, s); +s->num_extents = 0; +s->extents = NULL; +s->migration_blocker = NULL; +*prs = &(vmdk_rs->reopen_state); + +/* create extents afresh with new flags */ + ret = vmdk_open(bs, flags); + return ret; +} + +static void vmdk_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVVmdkReopenState *vmdk_rs; +BDRVVmdkState *stashed_s; +VmdkExtent *e; +int i; + +vmdk_rs = container_of(rs, BDRVVmdkReopenState, reopen_state); +stashed_s = vmdk_rs->stash_s; + +/* clean up stashed state */ +for (i = 0; i < stashed_s->num_extents; i++) { +e = &stashed_s->extents[i]; +g_free(e->l1_table); +g_free(e->l2_cache); +g_free(e->l1_backup_table); +} +g_free(stashed_s->extents); +g_free(vmdk_rs->stash_s); +g_free(vmdk_rs); +} + +static void vmdk_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVVmdkReopenState *vmdk_rs; +BDRVVmdkState *s = bs->opaque; +VmdkExtent *e; +int i; + +vmdk_rs = container_of(rs, BDRVVmdkReopenState, reopen_state); + +/* revert to stashed state */ +for (i = 0; i < s->num_extents; i++) { +e = &s->extents[i]; +g_free(e->l1_table); +g_free(e->l2_cache); +g_free(e->l1_backup_table); +} +g_free(s->extents); +vmdk_revert_state(s, vmdk_rs->stash_s); +g_free(vmdk_rs->stash_s); +g_free(vmdk_rs); +} + +static void vmdk_stash_state(BDRVVmdkState *stashed_state, BDRVVmdkState *s) +{ + stashed_state->lock = s->lock; + stashed_state->desc_offset = s->desc_offset; + stashed_state->cid_updated = s->cid_updated; + stashed_state->parent_cid = s->parent_cid; + stashed_state->num_extents = s->num_extents; + stashed_state->extents = s->extents; + stashed_state->migration_blocker = s->migration_blocker; +} + +static void vmdk_revert_state(BDRVVmdkState *s, BDRVVmdkState *stashed_state) +{ + s->lock = stashed_state->lock; + s->desc_offset = stashed_state->desc_offset; + s->cid_updated = stashed_state->cid_updated; + s->parent_cid = stashed_state->parent_cid; + s->num_extents = stashed_state->num_extents; + s->extents = stashed_state->extents; + s->migration_blocker = stashed_state->migration_blocker; +} + static int get_whole_cluster(BlockDriverState *bs, VmdkExtent *extent, uint64_t cluster_offset, @@ -1593,6 +1688,12 @@ static BlockDriver bdrv_vmdk = { .instance_size = sizeof(BDRVVmdkState), .bdrv_probe = vmdk_probe, .bdrv_open = vmdk_open, +.bdrv_reopen_prepare += vmdk_reopen_prepare, +.bdrv_reopen_commit += vmdk_reopen_commit, +.bdrv_reopen_abort += vmdk_reopen_abort, .bdrv_read = vmdk_co_read, .bdrv_write = vmdk_co_write, .bdrv_close = vmdk_close,
[Qemu-devel] [v2 Patch 1/9]block: Framework for reopening image files safely
Struct BDRVReopenState along with three reopen related functions introduced for handling reopening of images safely. This can be extended by each of the block drivers to reopen respective image files. Signed-off-by: Supriya Kannery --- Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -859,6 +859,60 @@ unlink_and_fail: return ret; } +int bdrv_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, int flags) +{ + BlockDriver *drv = bs->drv; + + return drv->bdrv_reopen_prepare(bs, prs, flags); +} + +void bdrv_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BlockDriver *drv = bs->drv; + +drv->bdrv_reopen_commit(bs, rs); +} + +void bdrv_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BlockDriver *drv = bs->drv; + +drv->bdrv_reopen_abort(bs, rs); +} + +void bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp) +{ +BlockDriver *drv = bs->drv; +int ret = 0; +BDRVReopenState *reopen_state = NULL; + +/* Quiesce IO for the given block device */ +bdrv_drain_all(); +ret = bdrv_flush(bs); +if (ret != 0) { +error_set(errp, QERR_IO_ERROR); +return; +} + +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen_prepare) { +ret = bdrv_reopen_prepare(bs, &reopen_state, bdrv_flags); + if (ret < 0) { +bdrv_reopen_abort(bs, reopen_state); +error_set(errp, QERR_OPEN_FILE_FAILED, bs->filename); +return; +} + +bdrv_reopen_commit(bs, reopen_state); +bs->open_flags = bdrv_flags; +} else { +error_set(errp, QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED, + drv->format_name, bs->device_name, + "reopening of file"); +return; +} +} + void bdrv_close(BlockDriverState *bs) { bdrv_flush(bs); Index: qemu/block_int.h === --- qemu.orig/block_int.h +++ qemu/block_int.h @@ -136,6 +136,14 @@ struct BlockDriver { int instance_size; int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename); int (*bdrv_probe_device)(const char *filename); + +/* For handling image reopen for split or non-split files */ +int (*bdrv_reopen_prepare)(BlockDriverState *bs, + BDRVReopenState **prs, + int flags); +void (*bdrv_reopen_commit)(BlockDriverState *bs, BDRVReopenState *rs); +void (*bdrv_reopen_abort)(BlockDriverState *bs, BDRVReopenState *rs); + int (*bdrv_open)(BlockDriverState *bs, int flags); int (*bdrv_file_open)(BlockDriverState *bs, const char *filename, int flags); int (*bdrv_read)(BlockDriverState *bs, int64_t sector_num, @@ -335,6 +343,10 @@ struct BlockDriverState { BlockJob *job; }; +struct BDRVReopenState { +BlockDriverState *bs; +}; + int get_tmp_filename(char *filename, int size); void bdrv_set_io_limits(BlockDriverState *bs, Index: qemu/block.h === --- qemu.orig/block.h +++ qemu/block.h @@ -129,6 +129,10 @@ int bdrv_parse_cache_flags(const char *m int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags); int bdrv_open(BlockDriverState *bs, const char *filename, int flags, BlockDriver *drv); +void bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp); +int bdrv_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, int flags); +void bdrv_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs); +void bdrv_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs); void bdrv_close(BlockDriverState *bs); int bdrv_attach_dev(BlockDriverState *bs, void *dev); void bdrv_attach_dev_nofail(BlockDriverState *bs, void *dev); Index: qemu/qemu-common.h === --- qemu.orig/qemu-common.h +++ qemu/qemu-common.h @@ -223,6 +223,7 @@ typedef struct NICInfo NICInfo; typedef struct HCIInfo HCIInfo; typedef struct AudioState AudioState; typedef struct BlockDriverState BlockDriverState; +typedef struct BDRVReopenState BDRVReopenState; typedef struct DriveInfo DriveInfo; typedef struct DisplayState DisplayState; typedef struct DisplayChangeListener DisplayChangeListener;
Re: [Qemu-devel] [v1 Patch 3/10]Qemu: Cmd "block_set_hostcache" for dynamic cache change
On 07/11/2012 07:46 PM, Luiz Capitulino wrote: On Sat, 16 Jun 2012 02:17:30 +0530 Supriya Kannery wrote: New command "block_set_hostcache" added for dynamically changing host pagecache setting of a block device. Usage: block_set_hostcache = block device = on/off Example: (qemu) block_set_hostcache ide0-hd0 off Signed-off-by: Supriya Kannery --- block.c | 54 ++ block.h |2 ++ blockdev.c | 26 ++ blockdev.h |2 ++ hmp-commands.hx | 14 ++ qmp-commands.hx | 27 +++ 6 files changed, 125 insertions(+) Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -858,6 +858,35 @@ unlink_and_fail: return ret; } +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) +{ +BlockDriver *drv = bs->drv; +int ret = 0, open_flags; + +/* Quiesce IO for the given block device */ +qemu_aio_flush(); +ret = bdrv_flush(bs); +if (ret != 0) { +qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); Please, use error_set() instead of qerror_report() and as Kevin said IOError seems enough here. Sure, will replace qemu_aio_flush() with bdrv_drain_all() and qerror_report() with error_set(). Combination of bdrv_drain_all() and bdrv_flush(), should help in flushing out the specific disk. -thanks, Supriya
Re: [Qemu-devel] [v1 Patch 3/10]Qemu: Cmd "block_set_hostcache" for dynamic cache change
On 06/16/2012 03:26 AM, Eric Blake wrote: On 06/15/2012 02:47 PM, Supriya Kannery wrote: New command "block_set_hostcache" added for dynamically changing host pagecache setting of a block device. Usage: block_set_hostcache = block device = on/off Example: (qemu) block_set_hostcache ide0-hd0 off Signed-off-by: Supriya Kannery --- block.c | 54 ++ block.h |2 ++ blockdev.c | 26 ++ blockdev.h |2 ++ hmp-commands.hx | 14 ++ qmp-commands.hx | 27 +++ 6 files changed, 125 insertions(+) Doesn't this also need to touch qapi-schema.json? [/me reads full patch] Oh, you did - but your diffstat is stale. It might be worth figuring out what in your workflow leads to stale diffstats. Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -858,6 +858,35 @@ unlink_and_fail: return ret; } +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) +{ +BlockDriver *drv = bs->drv; +int ret = 0, open_flags; + +/* Quiesce IO for the given block device */ +qemu_aio_flush(); +ret = bdrv_flush(bs); +if (ret != 0) { +qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); +return ret; +} +open_flags = bs->open_flags; +bdrv_close(bs); + +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); Yuck. This is bad, and why 'transaction' was invented. Any time you close() before open() you risk completely losing the file... +if (ret< 0) { +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_OPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret< 0) { +/* Reopen failed with orig and modified flags */ +abort(); and an abort() is not a nice reaction to that. I think we should rebase the series to do the safe reopen prior to adding this command (at least, just judging by the title of 4/10), to avoid intermediate bad code. Yes, will reorder the patches to have safe reopen done first and then block_set_hostcache use it. -thanks, Supriya
Re: [Qemu-devel] [v1 Patch 2/10]Qemu: Error classes for hostcache setting and data sync failures
On 07/09/2012 08:17 PM, Kevin Wolf wrote: Am 15.06.2012 22:47, schrieb Supriya Kannery: New error classes defined for hostcache setting and data sync error Signed-off-by: Supriya Kannery --- qerror.c |8 qerror.h |6 ++ 2 files changed, 14 insertions(+) Index: qemu/qerror.c === --- qemu.orig/qerror.c +++ qemu/qerror.c @@ -80,6 +80,10 @@ static const QErrorStringTable qerror_ta .desc = "The command %(name) has not been found", }, { +.error_fmt = QERR_DATA_SYNC_FAILED, +.desc = "Syncing of data failed for device '%(device)'", +}, +{ .error_fmt = QERR_DEVICE_ENCRYPTED, .desc = "Device '%(device)' is encrypted", }, @@ -152,6 +156,10 @@ static const QErrorStringTable qerror_ta .desc = "The feature '%(name)' is not enabled", }, { +.error_fmt = QERR_HOSTCACHE_NOT_CHANGED, +.desc = "Could not change hostcache setting for '%(device)'", +}, +{ .error_fmt = QERR_INVALID_BLOCK_FORMAT, .desc = "Invalid block format '%(name)'", }, Index: qemu/qerror.h === --- qemu.orig/qerror.h +++ qemu/qerror.h @@ -82,6 +82,9 @@ QError *qobject_to_qerror(const QObject #define QERR_COMMAND_NOT_FOUND \ "{ 'class': 'CommandNotFound', 'data': { 'name': %s } }" +#define QERR_DATA_SYNC_FAILED \ +"{ 'class': 'DataSyncFailed', 'data': { 'device': %s } }" + #define QERR_DEVICE_ENCRYPTED \ "{ 'class': 'DeviceEncrypted', 'data': { 'device': %s, 'filename': %s } }" @@ -136,6 +139,9 @@ QError *qobject_to_qerror(const QObject #define QERR_FEATURE_DISABLED \ "{ 'class': 'FeatureDisabled', 'data': { 'name': %s } }" +#define QERR_HOSTCACHE_NOT_CHANGED \ +"{ 'class': 'HostcacheNotChanged', 'data': { 'device': %s } }" + #define QERR_INVALID_BLOCK_FORMAT \ "{ 'class': 'InvalidBlockFormat', 'data': { 'name': %s } }" In the light of the recent error handling discussion: Do we really need two separate errors? Can we just reuse an existing one? Just QERR_IO_ERROR could be good enough. Kevin Yes, will use QERR_IO_ERROR instead of these two new errors. -thanks, Supriya
Re: [Qemu-devel] [v1 Patch 1/10]Qemu: Enhance "info block" to display host cache setting
On 07/05/2012 10:08 PM, Jeff Cody wrote: On 06/15/2012 04:47 PM, Supriya Kannery wrote: Enhance "info block" to display hostcache setting for each block device. Example: (qemu) info block ide0-hd0: removable=0 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Enhanced to display "hostcache" setting: (qemu) info block ide0-hd0: removable=0 hostcache=1 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Signed-off-by: Supriya Kannery This email is not about any changes per se, but just noting some conflicts (see below) with Paolo's blkmirror series (from his branch blkmirror-job-1.2 in git://github.com/bonzini/qemu.git). This is just for future reference, I don't know which will go in first. I have been using code from git://git.qemu.org/qemu.git for posting patches. Will use the same for next version of this patchset as well, so that patches are prepared over the latest code changes, including Paolo's patchset if they are upstream. -thanks, Supriya
Re: [Qemu-devel] [v1 Patch 1/10]Qemu: Enhance "info block" to display host cache setting
On 07/11/2012 07:33 PM, Luiz Capitulino wrote: On Mon, 09 Jul 2012 16:43:40 +0200 Kevin Wolf wrote: Am 15.06.2012 23:07, schrieb Eric Blake: On 06/15/2012 02:47 PM, Supriya Kannery wrote: Enhance "info block" to display hostcache setting for each block device. ## { 'type': 'BlockInfo', 'data': {'device': 'str', 'type': 'str', 'removable': 'bool', - 'locked': 'bool', '*inserted': 'BlockDeviceInfo', + 'locked': 'bool','hostcache': 'bool', '*inserted': 'BlockDeviceInfo', space after comma Since 'hostcache' was not present when talking to older qemu, should we mark it optional? What does "optional" really mean? I always understood that it means that whether the field exists or not depends on some runtime condition, not on the qemu version. I would specify something like this, that always exists in new qemu versions, in the "Since" section. Or maybe a separate "Since" specification like in SpiceInfo for mouse-mode. Yes, Kevin is right. Will add a comment " Since: 1.x" -thanks, Supriya
[Qemu-devel] [v1 Patch 6/10]Qemu: raw-win32 image file reopen
raw-win32 driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery Signed-off-by: Shrinidhi Joshi Index: qemu/block/raw-win32.c === --- qemu.orig/block/raw-win32.c +++ qemu/block/raw-win32.c @@ -26,18 +26,27 @@ #include "block_int.h" #include "module.h" #include +#include #include #define FTYPE_FILE 0 #define FTYPE_CD 1 #define FTYPE_HARDDISK 2 +#define WINDOWS_VISTA 6 typedef struct BDRVRawState { HANDLE hfile; int type; char drive_path[16]; /* format: "d:\" */ +DWORD overlapped; } BDRVRawState; +typedef struct BDRVRawReopenState { +BDRVReopenState reopen_state; +HANDLE stash_hfile; +DWORD stash_overlapped; +} BDRVRawReopenState; + int qemu_ftruncate64(int fd, int64_t length) { LARGE_INTEGER li; @@ -106,9 +115,96 @@ static int raw_open(BlockDriverState *bs return -EACCES; return -1; } +s->overlapped = overlapped; return 0; } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVRawReopenState *raw_rs = g_malloc0(sizeof(BDRVRawReopenState)); +BDRVRawState *s = bs->opaque; +int ret = 0; +OSVERSIONINFO osvi; +BOOL bIsWindowsVistaorLater; + +raw_rs->bs = bs; +raw_rs->stash_hfile = s->hfile; +raw_rs->stash_overlapped = s->overlapped; +*prs = raw_rs; + +if (flags & BDRV_O_NOCACHE) { +s->overlapped |= FILE_FLAG_NO_BUFFERING; +} else { +s->overlapped &= ~FILE_FLAG_NO_BUFFERING; +} + +if (!(flags & BDRV_O_CACHE_WB)) { +s->overlapped |= FILE_FLAG_WRITE_THROUGH; +} else { +s->overlapped &= ~FILE_FLAG_WRITE_THROUGH; +} + +ZeroMemory(&osvi, sizeof(OSVERSIONINFO)); +osvi.dwOSVersionInfoSize = sizeof(OSVERSIONINFO); + +GetVersionEx(&osvi); + +if (osvi.dwMajorVersion >= WINDOWS_VISTA) { +s->hfile = ReOpenFile(raw_rs->stash_hfile, 0, FILE_SHARE_READ, + overlapped); +if (s->hfile == INVALID_HANDLE_VALUE) { +int err = GetLastError(); +if (err == ERROR_ACCESS_DENIED) { +ret = -EACCES; +} else { +ret = -1; +} +} +} else { + +DuplicateHandle(GetCurrentProcess(), +raw_rs->stash_hfile, +GetCurrentProcess(), +&s->hfile, +0, +FALSE, +DUPLICATE_SAME_ACCESS); +bs->drv->bdrv_close(bs); +ret = bs->drv->bdrv_open(bs, bs->filename, flags); +} +return ret; +} + + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVRawReopenState *raw_rs; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +/* clean up stashed handle */ +CloseHandle(raw_rs->stash_hfile); +g_free(raw_rs); + +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ + +BDRVRawReopenState *raw_rs; +BDRVRawState *s = bs->opaque; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +if (s->hfile && (s->hfile != INVALID_HANDLE_VALUE)) { +CloseHandle(s->hfile); +} +s->hfile = raw_rs->stash_hfile; +s->overlapped = raw_rs->stash_overlapped; +g_free(raw_rs); +} + static int raw_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors) {
[Qemu-devel] [v1 Patch 5/10]Qemu: raw-posix image file reopen
raw-posix driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery Index: qemu/block/raw.c === --- qemu.orig/block/raw.c +++ qemu/block/raw.c @@ -9,6 +9,22 @@ static int raw_open(BlockDriverState *bs return 0; } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +return bdrv_reopen_prepare(bs->file, prs, flags); +} + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +bdrv_reopen_commit(bs->file, rs); +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +bdrv_reopen_abort(bs->file, rs); +} + static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov) { @@ -104,6 +120,10 @@ static BlockDriver bdrv_raw = { .instance_size = 1, .bdrv_open = raw_open, +.bdrv_reopen_prepare += raw_reopen_prepare, +.bdrv_reopen_commit = raw_reopen_commit, +.bdrv_reopen_abort = raw_reopen_abort, .bdrv_close = raw_close, .bdrv_co_readv = raw_co_readv, Index: qemu/block/raw-posix.c === --- qemu.orig/block/raw-posix.c +++ qemu/block/raw-posix.c @@ -136,8 +136,15 @@ typedef struct BDRVRawState { #endif } BDRVRawState; +typedef struct BDRVRawReopenState { +BDRVReopenState reopen_state; +BDRVRawState *stash_s; +} BDRVRawReopenState; + static int fd_open(BlockDriverState *bs); static int64_t raw_getlength(BlockDriverState *bs); +static void raw_stash_state(BDRVRawState *stashed_state, BDRVRawState *s); +static void raw_revert_state(BDRVRawState *s, BDRVRawState *stashed_state); #if defined(__FreeBSD__) || defined(__FreeBSD_kernel__) static int cdrom_reopen(BlockDriverState *bs); @@ -279,6 +286,119 @@ static int raw_open(BlockDriverState *bs return raw_open_common(bs, filename, flags, 0); } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVRawReopenState *raw_rs = g_malloc0(sizeof(BDRVRawReopenState)); +BDRVRawState *s = bs->opaque; +int ret = 0; + +raw_rs->reopen_state.bs = bs; + +/* stash state before reopen */ +raw_rs->stash_s = g_malloc0(sizeof(BDRVRawState)); +/*memcpy(raw_rs->stash_s, s, sizeof(BDRVRawState)); */ +raw_stash_state(raw_rs->stash_s, s); +s->fd = dup(raw_rs->stash_s->fd); + +*prs = &(raw_rs->reopen_state); + +/* Flags that can be set using fcntl */ +int fcntl_flags = BDRV_O_NOCACHE; + +if ((bs->open_flags & ~fcntl_flags) == (flags & ~fcntl_flags)) { +if ((flags & BDRV_O_NOCACHE)) { +s->open_flags |= O_DIRECT; +} else { +s->open_flags &= ~O_DIRECT; +} +ret = fcntl_setfl(s->fd, s->open_flags); +} else { + +/* close and reopen using new flags */ +bs->drv->bdrv_close(bs); +ret = bs->drv->bdrv_file_open(bs, bs->filename, flags); +} +return ret; +} + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVRawReopenState *raw_rs; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +/* clean up stashed state */ +close(raw_rs->stash_s->fd); +g_free(raw_rs->stash_s); +g_free(raw_rs); +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVRawReopenState *raw_rs; +BDRVRawState *s = bs->opaque; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +/* revert to stashed state */ +if (s->fd != -1) { +close(s->fd); +} +/* memcpy(s, raw_rs->stash_s, sizeof(BDRVRawState)); */ +raw_revert_state(s, raw_rs->stash_s); +g_free(raw_rs->stash_s); +g_free(raw_rs); +} + +static void raw_stash_state(BDRVRawState *stashed_s, BDRVRawState *s) +{ +stashed_s->fd = -1; +stashed_s->type = s->type; +stashed_s->open_flags = s->open_flags; +#if defined(__linux__) +/* linux floppy specific */ +stashed_s->fd_open_time = s->fd_open_time; +stashed_s->fd_error_time = s->fd_error_time; +stashed_s->fd_got_error = s->fd_got_error; +stashed_s->fd_media_changed = s->fd_media_changed; +#endif +#ifdef CONFIG_LINUX_AIO +stashed_s->use_aio = s->use_aio; +stashed_s->aio_ctx = s->aio_ctx; +#endif +stashed_s->aligned_buf = s->aligned_buf; +stashed_s->aligned_buf_size = s->aligned_buf_size; +#ifdef CONFIG_XFS +stashed_s->is_xfs = s->is_xfs; +#endif + +} + +static voi
[Qemu-devel] [v1 Patch 2/10]Qemu: Error classes for hostcache setting and data sync failures
New error classes defined for hostcache setting and data sync error Signed-off-by: Supriya Kannery --- qerror.c |8 qerror.h |6 ++ 2 files changed, 14 insertions(+) Index: qemu/qerror.c === --- qemu.orig/qerror.c +++ qemu/qerror.c @@ -80,6 +80,10 @@ static const QErrorStringTable qerror_ta .desc = "The command %(name) has not been found", }, { +.error_fmt = QERR_DATA_SYNC_FAILED, +.desc = "Syncing of data failed for device '%(device)'", +}, +{ .error_fmt = QERR_DEVICE_ENCRYPTED, .desc = "Device '%(device)' is encrypted", }, @@ -152,6 +156,10 @@ static const QErrorStringTable qerror_ta .desc = "The feature '%(name)' is not enabled", }, { +.error_fmt = QERR_HOSTCACHE_NOT_CHANGED, +.desc = "Could not change hostcache setting for '%(device)'", +}, +{ .error_fmt = QERR_INVALID_BLOCK_FORMAT, .desc = "Invalid block format '%(name)'", }, Index: qemu/qerror.h === --- qemu.orig/qerror.h +++ qemu/qerror.h @@ -82,6 +82,9 @@ QError *qobject_to_qerror(const QObject #define QERR_COMMAND_NOT_FOUND \ "{ 'class': 'CommandNotFound', 'data': { 'name': %s } }" +#define QERR_DATA_SYNC_FAILED \ +"{ 'class': 'DataSyncFailed', 'data': { 'device': %s } }" + #define QERR_DEVICE_ENCRYPTED \ "{ 'class': 'DeviceEncrypted', 'data': { 'device': %s, 'filename': %s } }" @@ -136,6 +139,9 @@ QError *qobject_to_qerror(const QObject #define QERR_FEATURE_DISABLED \ "{ 'class': 'FeatureDisabled', 'data': { 'name': %s } }" +#define QERR_HOSTCACHE_NOT_CHANGED \ +"{ 'class': 'HostcacheNotChanged', 'data': { 'device': %s } }" + #define QERR_INVALID_BLOCK_FORMAT \ "{ 'class': 'InvalidBlockFormat', 'data': { 'name': %s } }"
[Qemu-devel] [v1 Patch 4/10]Qemu: Framework for reopening image files safely
Struct BDRVReopenState along with three reopen related functions introduced for handling reopening of images safely. This can be extended by each of the block drivers to reopen respective image files. Signed-off-by: Supriya Kannery Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -858,10 +858,32 @@ unlink_and_fail: return ret; } +int bdrv_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, int flags) +{ + BlockDriver *drv = bs->drv; + + return drv->bdrv_reopen_prepare(bs, prs, flags); +} + +void bdrv_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BlockDriver *drv = bs->drv; + +drv->bdrv_reopen_commit(bs, rs); +} + +void bdrv_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BlockDriver *drv = bs->drv; + +drv->bdrv_reopen_abort(bs, rs); +} + int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) { BlockDriver *drv = bs->drv; int ret = 0, open_flags; +BDRVReopenState *reopen_state = NULL; /* Quiesce IO for the given block device */ qemu_aio_flush(); @@ -870,17 +892,44 @@ int bdrv_reopen(BlockDriverState *bs, in qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); return ret; } -open_flags = bs->open_flags; -bdrv_close(bs); -ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); -if (ret < 0) { -/* Reopen failed. Try to open with original flags */ -qerror_report(QERR_OPEN_FILE_FAILED, bs->filename); -ret = bdrv_open(bs, bs->filename, open_flags, drv); -if (ret < 0) { -/* Reopen failed with orig and modified flags */ -abort(); +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen_prepare) { +ret = bdrv_reopen_prepare(bs, &reopen_state, bdrv_flags); + if (ret < 0) { +bdrv_reopen_abort(bs, reopen_state); +qerror_report(QERR_OPEN_FILE_FAILED, bs->filename); +return ret; +} + +bdrv_reopen_commit(bs, reopen_state); +bs->open_flags = bdrv_flags; + +} else { + +/* Try reopening the image using protocol or directly */ +if (bs->file) { +open_flags = bs->open_flags; +drv->bdrv_close(bs); +if (drv->bdrv_file_open) { +ret = drv->bdrv_file_open(bs, bs->filename, bdrv_flags); +} else { +ret = bdrv_file_open(&bs->file, bs->filename, bdrv_flags); +} +if (ret < 0) { +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_OPEN_FILE_FAILED, bs->filename); +if (drv->bdrv_file_open) { +ret = drv->bdrv_file_open(bs, bs->filename, open_flags); +} else { +ret = bdrv_file_open(&bs->file, bs->filename, open_flags); +} +if (ret < 0) { +/* Reopen failed with orig and modified flags */ +bdrv_delete(bs->file); +bs->drv = NULL; +} +} } } Index: qemu/block_int.h === --- qemu.orig/block_int.h +++ qemu/block_int.h @@ -137,6 +137,13 @@ struct BlockDriver { int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename); int (*bdrv_probe_device)(const char *filename); int (*bdrv_open)(BlockDriverState *bs, int flags); + +/* For handling image reopen for split or non-split files */ +int (*bdrv_reopen_prepare)(BlockDriverState *bs, + BDRVReopenState **prs, + int flags); +void (*bdrv_reopen_commit)(BlockDriverState *bs, BDRVReopenState *rs); +void (*bdrv_reopen_abort)(BlockDriverState *bs, BDRVReopenState *rs); int (*bdrv_file_open)(BlockDriverState *bs, const char *filename, int flags); int (*bdrv_read)(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors); @@ -335,6 +342,10 @@ struct BlockDriverState { BlockJob *job; }; +struct BDRVReopenState { +BlockDriverState *bs; +}; + int get_tmp_filename(char *filename, int size); void bdrv_set_io_limits(BlockDriverState *bs, Index: qemu/qemu-common.h === --- qemu.orig/qemu-common.h +++ qemu/qemu-common.h @@ -225,6 +225,7 @@ typedef struct NICInfo NICInfo; typedef struct HCIInfo HCIInfo; typedef struct AudioState AudioState; typedef struct BlockDriverState BlockDriverState; +typedef struct BDRVReopenState BDRVReopenState; typedef struct DriveInfo DriveInfo; typedef struct DisplayState DisplayState; typedef struct DisplayChangeListener DisplayChang
[Qemu-devel] [v1 Patch 9/10]Qemu: qcow image file reopen
qcow driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Shrinidhi Joshi Index: qemu/block/qcow.c === --- qemu.orig/block/qcow.c +++ qemu/block/qcow.c @@ -78,7 +78,14 @@ typedef struct BDRVQcowState { Error *migration_blocker; } BDRVQcowState; +typedef struct BDRVQcowReopenState { +BDRVReopenState reopen_state; +BDRVQcowState *stash_s; +} BDRVQcowReopenState; + static int decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset); +static void qcow_stash_state(BDRVQcowState *stash_s, BDRVQcowState *s); +static void qcow_revert_state(BDRVQcowState *s, BDRVQcowState *stash_s); static int qcow_probe(const uint8_t *buf, int buf_size, const char *filename) { @@ -197,6 +204,103 @@ static int qcow_open(BlockDriverState *b return ret; } +static int qcow_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVQcowReopenState *qcow_rs = g_malloc0(sizeof(BDRVQcowReopenState)); +int ret = 0; +BDRVQcowState *s = bs->opaque; + +qcow_rs->reopen_state.bs = bs; +qcow_rs->stash_s = g_malloc0(sizeof(BDRVQcowState)); +qcow_stash_state(qcow_rs->stash_s, s); +*prs = &(qcow_rs->reopen_state); + +ret = qcow_open(bs, flags); +return ret; +} + +static void qcow_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQcowReopenState *qcow_rs; + +qcow_rs = container_of(rs, BDRVQcowReopenState, reopen_state); +g_free(qcow_rs->stash_s); +g_free(qcow_rs); +} + +static void qcow_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQcowReopenState *qcow_rs; +BDRVQcowState *s = bs->opaque; + +qcow_rs = container_of(rs, BDRVQcowReopenState, reopen_state); + +/* revert to stashed state */ +qcow_revert_state(s, qcow_rs->stash_s); + +g_free(qcow_rs->stash_s); +g_free(qcow_rs); +} + +static void qcow_stash_state(BDRVQcowState *stash_s, BDRVQcowState *s) +{ +int i; + +stash_s->cluster_bits = s->cluster_bits; +stash_s->cluster_size = s->cluster_size; +stash_s->cluster_sectors = s->cluster_sectors; +stash_s->l2_bits = s->l2_bits; +stash_s->l2_size = s->l2_size; +stash_s->l1_size = s->l1_size; +stash_s->cluster_offset_mask = s->cluster_offset_mask; +stash_s->l1_table_offset = s->l1_table_offset; +stash_s->l1_table = s->l1_table; +stash_s->l2_cache = s->l2_cache; +for (i = 0; i < L2_CACHE_SIZE; i++) { +stash_s->l2_cache_offsets[i] = s->l2_cache_offsets[i]; +stash_s->l2_cache_counts[i] = s->l2_cache_counts[i]; +} +stash_s->cluster_cache = s->cluster_cache; +stash_s->cluster_data = s->cluster_data; +stash_s->cluster_cache_offset = s->cluster_cache_offset; +stash_s->crypt_method = s->crypt_method; +stash_s->crypt_method_header = s->crypt_method_header; +stash_s->aes_encrypt_key = s->aes_encrypt_key; +stash_s->aes_decrypt_key = s->aes_decrypt_key; +stash_s->lock = s->lock; +stash_s->migration_blocker = s->migration_blocker; +} + +static void qcow_revert_state(BDRVQcowState *s, BDRVQcowState *stash_s) +{ +int i; + +s->cluster_bits = stash_s->cluster_bits; +s->cluster_size = stash_s->cluster_size; +s->cluster_sectors = stash_s->cluster_sectors; +s->l2_bits = stash_s->l2_bits; +s->l2_size = stash_s->l2_size; +s->l1_size = stash_s->l1_size; +s->cluster_offset_mask = stash_s->cluster_offset_mask; +s->l1_table_offset = stash_s->l1_table_offset; +s->l1_table = stash_s->l1_table; +s->l2_cache = stash_s->l2_cache; +for (i = 0; i < L2_CACHE_SIZE; i++) { +s->l2_cache_offsets[i] = s->l2_cache_offsets[i]; +s->l2_cache_counts[i] = stash_s->l2_cache_counts[i]; +} +s->cluster_cache = stash_s->cluster_cache; +s->cluster_data = stash_s->cluster_data; +s->cluster_cache_offset = stash_s->cluster_cache_offset; +s->crypt_method = stash_s->crypt_method; +s->crypt_method_header = stash_s->crypt_method_header; +s->aes_encrypt_key = stash_s->aes_encrypt_key; +s->aes_decrypt_key = stash_s->aes_decrypt_key; +s->lock = stash_s->lock; +s->migration_blocker = stash_s->migration_blocker; +} + static int qcow_set_key(BlockDriverState *bs, const char *key) { BDRVQcowState *s = bs->opaque; @@ -870,6 +974,10 @@ static BlockDriver bdrv_qcow = { .bdrv_close= qcow_close, .bdrv_create = qcow_create, +.bdrv_reopen_prepare= qcow_reopen_prepare, +.bdrv_reopen_commit = qcow_reopen_commit, +.bdrv_reopen_abort = qcow_reopen_abort, + .bdrv_co_readv = qcow_co_readv, .bdrv_co_writev = qcow_co_writev, .bdrv_co_is_allocated = qcow_co_is_allocated,
Re: [Qemu-devel] CoW image commit+shrink(= make_empty) support
On 06/14/2012 11:58 PM, Supriya Kannery wrote: On 06/14/2012 07:59 PM, Jeff Cody wrote: On 06/14/2012 10:23 AM, Zhi Hui Li wrote: On 2012年06月11日 23:37, Jeff Cody wrote: On 06/11/2012 10:24 AM, Stefan Hajnoczi wrote: On Mon, Jun 11, 2012 at 1:50 PM, Kevin Wolf wrote: Am 11.06.2012 14:09, schrieb Stefan Hajnoczi: On Fri, Jun 8, 2012 at 6:46 PM, Jeff Cody wrote: On 06/08/2012 12:11 PM, Kevin Wolf wrote: Am 08.06.2012 16:32, schrieb Jeff Cody: On 06/08/2012 09:53 AM, Stefan Hajnoczi wrote: On Fri, Jun 8, 2012 at 2:19 PM, Jeff Cody wrote: On 06/08/2012 08:42 AM, Stefan Hajnoczi wrote: Let's figure out how to specify block-commit so we're all happy, that way we can avoid duplicating work. Any comments on my notes above? I think we are almost completely on the same page - devil is in the details, of course (for instance, on how to convert the destination base from r/o to r/w). Great. The atomic r/o -> r/w transition and the commit coroutine can be implemented on in parallel. Are you happy to implement the atomic r/o -> r/w transition since you wrote bdrv_append()? Then Zhi Hui can assume that part already works and focus on implementing the commit coroutine in the meantime. I'm just suggesting a way to split up the work, please let me know if you think this is good. I am happy to do it that way. I'll shift my focus to the atomic image reopen in r/w mode. I'll go ahead and post my diagrams and other info for block-commit on the wiki, because I don't think it conflicts with we discussed above (although I will modify my diagrams to not show commit from the top-level image). Of course, feel free to change it as necessary. I may have mentioned it before, but just in case: I think Supriya's bdrv_reopen() patches are a good starting point. I don't know why they were never completed, but I think we all agreed on the general design, so it should be possible to pick that up. Though if you have already started with your own work on it, Jeff, I expect that it won't be much different because it's basically the same transactional approach that you know and that we already used for group snapshots. I will definitely use parts of Supriya's as it makes sense - what I started work on is similar to bdrv_append() and the current transaction approach, so there will be plenty in common to reuse, even with some differences. I have CCed Supriya who has been working on the reopen patch series. She is close to posting a new version. It's just a bit disappointing that it takes several months between each two versions of the patch series. We'd like to have this in qemu 1.2, not only in qemu 1.14. I can understand if someone can't find the time, but then allow at least someone else to pick it up. Hey, don't shoot the messenger :). I just wanted add Supriya to CC so she can join the discussion and see how much overlap there is with what she's doing. We all contribute to QEMU from different angles and with different priorities. If there is a time critical dependency on the reopen code then discuss it here and the result will be that someone officially drives the feature on. I am more than happy to take the previous reopen() patches, and drive those forward, and also do whatever else is needed for live block commit. It sounds like Zhi Hui is working on live block commit patches, and Supriya is working on the bdrv_reopen() portion - I don't want to duplicate any effort, but if there is anything I can do to help with either of those areas, just let me know. Thanks, Jeff Jeff please go ahead with block-commit, I am finishing up my fdc async conversion patch series first. I will reply when I'm ready to work on block-commit. Thank you very much! Hi Zhi, I will do that. Thanks! Jeff Jeff, Need another day's time for posting bdrv_reopen() patchset. Getting few hmp related build errors in hostcache toggling code when built on latest git. Will correct those and post the patchset soon. - Supriya Jeff, bdrv_reopen v1 patchset posted. Title: "Dynamic host pagecache change and image file reopen". Thanks, Supriya
[Qemu-devel] [v1 Patch 8/10]Qemu: qcow2 image file reopen
qcow2 driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery Index: qemu/block/qcow2.c === --- qemu.orig/block/qcow2.c +++ qemu/block/qcow2.c @@ -52,10 +52,19 @@ typedef struct { uint32_t magic; uint32_t len; } QCowExtension; + +typedef struct BDRVQcowReopenState { +BDRVReopenState reopen_state; +BDRVQcowState *stash_s; +} BDRVQcowReopenState; + #define QCOW2_EXT_MAGIC_END 0 #define QCOW2_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA #define QCOW2_EXT_MAGIC_FEATURE_TABLE 0x6803f857 +static void qcow2_stash_state(BDRVQcowState *stashed_state, BDRVQcowState *s); +static void qcow2_revert_state(BDRVQcowState *s, BDRVQcowState *stashed_state); + static int qcow2_probe(const uint8_t *buf, int buf_size, const char *filename) { const QCowHeader *cow_header = (const void *)buf; @@ -436,6 +445,171 @@ static int qcow2_open(BlockDriverState * return ret; } +static int qcow2_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVQcowReopenState *qcow2_rs = g_malloc0(sizeof(BDRVQcowReopenState)); +int ret = 0; +BDRVQcowState *s = bs->opaque; + +qcow2_rs->reopen_state.bs = bs; + +/* save state before reopen */ +qcow2_rs->stash_s = g_malloc0(sizeof(BDRVQcowState)); +qcow2_stash_state(qcow2_rs->stash_s, s); +*prs = &(qcow2_rs->reopen_state); + +/* Reopen file with new flags */ + ret = qcow2_open(bs, flags); + return ret; +} + +static void qcow2_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQcowReopenState *qcow2_rs; +BDRVQcowState *stashed_s; + +qcow2_rs = container_of(rs, BDRVQcowReopenState, reopen_state); +stashed_s = qcow2_rs->stash_s; + +g_free(stashed_s->l1_table); + +qcow2_cache_flush(bs, stashed_s->l2_table_cache); +qcow2_cache_flush(bs, stashed_s->refcount_block_cache); + +qcow2_cache_destroy(bs, stashed_s->l2_table_cache); +qcow2_cache_destroy(bs, stashed_s->refcount_block_cache); + +g_free(stashed_s->unknown_header_fields); +cleanup_unknown_header_ext(bs); + +g_free(stashed_s->cluster_cache); +qemu_vfree(stashed_s->cluster_data); +qcow2_refcount_close(bs); +qcow2_free_snapshots(bs); + +g_free(stashed_s); +g_free(qcow2_rs); +} + +static void qcow2_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQcowReopenState *qcow2_rs; +BDRVQcowState *s = bs->opaque; +BDRVQcowState *stashed_s; + +qcow2_rs = container_of(rs, BDRVQcowReopenState, reopen_state); + +/* Revert to stashed state */ +qcow2_revert_state(s, qcow2_rs->stash_s); +stashed_s = qcow2_rs->stash_s; + +g_free(stashed_s); +g_free(qcow2_rs); +} + +static void qcow2_stash_state(BDRVQcowState *stashed_s, BDRVQcowState *s) +{ +stashed_s->cluster_bits = s->cluster_bits; +stashed_s->cluster_size = s->cluster_size; +stashed_s->cluster_sectors = s->cluster_sectors; +stashed_s->l2_bits = s->l2_bits; +stashed_s->l2_size = s->l2_size; +stashed_s->l1_size = s->l1_size; +stashed_s->l1_vm_state_index = s->l1_vm_state_index; +stashed_s->csize_shift = s->csize_shift; +stashed_s->csize_mask = s->csize_mask; +stashed_s->cluster_offset_mask = s->cluster_offset_mask; +stashed_s->l1_table_offset = s->l1_table_offset; +stashed_s->l1_table = s->l1_table; + +stashed_s->l2_table_cache = s->l2_table_cache; +stashed_s->refcount_block_cache = s->refcount_block_cache; + +stashed_s->cluster_cache = s->cluster_cache; +stashed_s->cluster_data = s->cluster_data; +stashed_s->cluster_cache_offset = s->cluster_cache_offset; +stashed_s->cluster_allocs = s->cluster_allocs; + +stashed_s->refcount_table = s->refcount_table; +stashed_s->refcount_table_offset = s->refcount_table_offset; +stashed_s->refcount_table_size = s->refcount_table_size; +stashed_s->free_cluster_index = s->free_cluster_index; +stashed_s->free_byte_offset = s->free_byte_offset; + +stashed_s->lock = s->lock; + +stashed_s->crypt_method = s->crypt_method; +stashed_s->crypt_method_header = s->crypt_method_header; +stashed_s->aes_encrypt_key = s->aes_encrypt_key; +stashed_s->aes_decrypt_key = s->aes_decrypt_key; +stashed_s->snapshots_offset = s->snapshots_offset; +stashed_s->snapshots_size = s->snapshots_size; +stashed_s->nb_snapshots = s->nb_snapshots; +stashed_s->snapshots = s->snapshots; + +stashed_s->flags = s->flags; +stashed_s->qcow_version = s->qcow_version; + +stashe
[Qemu-devel] [v1 Patch 10/10]Qemu: qed image file reopen
qed driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery Index: qemu/block/qed.c === --- qemu.orig/block/qed.c +++ qemu/block/qed.c @@ -18,6 +18,14 @@ #include "qerror.h" #include "migration.h" +typedef struct BDRVQEDReopenState { +BDRVReopenState reopen_state; +BDRVQEDState *stash_s; +} BDRVQEDReopenState; + +static void qed_stash_state(BDRVQEDState *stashed_state, BDRVQEDState *s); +static void qed_revert_state(BDRVQEDState *s, BDRVQEDState *stashed_state); + static void qed_aio_cancel(BlockDriverAIOCB *blockacb) { QEDAIOCB *acb = (QEDAIOCB *)blockacb; @@ -512,6 +520,98 @@ out: return ret; } +static int bdrv_qed_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVQEDReopenState *qed_rs = g_malloc0(sizeof(BDRVQEDReopenState)); +int ret = 0; +BDRVQEDState *s = bs->opaque; + +qed_rs->reopen_state.bs = bs; + +/* save state before reopen */ +qed_rs->stash_s = g_malloc0(sizeof(BDRVQEDState)); +qed_stash_state(qed_rs->stash_s, s); +*prs = &(qed_rs->reopen_state); + +/* Reopen file with new flags */ + ret = bdrv_qed_open(bs, flags); + return ret; +} + +static void bdrv_qed_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQEDReopenState *qed_rs; +BDRVQEDState *stashed_s; + +qed_rs = container_of(rs, BDRVQEDReopenState, reopen_state); +stashed_s = qed_rs->stash_s; + +qed_cancel_need_check_timer(stashed_s); +qemu_free_timer(stashed_s->need_check_timer); + +qed_free_l2_cache(&stashed_s->l2_cache); +qemu_vfree(stashed_s->l1_table); + +g_free(stashed_s); +g_free(qed_rs); +} + +static void bdrv_qed_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVQEDReopenState *qed_rs; +BDRVQEDState *s = bs->opaque; +BDRVQEDState *stashed_s; + +qed_rs = container_of(rs, BDRVQEDReopenState, reopen_state); + +/* Revert to stashed state */ +qed_revert_state(s, qed_rs->stash_s); +stashed_s = qed_rs->stash_s; + +g_free(stashed_s); +g_free(qed_rs); +} + +static void qed_stash_state(BDRVQEDState *stashed_state, BDRVQEDState *s) +{ +stashed_state->bs = s->bs; +stashed_state->file_size = s->file_size; + +stashed_state->header = s->header; +stashed_state->l1_table = s->l1_table; +stashed_state->l2_cache = s->l2_cache; +stashed_state->table_nelems = s->table_nelems; +stashed_state->l1_shift = s->l1_shift; +stashed_state->l2_shift = s->l2_shift; +stashed_state->l2_mask = s->l2_mask; + +stashed_state->allocating_write_reqs = s->allocating_write_reqs; +stashed_state->allocating_write_reqs_plugged = + s->allocating_write_reqs_plugged; + +stashed_state->need_check_timer = s->need_check_timer; +} + +static void qed_revert_state(BDRVQEDState *s, BDRVQEDState *stashed_state) +{ +s->bs = stashed_state->bs; +s->file_size = stashed_state->file_size; + +s->header = stashed_state->header; +s->l1_table = stashed_state->l1_table; +s->l2_cache = stashed_state->l2_cache; +s->table_nelems = stashed_state->table_nelems; +s->l1_shift = stashed_state->l1_shift; +s->l2_shift = stashed_state->l2_shift; +s->l2_mask = stashed_state->l2_mask; + +s->allocating_write_reqs = s->allocating_write_reqs; +s->allocating_write_reqs_plugged = s->allocating_write_reqs_plugged; + +s->need_check_timer = s->need_check_timer; +} + static void bdrv_qed_close(BlockDriverState *bs) { BDRVQEDState *s = bs->opaque; @@ -1559,6 +1659,9 @@ static BlockDriver bdrv_qed = { .bdrv_rebind = bdrv_qed_rebind, .bdrv_open= bdrv_qed_open, .bdrv_close = bdrv_qed_close, +.bdrv_reopen_prepare = bdrv_qed_reopen_prepare, +.bdrv_reopen_commit = bdrv_qed_reopen_commit, +.bdrv_reopen_abort= bdrv_qed_reopen_abort, .bdrv_create = bdrv_qed_create, .bdrv_co_is_allocated = bdrv_qed_co_is_allocated, .bdrv_make_empty = bdrv_qed_make_empty,
[Qemu-devel] [v1 Patch 7/10]Qemu: vmdk image file reopen
vmdk driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery Index: qemu/block/vmdk.c === --- qemu.orig/block/vmdk.c +++ qemu/block/vmdk.c @@ -115,6 +115,14 @@ typedef struct VmdkGrainMarker { uint8_t data[0]; } VmdkGrainMarker; +typedef struct BDRVVmdkReopenState { +BDRVReopenState reopen_state; +BDRVVmdkState *stash_s; +} BDRVVmdkReopenState; + +static void vmdk_stash_state(BDRVVmdkState *stashed_state, BDRVVmdkState *s); +static void vmdk_revert_state(BDRVVmdkState *s, BDRVVmdkState *stashed_state); + static int vmdk_probe(const uint8_t *buf, int buf_size, const char *filename) { uint32_t magic; @@ -588,7 +596,6 @@ static int vmdk_parse_extents(const char if (!strcmp(type, "FLAT")) { /* FLAT extent */ VmdkExtent *extent; - extent = vmdk_add_extent(bs, extent_file, true, sectors, 0, 0, 0, 0, sectors); extent->flat_start_offset = flat_offset << 9; @@ -675,6 +682,94 @@ fail: return ret; } +static int vmdk_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVVmdkReopenState *vmdk_rs = g_malloc0(sizeof(BDRVVmdkReopenState)); +int ret = 0; +BDRVVmdkState *s = bs->opaque; + +vmdk_rs->reopen_state.bs = bs; + +/* save state before reopen */ +vmdk_rs->stash_s = g_malloc0(sizeof(BDRVVmdkState)); +vmdk_stash_state(vmdk_rs->stash_s, s); +s->num_extents = 0; +s->extents = NULL; +s->migration_blocker = NULL; +*prs = &(vmdk_rs->reopen_state); + +/* create extents afresh with new flags */ + ret = vmdk_open(bs, flags); + return ret; +} + +static void vmdk_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVVmdkReopenState *vmdk_rs; +BDRVVmdkState *stashed_s; +VmdkExtent *e; +int i; + +vmdk_rs = container_of(rs, BDRVVmdkReopenState, reopen_state); +stashed_s = vmdk_rs->stash_s; + +/* clean up stashed state */ +for (i = 0; i < stashed_s->num_extents; i++) { +e = &stashed_s->extents[i]; +g_free(e->l1_table); +g_free(e->l2_cache); +g_free(e->l1_backup_table); +} +g_free(stashed_s->extents); +g_free(vmdk_rs->stash_s); +g_free(vmdk_rs); +} + +static void vmdk_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVVmdkReopenState *vmdk_rs; +BDRVVmdkState *s = bs->opaque; +VmdkExtent *e; +int i; + +vmdk_rs = container_of(rs, BDRVVmdkReopenState, reopen_state); + +/* revert to stashed state */ +for (i = 0; i < s->num_extents; i++) { +e = &s->extents[i]; +g_free(e->l1_table); +g_free(e->l2_cache); +g_free(e->l1_backup_table); +} +g_free(s->extents); +vmdk_revert_state(s, vmdk_rs->stash_s); +g_free(vmdk_rs->stash_s); +g_free(vmdk_rs); +} + +static void vmdk_stash_state(BDRVVmdkState *stashed_state, BDRVVmdkState *s) +{ + stashed_state->lock = s->lock; + stashed_state->desc_offset = s->desc_offset; + stashed_state->cid_updated = s->cid_updated; + stashed_state->parent_cid = s->parent_cid; + stashed_state->num_extents = s->num_extents; + stashed_state->extents = s->extents; + stashed_state->migration_blocker = s->migration_blocker; +} + +static void vmdk_revert_state(BDRVVmdkState *s, BDRVVmdkState *stashed_state) +{ + s->lock = stashed_state->lock; + s->desc_offset = stashed_state->desc_offset; + s->cid_updated = stashed_state->cid_updated; + s->parent_cid = stashed_state->parent_cid; + s->num_extents = stashed_state->num_extents; + s->extents = stashed_state->extents; + s->migration_blocker = stashed_state->migration_blocker; +} + static int get_whole_cluster(BlockDriverState *bs, VmdkExtent *extent, uint64_t cluster_offset, @@ -1593,6 +1688,12 @@ static BlockDriver bdrv_vmdk = { .instance_size = sizeof(BDRVVmdkState), .bdrv_probe = vmdk_probe, .bdrv_open = vmdk_open, +.bdrv_reopen_prepare += vmdk_reopen_prepare, +.bdrv_reopen_commit += vmdk_reopen_commit, +.bdrv_reopen_abort += vmdk_reopen_abort, .bdrv_read = vmdk_co_read, .bdrv_write = vmdk_co_write, .bdrv_close = vmdk_close,
[Qemu-devel] [v1 Patch 3/10]Qemu: Cmd "block_set_hostcache" for dynamic cache change
New command "block_set_hostcache" added for dynamically changing host pagecache setting of a block device. Usage: block_set_hostcache = block device = on/off Example: (qemu) block_set_hostcache ide0-hd0 off Signed-off-by: Supriya Kannery --- block.c | 54 ++ block.h |2 ++ blockdev.c | 26 ++ blockdev.h |2 ++ hmp-commands.hx | 14 ++ qmp-commands.hx | 27 +++ 6 files changed, 125 insertions(+) Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -858,6 +858,35 @@ unlink_and_fail: return ret; } +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) +{ +BlockDriver *drv = bs->drv; +int ret = 0, open_flags; + +/* Quiesce IO for the given block device */ +qemu_aio_flush(); +ret = bdrv_flush(bs); +if (ret != 0) { +qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); +return ret; +} +open_flags = bs->open_flags; +bdrv_close(bs); + +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); +if (ret < 0) { +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_OPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret < 0) { +/* Reopen failed with orig and modified flags */ +abort(); +} +} + +return ret; +} + void bdrv_close(BlockDriverState *bs) { bdrv_flush(bs); @@ -953,6 +982,34 @@ void bdrv_drain_all(void) } } +int bdrv_change_hostcache(BlockDriverState *bs, bool enable) +{ +int bdrv_flags = bs->open_flags; + +/* set hostcache flags (without changing WCE/flush bits) */ +if (enable) { +bdrv_flags &= ~BDRV_O_NOCACHE; +} else { +bdrv_flags |= BDRV_O_NOCACHE; +} + +/* If no change in flags, no need to reopen */ +if (bdrv_flags == bs->open_flags) { +printf("no need to change\n"); +return 0; +} + +if (bdrv_is_inserted(bs)) { +/* Reopen file with changed set of flags */ +bdrv_flags &= ~BDRV_O_CACHE_WB; +return bdrv_reopen(bs, bdrv_flags); +} else { +/* Save hostcache change for future use */ +bs->open_flags = bdrv_flags; +return 0; +} +} + /* make a BlockDriverState anonymous by removing from bdrv_state list. Also, NULL terminate the device_name to prevent double remove */ void bdrv_make_anon(BlockDriverState *bs) Index: qemu/block.h === --- qemu.orig/block.h +++ qemu/block.h @@ -128,6 +128,7 @@ int bdrv_parse_cache_flags(const char *m int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags); int bdrv_open(BlockDriverState *bs, const char *filename, int flags, BlockDriver *drv); +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags); void bdrv_close(BlockDriverState *bs); int bdrv_attach_dev(BlockDriverState *bs, void *dev); void bdrv_attach_dev_nofail(BlockDriverState *bs, void *dev); @@ -177,6 +178,7 @@ int bdrv_commit_all(void); int bdrv_change_backing_file(BlockDriverState *bs, const char *backing_file, const char *backing_fmt); void bdrv_register(BlockDriver *bdrv); +int bdrv_change_hostcache(BlockDriverState *bs, bool enable_host_cache); typedef struct BdrvCheckResult { Index: qemu/blockdev.c === --- qemu.orig/blockdev.c +++ qemu/blockdev.c @@ -1198,3 +1198,27 @@ BlockJobInfoList *qmp_query_block_jobs(E bdrv_iterate(do_qmp_query_block_jobs_one, &prev); return dummy.next; } + + +/* + * Change host page cache setting while guest is running. +*/ +void qmp_block_set_hostcache(const char *device, bool enable, + Error **errp) +{ +BlockDriverState *bs = NULL; + +/* Validate device */ +bs = bdrv_find(device); +if (!bs) { +error_set(errp, QERR_DEVICE_NOT_FOUND, device); +return; +} + +if (bdrv_change_hostcache(bs, enable)) { +error_set(errp, QERR_HOSTCACHE_NOT_CHANGED, device); +} + +return; +} + Index: qemu/hmp-commands.hx === --- qemu.orig/hmp-commands.hx +++ qemu/hmp-commands.hx @@ -808,7 +808,6 @@ ETEXI .mhandler.cmd = hmp_migrate, }, - STEXI @item migrate [-d] [-b] [-i] @var{uri} @findex migrate @@ -1314,6 +1313,21 @@ client. @var{keep} changes the password ETEXI { +.name = "block_set_hostcache", +.args_type = "device:B,option:b", +.params = "device on|off", +.help = "Change setting of host pagecache", +.mhandler.cmd
[Qemu-devel] [v1 Patch 1/10]Qemu: Enhance "info block" to display host cache setting
Enhance "info block" to display hostcache setting for each block device. Example: (qemu) info block ide0-hd0: removable=0 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Enhanced to display "hostcache" setting: (qemu) info block ide0-hd0: removable=0 hostcache=1 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Signed-off-by: Supriya Kannery --- block.c | 20 qmp-commands.hx |2 ++ 2 files changed, 18 insertions(+), 4 deletions(-) Index: qemu/qapi-schema.json === --- qemu.orig/qapi-schema.json +++ qemu/qapi-schema.json @@ -447,6 +447,8 @@ # @locked: True if the guest has locked this device from having its media # removed # +# @hostcache: True if host pagecache is enabled. +# # @tray_open: #optional True if the device has a tray and it is open # (only present if removable is true) # @@ -460,7 +462,7 @@ ## { 'type': 'BlockInfo', 'data': {'device': 'str', 'type': 'str', 'removable': 'bool', - 'locked': 'bool', '*inserted': 'BlockDeviceInfo', + 'locked': 'bool','hostcache': 'bool', '*inserted': 'BlockDeviceInfo', '*tray_open': 'bool', '*io-status': 'BlockDeviceIoStatus'} } ## Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -2581,6 +2581,7 @@ BlockInfoList *qmp_query_block(Error **e info->value->device = g_strdup(bs->device_name); info->value->type = g_strdup("unknown"); info->value->locked = bdrv_dev_is_medium_locked(bs); +info->value->hostcache = !(bs->open_flags & BDRV_O_NOCACHE); info->value->removable = bdrv_dev_has_removable_media(bs); if (bdrv_dev_has_removable_media(bs)) { Index: qemu/hmp.c === --- qemu.orig/hmp.c +++ qemu/hmp.c @@ -212,6 +212,8 @@ void hmp_info_block(Monitor *mon) monitor_printf(mon, " tray-open=%d", info->value->tray_open); } +monitor_printf(mon, " hostcache=%d", info->value->hostcache); + if (info->value->has_io_status) { monitor_printf(mon, " io-status=%s", BlockDeviceIoStatus_lookup[info->value->io_status]);
[Qemu-devel] [v1 Patch 0/10]Qemu: Dynamic host pagecache change and image file reopen
For changing host pagecache setting of a running VM, it is important to have a safe way of reopening its image file. Following patchset introduces: * a generic way to reopen image files safely. In this approach, before reopening an image, for each block driver, its state will be stashed. Incase preparation, (bdrv_reopen_prepare) for reopening returns success, the stashed state will be cleared (bdrv_reopen_commit) and reopened state will be used further. Incase preparation of reopening returns failure, the state of the driver will be rolled back (bdrv_reopen_abort) to the stashed state. This approach is implemented for raw-posix, raw-win32, vmdk, qcow, qcow2 and qed block drivers. * qmp and hmp command 'block_set_hostcache' using which host pagecache setting for a block device can be changed when the VM is running. * BDRVReopenState, a generic structure which can be extended by each of the block drivers to reopen respective image files. ToDo: * Currently driver state is stashed by field-by-field copy. Optimise this by stashing only the required fields. * Find out other drivers needing bdrv_reopen implementation. * Do some more extensive testing, especially with drivers like floppy, cdrom etc.. Earlier discussions on RFC for dynamic change of host pagecache can be found at: http://lists.gnu.org/archive/html/qemu-devel/2012-01/msg04139.html New block command added: "block_set_hostcache" -- Sets hostcache parameter for block device while guest is running. Usage: block_set_hostcache = block device = on/off qemu/block.c | 127 -- qemu/block.h |5 + qemu/block/qcow.c | 108 + qemu/block/qcow2.c | 177 + qemu/block/qed.c | 103 qemu/block/raw-posix.c | 123 ++ qemu/block/raw-win32.c | 96 ++ qemu/block/raw.c | 20 + qemu/block/vmdk.c | 103 qemu/block_int.h | 11 +++ qemu/blockdev.c| 24 ++ qemu/hmp-commands.hx | 16 qemu/hmp.c | 11 ++ qemu/hmp.h |1 qemu/qapi-schema.json | 20 - qemu/qemu-common.h |1 qemu/qerror.c |8 ++ qemu/qerror.h |6 + qemu/qmp-commands.hx | 24 ++ 19 files changed, 971 insertions(+), 13 deletions(-)
Re: [Qemu-devel] CoW image commit+shrink(= make_empty) support
On 06/14/2012 07:59 PM, Jeff Cody wrote: On 06/14/2012 10:23 AM, Zhi Hui Li wrote: On 2012年06月11日 23:37, Jeff Cody wrote: On 06/11/2012 10:24 AM, Stefan Hajnoczi wrote: On Mon, Jun 11, 2012 at 1:50 PM, Kevin Wolf wrote: Am 11.06.2012 14:09, schrieb Stefan Hajnoczi: On Fri, Jun 8, 2012 at 6:46 PM, Jeff Cody wrote: On 06/08/2012 12:11 PM, Kevin Wolf wrote: Am 08.06.2012 16:32, schrieb Jeff Cody: On 06/08/2012 09:53 AM, Stefan Hajnoczi wrote: On Fri, Jun 8, 2012 at 2:19 PM, Jeff Cody wrote: On 06/08/2012 08:42 AM, Stefan Hajnoczi wrote: Let's figure out how to specify block-commit so we're all happy, that way we can avoid duplicating work. Any comments on my notes above? I think we are almost completely on the same page - devil is in the details, of course (for instance, on how to convert the destination base from r/o to r/w). Great. The atomic r/o -> r/w transition and the commit coroutine can be implemented on in parallel. Are you happy to implement the atomic r/o -> r/w transition since you wrote bdrv_append()? Then Zhi Hui can assume that part already works and focus on implementing the commit coroutine in the meantime. I'm just suggesting a way to split up the work, please let me know if you think this is good. I am happy to do it that way. I'll shift my focus to the atomic image reopen in r/w mode. I'll go ahead and post my diagrams and other info for block-commit on the wiki, because I don't think it conflicts with we discussed above (although I will modify my diagrams to not show commit from the top-level image). Of course, feel free to change it as necessary. I may have mentioned it before, but just in case: I think Supriya's bdrv_reopen() patches are a good starting point. I don't know why they were never completed, but I think we all agreed on the general design, so it should be possible to pick that up. Though if you have already started with your own work on it, Jeff, I expect that it won't be much different because it's basically the same transactional approach that you know and that we already used for group snapshots. I will definitely use parts of Supriya's as it makes sense - what I started work on is similar to bdrv_append() and the current transaction approach, so there will be plenty in common to reuse, even with some differences. I have CCed Supriya who has been working on the reopen patch series. She is close to posting a new version. It's just a bit disappointing that it takes several months between each two versions of the patch series. We'd like to have this in qemu 1.2, not only in qemu 1.14. I can understand if someone can't find the time, but then allow at least someone else to pick it up. Hey, don't shoot the messenger :). I just wanted add Supriya to CC so she can join the discussion and see how much overlap there is with what she's doing. We all contribute to QEMU from different angles and with different priorities. If there is a time critical dependency on the reopen code then discuss it here and the result will be that someone officially drives the feature on. I am more than happy to take the previous reopen() patches, and drive those forward, and also do whatever else is needed for live block commit. It sounds like Zhi Hui is working on live block commit patches, and Supriya is working on the bdrv_reopen() portion - I don't want to duplicate any effort, but if there is anything I can do to help with either of those areas, just let me know. Thanks, Jeff Jeff please go ahead with block-commit, I am finishing up my fdc async conversion patch series first. I will reply when I'm ready to work on block-commit. Thank you very much! Hi Zhi, I will do that. Thanks! Jeff Jeff, Need another day's time for posting bdrv_reopen() patchset. Getting few hmp related build errors in hostcache toggling code when built on latest git. Will correct those and post the patchset soon. - Supriya
Re: [Qemu-devel] CoW image commit+shrink(= make_empty) support
On 06/12/2012 04:26 PM, Stefan Hajnoczi wrote: On Mon, Jun 11, 2012 at 4:37 PM, Jeff Cody wrote: On 06/11/2012 10:24 AM, Stefan Hajnoczi wrote: On Mon, Jun 11, 2012 at 1:50 PM, Kevin Wolf wrote: Am 11.06.2012 14:09, schrieb Stefan Hajnoczi: On Fri, Jun 8, 2012 at 6:46 PM, Jeff Cody wrote: On 06/08/2012 12:11 PM, Kevin Wolf wrote: Am 08.06.2012 16:32, schrieb Jeff Cody: On 06/08/2012 09:53 AM, Stefan Hajnoczi wrote: On Fri, Jun 8, 2012 at 2:19 PM, Jeff Cody wrote: On 06/08/2012 08:42 AM, Stefan Hajnoczi wrote: Let's figure out how to specify block-commit so we're all happy, that way we can avoid duplicating work. Any comments on my notes above? I think we are almost completely on the same page - devil is in the details, of course (for instance, on how to convert the destination base from r/o to r/w). Great. The atomic r/o -> r/w transition and the commit coroutine can be implemented on in parallel. Are you happy to implement the atomic r/o -> r/w transition since you wrote bdrv_append()? Then Zhi Hui can assume that part already works and focus on implementing the commit coroutine in the meantime. I'm just suggesting a way to split up the work, please let me know if you think this is good. I am happy to do it that way. I'll shift my focus to the atomic image reopen in r/w mode. I'll go ahead and post my diagrams and other info for block-commit on the wiki, because I don't think it conflicts with we discussed above (although I will modify my diagrams to not show commit from the top-level image). Of course, feel free to change it as necessary. I may have mentioned it before, but just in case: I think Supriya's bdrv_reopen() patches are a good starting point. I don't know why they were never completed, but I think we all agreed on the general design, so it should be possible to pick that up. Though if you have already started with your own work on it, Jeff, I expect that it won't be much different because it's basically the same transactional approach that you know and that we already used for group snapshots. I will definitely use parts of Supriya's as it makes sense - what I started work on is similar to bdrv_append() and the current transaction approach, so there will be plenty in common to reuse, even with some differences. I have CCed Supriya who has been working on the reopen patch series. She is close to posting a new version. It's just a bit disappointing that it takes several months between each two versions of the patch series. We'd like to have this in qemu 1.2, not only in qemu 1.14. I can understand if someone can't find the time, but then allow at least someone else to pick it up. Hey, don't shoot the messenger :). I just wanted add Supriya to CC so she can join the discussion and see how much overlap there is with what she's doing. We all contribute to QEMU from different angles and with different priorities. If there is a time critical dependency on the reopen code then discuss it here and the result will be that someone officially drives the feature on. I am more than happy to take the previous reopen() patches, and drive those forward, and also do whatever else is needed for live block commit. Supriya, Can you share with us whether you have enough time to complete the reopen() patches you've been working on? This functionality is a dependency for the new block-commit command. Jeff is willing to take on the reopen() work if you do not have time. Please let us know. Stefan All, Sorry! for not following up these discussions and hence the delay in responding. After few month's gap, I had resumed working on bdrv_reopen() patchset but was not aware of this dependency and urgency. Will be posting whatever I have by tomorrow so that Jeff can use that code as needed for block-commit. - Supriya
Re: [Qemu-devel] [RFC Patch 5/7]Qemu: raw-posix image file reopen
On 02/07/2012 03:47 PM, Stefan Hajnoczi wrote: On Wed, Feb 01, 2012 at 08:37:12AM +0530, Supriya Kannery wrote: +/* stash state before reopen */ +raw_rs->stash_s = g_malloc0(sizeof(BDRVRawState)); +memcpy(raw_rs->stash_s, s, sizeof(BDRVRawState)); Copying a struct is fragile, Mike Roth pointed out the potential issue with aligned_buf. If raw-posix could open from a given file descriptor as an alternative to opening a filename, then it would be clean and natural to simply re-initialize from the dup'd file descriptor in the abort case. That's the approach I would try instead of stashing the whole struct. Stefan fine, will get V1 with stashing of only required fields wherever possible instead of stashing the full struct.
Re: [Qemu-devel] [RFC Patch 4/7]Qemu: Framework for reopening image files safely
On 02/07/2012 03:38 PM, Stefan Hajnoczi wrote: On Wed, Feb 01, 2012 at 08:36:58AM +0530, Supriya Kannery wrote: Struct BDRVReopenState along with three reopen related functions introduced for handling reopening of images safely. This can be extended by each of the block drivers to reopen respective image files. +int bdrv_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, int flags) +{ + BlockDriver *drv = bs->drv; + + return drv->bdrv_reopen_prepare(bs, prs, flags); Indentation should be 4 spaces. I suggest configuring your editor so this is always correct and done automatically. ok -ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); -if (ret< 0) { -/* Reopen failed. Try to open with original flags */ -qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); -ret = bdrv_open(bs, bs->filename, open_flags, drv); +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen_prepare) { +ret = bdrv_reopen_prepare(bs,&reopen_state, bdrv_flags); + if (ret< 0) { Indentation +bdrv_reopen_abort(bs, reopen_state); +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +return ret; +} + +bdrv_reopen_commit(bs, reopen_state); +bs->open_flags = bdrv_flags; Is bs->open_flags not assigned inside bdrv_reopen_*()? No, Since it is generic for all the drivers, placed it here. + +} else { + open_flags = bs->open_flags; + bdrv_close(bs); + + ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); if (ret< 0) { -/* Reopen failed with orig and modified flags */ -abort(); +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret< 0) { +/* Reopen failed with orig and modified flags */ +bs->drv = NULL; This should be a post-condition of bdrv_open(). If you have found a case where bs->drv != NULL after bdrv_open() fails, then please fix that code path instead of assigning NULL here. ok, will check on this -thanks for reviewing, Supriya
Re: [Qemu-devel] [RFC Patch 4/7]Qemu: Framework for reopening image files safely
On 02/08/2012 08:37 PM, Kevin Wolf wrote: Am 01.02.2012 04:06, schrieb Supriya Kannery: Struct BDRVReopenState along with three reopen related functions introduced for handling reopening of images safely. This can be extended by each of the block drivers to reopen respective image files. +} else { + open_flags = bs->open_flags; + bdrv_close(bs); + + ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); if (ret< 0) { -/* Reopen failed with orig and modified flags */ -abort(); +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret< 0) { +/* Reopen failed with orig and modified flags */ +bs->drv = NULL; +} } Most image formats don't have a bdrv_reopen_* implementation after this series, so usually you'll have something like qcow2 on top of file. This code uses bdrv_close/open for the whole stack, even though the file layer could actually make use of a bdrv_reopen_* implementation and the qcow2 open isn't likely to fail if the image file could be opened. I think we can use drv->bdrv_close/open to reopen only one layer and try using bdrv_reopen_* for the lower layer again. This is an improvement that can be done in a separate patch, though. What I understood is, in the enhancement patch, we will have something like (taking qcow2 as an example) Implement bdrv_reopen_qcow2(image file) which reopens only the qcow2 image file Then, drv->bdrv_open(qcow2 driver) will reopen qcow2 driver => calls bdrv_reopen_qcow2(qcow2 image file) if image file has to be reopen Can you please explain a bit more, it this is not what you meant. -thanks, Supriya
Re: [Qemu-devel] [RFC Patch 6/7]Qemu: raw-win32 image file reopen
On 02/08/2012 08:32 PM, Kevin Wolf wrote: Am 01.02.2012 04:07, schrieb Supriya Kannery: win32 driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. + +if (osvi.dwMajorVersion>= WINDOWS_VISTA) { +s->hfile = ReOpenFile(raw_rs->stash_hfile, 0, FILE_SHARE_READ, + overlapped); +if (s->hfile == INVALID_HANDLE_VALUE) { +int err = GetLastError(); +if (err == ERROR_ACCESS_DENIED) { +ret = -EACCES; +} else { +ret = -1; Returning -1 where -errno is expected is bad (turns out as -EPERM on Linux, which is misleading). Maybe -EIO here. ok -thanks, Supriya
Re: [Qemu-devel] [RFC Patch 5/7]Qemu: raw-posix image file reopen
On 02/08/2012 08:24 PM, Kevin Wolf wrote: Am 01.02.2012 04:07, schrieb Supriya Kannery: raw-posix driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. +typedef struct BDRVRawReopenState { +BDRVReopenState reopen_state; +BDRVRawState *stash_s; +} BDRVRawReopenState; See Stefan's comment. If it's possible to save only the fd and maybe one or two other fields, then we should do that. Yes, for V1 of this patchset, will look for stashing only those relevant fields of a driver state wherever possible + +if ((bs->open_flags& ~fcntl_flags) == (flags& ~fcntl_flags)) { +if ((flags& BDRV_O_NOCACHE)) { +s->open_flags |= O_DIRECT; +} else { +s->open_flags&= ~O_DIRECT; +} +printf("O_DIRECT flag\n"); Debugging leftover? yes :-), didn't do a proper cleanup as this is RFC for the stashing approach. +ret = fcntl_setfl(s->fd, s->open_flags); +} else { + +printf("close and open with new flags\n"); Same here. V1 will be a clean one ! Kevin
Re: [Qemu-devel] [RFC Patch 3/7]Qemu: Cmd "block_set_hostcache" for dynamic cache change
On 02/08/2012 05:37 PM, Luiz Capitulino wrote: On Wed, 01 Feb 2012 08:36:41 +0530 Supriya Kannery wrote: +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); +if (ret< 0) { +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); OPEN_FILE_FAILED is fine. ok +/* + * Change host page cache setting while guest is running. +*/ +int do_block_set_hostcache(Monitor *mon, const QDict *qdict, + QObject **ret_data) This is not a QAPI command, please read docs/writing-qmp-commands.txt to know how to write QMP commands using the QAPI. fine, will check the doc -thanks, Supriya
Re: [Qemu-devel] [RFC Patch 1/7]Qemu: Enhance "info block" to display host cache setting
On 02/08/2012 05:30 PM, Luiz Capitulino wrote: On Wed, 01 Feb 2012 08:36:14 +0530 Supriya Kannery wrote: Enhance "info block" to display hostcache setting for each block device. Example: (qemu) info block ide0-hd0: removable=0 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Enhanced to display "hostcache" setting: (qemu) info block ide0-hd0: removable=0 hostcache=1 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 The day we'll want to refactor 'info block' output is coming... ok :-)
Re: [Qemu-devel] [RFC Patch 2/7]Qemu: Error classes for file reopen and data sync failure
On 02/07/2012 01:26 PM, Stefan Hajnoczi wrote: On Wed, Feb 01, 2012 at 08:36:28AM +0530, Supriya Kannery wrote: Index: qemu/qerror.c === --- qemu.orig/qerror.c +++ qemu/qerror.c @@ -108,6 +108,14 @@ static const QErrorStringTable qerror_ta .desc = "Device '%(device)' has multiple child busses", }, { +.error_fmt = QERR_DATA_SYNC_FAILED, +.desc = "Syncing of data failed for device '%(device)'", +}, +{ +.error_fmt = QERR_REOPEN_FILE_FAILED, +.desc = "Could not reopen '%(filename)'", +}, The comment in qerror.c says: "Please keep the entries in alphabetical order. Use scripts/check-qerror.sh to check." ok +{ .error_fmt = QERR_DEVICE_NO_BUS, .desc = "Device '%(device)' has no child bus", }, Index: qemu/qerror.h === --- qemu.orig/qerror.h +++ qemu/qerror.h @@ -117,6 +117,9 @@ QError *qobject_to_qerror(const QObject #define QERR_DEVICE_NOT_FOUND \ "{ 'class': 'DeviceNotFound', 'data': { 'device': %s } }" +#define QERR_DATA_SYNC_FAILED \ +"{ 'class': 'DataSyncFailed', 'data': { 'device': %s } }" + Same here: /* * QError class list * Please keep the definitions in alphabetical order. * Use scripts/check-qerror.sh to check. */ ok
Re: [Qemu-devel] [RFC Patch 5/7]Qemu: raw-posix image file reopen
On 02/02/2012 05:45 AM, Michael Roth wrote: On 01/31/2012 09:07 PM, Supriya Kannery wrote: raw-posix driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. + + /* Flags that can be set using fcntl */ + int fcntl_flags = BDRV_O_NOCACHE; + + if ((bs->open_flags& ~fcntl_flags) == (flags& ~fcntl_flags)) { + if ((flags& BDRV_O_NOCACHE)) { + s->open_flags |= O_DIRECT; + } else { + s->open_flags&= ~O_DIRECT; + } + printf("O_DIRECT flag\n"); + ret = fcntl_setfl(s->fd, s->open_flags); raw-posix.c:raw_aio_submit() does some extra alignment work if s->aligned_buf was set due to the image being opened O_DIRECT initially, not sure what the impact is but probably want to clean that up here. ok, will check on this thanks! for reviewing Supriya
[Qemu-devel] [RFC Patch 5/7]Qemu: raw-posix image file reopen
raw-posix driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery Index: qemu/block/raw.c === --- qemu.orig/block/raw.c +++ qemu/block/raw.c @@ -9,6 +9,22 @@ static int raw_open(BlockDriverState *bs return 0; } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +return bdrv_reopen_prepare(bs->file, prs, flags); +} + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +bdrv_reopen_commit(bs->file, rs); +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +bdrv_reopen_abort(bs->file, rs); +} + static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov) { @@ -109,6 +125,10 @@ static BlockDriver bdrv_raw = { .instance_size = 1, .bdrv_open = raw_open, +.bdrv_reopen_prepare += raw_reopen_prepare, +.bdrv_reopen_commit = raw_reopen_commit, +.bdrv_reopen_abort = raw_reopen_abort, .bdrv_close = raw_close, .bdrv_co_readv = raw_co_readv, Index: qemu/block/raw-posix.c === --- qemu.orig/block/raw-posix.c +++ qemu/block/raw-posix.c @@ -136,6 +136,11 @@ typedef struct BDRVRawState { #endif } BDRVRawState; +typedef struct BDRVRawReopenState { +BDRVReopenState reopen_state; +BDRVRawState *stash_s; +} BDRVRawReopenState; + static int fd_open(BlockDriverState *bs); static int64_t raw_getlength(BlockDriverState *bs); @@ -279,6 +284,71 @@ static int raw_open(BlockDriverState *bs return raw_open_common(bs, filename, flags, 0); } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVRawReopenState *raw_rs = g_malloc0(sizeof(BDRVRawReopenState)); +BDRVRawState *s = bs->opaque; +int ret = 0; + +raw_rs->reopen_state.bs = bs; + +/* stash state before reopen */ +raw_rs->stash_s = g_malloc0(sizeof(BDRVRawState)); +memcpy(raw_rs->stash_s, s, sizeof(BDRVRawState)); +s->fd = dup(raw_rs->stash_s->fd); + +*prs = &(raw_rs->reopen_state); + +/* Flags that can be set using fcntl */ +int fcntl_flags = BDRV_O_NOCACHE; + +if ((bs->open_flags & ~fcntl_flags) == (flags & ~fcntl_flags)) { +if ((flags & BDRV_O_NOCACHE)) { +s->open_flags |= O_DIRECT; +} else { +s->open_flags &= ~O_DIRECT; +} +printf("O_DIRECT flag\n"); +ret = fcntl_setfl(s->fd, s->open_flags); +} else { + +printf("close and open with new flags\n"); +/* close and reopen using new flags */ +bs->drv->bdrv_close(bs); +ret = bs->drv->bdrv_file_open(bs, bs->filename, flags); +} +return ret; +} + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVRawReopenState *raw_rs; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +/* clean up stashed state */ +close(raw_rs->stash_s->fd); +g_free(raw_rs->stash_s); +g_free(raw_rs); +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVRawReopenState *raw_rs; +BDRVRawState *s = bs->opaque; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +/* revert to stashed state */ +if (s->fd != -1) { +close(s->fd); +} +memcpy(s, raw_rs->stash_s, sizeof(BDRVRawState)); +g_free(raw_rs->stash_s); +g_free(raw_rs); +} + /* XXX: use host sector size if necessary with: #ifdef DIOCGSECTORSIZE { @@ -631,6 +701,9 @@ static BlockDriver bdrv_file = { .instance_size = sizeof(BDRVRawState), .bdrv_probe = NULL, /* no probe for protocols */ .bdrv_file_open = raw_open, +.bdrv_reopen_prepare = raw_reopen_prepare, +.bdrv_reopen_commit = raw_reopen_commit, +.bdrv_reopen_abort = raw_reopen_abort, .bdrv_close = raw_close, .bdrv_create = raw_create, .bdrv_co_discard = raw_co_discard,
[Qemu-devel] [RFC Patch 7/7]Qemu: vmdk image file reopen
vmdk driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache flag dynamically is handled here. Signed-off-by: Supriya Kannery Index: qemu/block/vmdk.c === --- qemu.orig/block/vmdk.c +++ qemu/block/vmdk.c @@ -115,6 +115,11 @@ typedef struct VmdkGrainMarker { uint8_t data[0]; } VmdkGrainMarker; +typedef struct BDRVVmdkReopenState { +BDRVReopenState reopen_state; +BDRVVmdkState *stash_s; +} BDRVVmdkReopenState; + static int vmdk_probe(const uint8_t *buf, int buf_size, const char *filename) { uint32_t magic; @@ -588,7 +593,6 @@ static int vmdk_parse_extents(const char if (!strcmp(type, "FLAT")) { /* FLAT extent */ VmdkExtent *extent; - extent = vmdk_add_extent(bs, extent_file, true, sectors, 0, 0, 0, 0, sectors); extent->flat_start_offset = flat_offset << 9; @@ -675,6 +679,71 @@ fail: return ret; } +static int vmdk_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVVmdkReopenState *vmdk_rs = g_malloc0(sizeof(BDRVVmdkReopenState)); +int ret = 0; +BDRVVmdkState *s = bs->opaque; + +vmdk_rs->reopen_state.bs = bs; + +/* save state before reopen */ +vmdk_rs->stash_s = g_malloc0(sizeof(BDRVVmdkState)); +memcpy(vmdk_rs->stash_s, s, sizeof(BDRVVmdkState)); +s->num_extents = 0; +s->extents = NULL ; +*prs = &(vmdk_rs->reopen_state); + +/* create extents afresh with new flags */ + ret = vmdk_open(bs, flags); + return ret; +} + +static void vmdk_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVVmdkReopenState *vmdk_rs; +BDRVVmdkState *stashed_s; +VmdkExtent *e; +int i; + +vmdk_rs = container_of(rs, BDRVVmdkReopenState, reopen_state); +stashed_s = vmdk_rs->stash_s; + +/* clean up stashed state */ +for (i = 0; i < stashed_s->num_extents; i++) { +e = &stashed_s->extents[i]; +g_free(e->l1_table); +g_free(e->l2_cache); +g_free(e->l1_backup_table); +} +g_free(stashed_s->extents); +g_free(vmdk_rs->stash_s); +g_free(vmdk_rs); +} + +static void vmdk_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVVmdkReopenState *vmdk_rs; +BDRVVmdkState *s = bs->opaque; +VmdkExtent *e; +int i; + +vmdk_rs = container_of(rs, BDRVVmdkReopenState, reopen_state); + +/* revert to stashed state */ +for (i = 0; i < s->num_extents; i++) { +e = &s->extents[i]; +g_free(e->l1_table); +g_free(e->l2_cache); +g_free(e->l1_backup_table); +} +g_free(s->extents); +memcpy(s, vmdk_rs->stash_s, sizeof(BDRVVmdkState)); +g_free(vmdk_rs->stash_s); +g_free(vmdk_rs); +} + static int get_whole_cluster(BlockDriverState *bs, VmdkExtent *extent, uint64_t cluster_offset, @@ -1382,7 +1451,6 @@ static int vmdk_create(const char *filen if (filename_decompose(filename, path, prefix, postfix, PATH_MAX)) { return -EINVAL; } -printf("vmdk_create\n"); /* Read out options */ while (options && options->name) { if (!strcmp(options->name, BLOCK_OPT_SIZE)) { @@ -1517,8 +1585,8 @@ exit: static void vmdk_close(BlockDriverState *bs) { BDRVVmdkState *s = bs->opaque; - printf("vmdk_close\n"); + vmdk_free_extents(bs); migrate_del_blocker(s->migration_blocker); @@ -1595,6 +1663,12 @@ static BlockDriver bdrv_vmdk = { .instance_size = sizeof(BDRVVmdkState), .bdrv_probe = vmdk_probe, .bdrv_open = vmdk_open, +.bdrv_reopen_prepare += vmdk_reopen_prepare, +.bdrv_reopen_commit += vmdk_reopen_commit, +.bdrv_reopen_abort += vmdk_reopen_abort, .bdrv_read = vmdk_co_read, .bdrv_write = vmdk_co_write, .bdrv_close = vmdk_close,
[Qemu-devel] [RFC Patch 6/7]Qemu: raw-win32 image file reopen
win32 driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery Index: qemu/block/raw-win32.c === --- qemu.orig/block/raw-win32.c +++ qemu/block/raw-win32.c @@ -26,18 +26,27 @@ #include "block_int.h" #include "module.h" #include +#include #include #define FTYPE_FILE 0 #define FTYPE_CD 1 #define FTYPE_HARDDISK 2 +#define WINDOWS_VISTA 6 typedef struct BDRVRawState { HANDLE hfile; int type; char drive_path[16]; /* format: "d:\" */ +DWORD overlapped; } BDRVRawState; +typedef struct BDRVRawReopenState { +BDRVReopenState reopen_state; +HANDLE stash_hfile; +DWORD stash_overlapped; +} BDRVRawReopenState; + int qemu_ftruncate64(int fd, int64_t length) { LARGE_INTEGER li; @@ -106,9 +115,96 @@ static int raw_open(BlockDriverState *bs return -EACCES; return -1; } +s->overlapped = overlapped; return 0; } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVRawReopenState *raw_rs = g_malloc0(sizeof(BDRVRawReopenState)); +BDRVRawState *s = bs->opaque; +int ret = 0; +OSVERSIONINFO osvi; +BOOL bIsWindowsVistaorLater; + +raw_rs->bs = bs; +raw_rs->stash_hfile = s->hfile; +raw_rs->stash_overlapped = s->overlapped; +*prs = raw_rs; + +if (flags & BDRV_O_NOCACHE) { +s->overlapped |= FILE_FLAG_NO_BUFFERING; +} else { +s->overlapped &= ~FILE_FLAG_NO_BUFFERING; +} + +if (!(flags & BDRV_O_CACHE_WB)) { +s->overlapped |= FILE_FLAG_WRITE_THROUGH; +} else { +s->overlapped &= ~FILE_FLAG_WRITE_THROUGH; +} + +ZeroMemory(&osvi, sizeof(OSVERSIONINFO)); +osvi.dwOSVersionInfoSize = sizeof(OSVERSIONINFO); + +GetVersionEx(&osvi); + +if (osvi.dwMajorVersion >= WINDOWS_VISTA) { +s->hfile = ReOpenFile(raw_rs->stash_hfile, 0, FILE_SHARE_READ, + overlapped); +if (s->hfile == INVALID_HANDLE_VALUE) { +int err = GetLastError(); +if (err == ERROR_ACCESS_DENIED) { +ret = -EACCES; +} else { +ret = -1; +} +} +} else { + +DuplicateHandle(GetCurrentProcess(), +raw_rs->stash_hfile, +GetCurrentProcess(), +&s->hfile, +0, +FALSE, +DUPLICATE_SAME_ACCESS); +bs->drv->bdrv_close(bs); +ret = bs->drv->bdrv_open(bs, bs->filename, flags); +} +return ret; +} + + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVRawReopenState *raw_rs; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +/* clean up stashed handle */ +CloseHandle(raw_rs->stash_hfile); +g_free(raw_rs); + +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ + +BDRVRawReopenState *raw_rs; +BDRVRawState *s = bs->opaque; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +if (s->hfile && (s->hfile != INVALID_HANDLE_VALUE)) { +CloseHandle(s->hfile); +} +s->hfile = raw_rs->stash_hfile; +s->overlapped = raw_rs->stash_overlapped; +g_free(raw_rs); +} + static int raw_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors) {
[Qemu-devel] [RFC Patch 4/7]Qemu: Framework for reopening image files safely
Struct BDRVReopenState along with three reopen related functions introduced for handling reopening of images safely. This can be extended by each of the block drivers to reopen respective image files. Signed-off-by: Supriya Kannery Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -808,10 +808,32 @@ unlink_and_fail: return ret; } +int bdrv_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, int flags) +{ + BlockDriver *drv = bs->drv; + + return drv->bdrv_reopen_prepare(bs, prs, flags); +} + +void bdrv_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs) +{ +BlockDriver *drv = bs->drv; + +drv->bdrv_reopen_commit(bs, rs); +} + +void bdrv_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BlockDriver *drv = bs->drv; + +drv->bdrv_reopen_abort(bs, rs); +} + int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) { BlockDriver *drv = bs->drv; int ret = 0, open_flags; +BDRVReopenState *reopen_state = NULL; /* Quiesce IO for the given block device */ qemu_aio_flush(); @@ -820,17 +842,32 @@ int bdrv_reopen(BlockDriverState *bs, in qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); return ret; } -open_flags = bs->open_flags; -bdrv_close(bs); -ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); -if (ret < 0) { -/* Reopen failed. Try to open with original flags */ -qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); -ret = bdrv_open(bs, bs->filename, open_flags, drv); +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen_prepare) { +ret = bdrv_reopen_prepare(bs, &reopen_state, bdrv_flags); + if (ret < 0) { +bdrv_reopen_abort(bs, reopen_state); +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +return ret; +} + +bdrv_reopen_commit(bs, reopen_state); +bs->open_flags = bdrv_flags; + +} else { + open_flags = bs->open_flags; + bdrv_close(bs); + + ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); if (ret < 0) { -/* Reopen failed with orig and modified flags */ -abort(); +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret < 0) { +/* Reopen failed with orig and modified flags */ +bs->drv = NULL; +} } } Index: qemu/block_int.h === --- qemu.orig/block_int.h +++ qemu/block_int.h @@ -105,6 +105,13 @@ struct BlockDriver { int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename); int (*bdrv_probe_device)(const char *filename); int (*bdrv_open)(BlockDriverState *bs, int flags); + +/* For handling image reopen for split or non-split files */ +int (*bdrv_reopen_prepare)(BlockDriverState *bs, + BDRVReopenState **prs, + int flags); +void (*bdrv_reopen_commit)(BlockDriverState *bs, BDRVReopenState *rs); +void (*bdrv_reopen_abort)(BlockDriverState *bs, BDRVReopenState *rs); int (*bdrv_file_open)(BlockDriverState *bs, const char *filename, int flags); int (*bdrv_read)(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors); @@ -299,6 +306,10 @@ struct BlockDriverState { BlockJob *job; }; +struct BDRVReopenState { +BlockDriverState *bs; +}; + struct BlockDriverAIOCB { AIOPool *pool; BlockDriverState *bs; Index: qemu/qemu-common.h === --- qemu.orig/qemu-common.h +++ qemu/qemu-common.h @@ -210,6 +210,7 @@ typedef struct NICInfo NICInfo; typedef struct HCIInfo HCIInfo; typedef struct AudioState AudioState; typedef struct BlockDriverState BlockDriverState; +typedef struct BDRVReopenState BDRVReopenState; typedef struct DriveInfo DriveInfo; typedef struct DisplayState DisplayState; typedef struct DisplayChangeListener DisplayChangeListener; Index: qemu/block.h === --- qemu.orig/block.h +++ qemu/block.h @@ -120,6 +120,9 @@ int bdrv_file_open(BlockDriverState **pb int bdrv_open(BlockDriverState *bs, const char *filename, int flags, BlockDriver *drv); int bdrv_reopen(BlockDriverState *bs, int bdrv_flags); +int bdrv_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, int flags); +void bdrv_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs); +void bdrv_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs); void bdrv_close(BlockDriverState *bs); int bd
[Qemu-devel] [RFC Patch 3/7]Qemu: Cmd "block_set_hostcache" for dynamic cache change
New command "block_set_hostcache" added for dynamically changing host pagecache setting of a block device. Usage: block_set_hostcache = block device = on/off Example: (qemu) block_set_hostcache ide0-hd0 off Signed-off-by: Supriya Kannery --- block.c | 54 ++ block.h |2 ++ blockdev.c | 26 ++ blockdev.h |2 ++ hmp-commands.hx | 14 ++ qmp-commands.hx | 27 +++ 6 files changed, 125 insertions(+) Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -808,6 +808,35 @@ unlink_and_fail: return ret; } +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) +{ +BlockDriver *drv = bs->drv; +int ret = 0, open_flags; + +/* Quiesce IO for the given block device */ +qemu_aio_flush(); +ret = bdrv_flush(bs); +if (ret != 0) { +qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); +return ret; +} +open_flags = bs->open_flags; +bdrv_close(bs); + +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); +if (ret < 0) { +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret < 0) { +/* Reopen failed with orig and modified flags */ +abort(); +} +} + +return ret; +} + void bdrv_close(BlockDriverState *bs) { if (bs->drv) { @@ -870,6 +899,33 @@ void bdrv_drain_all(void) } } +int bdrv_change_hostcache(BlockDriverState *bs, bool enable_host_cache) +{ +int bdrv_flags = bs->open_flags; + +/* set hostcache flags (without changing WCE/flush bits) */ +if (enable_host_cache) { +bdrv_flags &= ~BDRV_O_NOCACHE; +} else { +bdrv_flags |= BDRV_O_NOCACHE; +} + +/* If no change in flags, no need to reopen */ +if (bdrv_flags == bs->open_flags) { +return 0; +} + +if (bdrv_is_inserted(bs)) { +/* Reopen file with changed set of flags */ +bdrv_flags &= ~BDRV_O_CACHE_WB; +return bdrv_reopen(bs, bdrv_flags); +} else { +/* Save hostcache change for future use */ +bs->open_flags = bdrv_flags; +return 0; +} +} + /* make a BlockDriverState anonymous by removing from bdrv_state list. Also, NULL terminate the device_name to prevent double remove */ void bdrv_make_anon(BlockDriverState *bs) Index: qemu/block.h === --- qemu.orig/block.h +++ qemu/block.h @@ -119,6 +119,7 @@ int bdrv_parse_cache_flags(const char *m int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags); int bdrv_open(BlockDriverState *bs, const char *filename, int flags, BlockDriver *drv); +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags); void bdrv_close(BlockDriverState *bs); int bdrv_attach_dev(BlockDriverState *bs, void *dev); void bdrv_attach_dev_nofail(BlockDriverState *bs, void *dev); @@ -160,6 +161,7 @@ void bdrv_commit_all(void); int bdrv_change_backing_file(BlockDriverState *bs, const char *backing_file, const char *backing_fmt); void bdrv_register(BlockDriver *bdrv); +int bdrv_change_hostcache(BlockDriverState *bs, bool enable_host_cache); typedef struct BdrvCheckResult { Index: qemu/blockdev.c === --- qemu.orig/blockdev.c +++ qemu/blockdev.c @@ -1080,3 +1080,29 @@ BlockJobInfoList *qmp_query_block_jobs(E bdrv_iterate(do_qmp_query_block_jobs_one, &prev); return dummy.next; } + + +/* + * Change host page cache setting while guest is running. +*/ +int do_block_set_hostcache(Monitor *mon, const QDict *qdict, + QObject **ret_data) +{ +BlockDriverState *bs = NULL; +int enable; +const char *device; + +/* Validate device */ +device = qdict_get_str(qdict, "device"); +bs = bdrv_find(device); +if (!bs) { +qerror_report(QERR_DEVICE_NOT_FOUND, device); +return -1; +} + +/* Read hostcache setting */ +enable = qdict_get_bool(qdict, "option"); +return bdrv_change_hostcache(bs, enable); + +} + Index: qemu/blockdev.h === --- qemu.orig/blockdev.h +++ qemu/blockdev.h @@ -62,4 +62,6 @@ void qmp_change_blockdev(const char *dev bool has_format, const char *format, Error **errp); void do_commit(Monitor *mon, const QDict *qdict); int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data); +int do_block_set_hostcache(Monitor *mon, const QDict *qdict, +
[Qemu-devel] [RFC Patch 1/7]Qemu: Enhance "info block" to display host cache setting
Enhance "info block" to display hostcache setting for each block device. Example: (qemu) info block ide0-hd0: removable=0 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Enhanced to display "hostcache" setting: (qemu) info block ide0-hd0: removable=0 hostcache=1 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Signed-off-by: Supriya Kannery --- block.c | 20 qmp-commands.hx |2 ++ 2 files changed, 18 insertions(+), 4 deletions(-) Index: qemu/qapi-schema.json === --- qemu.orig/qapi-schema.json +++ qemu/qapi-schema.json @@ -423,6 +423,8 @@ # @locked: True if the guest has locked this device from having its media # removed # +# @hostcache: True if host pagecache is enabled. +# # @tray_open: #optional True if the device has a tray and it is open # (only present if removable is true) # @@ -436,7 +438,7 @@ ## { 'type': 'BlockInfo', 'data': {'device': 'str', 'type': 'str', 'removable': 'bool', - 'locked': 'bool', '*inserted': 'BlockDeviceInfo', + 'locked': 'bool','hostcache': 'bool', '*inserted': 'BlockDeviceInfo', '*tray_open': 'bool', '*io-status': 'BlockDeviceIoStatus'} } ## Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -2285,6 +2285,7 @@ BlockInfoList *qmp_query_block(Error **e info->value->device = g_strdup(bs->device_name); info->value->type = g_strdup("unknown"); info->value->locked = bdrv_dev_is_medium_locked(bs); +info->value->hostcache = !(bs->open_flags & BDRV_O_NOCACHE); info->value->removable = bdrv_dev_has_removable_media(bs); if (bdrv_dev_has_removable_media(bs)) { Index: qemu/hmp.c === --- qemu.orig/hmp.c +++ qemu/hmp.c @@ -209,6 +209,8 @@ void hmp_info_block(Monitor *mon) monitor_printf(mon, " tray-open=%d", info->value->tray_open); } +monitor_printf(mon, " hostcache=%d", info->value->hostcache); + if (info->value->has_io_status) { monitor_printf(mon, " io-status=%s", BlockDeviceIoStatus_lookup[info->value->io_status]);
[Qemu-devel] [RFC Patch 2/7]Qemu: Error classes for file reopen and data sync failure
New error classes defined for file reopen failure and data sync error Signed-off-by: Supriya Kannery --- qerror.c |8 qerror.h |6 ++ 2 files changed, 14 insertions(+) Index: qemu/qerror.c === --- qemu.orig/qerror.c +++ qemu/qerror.c @@ -108,6 +108,14 @@ static const QErrorStringTable qerror_ta .desc = "Device '%(device)' has multiple child busses", }, { +.error_fmt = QERR_DATA_SYNC_FAILED, +.desc = "Syncing of data failed for device '%(device)'", +}, +{ +.error_fmt = QERR_REOPEN_FILE_FAILED, +.desc = "Could not reopen '%(filename)'", +}, +{ .error_fmt = QERR_DEVICE_NO_BUS, .desc = "Device '%(device)' has no child bus", }, Index: qemu/qerror.h === --- qemu.orig/qerror.h +++ qemu/qerror.h @@ -117,6 +117,9 @@ QError *qobject_to_qerror(const QObject #define QERR_DEVICE_NOT_FOUND \ "{ 'class': 'DeviceNotFound', 'data': { 'device': %s } }" +#define QERR_DATA_SYNC_FAILED \ +"{ 'class': 'DataSyncFailed', 'data': { 'device': %s } }" + #define QERR_DEVICE_NOT_REMOVABLE \ "{ 'class': 'DeviceNotRemovable', 'data': { 'device': %s } }" @@ -180,6 +183,9 @@ QError *qobject_to_qerror(const QObject #define QERR_PERMISSION_DENIED \ "{ 'class': 'PermissionDenied', 'data': {} }" +#define QERR_REOPEN_FILE_FAILED \ +"{ 'class': 'ReopenFileFailed', 'data': { 'filename': %s } }" + #define QERR_PROPERTY_NOT_FOUND \ "{ 'class': 'PropertyNotFound', 'data': { 'device': %s, 'property': %s } }"
[Qemu-devel] [RFC Patch 0/7]Qemu: Dynamic host pagecache change
For changing host pagecache setting of a running VM, it is important to have a safe way of reopening its image file. Following patchset introduces: * a generic way to reopen image files safely. In this approach, before reopening an image, for each block driver, its state will be stashed. Incase preparation (bdrv_reopen_prepare) for reopening returns success, the stashed state will be cleared (bdrv_reopen_commit) and reopened state will be used further. Incase preparation of reopening returns failure, the state of the driver will be rolled back (bdrv_reopen_abort) to the stashed state. This approach is extended to raw-posix, raw-win32 and vmdk block drivers in this patchset. Once this is reviewed and finalised, I will extend the implementation to other drivers like qcow2, qed etc.. * qmp and hmp command 'block_set_hostcache' using which host pagecache setting for a block device can be changed when the VM is running. * BDRVReopenState, a generic structure which can be extended by each of the block drivers to reopen respective image files. ToDo: * memcpy is used to save driver state. Replace this with copying individual fields of driver state (?) * Extend this implementation to other block drivers. * Build and verify raw-win32 driver changes in windows Earlier discussions related to dynamic change of host pagecache can be found at: http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01482.html New block command added: "block_set_hostcache" -- Sets hostcache parameter for block device while guest is running. Usage: block_set_hostcache = block device = on/off qemu/block.c | 112 + qemu/block.h |5 + qemu/block/raw-posix.c | 74 qemu/block/raw-win32.c | 95 + qemu/block/raw.c | 20 qemu/block/vmdk.c | 80 +-- qemu/block_int.h | 11 qemu/blockdev.c| 26 +++ qemu/blockdev.h|2 qemu/hmp-commands.hx | 14 ++ qemu/hmp.c |2 qemu/qapi-schema.json |4 + qemu/qemu-common.h |1 qemu/qerror.c |8 +++ qemu/qerror.h |6 ++ qemu/qmp-commands.hx | 27 +++ 18 files changed, 474 insertions(+), 13 deletions(-)
Re: [Qemu-devel] [v9 Patch 5/6]Qemu: Framework for reopening images safely
On 11/22/2011 05:19 PM, Stefan Hajnoczi wrote: On Tue, Nov 22, 2011 at 11:16 AM, supriya kannery wrote: Kevin Wolf wrote: Am 22.11.2011 11:24, schrieb supriya kannery: How does VMDK implement its prepare/commit/abort? It needs to use the "public" bdrv_reopen_prepare() function on its image files. bdrv_reopen() is the public interface which gets called by any of the image formats. So VMDK or any image format has to call bdrv_reopen which decides to call driver specific prepare/commit/abort or simply close and reopen the file. No, that doesn't work. In order to get all-or-nothing semantics, you need to explicitly prepare all child images and only when you know the results of all preparations, you can decide whether to commit or abort all. bdrv_reopen_prepare/commit/abort will be implemented specific to VMDK in vmdk.c. Then for vmdk, drv->bdrv_reopen_prepare() will handle preparing child images and return success to bdrv_reopen () only if all of them get prepared successfully. The prepare/commit/abort concept we took up considering vmdk's special case of multiple files. So it is bdrv_reopen() which is public and called by hostcache change request for any of the image formats. It then routes the processing to respective prepare/commit/abort implemented by the drivers, including VMDK. In cases where drivers don't have their own implementation, default route is taken which is simply closing and opening the file. VMDK must call bdrv_reopen_prepare()/bdrv_reopen_commit()/bdrv_reopen_abort() on its child images in order to support aborting when there is a failure half-way through. If it used bdrv_reopen() on its child images then it could not roll back later when there is a failure on the next child. I got both your's and Kevin's point now, after looking into vmdk.c code related to extents. My initial understanding was bdrv_reopen_prepare() implemented inside vmdk.c can be called for handling reopening of multiple child images. But observing how vmdk_open() is implemented for child images, I should follow that method. So will make bdrv_reopen_prepare()/commit/abort in block.c to be used by vmdk as well as other required image formats (like how raw.c is using them). - thanks, Supriya
Re: [Qemu-devel] [v9 Patch 6/6]Qemu: raw posix implementation of reopen functions
Stefan Hajnoczi wrote: On Tue, Nov 22, 2011 at 9:45 AM, supriya kannery wrote: supriya kannery wrote: Stefan Hajnoczi wrote: On Fri, Nov 11, 2011 at 6:48 AM, Supriya Kannery wrote: +} +if ((flags & BDRV_O_NOCACHE)) { +raw_rs->reopen_state.reopen_flags |= O_DIRECT; +} else { +raw_rs->reopen_state.reopen_flags &= ~O_DIRECT; +} +ret = fcntl_setfl(raw_rs->reopen_fd, raw_rs->reopen_state.reopen_flags); I wonder if this works on Solaris, FreeBSD, etc? Perhaps there needs to be a fallback to the missing "else" case below... ok. Will look into whether this will work on Solaris, FreeBSD etc.. This should work for all non-win Oses. I have tested only in x86. #ifndef _WIN32 /* Sets a specific flag */ int fcntl_setfl(int fd, int flag) { int flags; flags = fcntl(fd, F_GETFL); Are you sure POSIX guarantees that O_DIRECT can be changed with F_SETFL? I didn't find any statement in the specification. It is possible that this code compiles but does not actually work on non-Linux OSes. Did you run tests? I don't have FreeBSD and Solaris systems to use. Referred the following man page links to verify that O_DIRECT in these OSes can be changed using fcntl. http://fuse4bsd.creo.hu/localcgi/man-cgi.cgi?fcntl+2 http://man-wiki.net/index.php/2:fcntl If anybody with these systems can confirm, that would be very helpful. -thanks, Supriya Stefan
Re: [Qemu-devel] [v9 Patch 5/6]Qemu: Framework for reopening images safely
Kevin Wolf wrote: Am 22.11.2011 11:24, schrieb supriya kannery: Stefan Hajnoczi wrote: On Mon, Nov 21, 2011 at 12:13 PM, supriya kannery wrote: Stefan Hajnoczi wrote: On Fri, Nov 11, 2011 at 6:48 AM, Supriya Kannery wrote: @@ -708,17 +731,31 @@ int bdrv_reopen(BlockDriverState *bs, in qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); return ret; } -open_flags = bs->open_flags; -bdrv_close(bs); -ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); -if (ret < 0) { -/* Reopen failed. Try to open with original flags */ -qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); -ret = bdrv_open(bs, bs->filename, open_flags, drv); +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen_prepare) { This seems weird to me because we're saying a driver may have drv->bdrv_reopen_prepare == NULL but the public bdrv_reopen_prepare() function doesn't check and return -ENOTSUP. If drv->bdrv_reopen_prepare == NULL , then we are not calling the publick bdrv_reopen_prepare at all. Unless we later call public bdrv_reopen_prepare from elsewhere without checking drv->bdrv_reopen_prepare, a check for -ENOTSUP inside the public one is not needed right? Also, we are handling reopening for even those drivers which don't have its own bdrv_reopen_prepare defined, by taking the "else" control path. So condition for reporting "ENOTSUP" shouldn't come up as of now. Please let me know your thoughts. How does VMDK implement its prepare/commit/abort? It needs to use the "public" bdrv_reopen_prepare() function on its image files. bdrv_reopen() is the public interface which gets called by any of the image formats. So VMDK or any image format has to call bdrv_reopen which decides to call driver specific prepare/commit/abort or simply close and reopen the file. No, that doesn't work. In order to get all-or-nothing semantics, you need to explicitly prepare all child images and only when you know the results of all preparations, you can decide whether to commit or abort all. bdrv_reopen_prepare/commit/abort will be implemented specific to VMDK in vmdk.c. Then for vmdk, drv->bdrv_reopen_prepare() will handle preparing child images and return success to bdrv_reopen () only if all of them get prepared successfully. The prepare/commit/abort concept we took up considering vmdk's special case of multiple files. So it is bdrv_reopen() which is public and called by hostcache change request for any of the image formats. It then routes the processing to respective prepare/commit/abort implemented by the drivers, including VMDK. In cases where drivers don't have their own implementation, default route is taken which is simply closing and opening the file. - thanks, Supriya Kevin
Re: [Qemu-devel] [v9 Patch 5/6]Qemu: Framework for reopening images safely
Stefan Hajnoczi wrote: On Mon, Nov 21, 2011 at 12:13 PM, supriya kannery wrote: Stefan Hajnoczi wrote: On Fri, Nov 11, 2011 at 6:48 AM, Supriya Kannery wrote: @@ -708,17 +731,31 @@ int bdrv_reopen(BlockDriverState *bs, in qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); return ret; } -open_flags = bs->open_flags; -bdrv_close(bs); -ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); -if (ret < 0) { -/* Reopen failed. Try to open with original flags */ -qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); -ret = bdrv_open(bs, bs->filename, open_flags, drv); +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen_prepare) { This seems weird to me because we're saying a driver may have drv->bdrv_reopen_prepare == NULL but the public bdrv_reopen_prepare() function doesn't check and return -ENOTSUP. If drv->bdrv_reopen_prepare == NULL , then we are not calling the publick bdrv_reopen_prepare at all. Unless we later call public bdrv_reopen_prepare from elsewhere without checking drv->bdrv_reopen_prepare, a check for -ENOTSUP inside the public one is not needed right? Also, we are handling reopening for even those drivers which don't have its own bdrv_reopen_prepare defined, by taking the "else" control path. So condition for reporting "ENOTSUP" shouldn't come up as of now. Please let me know your thoughts. How does VMDK implement its prepare/commit/abort? It needs to use the "public" bdrv_reopen_prepare() function on its image files. bdrv_reopen() is the public interface which gets called by any of the image formats. So VMDK or any image format has to call bdrv_reopen which decides to call driver specific prepare/commit/abort or simply close and reopen the file. BTW I think the bdrv_reopen_*() functions should go in block_int.h and not block.h. They are visible to the block layer but not public to the rest of QEMU, which must use the bdrv_reopen() interface only. I think what's really missing is a way to tie this all together. You have posted raw format and raw-posix protocol patches. But we need to cover image formats, where VMDK is the multi-file special case and qcow2/qed/etc are simpler but also need to be supported. Right now anything but raw-posix is still closing and reopening. By adding support for image formats I think you'll find the right way to structure this code. Since only bdrv_reopen() is public, it is declared in block.h and structure of code done in similar way how bdrv_open() is done. The else part in bdrv_reopen() will handle reopen requests for images other than raw for now (simply close and reopen). -thanks, Supriya Stefan
Re: [Qemu-devel] [v9 Patch 6/6]Qemu: raw posix implementation of reopen functions
supriya kannery wrote: Stefan Hajnoczi wrote: On Fri, Nov 11, 2011 at 6:48 AM, Supriya Kannery wrote: +} +if ((flags & BDRV_O_NOCACHE)) { +raw_rs->reopen_state.reopen_flags |= O_DIRECT; +} else { +raw_rs->reopen_state.reopen_flags &= ~O_DIRECT; +} +ret = fcntl_setfl(raw_rs->reopen_fd, raw_rs->reopen_state.reopen_flags); I wonder if this works on Solaris, FreeBSD, etc? Perhaps there needs to be a fallback to the missing "else" case below... ok. Will look into whether this will work on Solaris, FreeBSD etc.. This should work for all non-win Oses. I have tested only in x86. #ifndef _WIN32 /* Sets a specific flag */ int fcntl_setfl(int fd, int flag) { int flags; flags = fcntl(fd, F_GETFL); - thanks, Supriya
Re: [Qemu-devel] [v9 Patch 4/6]Qemu: Add commandline -drive option 'hostcache'
Stefan Hajnoczi wrote: On Mon, Nov 21, 2011 at 12:28 PM, supriya kannery wrote: Stefan Hajnoczi wrote: On Thu, Nov 17, 2011 at 5:18 AM, Supriya Kannery wrote: On 11/17/2011 01:36 AM, Stefan Hajnoczi wrote: On Fri, Nov 11, 2011 at 6:48 AM, Supriya Kannery wrote: +if ((hostcache = qemu_opt_get_bool(opts, "hostcache", -1)) != -1) { This does not work. qemu_opt_get_bool() takes a bool default argument and returns a bool. (bool)-1 == true. But (int)true == 1 and you cannot expect it to ever equal -1. Try this: if (qemu_opt_get(opts, "hostcache")&& !qemu_opt_get_bool(opts, "hostcache", false)) { bdrv_flags |= BDRV_O_NOCACHE; } Stefan Thanks! for pointing this. Does the following look ok? if ((hostcache = qemu_opt_get_bool(opts, "hostcache", 1) == 0) { bdrv_flags |= BDRV_O_NOCACHE; } If either "hostcache" is not at all specified or it is specified as "on", qemu_opt_get_bool will return 1, which can be ignored as bdrv_flags is initialized to 0. It depends on the overall way this should work. I think this captures all the cases: 1. cache= and hostcache= may not be used together. 2. cache= sets bdrv_flags. 3. hostcache= may |= BDRV_O_NOCACHE. 4. No option defaults to cache=writethrough (bdrv_flags &= ~BDRV_O_CACHE_MASK). The code you posted will work although I find it a bit weird how it also includes case #4. IMO it's cleanest to just do case #3 by testing whether or not the hostcache= option is set. BTW, is there a check for case #1 in your patch series. I thought I saw one earlier but now I can't find it. Following is what I tried to accomplish: 1. cache= and hostcache= may be used together. Cache= will override hostcache= if specified together This condition can be retained till cache-xx can be completely replaced by combinations of separate cmdline options defined for setting hostcache, flush, and WCE [I think I can change the current implementation to have Cache=xx ORed with hostcache=yy if they are specified together instead of just ignoring hostcache= ] Okay, I can see that line of reasoning but then hostcache= does not provide much value on the command-line. Perhaps it's better to drop this patch and not merge a new -drive option until the guest-visible write cache enable support is also merged? Let us have the implementation for hostcache= as command line option, with the condition that if both cache= and hostcache= are specified together, then depending upon enable/disable value specified for hostcache, corresponding bit in cache flags can be set/reset. That way, there is a value add on specifying hostcache in cmdline as well as user will be able to control hostcache value from cmdline itself. The interesting feature that this series adds is changing of hostcache at run-time. Yes.. will resume with dynamic hostcache change part to make it usable by qemu - thanks, Supriya Stefan
Re: [Qemu-devel] [v9 Patch 6/6]Qemu: raw posix implementation of reopen functions
Stefan Hajnoczi wrote: On Fri, Nov 11, 2011 at 6:48 AM, Supriya Kannery wrote: +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVRawReopenState *raw_rs = g_malloc0(sizeof(BDRVRawReopenState)); +BDRVRawState *s = bs->opaque; +int ret = 0; + +raw_rs->reopen_state.reopen_flags = s->open_flags; +raw_rs->reopen_state.bs = bs; +raw_rs->reopen_fd = -1; +*prs = &(raw_rs->reopen_state); + +/* Flags that can be set using fcntl */ +int fcntl_flags = BDRV_O_NOCACHE; + +if ((bs->open_flags & ~fcntl_flags) == (flags & ~fcntl_flags)) { +raw_rs->reopen_fd = dup(s->fd); +if (raw_rs->reopen_fd <= 0) { +return -1; return -errno; +} +if ((flags & BDRV_O_NOCACHE)) { +raw_rs->reopen_state.reopen_flags |= O_DIRECT; +} else { +raw_rs->reopen_state.reopen_flags &= ~O_DIRECT; +} +ret = fcntl_setfl(raw_rs->reopen_fd, raw_rs->reopen_state.reopen_flags); I wonder if this works on Solaris, FreeBSD, etc? Perhaps there needs to be a fallback to the missing "else" case below... ok. Will look into whether this will work on Solaris, FreeBSD etc.. +} else { + +/* TBD: Handle O_DSYNC and other flags. For now return error */ +ret = -1; ...and this needs to be implemented. ok +} +return ret; +} Stefan
Re: [Qemu-devel] [v9 Patch 4/6]Qemu: Add commandline -drive option 'hostcache'
Stefan Hajnoczi wrote: On Thu, Nov 17, 2011 at 5:18 AM, Supriya Kannery wrote: On 11/17/2011 01:36 AM, Stefan Hajnoczi wrote: On Fri, Nov 11, 2011 at 6:48 AM, Supriya Kannery wrote: +if ((hostcache = qemu_opt_get_bool(opts, "hostcache", -1)) != -1) { This does not work. qemu_opt_get_bool() takes a bool default argument and returns a bool. (bool)-1 == true. But (int)true == 1 and you cannot expect it to ever equal -1. Try this: if (qemu_opt_get(opts, "hostcache")&& !qemu_opt_get_bool(opts, "hostcache", false)) { bdrv_flags |= BDRV_O_NOCACHE; } Stefan Thanks! for pointing this. Does the following look ok? if ((hostcache = qemu_opt_get_bool(opts, "hostcache", 1) == 0) { bdrv_flags |= BDRV_O_NOCACHE; } If either "hostcache" is not at all specified or it is specified as "on", qemu_opt_get_bool will return 1, which can be ignored as bdrv_flags is initialized to 0. It depends on the overall way this should work. I think this captures all the cases: 1. cache= and hostcache= may not be used together. 2. cache= sets bdrv_flags. 3. hostcache= may |= BDRV_O_NOCACHE. 4. No option defaults to cache=writethrough (bdrv_flags &= ~BDRV_O_CACHE_MASK). The code you posted will work although I find it a bit weird how it also includes case #4. IMO it's cleanest to just do case #3 by testing whether or not the hostcache= option is set. BTW, is there a check for case #1 in your patch series. I thought I saw one earlier but now I can't find it. Following is what I tried to accomplish: 1. cache= and hostcache= may be used together. Cache= will override hostcache= if specified together This condition can be retained till cache-xx can be completely replaced by combinations of separate cmdline options defined for setting hostcache, flush, and WCE [I think I can change the current implementation to have Cache=xx ORed with hostcache=yy if they are specified together instead of just ignoring hostcache= ] 2. If only cache= specified sets bdrv_flags. 3. If only hostcache= specified, bdrv_flags |= BDRV_O_NOCACHE 4. No option defaults to cache=writethrough (bdrv_flags &= ~BDRV_O_CACHE_MASK). So the check I had earlier for case #1 in your list, I changed that to allow both the options to co-exist. -thanks, Supriya Stefan
Re: [Qemu-devel] [v9 Patch 5/6]Qemu: Framework for reopening images safely
Stefan Hajnoczi wrote: On Fri, Nov 11, 2011 at 6:48 AM, Supriya Kannery wrote: @@ -708,17 +731,31 @@ int bdrv_reopen(BlockDriverState *bs, in qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); return ret; } -open_flags = bs->open_flags; -bdrv_close(bs); -ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); -if (ret < 0) { -/* Reopen failed. Try to open with original flags */ -qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); -ret = bdrv_open(bs, bs->filename, open_flags, drv); +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen_prepare) { This seems weird to me because we're saying a driver may have drv->bdrv_reopen_prepare == NULL but the public bdrv_reopen_prepare() function doesn't check and return -ENOTSUP. If drv->bdrv_reopen_prepare == NULL , then we are not calling the publick bdrv_reopen_prepare at all. Unless we later call public bdrv_reopen_prepare from elsewhere without checking drv->bdrv_reopen_prepare, a check for -ENOTSUP inside the public one is not needed right? Also, we are handling reopening for even those drivers which don't have its own bdrv_reopen_prepare defined, by taking the "else" control path. So condition for reporting "ENOTSUP" shouldn't come up as of now. Please let me know your thoughts. This check can be moved into bdrv_reopen_prepare(). We can test for the -ENOTSUP return value here instead. +ret = bdrv_reopen_prepare(bs, &reopen_state, bdrv_flags); + if (ret < 0) { Indentation is off here. sure..will take care in next version. Stefan
Re: [Qemu-devel] [v9 Patch 3/6]Qemu: Cmd "block_set_hostcache" for dynamic cache change
Luiz Capitulino wrote: On Fri, 11 Nov 2011 12:17:48 +0530 Supriya Kannery wrote: + +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); +if (ret < 0) { +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret < 0) { +/* Reopen failed with orig and modified flags */ +abort(); +} +} + +return ret; +} In this thread: http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01271.html Juan uses a similar method (well, at least it looks similar to me) to fix a problem with migration. However, it was said that that method can cause problems with -snapshot and encrypted images. Won't we have the same sort of problems with this series? Thanks! for this link. Yes, this series also need to look at such issues. Will go through the discussions and see what needs to be looked at. +if (bdrv_is_inserted(bs)) { +/* Reopen file with changed set of flags */ +return bdrv_reopen(bs, bdrv_flags); +} else { +/* Save hostcache change for future use */ +bs->open_flags = bdrv_flags; +return 0; +} I'm wondering if the simplest (and best) thing to do here is to fail if the drive is not inserted. Just wondering, not exactly asking you to change it. But it should at least be clearly documented if you keep it. I initially started the patchset with returning error message on detecting uninserted drive. But later thought that, user should be given the flexibility of changing hostcache setting irrespective of whether drive is full or not. To what I understand, hostcache setting is independent of what user puts in to the drive. Need to make these saved status flags used when opening a newly inserted device. +/* Read hostcache setting */ +enable = qdict_get_bool(qdict, "option"); +return bdrv_change_hostcache(bs, enable); + +} This is not using the QAPI. will change to use QAPI. + Index: qemu/blockdev.h === --- qemu.orig/blockdev.h +++ qemu/blockdev.h @@ -65,5 +65,7 @@ int do_change_block(Monitor *mon, const int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_block_resize(Monitor *mon, const QDict *qdict, QObject **ret_data); +int do_block_set_hostcache(Monitor *mon, const QDict *qdict, + QObject **ret_data); #endif Index: qemu/hmp-commands.hx === --- qemu.orig/hmp-commands.hx +++ qemu/hmp-commands.hx @@ -70,6 +70,20 @@ but should be used with extreme caution. resizes image files, it can not resize block devices like LVM volumes. ETEXI +{ +.name = "block_set_hostcache", +.args_type = "device:B,option:b", "option" is not good enough. I'd call it "enable" or "disable". ok
Re: [Qemu-devel] [v9 Patch 2/6]Qemu: Error classes for file reopen and data sync failure
Luiz Capitulino wrote: On Fri, 11 Nov 2011 12:17:35 +0530 Supriya Kannery wrote: New error classes defined for file reopen failure and data sync error Signed-off-by: Supriya Kannery --- qerror.c |8 qerror.h |6 ++ 2 files changed, 14 insertions(+) Index: qemu/qerror.c === --- qemu.orig/qerror.c +++ qemu/qerror.c @@ -97,6 +97,14 @@ static const QErrorStringTable qerror_ta .desc = "Device '%(device)' is not removable", }, { +.error_fmt = QERR_DATA_SYNC_FAILED, +.desc = "Syncing of data failed for device '%(device)'", +}, +{ +.error_fmt = QERR_REOPEN_FILE_FAILED, +.desc = "Could not reopen '%(filename)'", +}, Is this really needed? I think you could use QERR_OPEN_FILE_FAILED. This could confuse the user, as to what was tried on an already opened file by qemu. So I feel, QERR_REOPEN_FILE_FAILED can bring in better clarity to the message passed to the user. - thanks Supriya
Re: [Qemu-devel] [v9 Patch 1/6 - updated]Qemu: Enhance "info block" to display host cache setting
Luiz Capitulino wrote: On Fri, 11 Nov 2011 15:39:29 +0530 Supriya Kannery wrote: On 11/11/2011 12:17 PM, Supriya Kannery wrote: hostcache gets added to qapi-types.h from the change done in qapi-schema.json. Hence above change has to be ignored. Pls find updated patch. git am says this patch is corrupted. Otherwise the QAPI changes look ok to me. ok, will recheck with git am while sending updated patchset. - thanks, Supriya
Re: [Qemu-devel] [v9 Patch 3/6]Qemu: Cmd "block_set_hostcache" for dynamic cache change
On 11/17/2011 12:04 AM, Stefan Hajnoczi wrote: On Fri, Nov 11, 2011 at 6:47 AM, Supriya Kannery wrote: +{ +.name = "block_set_hostcache", +.args_type = "device:B,option:b", +.params = "device on|off", +.help = "Change setting of host pagecache", +.user_print = monitor_user_noop, +.mhandler.cmd_new = do_block_set_hostcache, +}, +STEXI +@item block_set_hostcache @var{device} @var{setting} @var{option} Will send updated patch +@findex block_set_hostcache +Change host pagecache setting of a block device while guest is running. +ETEXI + { .name = "eject", Index: qemu/qmp-commands.hx === --- qemu.orig/qmp-commands.hx +++ qemu/qmp-commands.hx @@ -716,7 +716,34 @@ Example: EQMP + { +.name = "block_set_hostcache", +.args_type = "device:B,option:b", +.params = "device on|off", +.help = "Change setting of host pagecache (true|false)", It would be more consistent to use "on|off" instead of "true|false". Or eliminate it entirely by saying "Enable or disable host pagecache usage". Stefan Followed similar way how set_link is done. Specified 'true/false' in brackets as 'on' or 'off' are not accepted as bool parameter in qmp prompt.
Re: [Qemu-devel] [v9 Patch 4/6]Qemu: Add commandline -drive option 'hostcache'
On 11/17/2011 01:36 AM, Stefan Hajnoczi wrote: On Fri, Nov 11, 2011 at 6:48 AM, Supriya Kannery wrote: +if ((hostcache = qemu_opt_get_bool(opts, "hostcache", -1)) != -1) { This does not work. qemu_opt_get_bool() takes a bool default argument and returns a bool. (bool)-1 == true. But (int)true == 1 and you cannot expect it to ever equal -1. Try this: if (qemu_opt_get(opts, "hostcache")&& !qemu_opt_get_bool(opts, "hostcache", false)) { bdrv_flags |= BDRV_O_NOCACHE; } Stefan Thanks! for pointing this. Does the following look ok? if ((hostcache = qemu_opt_get_bool(opts, "hostcache", 1) == 0) { bdrv_flags |= BDRV_O_NOCACHE; } If either "hostcache" is not at all specified or it is specified as "on", qemu_opt_get_bool will return 1, which can be ignored as bdrv_flags is initialized to 0.
Re: [Qemu-devel] [v9 Patch 1/6 - updated]Qemu: Enhance "info block" to display host cache setting
On 11/11/2011 12:17 PM, Supriya Kannery wrote: > Enhance "info block" to display hostcache setting for each > block device. > > > ## > Index: qemu/qapi-types.h > === > --- qemu.orig/qapi-types.h > +++ qemu/qapi-types.h > @@ -383,6 +383,7 @@ struct BlockInfo > { > char * device; > char * type; > +bool hostcache; > bool removable; > bool locked; > bool has_inserted; > Index: qemu/block.c hostcache gets added to qapi-types.h from the change done in qapi-schema.json. Hence above change has to be ignored. Pls find updated patch. * Enhance "info block" to display hostcache setting for each block device. Example: (qemu) info block ide0-hd0: removable=0 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Enhanced to display "hostcache" setting: (qemu) info block ide0-hd0: removable=0 hostcache=1 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Signed-off-by: Supriya Kannery --- block.c | 20 qmp-commands.hx |2 ++ 2 files changed, 18 insertions(+), 4 deletions(-) Index: qemu/qapi-schema.json === --- qemu.orig/qapi-schema.json +++ qemu/qapi-schema.json @@ -409,6 +409,8 @@ # @locked: True if the guest has locked this device from having its media # removed # +# @hostcache: True if host pagecache is enabled. +# # @tray_open: #optional True if the device has a tray and it is open # (only present if removable is true) # @@ -422,7 +424,7 @@ ## { 'type': 'BlockInfo', 'data': {'device': 'str', 'type': 'str', 'removable': 'bool', - 'locked': 'bool', '*inserted': 'BlockDeviceInfo', + 'locked': 'bool','hostcache': 'bool', '*inserted': 'BlockDeviceInfo', '*tray_open': 'bool', '*io-status': 'BlockDeviceIoStatus'} } ## Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -1839,6 +1839,7 @@ BlockInfoList *qmp_query_block(Error **e info->value->device = g_strdup(bs->device_name); info->value->type = g_strdup("unknown"); info->value->locked = bdrv_dev_is_medium_locked(bs); +info->value->hostcache = !(bs->open_flags & BDRV_O_NOCACHE); info->value->removable = bdrv_dev_has_removable_media(bs); if (bdrv_dev_has_removable_media(bs)) { Index: qemu/hmp.c === --- qemu.orig/hmp.c +++ qemu/hmp.c @@ -199,6 +199,8 @@ void hmp_info_block(Monitor *mon) monitor_printf(mon, " tray-open=%d", info->value->tray_open); } +monitor_printf(mon, " hostcache=%d", info->value->hostcache); + if (info->value->has_io_status) { monitor_printf(mon, " io-status=%s", BlockDeviceIoStatus_lookup[info->value->io_status]);
[Qemu-devel] [v9 Patch 3/6]Qemu: Cmd "block_set_hostcache" for dynamic cache change
New command "block_set_hostcache" added for dynamically changing host pagecache setting of a block device. Usage: block_set_hostcache = block device = on/off Example: (qemu) block_set_hostcache ide0-hd0 off Signed-off-by: Supriya Kannery --- block.c | 54 ++ block.h |2 ++ blockdev.c | 26 ++ blockdev.h |2 ++ hmp-commands.hx | 14 ++ qmp-commands.hx | 27 +++ 6 files changed, 125 insertions(+) Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -696,6 +696,35 @@ unlink_and_fail: return ret; } +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) +{ +BlockDriver *drv = bs->drv; +int ret = 0, open_flags; + +/* Quiesce IO for the given block device */ +qemu_aio_flush(); +ret = bdrv_flush(bs); +if (ret != 0) { +qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); +return ret; +} +open_flags = bs->open_flags; +bdrv_close(bs); + +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); +if (ret < 0) { +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret < 0) { +/* Reopen failed with orig and modified flags */ +abort(); +} +} + +return ret; +} + void bdrv_close(BlockDriverState *bs) { if (bs->drv) { @@ -733,6 +762,32 @@ void bdrv_close_all(void) } } +int bdrv_change_hostcache(BlockDriverState *bs, bool enable_host_cache) +{ +int bdrv_flags = bs->open_flags; + +/* set hostcache flags (without changing WCE/flush bits) */ +if (enable_host_cache) { +bdrv_flags &= ~BDRV_O_NOCACHE; +} else { +bdrv_flags |= BDRV_O_NOCACHE; +} + +/* If no change in flags, no need to reopen */ +if (bdrv_flags == bs->open_flags) { +return 0; +} + +if (bdrv_is_inserted(bs)) { +/* Reopen file with changed set of flags */ +return bdrv_reopen(bs, bdrv_flags); +} else { +/* Save hostcache change for future use */ +bs->open_flags = bdrv_flags; +return 0; +} +} + /* make a BlockDriverState anonymous by removing from bdrv_state list. Also, NULL terminate the device_name to prevent double remove */ void bdrv_make_anon(BlockDriverState *bs) Index: qemu/block.h === --- qemu.orig/block.h +++ qemu/block.h @@ -104,6 +104,7 @@ int bdrv_parse_cache_flags(const char *m int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags); int bdrv_open(BlockDriverState *bs, const char *filename, int flags, BlockDriver *drv); +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags); void bdrv_close(BlockDriverState *bs); int bdrv_attach_dev(BlockDriverState *bs, void *dev); void bdrv_attach_dev_nofail(BlockDriverState *bs, void *dev); @@ -138,6 +139,7 @@ void bdrv_commit_all(void); int bdrv_change_backing_file(BlockDriverState *bs, const char *backing_file, const char *backing_fmt); void bdrv_register(BlockDriver *bdrv); +int bdrv_change_hostcache(BlockDriverState *bs, bool enable_host_cache); typedef struct BdrvCheckResult { Index: qemu/blockdev.c === --- qemu.orig/blockdev.c +++ qemu/blockdev.c @@ -776,3 +776,29 @@ int do_block_resize(Monitor *mon, const return 0; } + + +/* + * Change host page cache setting while guest is running. +*/ +int do_block_set_hostcache(Monitor *mon, const QDict *qdict, + QObject **ret_data) +{ +BlockDriverState *bs = NULL; +int enable; +const char *device; + +/* Validate device */ +device = qdict_get_str(qdict, "device"); +bs = bdrv_find(device); +if (!bs) { +qerror_report(QERR_DEVICE_NOT_FOUND, device); +return -1; +} + +/* Read hostcache setting */ +enable = qdict_get_bool(qdict, "option"); +return bdrv_change_hostcache(bs, enable); + +} + Index: qemu/blockdev.h === --- qemu.orig/blockdev.h +++ qemu/blockdev.h @@ -65,5 +65,7 @@ int do_change_block(Monitor *mon, const int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_block_resize(Monitor *mon, const QDict *qdict, QObject **ret_data); +int do_block_set_hostcache(Monitor *mon, const QDict *qdict, + QObject **ret_data); #endif Index: qemu/hmp-commands.hx === --- qem
[Qemu-devel] [v9 Patch 6/6]Qemu: raw posix implementation of reopen functions
raw-posix driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically is handled here. Signed-off-by: Supriya Kannery Index: qemu/block/raw.c === --- qemu.orig/block/raw.c +++ qemu/block/raw.c @@ -9,6 +9,23 @@ static int raw_open(BlockDriverState *bs return 0; } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +return bdrv_reopen_prepare(bs->file, prs, flags); +} + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs, + int flags) +{ +bdrv_reopen_commit(bs->file, rs, flags); +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +bdrv_reopen_abort(bs->file, rs); +} + static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov) { @@ -107,10 +124,12 @@ static BlockDriver bdrv_raw = { /* It's really 0, but we need to make g_malloc() happy */ .instance_size = 1, - .bdrv_open = raw_open, +.bdrv_reopen_prepare += raw_reopen_prepare, +.bdrv_reopen_commit = raw_reopen_commit, +.bdrv_reopen_abort = raw_reopen_abort, .bdrv_close = raw_close, - .bdrv_co_readv = raw_co_readv, .bdrv_co_writev = raw_co_writev, .bdrv_co_flush = raw_co_flush, Index: qemu/block/raw-posix.c === --- qemu.orig/block/raw-posix.c +++ qemu/block/raw-posix.c @@ -136,6 +136,11 @@ typedef struct BDRVRawState { #endif } BDRVRawState; +typedef struct BDRVRawReopenState { +int reopen_fd; +BDRVReopenState reopen_state; +} BDRVRawReopenState; + static int fd_open(BlockDriverState *bs); static int64_t raw_getlength(BlockDriverState *bs); @@ -279,6 +284,70 @@ static int raw_open(BlockDriverState *bs return raw_open_common(bs, filename, flags, 0); } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVRawReopenState *raw_rs = g_malloc0(sizeof(BDRVRawReopenState)); +BDRVRawState *s = bs->opaque; +int ret = 0; + +raw_rs->reopen_state.reopen_flags = s->open_flags; +raw_rs->reopen_state.bs = bs; +raw_rs->reopen_fd = -1; +*prs = &(raw_rs->reopen_state); + +/* Flags that can be set using fcntl */ +int fcntl_flags = BDRV_O_NOCACHE; + +if ((bs->open_flags & ~fcntl_flags) == (flags & ~fcntl_flags)) { +raw_rs->reopen_fd = dup(s->fd); +if (raw_rs->reopen_fd <= 0) { +return -1; +} +if ((flags & BDRV_O_NOCACHE)) { +raw_rs->reopen_state.reopen_flags |= O_DIRECT; +} else { +raw_rs->reopen_state.reopen_flags &= ~O_DIRECT; +} +ret = fcntl_setfl(raw_rs->reopen_fd, raw_rs->reopen_state.reopen_flags); +} else { + +/* TBD: Handle O_DSYNC and other flags. For now return error */ +ret = -1; +} +return ret; +} + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs, + int flags) +{ +BDRVRawReopenState *raw_rs; +BDRVRawState *s = bs->opaque; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +/* Set new flags, so replace old fd with new one */ +close(s->fd); +s->fd = raw_rs->reopen_fd; +s->open_flags = rs->reopen_flags; +bs->open_flags = flags; + +g_free(raw_rs); + +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BDRVRawReopenState *raw_rs; + +raw_rs = container_of(rs, BDRVRawReopenState, reopen_state); + +if (raw_rs->reopen_fd != -1) { +close(raw_rs->reopen_fd); +} +g_free(raw_rs); +} + /* XXX: use host sector size if necessary with: #ifdef DIOCGSECTORSIZE { @@ -631,6 +700,9 @@ static BlockDriver bdrv_file = { .instance_size = sizeof(BDRVRawState), .bdrv_probe = NULL, /* no probe for protocols */ .bdrv_file_open = raw_open, +.bdrv_reopen_prepare = raw_reopen_prepare, +.bdrv_reopen_commit = raw_reopen_commit, +.bdrv_reopen_abort = raw_reopen_abort, .bdrv_close = raw_close, .bdrv_create = raw_create, .bdrv_co_discard = raw_co_discard,
[Qemu-devel] [v9 Patch 5/6]Qemu: Framework for reopening images safely
Struct BDRVReopenState along with three reopen related functions introduced for handling reopen state of images safely. This can be extended by each of the block drivers to reopen respective image files. Signed-off-by: Supriya Kannery Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -696,10 +696,33 @@ unlink_and_fail: return ret; } +int bdrv_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, int flags) +{ + BlockDriver *drv = bs->drv; + + return drv->bdrv_reopen_prepare(bs, prs, flags); +} + +void bdrv_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs, int flags) +{ +BlockDriver *drv = bs->drv; + +drv->bdrv_reopen_commit(bs, rs, flags); +bs->open_flags = flags; +} + +void bdrv_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BlockDriver *drv = bs->drv; + +drv->bdrv_reopen_abort(bs, rs); +} + int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) { BlockDriver *drv = bs->drv; int ret = 0, open_flags; +BDRVReopenState *reopen_state = NULL; /* Quiesce IO for the given block device */ qemu_aio_flush(); @@ -708,17 +731,31 @@ int bdrv_reopen(BlockDriverState *bs, in qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); return ret; } -open_flags = bs->open_flags; -bdrv_close(bs); -ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); -if (ret < 0) { -/* Reopen failed. Try to open with original flags */ -qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); -ret = bdrv_open(bs, bs->filename, open_flags, drv); +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen_prepare) { +ret = bdrv_reopen_prepare(bs, &reopen_state, bdrv_flags); + if (ret < 0) { +bdrv_reopen_abort(bs, reopen_state); +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +return ret; +} + +bdrv_reopen_commit(bs, reopen_state, bdrv_flags); + +} else { + open_flags = bs->open_flags; + bdrv_close(bs); + + ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); if (ret < 0) { -/* Reopen failed with orig and modified flags */ -abort(); +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret < 0) { +/* Reopen failed with orig and modified flags */ +bs->drv = NULL; +} } } Index: qemu/block_int.h === --- qemu.orig/block_int.h +++ qemu/block_int.h @@ -56,6 +56,14 @@ struct BlockDriver { int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename); int (*bdrv_probe_device)(const char *filename); int (*bdrv_open)(BlockDriverState *bs, int flags); + +/* For handling image reopen for split or non-split files */ +int (*bdrv_reopen_prepare)(BlockDriverState *bs, + BDRVReopenState **prs, + int flags); +void (*bdrv_reopen_commit)(BlockDriverState *bs, BDRVReopenState *rs, + int flags); +void (*bdrv_reopen_abort)(BlockDriverState *bs, BDRVReopenState *rs); int (*bdrv_file_open)(BlockDriverState *bs, const char *filename, int flags); int (*bdrv_read)(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors); @@ -213,6 +221,11 @@ struct BlockDriverState { void *private; }; +struct BDRVReopenState { +BlockDriverState *bs; +int reopen_flags; +}; + struct BlockDriverAIOCB { AIOPool *pool; BlockDriverState *bs; Index: qemu/qemu-common.h === --- qemu.orig/qemu-common.h +++ qemu/qemu-common.h @@ -203,6 +203,7 @@ typedef struct NICInfo NICInfo; typedef struct HCIInfo HCIInfo; typedef struct AudioState AudioState; typedef struct BlockDriverState BlockDriverState; +typedef struct BDRVReopenState BDRVReopenState; typedef struct DriveInfo DriveInfo; typedef struct DisplayState DisplayState; typedef struct DisplayChangeListener DisplayChangeListener; Index: qemu/block.h === --- qemu.orig/block.h +++ qemu/block.h @@ -105,6 +105,9 @@ int bdrv_file_open(BlockDriverState **pb int bdrv_open(BlockDriverState *bs, const char *filename, int flags, BlockDriver *drv); int bdrv_reopen(BlockDriverState *bs, int bdrv_flags); +int bdrv_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, int flags); +void bdrv_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs, int flags); +void bd
[Qemu-devel] [v9 Patch 4/6]Qemu: Add commandline -drive option 'hostcache'
qemu command option 'hostcache' added to -drive for block devices. While starting a VM from qemu commandline, this option can be used for setting host cache usage for block data access. Signed-off-by: Supriya Kannery --- blockdev.c | 13 + qemu-config.c |4 qemu-options.hx |2 +- 3 files changed, 18 insertions(+), 1 deletion(-) Index: qemu/blockdev.c === --- qemu.orig/blockdev.c +++ qemu/blockdev.c @@ -237,6 +237,7 @@ DriveInfo *drive_init(QemuOpts *opts, in DriveInfo *dinfo; int snapshot = 0; int ret; +int hostcache = 0; translation = BIOS_ATA_TRANSLATION_AUTO; media = MEDIA_DISK; @@ -324,6 +325,12 @@ DriveInfo *drive_init(QemuOpts *opts, in error_report("invalid cache option"); return NULL; } +} else { +if ((hostcache = qemu_opt_get_bool(opts, "hostcache", -1)) != -1) { +if (!hostcache) { +bdrv_flags |= BDRV_O_NOCACHE; +} +} } #ifdef CONFIG_LINUX_AIO Index: qemu/qemu-options.hx === --- qemu.orig/qemu-options.hx +++ qemu/qemu-options.hx @@ -135,7 +135,7 @@ DEF("drive", HAS_ARG, QEMU_OPTION_drive, " [,cyls=c,heads=h,secs=s[,trans=t]][,snapshot=on|off]\n" " [,cache=writethrough|writeback|none|directsync|unsafe][,format=f]\n" " [,serial=s][,addr=A][,id=name][,aio=threads|native]\n" -" [,readonly=on|off]\n" +" [,readonly=on|off][,hostcache=on|off]\n" "use 'file' as a drive image\n", QEMU_ARCH_ALL) STEXI @item -drive @var{option}[,@var{option}[,@var{option}[,...]]] Index: qemu/qemu-config.c === --- qemu.orig/qemu-config.c +++ qemu/qemu-config.c @@ -85,6 +85,10 @@ static QemuOptsList qemu_drive_opts = { .name = "readonly", .type = QEMU_OPT_BOOL, .help = "open drive file as read-only", +},{ +.name = "hostcache", +.type = QEMU_OPT_BOOL, +.help = "set or reset hostcache (on/off)" }, { /* end of list */ } },
[Qemu-devel] [v9 Patch 2/6]Qemu: Error classes for file reopen and data sync failure
New error classes defined for file reopen failure and data sync error Signed-off-by: Supriya Kannery --- qerror.c |8 qerror.h |6 ++ 2 files changed, 14 insertions(+) Index: qemu/qerror.c === --- qemu.orig/qerror.c +++ qemu/qerror.c @@ -97,6 +97,14 @@ static const QErrorStringTable qerror_ta .desc = "Device '%(device)' is not removable", }, { +.error_fmt = QERR_DATA_SYNC_FAILED, +.desc = "Syncing of data failed for device '%(device)'", +}, +{ +.error_fmt = QERR_REOPEN_FILE_FAILED, +.desc = "Could not reopen '%(filename)'", +}, +{ .error_fmt = QERR_DEVICE_NO_BUS, .desc = "Device '%(device)' has no child bus", }, Index: qemu/qerror.h === --- qemu.orig/qerror.h +++ qemu/qerror.h @@ -87,6 +87,9 @@ QError *qobject_to_qerror(const QObject #define QERR_DEVICE_NOT_FOUND \ "{ 'class': 'DeviceNotFound', 'data': { 'device': %s } }" +#define QERR_DATA_SYNC_FAILED \ +"{ 'class': 'DataSyncFailed', 'data': { 'device': %s } }" + #define QERR_DEVICE_NOT_REMOVABLE \ "{ 'class': 'DeviceNotRemovable', 'data': { 'device': %s } }" @@ -144,6 +147,9 @@ QError *qobject_to_qerror(const QObject #define QERR_OPEN_FILE_FAILED \ "{ 'class': 'OpenFileFailed', 'data': { 'filename': %s } }" +#define QERR_REOPEN_FILE_FAILED \ +"{ 'class': 'ReopenFileFailed', 'data': { 'filename': %s } }" + #define QERR_PROPERTY_NOT_FOUND \ "{ 'class': 'PropertyNotFound', 'data': { 'device': %s, 'property': %s } }"
[Qemu-devel] [v9 Patch 1/6]Qemu: Enhance "info block" to display host cache setting
Enhance "info block" to display hostcache setting for each block device. Example: (qemu) info block ide0-hd0: removable=0 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Enhanced to display "hostcache" setting: (qemu) info block ide0-hd0: removable=0 hostcache=1 file=../rhel6-32.raw ro=0 drv=raw encrypted=0 Signed-off-by: Supriya Kannery --- block.c | 20 qmp-commands.hx |2 ++ 2 files changed, 18 insertions(+), 4 deletions(-) Index: qemu/qapi-schema.json === --- qemu.orig/qapi-schema.json +++ qemu/qapi-schema.json @@ -409,6 +409,8 @@ # @locked: True if the guest has locked this device from having its media # removed # +# @hostcache: True if host pagecache is enabled. +# # @tray_open: #optional True if the device has a tray and it is open # (only present if removable is true) # @@ -422,7 +424,7 @@ ## { 'type': 'BlockInfo', 'data': {'device': 'str', 'type': 'str', 'removable': 'bool', - 'locked': 'bool', '*inserted': 'BlockDeviceInfo', + 'locked': 'bool','hostcache': 'bool', '*inserted': 'BlockDeviceInfo', '*tray_open': 'bool', '*io-status': 'BlockDeviceIoStatus'} } ## Index: qemu/qapi-types.h === --- qemu.orig/qapi-types.h +++ qemu/qapi-types.h @@ -383,6 +383,7 @@ struct BlockInfo { char * device; char * type; +bool hostcache; bool removable; bool locked; bool has_inserted; Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -1839,6 +1839,7 @@ BlockInfoList *qmp_query_block(Error **e info->value->device = g_strdup(bs->device_name); info->value->type = g_strdup("unknown"); info->value->locked = bdrv_dev_is_medium_locked(bs); +info->value->hostcache = !(bs->open_flags & BDRV_O_NOCACHE); info->value->removable = bdrv_dev_has_removable_media(bs); if (bdrv_dev_has_removable_media(bs)) { Index: qemu/hmp.c === --- qemu.orig/hmp.c +++ qemu/hmp.c @@ -199,6 +199,8 @@ void hmp_info_block(Monitor *mon) monitor_printf(mon, " tray-open=%d", info->value->tray_open); } +monitor_printf(mon, " hostcache=%d", info->value->hostcache); + if (info->value->has_io_status) { monitor_printf(mon, " io-status=%s", BlockDeviceIoStatus_lookup[info->value->io_status]);
[Qemu-devel] [v9 Patch 0/6]Qemu: Host pagecache setting from cmdline and monitor
Following patchset is for enabling dynamic change of host pagecache setting of block devices through qemu monitor. This patchset introduces a. monitor command 'block_set_hostcache' using which host pagecache setting for a block device can be changed dynamically. b. a new option for setting host cache from qemu commandline option -drive "hostcache=on/off". c. BDRVReopenState, a generic structure which can be extended by each of the block drivers to reopen respective image files. Extension of this structure for drivers raw-posix is done here. Note: 'hostcache and 'cache' options when used together, cache=xx will override hostcache=yy. v9: 1. Rebased patchset to use qapi interfaces 2. Approach of extending BDRVReopenState changed to use container_of() v8: 1. Mandate implementation of all three reopen related functions by block drivers. 2. If 'cache=xx' and 'hostcache=yy' specified in cmdline, 'cache=' overrides 'hostcache='. v7: 1. Added structure BDRVReopenState to support safe reopening of image files. 2. Implemented reopen functions for raw-posix driver v6: 1. "block_set_hostcache" to replace "block_set" command v5: 1. Defined qerror class for incorrect command syntax. 2. Changed error_report() calls to qerror_report() v4: Added 'hostcache' option to '-drive' commandline option. v3: 1. Command "block_set" for changing various block params 2. Enhanced info-block to display hostcache setting 3. Added qmp interfaces for setting and querying hostcache v2: 1. Support of dynamic cache change only for hostcache. 2. Monitor command "hostcache_get" added to display cache setting 3. Backed off the changes for display of cache setting in "info block" v1: Dynamic cache change through monitor New block command added: "block_set_hostcache" -- Sets hostcache parameter for block device while guest is running. Usage: block_set_hostcache = block device = on/off New 'hostcache' option added to -drive: -drive [file=file][,if=type][,bus=n][,unit=m][,media=d][,index=i]\n" " [,readonly=on|off][,hostcache=on|off]\n" qemu/block.c | 111 + qemu/block.h |5 + qemu/block/raw-posix.c | 72 qemu/block/raw.c | 23 +- qemu/block_int.h | 13 + qemu/blockdev.c| 33 ++ qemu/blockdev.h|2 qemu/hmp-commands.hx | 14 ++ qemu/hmp.c |2 qemu/qapi-schema.json |4 + qemu/qapi-types.h |1 qemu/qemu-common.h |1 qemu/qemu-config.c |4 + qemu/qemu-options.hx |2 qemu/qerror.c |8 +++ qemu/qerror.h |6 ++ qemu/qmp-commands.hx | 27 20 files changed, 315 insertions(+), 13 deletions(-) ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ "txt" 13L, 574C
Re: [Qemu-devel] [v8 Patch 0/6]Qemu: Host pagecache setting from cmdline and monitor
On 11/07/2011 02:19 PM, Zhi Yong Wu wrote: On Mon, Nov 7, 2011 at 4:38 PM, Supriya Kannery wrote: On 11/04/2011 07:59 AM, Zhi Yong Wu wrote: On Sun, Oct 30, 2011 at 6:33 PM, Supriya Kannery wrote: Currently cache setting of a block device cannot be changed without restarting a running VM. Following patchset is for enabling dynamic change of cache setting for block devices through qemu monitor. Code changes are based on patches from Christoph Hellwig and Prerna Saxena. This patchset introduces a. monitor command 'block_set_hostcache' using which host pagecache setting for a block device can be changed dynamically. I got a bit confusion. Is it used to change host pagecache setting on hyperviser or on guest? This block device said by you is for guest, right? This command is used for changing pagecache setting of the image files in host which are acting as block devices for guest. So what is the difference between it and cache option? Cache=xx sets a combination of status flags for image files among which host pagecache is just one. Intention here is to gradually replace cache=xx with more explicit settings like hostcache, flush, WCE.
Re: [Qemu-devel] [v8 Patch 0/6]Qemu: Host pagecache setting from cmdline and monitor
On 11/04/2011 07:59 AM, Zhi Yong Wu wrote: On Sun, Oct 30, 2011 at 6:33 PM, Supriya Kannery wrote: Currently cache setting of a block device cannot be changed without restarting a running VM. Following patchset is for enabling dynamic change of cache setting for block devices through qemu monitor. Code changes are based on patches from Christoph Hellwig and Prerna Saxena. This patchset introduces a. monitor command 'block_set_hostcache' using which host pagecache setting for a block device can be changed dynamically. I got a bit confusion. Is it used to change host pagecache setting on hyperviser or on guest? This block device said by you is for guest, right? This command is used for changing pagecache setting of the image files in host which are acting as block devices for guest. -thanks, Supriya
Re: [Qemu-devel] [v8 Patch 5/6]Qemu: Framework for reopening images safely
On 11/04/2011 03:35 PM, Kevin Wolf wrote: Am 30.10.2011 11:35, schrieb Supriya Kannery: +struct BDRVReopenState { +BlockDriverState *bs; +int reopen_flags; + +/* For raw-posix */ +int reopen_fd; +}; I think I commented the same on the previous version: BDRVReopenState shouldn't contain any format specific fields. raw-posix must extend the struct like this and use container_of() to get it from a BDRVReopenState pointer: struct BDRVRawReopenState { BDRVReopenState common; int reopen_fd; }; I don't recall this was suggested in prev version or may be I missed to notice.. ok, will have raw extending common BDRVReopenState struct. Kevin
Re: [Qemu-devel] [v8 Patch 3/6]Qemu: Cmd "block_set_hostcache" for dynamic cache change
On 11/04/2011 03:30 PM, Kevin Wolf wrote: Am 30.10.2011 11:34, schrieb Supriya Kannery: + +if (bdrv_is_inserted(bs)) { +/* Reopen file with changed set of flags */ +return bdrv_reopen(bs, bdrv_flags); +} else { +/* Save hostcache change for future use */ +bs->open_flags = bdrv_flags; +return 0; +} Maybe the !bdrv_is_inserted() case should be handled in bdrv_reopen(). I think it would be the same for changing other flags. I am yet to look at the conditions specific to other flags. So I can move this check to bdrv_reopen when implementing one more flag I guess. Kevin
Re: [Qemu-devel] [v8 Patch 1/6]Qemu: Enhance "info block" to display host cache setting
On 11/03/2011 07:25 PM, Luiz Capitulino wrote: On Sun, 30 Oct 2011 16:03:54 +0530 Supriya Kannery wrote: +if (qdict_haskey(bs_dict, "hostcache")) { +monitor_printf(mon, " hostcache=%d", + qdict_get_bool(bs_dict, "hostcache")); +} This series needs to be rebased, as the info block command has been converted to the QAPI. Please, see the following commit for details: b202381800d81fbff9978abbdea95760dd363bb6. Also note that if you're adding new commands (I haven't reviewed the series) you should use the QAPI. A document on how to use it is coming soon. yes, will rebase and use QAPI if (qdict_haskey(bs_dict, "io-status")) { monitor_printf(mon, " io-status=%s", qdict_get_str(bs_dict, "io-status")); }
[Qemu-devel] [v8 Patch 3/6]Qemu: Cmd "block_set_hostcache" for dynamic cache change
New command "block_set_hostcache" added for dynamically changing host pagecache setting of a block device. Usage: block_set_hostcache = block device = on/off Example: (qemu) block_set_hostcache ide0-hd0 off Signed-off-by: Supriya Kannery --- block.c | 54 ++ block.h |2 ++ blockdev.c | 26 ++ blockdev.h |2 ++ hmp-commands.hx | 14 ++ qmp-commands.hx | 27 +++ 6 files changed, 125 insertions(+) Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -693,6 +693,34 @@ unlink_and_fail: return ret; } +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) +{ +BlockDriver *drv = bs->drv; +int ret = 0, open_flags; + +/* Quiesce IO for the given block device */ +qemu_aio_flush(); +if (bdrv_flush(bs)) { +qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); +return ret; +} +open_flags = bs->open_flags; +bdrv_close(bs); + +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); +if (ret < 0) { +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret < 0) { +/* Reopen failed with orig and modified flags */ +abort(); +} +} + +return ret; +} + void bdrv_close(BlockDriverState *bs) { if (bs->drv) { @@ -730,6 +758,32 @@ void bdrv_close_all(void) } } +int bdrv_change_hostcache(BlockDriverState *bs, bool enable_host_cache) +{ +int bdrv_flags = bs->open_flags; + +/* set hostcache flags (without changing WCE/flush bits) */ +if (enable_host_cache) { +bdrv_flags &= ~BDRV_O_NOCACHE; +} else { +bdrv_flags |= BDRV_O_NOCACHE; +} + +/* If no change in flags, no need to reopen */ +if (bdrv_flags == bs->open_flags) { +return 0; +} + +if (bdrv_is_inserted(bs)) { +/* Reopen file with changed set of flags */ +return bdrv_reopen(bs, bdrv_flags); +} else { +/* Save hostcache change for future use */ +bs->open_flags = bdrv_flags; +return 0; +} +} + /* make a BlockDriverState anonymous by removing from bdrv_state list. Also, NULL terminate the device_name to prevent double remove */ void bdrv_make_anon(BlockDriverState *bs) Index: qemu/block.h === --- qemu.orig/block.h +++ qemu/block.h @@ -109,6 +109,7 @@ int bdrv_parse_cache_flags(const char *m int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags); int bdrv_open(BlockDriverState *bs, const char *filename, int flags, BlockDriver *drv); +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags); void bdrv_close(BlockDriverState *bs); int bdrv_attach_dev(BlockDriverState *bs, void *dev); void bdrv_attach_dev_nofail(BlockDriverState *bs, void *dev); @@ -143,6 +144,7 @@ void bdrv_commit_all(void); int bdrv_change_backing_file(BlockDriverState *bs, const char *backing_file, const char *backing_fmt); void bdrv_register(BlockDriver *bdrv); +int bdrv_change_hostcache(BlockDriverState *bs, bool enable_host_cache); typedef struct BdrvCheckResult { Index: qemu/blockdev.c === --- qemu.orig/blockdev.c +++ qemu/blockdev.c @@ -776,3 +776,29 @@ int do_block_resize(Monitor *mon, const return 0; } + + +/* + * Change host page cache setting while guest is running. +*/ +int do_block_set_hostcache(Monitor *mon, const QDict *qdict, + QObject **ret_data) +{ +BlockDriverState *bs = NULL; +int enable; +const char *device; + +/* Validate device */ +device = qdict_get_str(qdict, "device"); +bs = bdrv_find(device); +if (!bs) { +qerror_report(QERR_DEVICE_NOT_FOUND, device); +return -1; +} + +/* Read hostcache setting */ +enable = qdict_get_bool(qdict, "option"); +return bdrv_change_hostcache(bs, enable); + +} + Index: qemu/blockdev.h === --- qemu.orig/blockdev.h +++ qemu/blockdev.h @@ -65,5 +65,7 @@ int do_change_block(Monitor *mon, const int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_block_resize(Monitor *mon, const QDict *qdict, QObject **ret_data); +int do_block_set_hostcache(Monitor *mon, const QDict *qdict, + QObject **ret_data); #endif Index: qemu/hmp-commands.hx === --- qemu.orig/hmp-commands.
[Qemu-devel] [v8 Patch 5/6]Qemu: Framework for reopening images safely
Struct BDRVReopenState along with three reopen related functions introduced for handling reopen state of images safely. This can be extended by each of the block drivers to reopen respective image files. Signed-off-by: Supriya Kannery Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -693,10 +693,32 @@ unlink_and_fail: return ret; } +int bdrv_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, int flags) +{ + BlockDriver *drv = bs->drv; + + return drv->bdrv_reopen_prepare(bs, prs, flags); +} + +void bdrv_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs, int flags) +{ +BlockDriver *drv = bs->drv; + +drv->bdrv_reopen_commit(bs, rs, flags); +} + +void bdrv_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +BlockDriver *drv = bs->drv; + +drv->bdrv_reopen_abort(bs, rs); +} + int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) { BlockDriver *drv = bs->drv; int ret = 0, open_flags; +BDRVReopenState *rs = NULL; /* Quiesce IO for the given block device */ qemu_aio_flush(); @@ -704,20 +726,35 @@ int bdrv_reopen(BlockDriverState *bs, in qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); return ret; } -open_flags = bs->open_flags; -bdrv_close(bs); -ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); -if (ret < 0) { -/* Reopen failed. Try to open with original flags */ -qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); -ret = bdrv_open(bs, bs->filename, open_flags, drv); +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen_prepare) { +ret = bdrv_reopen_prepare(bs, &rs, bdrv_flags); if (ret < 0) { -/* Reopen failed with orig and modified flags */ -abort(); +bdrv_reopen_abort(bs, rs); +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +return ret; } -} +bdrv_reopen_commit(bs, rs, bdrv_flags); + +} else { +open_flags = bs->open_flags; +bdrv_close(bs); +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); +if (ret < 0) { +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret < 0) { +/* + * Reopen failed with orig and modified flags + * For next access to fail, set drv to NULL +*/ +bs->drv = NULL; +} +} +} return ret; } Index: qemu/block_int.h === --- qemu.orig/block_int.h +++ qemu/block_int.h @@ -55,6 +55,14 @@ struct BlockDriver { int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename); int (*bdrv_probe_device)(const char *filename); int (*bdrv_open)(BlockDriverState *bs, int flags); + +/* For handling image reopen for split or non-split files */ +int (*bdrv_reopen_prepare)(BlockDriverState *bs, BDRVReopenState **rs, + int flags); +void (*bdrv_reopen_commit)(BlockDriverState *bs, BDRVReopenState *rs, + int flags); +void (*bdrv_reopen_abort)(BlockDriverState *bs, BDRVReopenState *rs); + int (*bdrv_file_open)(BlockDriverState *bs, const char *filename, int flags); int (*bdrv_read)(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors); @@ -211,6 +219,14 @@ struct BlockDriverState { void *private; }; +struct BDRVReopenState { +BlockDriverState *bs; +int reopen_flags; + +/* For raw-posix */ +int reopen_fd; +}; + struct BlockDriverAIOCB { AIOPool *pool; BlockDriverState *bs; Index: qemu/qemu-common.h === --- qemu.orig/qemu-common.h +++ qemu/qemu-common.h @@ -202,6 +202,7 @@ typedef struct NICInfo NICInfo; typedef struct HCIInfo HCIInfo; typedef struct AudioState AudioState; typedef struct BlockDriverState BlockDriverState; +typedef struct BDRVReopenState BDRVReopenState; typedef struct DriveInfo DriveInfo; typedef struct DisplayState DisplayState; typedef struct DisplayChangeListener DisplayChangeListener; Index: qemu/block.h === --- qemu.orig/block.h +++ qemu/block.h @@ -110,6 +110,9 @@ int bdrv_file_open(BlockDriverState **pb int bdrv_open(BlockDriverState *bs, const char *filename, int flags, BlockDriver *drv); int bdrv_reopen(BlockDriverState *bs, int bdrv_flags); +int bdrv_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, int flags); +void bdrv_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs, int
[Qemu-devel] [v8 Patch 6/6]Qemu: raw posix implementation of reopen functions
raw-posix driver changes for bdrv_reopen_xx functions to safely reopen image files. Reopening of image files while changing hostcache dynamically, is handled here. Signed-off-by: Supriya Kannery Index: qemu/block/raw.c === --- qemu.orig/block/raw.c +++ qemu/block/raw.c @@ -9,6 +9,24 @@ static int raw_open(BlockDriverState *bs return 0; } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +return bdrv_reopen_prepare(bs->file, prs, flags); +} + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs, + int flags) +{ +bdrv_reopen_commit(bs->file, rs, flags); +bs->open_flags = bs->file->open_flags; +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +bdrv_reopen_abort(bs->file, rs); +} + static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov) { @@ -107,7 +125,10 @@ static BlockDriver bdrv_raw = { /* It's really 0, but we need to make g_malloc() happy */ .instance_size = 1, - +.bdrv_reopen_prepare += raw_reopen_prepare, +.bdrv_reopen_commit = raw_reopen_commit, +.bdrv_reopen_abort = raw_reopen_abort, .bdrv_open = raw_open, .bdrv_close = raw_close, Index: qemu/block/raw-posix.c === --- qemu.orig/block/raw-posix.c +++ qemu/block/raw-posix.c @@ -279,6 +279,60 @@ static int raw_open(BlockDriverState *bs return raw_open_common(bs, filename, flags, 0); } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **prs, + int flags) +{ +BDRVReopenState *raw_rs = g_malloc0(sizeof(BDRVReopenState)); +BDRVRawState *s = bs->opaque; + +raw_rs->bs = bs; +raw_rs->reopen_flags = s->open_flags; +raw_rs->reopen_fd = -1; +*prs = raw_rs; +int ret = 0; + +/* If only O_DIRECT to be toggled, use fcntl */ +if (!((bs->open_flags & ~BDRV_O_NOCACHE) ^ +(flags & ~BDRV_O_NOCACHE))) { +raw_rs->reopen_fd = dup(s->fd); +if (raw_rs->reopen_fd <= 0) { +return -1; +} +if ((flags & BDRV_O_NOCACHE)) { +raw_rs->reopen_flags |= O_DIRECT; +} else { +raw_rs->reopen_flags &= ~O_DIRECT; +} +ret = fcntl_setfl(raw_rs->reopen_fd, raw_rs->reopen_flags); +} + +/* TBD: Handle O_DSYNC and other flags */ + +return ret; +} + +static void raw_reopen_commit(BlockDriverState *bs, BDRVReopenState *rs, + int flags) +{ +BDRVRawState *s = bs->opaque; + +/* Set new flags, so replace old fd with new one */ +close(s->fd); +s->fd = rs->reopen_fd; +s->open_flags = rs->reopen_flags; +bs->open_flags = flags; +g_free(rs); + +} + +static void raw_reopen_abort(BlockDriverState *bs, BDRVReopenState *rs) +{ +if (rs->reopen_fd != -1) { +close(rs->reopen_fd); +} +g_free(rs); +} + /* XXX: use host sector size if necessary with: #ifdef DIOCGSECTORSIZE { @@ -631,6 +685,9 @@ static BlockDriver bdrv_file = { .instance_size = sizeof(BDRVRawState), .bdrv_probe = NULL, /* no probe for protocols */ .bdrv_file_open = raw_open, +.bdrv_reopen_prepare = raw_reopen_prepare, +.bdrv_reopen_commit = raw_reopen_commit, +.bdrv_reopen_abort = raw_reopen_abort, .bdrv_close = raw_close, .bdrv_create = raw_create, .bdrv_co_discard = raw_co_discard,
[Qemu-devel] [v8 Patch 4/6]Qemu: Add commandline -drive option 'hostcache'
qemu command option 'hostcache' added to -drive for block devices. While starting a VM from qemu commandline, this option can be used for setting host cache usage for block data access. 'cache=xx' overrides 'hostcache=yy' when specified simultaneously. Signed-off-by: Supriya Kannery --- blockdev.c | 13 + qemu-config.c |4 qemu-options.hx |2 +- 3 files changed, 18 insertions(+), 1 deletion(-) Index: qemu/blockdev.c === --- qemu.orig/blockdev.c +++ qemu/blockdev.c @@ -237,6 +237,7 @@ DriveInfo *drive_init(QemuOpts *opts, in DriveInfo *dinfo; int snapshot = 0; int ret; +int hostcache = 0; translation = BIOS_ATA_TRANSLATION_AUTO; media = MEDIA_DISK; @@ -324,6 +325,12 @@ DriveInfo *drive_init(QemuOpts *opts, in error_report("invalid cache option"); return NULL; } +} else { +if ((hostcache = qemu_opt_get_bool(opts, "hostcache", -1)) != -1) { +if (!hostcache) { +bdrv_flags |= BDRV_O_NOCACHE; +} +} } #ifdef CONFIG_LINUX_AIO Index: qemu/qemu-options.hx === --- qemu.orig/qemu-options.hx +++ qemu/qemu-options.hx @@ -135,7 +135,7 @@ DEF("drive", HAS_ARG, QEMU_OPTION_drive, " [,cyls=c,heads=h,secs=s[,trans=t]][,snapshot=on|off]\n" " [,cache=writethrough|writeback|none|directsync|unsafe][,format=f]\n" " [,serial=s][,addr=A][,id=name][,aio=threads|native]\n" -" [,readonly=on|off]\n" +" [,readonly=on|off][,hostcache=on|off]\n" "use 'file' as a drive image\n", QEMU_ARCH_ALL) STEXI @item -drive @var{option}[,@var{option}[,@var{option}[,...]]] Index: qemu/qemu-config.c === --- qemu.orig/qemu-config.c +++ qemu/qemu-config.c @@ -85,6 +85,10 @@ static QemuOptsList qemu_drive_opts = { .name = "readonly", .type = QEMU_OPT_BOOL, .help = "open drive file as read-only", +},{ +.name = "hostcache", +.type = QEMU_OPT_BOOL, +.help = "set or reset hostcache (on/off)" }, { /* end of list */ } },
[Qemu-devel] [v8 Patch 2/6]Qemu: Error classes for file reopen and data sync failure
New error classes defined for file reopen failure and data sync error Signed-off-by: Supriya Kannery --- qerror.c |8 qerror.h |6 ++ 2 files changed, 14 insertions(+) Index: qemu/qerror.c === --- qemu.orig/qerror.c +++ qemu/qerror.c @@ -97,6 +97,14 @@ static const QErrorStringTable qerror_ta .desc = "Device '%(device)' is not removable", }, { +.error_fmt = QERR_DATA_SYNC_FAILED, +.desc = "Syncing of data failed for device '%(device)'", +}, +{ +.error_fmt = QERR_REOPEN_FILE_FAILED, +.desc = "Could not reopen '%(filename)'", +}, +{ .error_fmt = QERR_DEVICE_NO_BUS, .desc = "Device '%(device)' has no child bus", }, Index: qemu/qerror.h === --- qemu.orig/qerror.h +++ qemu/qerror.h @@ -87,6 +87,9 @@ QError *qobject_to_qerror(const QObject #define QERR_DEVICE_NOT_FOUND \ "{ 'class': 'DeviceNotFound', 'data': { 'device': %s } }" +#define QERR_DATA_SYNC_FAILED \ +"{ 'class': 'DataSyncFailed', 'data': { 'device': %s } }" + #define QERR_DEVICE_NOT_REMOVABLE \ "{ 'class': 'DeviceNotRemovable', 'data': { 'device': %s } }" @@ -144,6 +147,9 @@ QError *qobject_to_qerror(const QObject #define QERR_OPEN_FILE_FAILED \ "{ 'class': 'OpenFileFailed', 'data': { 'filename': %s } }" +#define QERR_REOPEN_FILE_FAILED \ +"{ 'class': 'ReopenFileFailed', 'data': { 'filename': %s } }" + #define QERR_PROPERTY_NOT_FOUND \ "{ 'class': 'PropertyNotFound', 'data': { 'device': %s, 'property': %s } }"
[Qemu-devel] [v8 Patch 1/6]Qemu: Enhance "info block" to display host cache setting
Enhance "info block" to display hostcache setting for each block device. Example: (qemu) info block ide0-hd0: removable=0 file=../sles11-32.raw ro=0 drv=raw encrypted=0 Enhanced to display "hostcache" setting: (qemu) info block ide0-hd0: removable=0 hostcache=1 file=../sles11-32.raw ro=0 drv=raw encrypted=0 Signed-off-by: Supriya Kannery --- block.c | 20 qmp-commands.hx |2 ++ 2 files changed, 18 insertions(+), 4 deletions(-) Index: qemu/qmp-commands.hx === --- qemu.orig/qmp-commands.hx +++ qemu/qmp-commands.hx @@ -1142,6 +1142,7 @@ Each json-object contain the following: - "locked": true if the device is locked, false otherwise (json-bool) - "tray-open": only present if removable, true if the device has a tray, and it is open (json-bool) +- "hostcache": true if host pagecache enabled, false otherwise (json-bool) - "inserted": only present if the device is inserted, it is a json-object containing the following: - "file": device file name (json-string) @@ -1168,6 +1169,7 @@ Example: "io-status": "ok", "device":"ide0-hd0", "locked":false, +"hostcache":false, "removable":false, "inserted":{ "ro":false, Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -1841,6 +1841,11 @@ static void bdrv_print_dict(QObject *obj qdict_get_bool(bs_dict, "tray-open")); } +if (qdict_haskey(bs_dict, "hostcache")) { +monitor_printf(mon, " hostcache=%d", + qdict_get_bool(bs_dict, "hostcache")); +} + if (qdict_haskey(bs_dict, "io-status")) { monitor_printf(mon, " io-status=%s", qdict_get_str(bs_dict, "io-status")); } @@ -1888,10 +1893,12 @@ void bdrv_info(Monitor *mon, QObject **r QDict *bs_dict; bs_obj = qobject_from_jsonf("{ 'device': %s, 'type': 'unknown', " -"'removable': %i, 'locked': %i }", +"'removable': %i, 'locked': %i, " +"'hostcache': %i }", bs->device_name, bdrv_dev_has_removable_media(bs), -bdrv_dev_is_medium_locked(bs)); +bdrv_dev_is_medium_locked(bs), +!(bs->open_flags & BDRV_O_NOCACHE)); bs_dict = qobject_to_qdict(bs_obj); if (bdrv_dev_has_removable_media(bs)) {
[Qemu-devel] [v8 Patch 0/6]Qemu: Host pagecache setting from cmdline and monitor
Currently cache setting of a block device cannot be changed without restarting a running VM. Following patchset is for enabling dynamic change of cache setting for block devices through qemu monitor. Code changes are based on patches from Christoph Hellwig and Prerna Saxena. This patchset introduces a. monitor command 'block_set_hostcache' using which host pagecache setting for a block device can be changed dynamically. b. a new option for setting host cache from qemu commandline option -drive "hostcache=on/off". c. BDRVReopenState, a generic structure which can be extended by each of the block drivers to reopen respective image files. Extension of this structure for drivers raw-posix is done here. d. 'hostcache and 'cache' options when used together, cache=xx will override hostcache=yy. v8: 1. Mandate implementation of all three reopen related functions by block drivers. 2. If 'cache=xx' and 'hostcache=yy' specified in cmdline, 'cache=' overrides 'hostcache='. v7: 1. Added structure BDRVReopenState to support safe reopening of image files. 2. Implemented reopen functions for raw-posix driver v6: 1. "block_set_hostcache" to replace "block_set" command v5: 1. Defined qerror class for incorrect command syntax. 2. Changed error_report() calls to qerror_report() v4: Added 'hostcache' option to '-drive' commandline option. v3: 1. Command "block_set" for changing various block params 2. Enhanced info-block to display hostcache setting 3. Added qmp interfaces for setting and querying hostcache v2: 1. Support of dynamic cache change only for hostcache. 2. Monitor command "hostcache_get" added to display cache setting 3. Backed off the changes for display of cache setting in "info block" v1: Dynamic cache change through monitor New block command added: "block_set_hostcache" -- Sets hostcache parameter for block device while guest is running. Usage: block_set_hostcache = block device = on/off New 'hostcache' option added to -drive: -drive [file=file][,if=type][,bus=n][,unit=m][,media=d][,index=i]\n" " [,readonly=on|off][,hostcache=on|off]\n" qemu/block.c | 79 +++ qemu/block.h |3 ++ qemu/block/raw-posix.c | 57 + qemu/block/raw.c | 23 +++- qemu/block_int.h | 16 +++ qemu/blockdev.c|7 + qemu/qemu-common.h |1 qemu/qemu-config.c |4 ++ qemu/qemu-options.hx |2 - qemu/qerror.c |8 + qemu/qerror.h |6 qemu/qmp-commands.hx |4 ++ 14 files changed, 194 insertions(+), 16 deletions(-) ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ "txt" 13L, 574C
Re: [Qemu-devel] [v7 Patch 5/5]Qemu: New struct 'BDRVReopenState' for image files reopen
On 10/14/2011 04:43 PM, Stefan Hajnoczi wrote: On Tue, Oct 11, 2011 at 08:41:59AM +0530, Supriya Kannery wrote: Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -706,6 +706,7 @@ int bdrv_reopen(BlockDriverState *bs, in { BlockDriver *drv = bs->drv; int ret = 0, open_flags; +BDRVReopenState *rs; BDRVReopenState *rs = NULL; If the abort path is taken we need to make sure rs has a defined value. Note that the abort path currently doesn't handle rs == NULL and will segfault in raw_reopen_abort(). sure..will check on this. thanks, Supriya
Re: [Qemu-devel] [v7 Patch 5/5]Qemu: New struct 'BDRVReopenState' for image files reopen
On 10/12/2011 08:25 PM, Kevin Wolf wrote: Am 11.10.2011 05:11, schrieb Supriya Kannery: Struct BDRVReopenState introduced for handling reopen state of images. This can be extended by each of the block drivers to reopen respective image files. Implementation for raw-posix is done here. Signed-off-by: Supriya Kannery Maybe it would make sense to split this into two patches, one for the block.c infrastructure and another one for adding the callbacks in drivers. ok..will split in v8. + +/* If only O_DIRECT to be toggled, use fcntl */ +if (!((bs->open_flags& ~BDRV_O_NOCACHE) ^ +(flags& ~BDRV_O_NOCACHE))) { +raw_rs->reopen_fd = dup(s->fd); +if (raw_rs->reopen_fd<= 0) { +return -1; This leaks raw_rs. will handle +} +} + +/* TBD: Handle O_DSYNC and other flags */ +*rs = raw_rs; +return 0; +} + +static int raw_reopen_commit(BDRVReopenState *rs) bdrv_reopen_commit must never fail. Any error that can happen must already be handled in bdrv_reopen_prepare. The commit function should really only do s->fd = rs->reopen_fd (besides cleanup), everything else should already be prepared. will move call to fcntl to bdrv_reopen_prepare. +{ +BlockDriverState *bs = rs->bs; +BDRVRawState *s = bs->opaque; + +if (!rs->reopen_fd) { +return -1; +} + +int ret = fcntl_setfl(rs->reopen_fd, rs->reopen_flags); reopen_flags is BDRV_O_*, not O_*, so it needs to be translated. ok +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen_prepare) { +ret = drv->bdrv_reopen_prepare(bs,&rs, bdrv_flags); if (ret< 0) { -/* Reopen failed with orig and modified flags */ -abort(); +goto fail; } -} +if (drv->bdrv_reopen_commit) { +ret = drv->bdrv_reopen_commit(rs); +if (ret< 0) { +goto fail; +} +return 0; +} Pull the return 0; out one level. It would be really strange if we turned a successful prepare into reopen_abort just because the driver doesn't need a commit function. (The other consistent way would be to require that if a driver implements one reopen function, it has to implement all three of them. I'm fine either way.) Will give flexibility to drivers, not mandating all the three functions. +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret< 0) { +/* + * Reopen with orig and modified flags failed + */ +abort(); Make this bs->drv = NULL, so that trying to access to image will fail, but at least not the whole VM crashes. ok static int raw_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors) { @@ -128,6 +137,8 @@ static BlockDriver bdrv_raw = { .instance_size = 1, .bdrv_open = raw_open, +.bdrv_reopen_prepare += raw_reopen, .bdrv_close = raw_close, .bdrv_read = raw_read, .bdrv_write = raw_write, I think raw must pass through all three functions. Otherwise it could happen that we need to abort, but the image has already been reopened. ok..got it..will have three separate functions to avoid unnecessary dependencies. Got a question.. In raw-posix, the three functions are implemented for file_reopen for now. Should this be extended to hdev, cdrom and floppy? Kevin
Re: [Qemu-devel] [v7 Patch 4/5]Qemu: Add commandline -drive option 'hostcache'
On 10/12/2011 08:00 PM, Kevin Wolf wrote: Am 11.10.2011 05:11, schrieb Supriya Kannery: qemu command option 'hostcache' added to -drive for block devices. While starting a VM from qemu commandline, this option can be used for setting host cache usage for block data access. Simultaneous use of 'hostcache' and 'cache' options not allowed. Signed-off-by: Supriya Kannery I'm not sure if introducing this alone makes sense. I think I would only do it when we introduce more options that allow replacing all cache=... options by other parameters. Can we do transition to alternatives for 'cache=' in a phased manner? Until all other params are ready, we can allow hostcache (as well as other params as and when they are ready) in cmdline with the condition that 'cache=x', if specified, overrides these params. Once we have all other params ready, 'cache=' can be replaced completely. thanks, Supriya
Re: [Qemu-devel] [v7 Patch 1/5]Qemu: Enhance "info block" to display host cache setting
On 10/12/2011 07:47 PM, Kevin Wolf wrote: Am 11.10.2011 05:10, schrieb Supriya Kannery: Enhance "info block" to display hostcache setting for each block device. +if (qdict_haskey(bs_dict, "open_flags")) { +int open_flags = qdict_get_int(bs_dict, "open_flags"); +if (open_flags& BDRV_O_NOCACHE) +monitor_printf(mon, " hostcache=0"); +else +monitor_printf(mon, " hostcache=1"); Coding style requires braces. ok..will add. checkpatch.pl didn't catch this! bs_obj = qobject_from_jsonf("{ 'device': %s, 'type': 'unknown', " -"'removable': %i, 'locked': %i }", +"'removable': %i, 'locked': %i, " +"'hostcache': %i }", bs->device_name, bdrv_dev_has_removable_media(bs), -bdrv_dev_is_medium_locked(bs)); +bdrv_dev_is_medium_locked(bs), +!(bs->open_flags& BDRV_O_NOCACHE)); bs_dict = qobject_to_qdict(bs_obj); +qdict_put(bs_dict, "open_flags", qint_from_int(bs->open_flags)); No. This adds a open_flags field to the QMP structure that is transferred to clients. This is wrong, open_flags is an internal thing that should never be visible on an interface. In bdrv_print_dict, access the hostcache field that you introduced, it provides the same information. Will replace "open_flags" with "hostcache" field. thanks, Supriya
Re: [Qemu-devel] Safely reopening image files by stashing fds
On 10/10/2011 11:58 PM, Kevin Wolf wrote: Am 09.08.2011 11:22, schrieb supriya kannery: Kevin Wolf wrote: Am 08.08.2011 09:02, schrieb Supriya Kannery: On 08/05/2011 09:19 PM, Anthony Liguori wrote: On 08/05/2011 10:43 AM, Kevin Wolf wrote: Am 05.08.2011 17:24, schrieb Stefan Hajnoczi: On Fri, Aug 5, 2011 at 3:28 PM, Christoph Hellwig wrote: On Fri, Aug 05, 2011 at 02:12:48PM +0100, Daniel P. Berrange wrote: Because you cannot change O_DIRECT on an open fd :(. This is why we're going through this pain. Hmm, I remember hearing that before, but looking at the current fcntl() manpage, it claims you *can* change O_DIRECT using SET_FL. Perhaps this is a newish feature, but it'd be nicer to use it if possible ? It's been there since day 1 of O_DIRECT support. Sorry, my bad. So for Linux we could just use fcntl for block_set_hostcache and not bother with reopening. However, we will need to reopen should we wish to support changing O_DSYNC. We do wish to support that. Anthony thinks that allowing the guest to toggle WCE is a prerequisite for making cache=writeback the default. And this is something that I definitely want to do for 1.0. Indeed. We discussed the following so far... 1. How to safely reopen image files 2. Dynamic hostcache change 3. Support for dynamic change of O_DSYNC Since 2 is independent of 1, shall I go ahead implementing hostcache change using fcntl. Implementation for safely reopening image files using "BDRVReopenState" can be done separately as a pre-requisite before implementing 3 Doing it separately means that we would introduce yet another callback that is used just to change O_DIRECT. In the end we want it to use bdrv_reopen(), too, so I'm not sure if there is a need for a temporary solution. Could you please explain "In the end we want it to use bdrv_reopen" at bit more. When fcntl() can change O_DIRECT on open fd , is there a need to "re-open" the image file? Considering the current way of having separate high level commands for changing block parameters (block_set_hostcache, and may be block_set_flush in furture), these dynamic requests will be sequential. So wouldn't it be better to avoid re-opening of image if possible for individual flag change request that comes in? Actually, once we know what we really want (I haven't seen many comments on the BDRVReopenState suggestion yet), it should be pretty easy to implement. Kevin Will work on to get an RFC patch with this proposed BDRVReopenState to get more inputs. Are you still going to prepare an RFC patch implementing bdrv_reopen_prepare/commit/abort using a BDRVReopenState? Or has it even been posted and I just missed it? Kevin Pls review the patchset posted at http://lists.gnu.org/archive/html/qemu-devel/2011-10/msg01014.html Working on further to similarly use BDRVReopenState for raw-win32 images. thanks, Supriya
[Qemu-devel] [v7 Patch 5/5]Qemu: New struct 'BDRVReopenState' for image files reopen
Struct BDRVReopenState introduced for handling reopen state of images. This can be extended by each of the block drivers to reopen respective image files. Implementation for raw-posix is done here. Signed-off-by: Supriya Kannery --- block.c | 46 block/raw-posix.c | 62 ++ block_int.h | 15 + 3 files changed, 114 insertions(+), 9 deletions(-) Index: qemu/block/raw-posix.c === --- qemu.orig/block/raw-posix.c +++ qemu/block/raw-posix.c @@ -279,6 +279,66 @@ static int raw_open(BlockDriverState *bs return raw_open_common(bs, filename, flags, 0); } +static int raw_reopen_prepare(BlockDriverState *bs, BDRVReopenState **rs, + int flags) +{ +BDRVReopenState *raw_rs = g_malloc0(sizeof(BDRVReopenState)); +BDRVRawState *s = bs->opaque; + +raw_rs->bs = bs; +raw_rs->reopen_flags = flags; +raw_rs->reopen_fd = -1; + +/* If only O_DIRECT to be toggled, use fcntl */ +if (!((bs->open_flags & ~BDRV_O_NOCACHE) ^ +(flags & ~BDRV_O_NOCACHE))) { +raw_rs->reopen_fd = dup(s->fd); +if (raw_rs->reopen_fd <= 0) { +return -1; +} +} + +/* TBD: Handle O_DSYNC and other flags */ +*rs = raw_rs; +return 0; +} + +static int raw_reopen_commit(BDRVReopenState *rs) +{ +BlockDriverState *bs = rs->bs; +BDRVRawState *s = bs->opaque; + +if (!rs->reopen_fd) { +return -1; +} + +int ret = fcntl_setfl(rs->reopen_fd, rs->reopen_flags); +if (ret < 0) { +return ret; +} + +/* Set new flags, so replace old fd with new one */ +close(s->fd); +s->fd = rs->reopen_fd; +s->open_flags = bs->open_flags = rs->reopen_flags; +g_free(rs); +rs = NULL; + +return 0; +} + +static int raw_reopen_abort(BDRVReopenState *rs) +{ +if (rs->reopen_fd != -1) { +close(rs->reopen_fd); +rs->reopen_fd = -1; +rs->reopen_flags = 0; +} +g_free(rs); +rs = NULL; +return 0; +} + /* XXX: use host sector size if necessary with: #ifdef DIOCGSECTORSIZE { @@ -896,6 +956,9 @@ static BlockDriver bdrv_file = { .instance_size = sizeof(BDRVRawState), .bdrv_probe = NULL, /* no probe for protocols */ .bdrv_file_open = raw_open, +.bdrv_reopen_prepare = raw_reopen_prepare, +.bdrv_reopen_commit = raw_reopen_commit, +.bdrv_reopen_abort = raw_reopen_abort, .bdrv_read = raw_read, .bdrv_write = raw_write, .bdrv_close = raw_close, @@ -1166,6 +1229,10 @@ static BlockDriver bdrv_host_device = { .instance_size = sizeof(BDRVRawState), .bdrv_probe_device = hdev_probe_device, .bdrv_file_open = hdev_open, +.bdrv_reopen_prepare += raw_reopen_prepare, +.bdrv_reopen_commit = raw_reopen_commit, +.bdrv_reopen_abort = raw_reopen_abort, .bdrv_close = raw_close, .bdrv_create= hdev_create, .create_options = raw_create_options, Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -706,6 +706,7 @@ int bdrv_reopen(BlockDriverState *bs, in { BlockDriver *drv = bs->drv; int ret = 0, open_flags; +BDRVReopenState *rs; /* Quiesce IO for the given block device */ qemu_aio_flush(); @@ -713,20 +714,48 @@ int bdrv_reopen(BlockDriverState *bs, in qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); return ret; } -open_flags = bs->open_flags; -bdrv_close(bs); -ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); -if (ret < 0) { -/* Reopen failed. Try to open with original flags */ -qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); -ret = bdrv_open(bs, bs->filename, open_flags, drv); +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen_prepare) { +ret = drv->bdrv_reopen_prepare(bs, &rs, bdrv_flags); if (ret < 0) { -/* Reopen failed with orig and modified flags */ -abort(); +goto fail; } -} +if (drv->bdrv_reopen_commit) { +ret = drv->bdrv_reopen_commit(rs); +if (ret < 0) { +goto fail; +} +return 0; +} +fail: +if (drv->bdrv_reopen_abort) { +ret = drv->bdrv_reopen_abort(rs); +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +return ret; +} + +} else { + +/* Use reopen procedure common for all drivers */ +open_flags = bs->open_flags; +bdrv_close(bs); +ret = bdrv_open(
[Qemu-devel] [v7 Patch 4/5]Qemu: Add commandline -drive option 'hostcache'
qemu command option 'hostcache' added to -drive for block devices. While starting a VM from qemu commandline, this option can be used for setting host cache usage for block data access. Simultaneous use of 'hostcache' and 'cache' options not allowed. Signed-off-by: Supriya Kannery --- blockdev.c | 13 + qemu-config.c |4 qemu-options.hx |2 +- 3 files changed, 18 insertions(+), 1 deletion(-) Index: qemu/blockdev.c === --- qemu.orig/blockdev.c +++ qemu/blockdev.c @@ -237,6 +237,7 @@ DriveInfo *drive_init(QemuOpts *opts, in DriveInfo *dinfo; int snapshot = 0; int ret; +int hostcache = 0; translation = BIOS_ATA_TRANSLATION_AUTO; media = MEDIA_DISK; @@ -319,7 +320,19 @@ DriveInfo *drive_init(QemuOpts *opts, in } } +if ((hostcache = qemu_opt_get_bool(opts, "hostcache", -1)) != -1) { +if (!hostcache) { +bdrv_flags |= BDRV_O_NOCACHE; +} else { +bdrv_flags &= ~BDRV_O_NOCACHE; +} +} + if ((buf = qemu_opt_get(opts, "cache")) != NULL) { +if (hostcache != -1) { +error_report("'hostcache' and 'cache' cannot co-exist"); +return NULL; +} if (bdrv_parse_cache_flags(buf, &bdrv_flags) != 0) { error_report("invalid cache option"); return NULL; Index: qemu/qemu-options.hx === --- qemu.orig/qemu-options.hx +++ qemu/qemu-options.hx @@ -135,7 +135,7 @@ DEF("drive", HAS_ARG, QEMU_OPTION_drive, " [,cyls=c,heads=h,secs=s[,trans=t]][,snapshot=on|off]\n" " [,cache=writethrough|writeback|none|directsync|unsafe][,format=f]\n" " [,serial=s][,addr=A][,id=name][,aio=threads|native]\n" -" [,readonly=on|off]\n" +" [,readonly=on|off][,hostcache=on|off]\n" "use 'file' as a drive image\n", QEMU_ARCH_ALL) STEXI @item -drive @var{option}[,@var{option}[,@var{option}[,...]]] Index: qemu/qemu-config.c === --- qemu.orig/qemu-config.c +++ qemu/qemu-config.c @@ -85,6 +85,10 @@ static QemuOptsList qemu_drive_opts = { .name = "readonly", .type = QEMU_OPT_BOOL, .help = "open drive file as read-only", +},{ +.name = "hostcache", +.type = QEMU_OPT_BOOL, +.help = "set or reset hostcache (on/off)" }, { /* end of list */ } },
[Qemu-devel] [v7 Patch 3/5]Qemu: Cmd "block_set_hostcache" for dynamic cache change
New command "block_set_hostcache" added for dynamically changing host pagecache setting of a block device. Usage: block_set_hostcache = block device = on/off Example: (qemu) block_set_hostcache ide0-hd0 off Signed-off-by: Supriya Kannery --- block.c | 54 ++ block.h |2 ++ blockdev.c | 26 ++ blockdev.h |2 ++ hmp-commands.hx | 14 ++ qmp-commands.hx | 27 +++ 6 files changed, 125 insertions(+) Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -702,6 +702,34 @@ unlink_and_fail: return ret; } +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) +{ +BlockDriver *drv = bs->drv; +int ret = 0, open_flags; + +/* Quiesce IO for the given block device */ +qemu_aio_flush(); +if (bdrv_flush(bs)) { +qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); +return ret; +} +open_flags = bs->open_flags; +bdrv_close(bs); + +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); +if (ret < 0) { +/* Reopen failed. Try to open with original flags */ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret < 0) { +/* Reopen failed with orig and modified flags */ +abort(); +} +} + +return ret; +} + void bdrv_close(BlockDriverState *bs) { if (bs->drv) { @@ -739,6 +767,32 @@ void bdrv_close_all(void) } } +int bdrv_change_hostcache(BlockDriverState *bs, bool enable_host_cache) +{ +int bdrv_flags = bs->open_flags; + +/* set hostcache flags (without changing WCE/flush bits) */ +if (enable_host_cache) { +bdrv_flags &= ~BDRV_O_NOCACHE; +} else { +bdrv_flags |= BDRV_O_NOCACHE; +} + +/* If no change in flags, no need to reopen */ +if (bdrv_flags == bs->open_flags) { +return 0; +} + +if (bdrv_is_inserted(bs)) { +/* Reopen file with changed set of flags */ +return bdrv_reopen(bs, bdrv_flags); +} else { +/* Save hostcache change for future use */ +bs->open_flags = bdrv_flags; +return 0; +} +} + /* make a BlockDriverState anonymous by removing from bdrv_state list. Also, NULL terminate the device_name to prevent double remove */ void bdrv_make_anon(BlockDriverState *bs) Index: qemu/block.h === --- qemu.orig/block.h +++ qemu/block.h @@ -99,6 +99,7 @@ int bdrv_parse_cache_flags(const char *m int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags); int bdrv_open(BlockDriverState *bs, const char *filename, int flags, BlockDriver *drv); +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags); void bdrv_close(BlockDriverState *bs); int bdrv_attach_dev(BlockDriverState *bs, void *dev); void bdrv_attach_dev_nofail(BlockDriverState *bs, void *dev); @@ -133,6 +134,7 @@ void bdrv_commit_all(void); int bdrv_change_backing_file(BlockDriverState *bs, const char *backing_file, const char *backing_fmt); void bdrv_register(BlockDriver *bdrv); +int bdrv_change_hostcache(BlockDriverState *bs, bool enable_host_cache); typedef struct BdrvCheckResult { Index: qemu/blockdev.c === --- qemu.orig/blockdev.c +++ qemu/blockdev.c @@ -776,3 +776,29 @@ int do_block_resize(Monitor *mon, const return 0; } + + +/* + * Change host page cache setting while guest is running. +*/ +int do_block_set_hostcache(Monitor *mon, const QDict *qdict, + QObject **ret_data) +{ +BlockDriverState *bs = NULL; +int enable; +const char *device; + +/* Validate device */ +device = qdict_get_str(qdict, "device"); +bs = bdrv_find(device); +if (!bs) { +qerror_report(QERR_DEVICE_NOT_FOUND, device); +return -1; +} + +/* Read hostcache setting */ +enable = qdict_get_bool(qdict, "option"); +return bdrv_change_hostcache(bs, enable); + +} + Index: qemu/blockdev.h === --- qemu.orig/blockdev.h +++ qemu/blockdev.h @@ -65,5 +65,7 @@ int do_change_block(Monitor *mon, const int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_block_resize(Monitor *mon, const QDict *qdict, QObject **ret_data); +int do_block_set_hostcache(Monitor *mon, const QDict *qdict, + QObject **ret_data); #endif Index: qemu/hmp-commands.hx === --- qemu.orig/hmp-commands.
[Qemu-devel] [v7 Patch 2/5]Qemu: Error classes for file reopen and data sync failure
New error classes defined for file reopen failure and data sync error Signed-off-by: Supriya Kannery --- qerror.c |8 qerror.h |6 ++ 2 files changed, 14 insertions(+) Index: qemu/qerror.c === --- qemu.orig/qerror.c +++ qemu/qerror.c @@ -97,6 +97,14 @@ static const QErrorStringTable qerror_ta .desc = "Device '%(device)' is not removable", }, { +.error_fmt = QERR_DATA_SYNC_FAILED, +.desc = "Syncing of data failed for device '%(device)'", +}, +{ +.error_fmt = QERR_REOPEN_FILE_FAILED, +.desc = "Could not reopen '%(filename)'", +}, +{ .error_fmt = QERR_DEVICE_NO_BUS, .desc = "Device '%(device)' has no child bus", }, Index: qemu/qerror.h === --- qemu.orig/qerror.h +++ qemu/qerror.h @@ -87,6 +87,9 @@ QError *qobject_to_qerror(const QObject #define QERR_DEVICE_NOT_FOUND \ "{ 'class': 'DeviceNotFound', 'data': { 'device': %s } }" +#define QERR_DATA_SYNC_FAILED \ +"{ 'class': 'DataSyncFailed', 'data': { 'device': %s } }" + #define QERR_DEVICE_NOT_REMOVABLE \ "{ 'class': 'DeviceNotRemovable', 'data': { 'device': %s } }" @@ -144,6 +147,9 @@ QError *qobject_to_qerror(const QObject #define QERR_OPEN_FILE_FAILED \ "{ 'class': 'OpenFileFailed', 'data': { 'filename': %s } }" +#define QERR_REOPEN_FILE_FAILED \ +"{ 'class': 'ReopenFileFailed', 'data': { 'filename': %s } }" + #define QERR_PROPERTY_NOT_FOUND \ "{ 'class': 'PropertyNotFound', 'data': { 'device': %s, 'property': %s } }"
[Qemu-devel] [v7 Patch 1/5]Qemu: Enhance "info block" to display host cache setting
Enhance "info block" to display hostcache setting for each block device. Example: (qemu) info block ide0-hd0: removable=0 file=../sles11-32.raw ro=0 drv=raw encrypted=0 Enhanced to display "hostcache" setting: (qemu) info block ide0-hd0: removable=0 hostcache=1 file=../sles11-32.raw ro=0 drv=raw encrypted=0 Signed-off-by: Supriya Kannery --- block.c | 20 qmp-commands.hx |2 ++ 2 files changed, 18 insertions(+), 4 deletions(-) Index: qemu/qmp-commands.hx === --- qemu.orig/qmp-commands.hx +++ qemu/qmp-commands.hx @@ -1142,6 +1142,7 @@ Each json-object contain the following: - "locked": true if the device is locked, false otherwise (json-bool) - "tray-open": only present if removable, true if the device has a tray, and it is open (json-bool) +- "hostcache": true if host pagecache enabled, false otherwise (json-bool) - "inserted": only present if the device is inserted, it is a json-object containing the following: - "file": device file name (json-string) @@ -1163,6 +1164,7 @@ Example: { "device":"ide0-hd0", "locked":false, +"hostcache":false, "removable":false, "inserted":{ "ro":false, Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -1866,6 +1866,15 @@ static void bdrv_print_dict(QObject *obj monitor_printf(mon, " tray-open=%d", qdict_get_bool(bs_dict, "tray-open")); } + +if (qdict_haskey(bs_dict, "open_flags")) { +int open_flags = qdict_get_int(bs_dict, "open_flags"); +if (open_flags & BDRV_O_NOCACHE) +monitor_printf(mon, " hostcache=0"); +else +monitor_printf(mon, " hostcache=1"); +} + if (qdict_haskey(bs_dict, "inserted")) { QDict *qdict = qobject_to_qdict(qdict_get(bs_dict, "inserted")); @@ -1903,11 +1912,14 @@ void bdrv_info(Monitor *mon, QObject **r QDict *bs_dict; bs_obj = qobject_from_jsonf("{ 'device': %s, 'type': 'unknown', " -"'removable': %i, 'locked': %i }", +"'removable': %i, 'locked': %i, " +"'hostcache': %i }", bs->device_name, bdrv_dev_has_removable_media(bs), -bdrv_dev_is_medium_locked(bs)); +bdrv_dev_is_medium_locked(bs), +!(bs->open_flags & BDRV_O_NOCACHE)); bs_dict = qobject_to_qdict(bs_obj); +qdict_put(bs_dict, "open_flags", qint_from_int(bs->open_flags)); if (bdrv_dev_has_removable_media(bs)) { qdict_put(bs_dict, "tray-open",
[Qemu-devel] [v7 Patch 0/5]Qemu: Host pagecache setting from cmdline and monitor
Currently cache setting of a block device cannot be changed without restarting a running VM. Following patchset is for enabling dynamic change of cache setting for block devices through qemu monitor and qmp. Code changes are based on patches from Christoph Hellwig and Prerna Saxena. This patchset introduces a. monitor command 'block_set_hostcache' using which host pagecache setting for a block device can be changed dynamically while guest VM is running. b. 'hostcache' - a new option for setting host cache from qemu command -drive "hostcache=on/off". c. BDRVReopenState, a generic structure which can be extended by each of the block drivers to reopen respective image files. Extension of this structure for raw-posix is done for now. I am working on to extend the same for raw-win32 image files as well. Note: 'Hostcache and 'cache' options cannot be used simultaneously from commandline. v7: 1. Added structure BDRVReopenState to support safe reopening of image files. v6: 1. "block_set_hostcache" to replace "block_set" command v5: 1. Defined qerror class for incorrect command syntax. 2. Changed error_report() calls to qerror_report() v4: Added 'hostcache' option to '-drive' commandline option. v3: 1. Command "block_set" for changing various block params 2. Enhanced info-block to display hostcache setting 3. Added qmp interfaces for setting and querying hostcache v2: 1. Support of dynamic cache change only for hostcache. 2. Monitor command "hostcache_get" added to display cache setting 3. Backed off the changes for display of cache setting in "info block" v1: Dynamic cache change through monitor New block command added: "block_set_hostcache" -- Sets hostcache parameter for block device while guest is running. Usage: block_set_hostcache = block device = on/off New 'hostcache' option added to -drive: -drive [file=file][,if=type][,bus=n][,unit=m][,media=d][,index=i]\n" " [,readonly=on|off][,hostcache=on|off]\n" qemu/block.c | 119 ++--- qemu/block.h |2 + qemu/block/raw-posix.c | 67 ++ qemu/block/raw.c | 11 +++ qemu/block_int.h | 15 ++ qemu/blockdev.c| 39 + qemu/blockdev.h|2 + qemu/hmp-commands.hx | 14 + qemu/qemu-common.h |1 qemu/qemu-config.c |4 ++ qemu/qemu-options.hx |2 - qemu/qerror.c |8 + qemu/qerror.h |6 qemu/qmp-commands.hx | 29 +++ 17 files changed, 306 insertions(+), 13 deletions(-)
Re: [Qemu-devel] [RFC] Safely reopening image files by stashing fds
On 08/09/2011 03:02 PM, supriya kannery wrote: > Kevin Wolf wrote: >> Am 09.08.2011 11:22, schrieb supriya kannery: >>> Kevin Wolf wrote: >> >> What I meant is that in the end, with a generic bdrv_reopen(), we can >> have raw-posix only call dup() and fcntl() instead of doing a >> close()/open() sequence if it can satisfy the new flags this way. But >> this would be an implementation detail and not be visible in the >> interface. >> >> Kevin > > ok > - thanks, Supriya > Though I started the RFC patch with defining BDRVReopenState, ended up in enhancing struct BlockDriver with .bdrv_reopen. bdrv_reopen mplementation specific to respective driver is assigned to this function pointer. Please find the implementation of O_DIRECT flag change, done in raw-posix.c. Similar implementation can be done for vmdk (with bdrv_reopen_commit and bdrv_reopen_abort as internal functions in vmdk.c for opening split files), sheepdog, nbd etc.. Request for comments on this approach. Note: Following patch is for demonstrating the approach using raw-posix as sample driver. This patch is not complete. - thanks, Supriya --- block.c | 33 +++-- block/raw-posix.c | 34 ++ block_int.h |1 + 3 files changed, 58 insertions(+), 10 deletions(-) Index: qemu/block/raw-posix.c === --- qemu.orig/block/raw-posix.c +++ qemu/block/raw-posix.c @@ -276,6 +276,39 @@ static int raw_open(BlockDriverState *bs return raw_open_common(bs, filename, flags, 0); } +static int raw_reopen(BlockDriverState *bs, int flags) +{ +BDRVRawState *s = bs->opaque; +int new_fd; +int ret = 0; + +/* Handle request for toggling O_DIRECT */ +if ((bs->open_flags | BDRV_O_NOCACHE) ^ +(flags | BDRV_O_NOCACHE)) +{ +if ((new_fd = dup(s->fd)) <= 0) { +return -1; +} + +if ((ret = fcntl_setfl(new_fd, flags)) < 0) { +close(new_fd); +return ret; +} + +/* Set new flags, so replace old fd with new one */ +close(s->fd); +s->fd = new_fd; +s->open_flags = flags; +bs->open_flags = flags; +} + +/* + * TBD: handle O_DSYNC and other flags for which image + * file has to be reopened + */ +return 0; +} + /* XXX: use host sector size if necessary with: #ifdef DIOCGSECTORSIZE { @@ -886,6 +919,7 @@ static BlockDriver bdrv_file = { .instance_size = sizeof(BDRVRawState), .bdrv_probe = NULL, /* no probe for protocols */ .bdrv_file_open = raw_open, +.bdrv_reopen = raw_reopen, .bdrv_read = raw_read, .bdrv_write = raw_write, .bdrv_close = raw_close, Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -687,20 +687,33 @@ int bdrv_reopen(BlockDriverState *bs, in qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); return ret; } -open_flags = bs->open_flags; -bdrv_close(bs); -ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); -if (ret < 0) { -/* Reopen failed. Try to open with original flags */ -qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); -ret = bdrv_open(bs, bs->filename, open_flags, drv); +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen) { +if ((ret = drv->bdrv_reopen(bs, bdrv_flags)) < 0) { +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +return ret; +} +} else { +/* Use reopen procedure common for drivers */ +open_flags = bs->open_flags; +bdrv_close(bs); + +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); if (ret < 0) { -/* Reopen failed with orig and modified flags */ -abort(); +/* + * Reopen failed. Try to open with original flags +*/ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret < 0) { +/* + * Reopen with orig and modified flags failed + */ +abort(); +} } } - return ret; } Index: qemu/block_int.h === --- qemu.orig/block_int.h +++ qemu/block_int.h @@ -54,6 +54,7 @@ struct BlockDriver { int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename); int (*bdrv_probe_device)(const char *filename); int (*bdrv_open)(BlockDriverState *bs, int flags); +int (*bdrv_reopen)(BlockDriverState *bs, int flags); int (*bdrv_file_open)(BlockDriverState *bs, const char *filename, int flags); int (*bdrv_read)(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors);
Re: [Qemu-devel] [RFC] Safely reopening image files by stashing fds
On 08/09/2011 03:02 PM, supriya kannery wrote: > Kevin Wolf wrote: >> Am 09.08.2011 11:22, schrieb supriya kannery: >>> Kevin Wolf wrote: >> >> What I meant is that in the end, with a generic bdrv_reopen(), we can >> have raw-posix only call dup() and fcntl() instead of doing a >> close()/open() sequence if it can satisfy the new flags this way. But >> this would be an implementation detail and not be visible in the >> interface. >> >> Kevin > > ok > - thanks, Supriya > Though I started the RFC patch with defining BDRVReopenState, ended up in enhancing struct BlockDriver with .bdrv_reopen. bdrv_reopen mplementation specific to respective driver is assigned to this function pointer. Please find the implementation of O_DIRECT flag change, done in raw-posix.c. Similar implementation can be done for vmdk (with bdrv_reopen_commit and bdrv_reopen_abort as internal functions in vmdk.c for opening split files), sheepdog, nbd etc.. Request for comments on this approach. Note: Following patch is for demonstrating the approach using raw-posix as sample driver. This patch is not complete. --- block.c | 33 +++-- block/raw-posix.c | 34 ++ block_int.h |1 + 3 files changed, 58 insertions(+), 10 deletions(-) Index: qemu/block/raw-posix.c === --- qemu.orig/block/raw-posix.c +++ qemu/block/raw-posix.c @@ -276,6 +276,39 @@ static int raw_open(BlockDriverState *bs return raw_open_common(bs, filename, flags, 0); } +static int raw_reopen(BlockDriverState *bs, int flags) +{ +BDRVRawState *s = bs->opaque; +int new_fd; +int ret = 0; + +/* Handle request for toggling O_DIRECT */ +if ((bs->open_flags | BDRV_O_NOCACHE) ^ +(flags | BDRV_O_NOCACHE)) +{ +if ((new_fd = dup(s->fd)) <= 0) { +return -1; +} + +if ((ret = fcntl_setfl(new_fd, flags)) < 0) { +close(new_fd); +return ret; +} + +/* Set new flags, so replace old fd with new one */ +close(s->fd); +s->fd = new_fd; +s->open_flags = flags; +bs->open_flags = flags; +} + +/* + * TBD: handle O_DSYNC and other flags for which image + * file has to be reopened + */ +return 0; +} + /* XXX: use host sector size if necessary with: #ifdef DIOCGSECTORSIZE { @@ -886,6 +919,7 @@ static BlockDriver bdrv_file = { .instance_size = sizeof(BDRVRawState), .bdrv_probe = NULL, /* no probe for protocols */ .bdrv_file_open = raw_open, +.bdrv_reopen = raw_reopen, .bdrv_read = raw_read, .bdrv_write = raw_write, .bdrv_close = raw_close, Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -687,20 +687,33 @@ int bdrv_reopen(BlockDriverState *bs, in qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name); return ret; } -open_flags = bs->open_flags; -bdrv_close(bs); -ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); -if (ret < 0) { -/* Reopen failed. Try to open with original flags */ -qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); -ret = bdrv_open(bs, bs->filename, open_flags, drv); +/* Use driver specific reopen() if available */ +if (drv->bdrv_reopen) { +if ((ret = drv->bdrv_reopen(bs, bdrv_flags)) < 0) { +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +return ret; +} +} else { +/* Use reopen procedure common for drivers */ +open_flags = bs->open_flags; +bdrv_close(bs); + +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv); if (ret < 0) { -/* Reopen failed with orig and modified flags */ -abort(); +/* + * Reopen failed. Try to open with original flags +*/ +qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename); +ret = bdrv_open(bs, bs->filename, open_flags, drv); +if (ret < 0) { +/* + * Reopen with orig and modified flags failed + */ +abort(); +} } } - return ret; } Index: qemu/block_int.h === --- qemu.orig/block_int.h +++ qemu/block_int.h @@ -54,6 +54,7 @@ struct BlockDriver { int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename); int (*bdrv_probe_device)(const char *filename); int (*bdrv_open)(BlockDriverState *bs, int flags); +int (*bdrv_reopen)(BlockDriverState *bs, int flags); int (*bdrv_file_open)(BlockDriverState *bs, const char *filename, int flags); int (*bdrv_read)(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors);
Re: [Qemu-devel] Safely reopening image files by stashing fds
Kevin Wolf wrote: Am 09.08.2011 11:22, schrieb supriya kannery: Kevin Wolf wrote: Am 08.08.2011 09:02, schrieb Supriya Kannery: On 08/05/2011 09:19 PM, Anthony Liguori wrote: On 08/05/2011 10:43 AM, Kevin Wolf wrote: Am 05.08.2011 17:24, schrieb Stefan Hajnoczi: On Fri, Aug 5, 2011 at 3:28 PM, Christoph Hellwig wrote: On Fri, Aug 05, 2011 at 02:12:48PM +0100, Daniel P. Berrange wrote: Because you cannot change O_DIRECT on an open fd :(. This is why we're going through this pain. Hmm, I remember hearing that before, but looking at the current fcntl() manpage, it claims you *can* change O_DIRECT using SET_FL. Perhaps this is a newish feature, but it'd be nicer to use it if possible ? It's been there since day 1 of O_DIRECT support. Sorry, my bad. So for Linux we could just use fcntl for block_set_hostcache and not bother with reopening. However, we will need to reopen should we wish to support changing O_DSYNC. We do wish to support that. Anthony thinks that allowing the guest to toggle WCE is a prerequisite for making cache=writeback the default. And this is something that I definitely want to do for 1.0. Indeed. We discussed the following so far... 1. How to safely reopen image files 2. Dynamic hostcache change 3. Support for dynamic change of O_DSYNC Since 2 is independent of 1, shall I go ahead implementing hostcache change using fcntl. Implementation for safely reopening image files using "BDRVReopenState" can be done separately as a pre-requisite before implementing 3 Doing it separately means that we would introduce yet another callback that is used just to change O_DIRECT. In the end we want it to use bdrv_reopen(), too, so I'm not sure if there is a need for a temporary solution. Could you please explain "In the end we want it to use bdrv_reopen" at bit more. When fcntl() can change O_DIRECT on open fd , is there a need to "re-open" the image file? What I meant is that in the end, with a generic bdrv_reopen(), we can have raw-posix only call dup() and fcntl() instead of doing a close()/open() sequence if it can satisfy the new flags this way. But this would be an implementation detail and not be visible in the interface. Kevin ok - thanks, Supriya