On 2014-10-20 at 12:03, Peter Lieven wrote:
On 20.10.2014 11:27, Max Reitz wrote:
On 2014-10-20 at 11:14, Peter Lieven wrote:
On 20.10.2014 10:59, Max Reitz wrote:
On 2014-10-20 at 08:14, Peter Lieven wrote:
the block layer silently merges write requests since
s/^t/T/
commit 40b4f539. This patch adds a knob to disable
this feature as there has been some discussion lately
if multiwrite is a good idea at all and as it falsifies
benchmarks.
Signed-off-by: Peter Lieven <p...@kamp.de>
---
block.c | 4 ++++
block/qapi.c | 1 +
blockdev.c | 7 +++++++
hmp.c | 4 ++++
include/block/block_int.h | 1 +
qapi/block-core.json | 10 +++++++++-
qemu-options.hx | 1 +
qmp-commands.hx | 2 ++
8 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/block.c b/block.c
index 27533f3..1658a72 100644
--- a/block.c
+++ b/block.c
@@ -4531,6 +4531,10 @@ static int
multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs,
{
int i, outidx;
+ if (!bs->write_merging) {
+ return num_reqs;
+ }
+
// Sort requests by start sector
qsort(reqs, num_reqs, sizeof(*reqs), &multiwrite_req_compare);
diff --git a/block/qapi.c b/block/qapi.c
index 9733ebd..02251dd 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -58,6 +58,7 @@ BlockDeviceInfo
*bdrv_block_device_info(BlockDriverState *bs)
info->backing_file_depth = bdrv_get_backing_file_depth(bs);
info->detect_zeroes = bs->detect_zeroes;
+ info->write_merging = bs->write_merging;
if (bs->io_limits_enabled) {
ThrottleConfig cfg;
diff --git a/blockdev.c b/blockdev.c
index e595910..13e47b8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char
*file, QDict *bs_opts,
const char *id;
bool has_driver_specific_opts;
BlockdevDetectZeroesOptions detect_zeroes;
+ bool write_merging;
BlockDriver *drv = NULL;
/* Check common options by copying from bs_opts to opts,
all other options
@@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char
*file, QDict *bs_opts,
snapshot = qemu_opt_get_bool(opts, "snapshot", 0);
ro = qemu_opt_get_bool(opts, "read-only", 0);
copy_on_read = qemu_opt_get_bool(opts, "copy-on-read", false);
+ write_merging = qemu_opt_get_bool(opts, "write-merging", true);
Using this option in blockdev_init() means that you can only enable
or disable merging for the top layer (the root BDS). Furthermore,
since you don't set bs->write_merging in bdrv_new() (or at least
bdrv_open()), it actually defaults to false and only for the top
layer it defaults to true.
Therefore, if after this patch a format block driver issues a
multiwrite to its file, the write will not be merged and the user
can do nothing about it. I don't suppose this is intentional...?
I am not sure if a block driver actually can do this at all? The
only way to enter multiwrite is from virtio_blk_handle_request in
virtio-blk.c.
Well, there's also qemu-io -c multiwrite (which only accesses the
root BDS as well). But other than that, yes, you're right. So, in
practice it shouldn't matter.
I propose evaluating the option in bdrv_open() and setting
bs->write_merging there.
I wasn't aware actually. I remember that someone asked me to
implement discard_zeroes in blockdev_init. I think it was something
related to QMP. So we still might
need to check parameters at 2 positions? It is quite confusing which
paramter has to be parsed where.
As for me, I don't know why some options are parsed in
blockdev_init() at all. I guess all the options currently parsed in
blockdev_init() should later be moved to the BlockBackend, at least
that would be the idea. In practice, we cannot do that: Things like
caching will stay in the BlockDriverState.
I think it's just broken. IMHO, everything related to the BB should
be in blockdev_init() and everything related to the BDS should be in
bdrv_open(). So the question is now whether you want write_merging to
be in the BDS or in the BB. Considering BB is in Kevin's block branch
as of last Friday, you might actually want to work on that branch and
move the field into the BB if you decide that that's the place it
should be in.
Actually I there a pros and cons for both BDS and BB. As of now my
intention was to be able to turn it off. As there are People who would
like to see it completely disappear I would not spent too much effort
in that switch today.
Looking at BB it is a BDS thing and thus belongs to bdrv_open. But
this is true for discard_zeroes (and others) as well. Kevin, Stefan,
ultimatively where should it be parsed?
Yes, and for cache, too. That's what I meant with "it's just broken".
Max
I have on my todo list the following to give you a figure what might
happen. All this is for 2.3+ except for the accounting maybe:
- add accounting for merged requests (to have a metric for
modifications)
- evaluate if sorting the requests really helps
- simplify the merge conditions and do not keep a fixed list of
requests just merge if the Nth and N-1 requests are mergable (without
sorting).
- think about handling the complete merging in virtio-blk with the
above simplifications. currently it is a virtio-blk only feature anyway.
- add read merging
Peter