v2 was: with RFC. Since last review round I dropped the tag
because I think now all the open points are addressed.
v1 was: "add watermark reporting for block devices", but
"watermark" is incorrectly unused. Hence the change in subject.

Changes since v2:
-----------------

addressed reviewers comments:
- use node name to find the block driver state to be checked
- report node name in the event notification
- fixed signed vs unsigned integer: use uint64 everywhere, to deal
  with integer overflows (more) gracefully.
- fixed pending style issues
- renamed and made public a few functions to make them testable
- add very basic initial unit tests

Changes since v1:
-----------------

addressed reviewers comments. Highligths:
- fixed terminology: "watermark" -> "usage threshold"
- threshold is expressed in bytes
- make the event triggers only once when threshold crossed
- configured threshold visible in 'query-block' output
- fixed bugs

Open issues:
------------

Not all node names show up in the 'query-block' output, but I'll
start a different thread to discuss this.

Followup:
---------

Patches I'll have on my queue and I'd like to post as followup
- more some unit testing.
- add support to set the threshold at device creation


Rationale and context from v1
-----------------------------

I'm one of the oVirt developers (http://www.ovirt.org);
oVirt is a virtualization management application built
around qemu/kvm, so it is nice to get in touch :)

We have begun a big scalability improvement effort, aiming to
support without problems hundreds of VMs per host, with plans
to support thousands in a not so distant future.
In doing so, we are reviewing our usage flows.

One of them is thin-provisioned storage, which is used
quite extensively, with block devices (ISCSI for example)
and COW images.
When using thin provisioning, oVirt tries hard to hide this
fact from the guest OS, and to do so watches closely
the usage of the device, and resize it when its usage exceeds
a configured threshold (the "high water mark"), in order
to avoid the guest OS to get paused for space exhausted.

To do the watching, we poll he devices using libvirt [1],
which in turn uses query-blockstats.
This is suboptimal with just one VM, but with hundereds of them,
let alone thousands, it doesn't scale and it is quite a resource
hog.

Would be great to have this watermark concept supported into qemu,
with a new event to be raised when the limit is crossed.

To track this RFE I filed https://bugs.launchpad.net/qemu/+bug/1338957

Moreover, I had the chance to take a look at the QEMU sources
and come up with this tentative patch which I'd also like
to submit.

Notes
-----

[0]: https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg01348.html
[1]: 
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=ebb0c19c48690f0598de954f8e0e9d4d29d48b85


Francesco Romani (1):
  block: add event when disk usage exceeds threshold

 block/Makefile.objs             |   1 +
 block/qapi.c                    |   3 +
 block/usage-threshold.c         | 124 ++++++++++++++++++++++++++++++++++++++++
 include/block/block_int.h       |   4 ++
 include/block/usage-threshold.h |  62 ++++++++++++++++++++
 qapi/block-core.json            |  48 +++++++++++++++-
 qmp-commands.hx                 |  28 +++++++++
 tests/Makefile                  |   3 +
 tests/test-usage-threshold.c    | 101 ++++++++++++++++++++++++++++++++
 9 files changed, 373 insertions(+), 1 deletion(-)
 create mode 100644 block/usage-threshold.c
 create mode 100644 include/block/usage-threshold.h
 create mode 100644 tests/test-usage-threshold.c

-- 
1.9.3


Reply via email to