Re: [libvirt] [PATCH] docs: improve virsh man page synopses
On Thu, Jul 14, 2011 at 11:38:54 -0600, Eric Blake wrote: optional is not a very good meta-syntactic construct in our man page. I scrubbed this, and additionally improved some documentation on mutually exclusive options. For example, {[--live] [--config] | --current} implies that the set must be satisfied ({}), and within the set, you either have a mandatory --current, or an optional combination of 0, 1, or both --live and --config. Hmm, why not [[--live] [--config] | --current] to make it more obvious that none of the options is in fact required? Otherwise I like it. * tools/virsh.pod: Use [name] rather than optional name for optional arguments. --- I finally did something to address a pet peeve of mine. tools/virsh.pod | 199 ++- 1 files changed, 108 insertions(+), 91 deletions(-) diff --git a/tools/virsh.pod b/tools/virsh.pod index 1a98ec1..c6549f1 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod ... +=item Bmigrate [I--live] [I--direct] [I--p2p [I--tunnelled]] +[I--persistent] [I--undefinesource] [I--suspend] [I--copy-storage-all] +[I--copy-storage-inc] [I--verbose] Idomain-id Idesturi [Imigrateuri] +[Idname] [Itimeout] Shouldn't [Itimeout] be specified as [I--timeout Bseconds]? Migrate domain to another host. Add I--live for live migration; I--p2p for peer-2-peer migration; I--direct for direct migration; or I--tunnelled @@ -544,7 +545,8 @@ Imigrateuri is the migration URI, which usually can be omitted. Idname is used for renaming the domain to new name during migration, which also usually can be omitted. -I--timeout forces guest to suspend when live migration exceeds timeout, and +I--timeout number forces guest to suspend when live migration exceeds +Inumber seconds, and then the migration will complete offline. It can only be used with I--live. ... -=item Bschedinfo optional I--set Bparameter=value Idomain-id I--config -I--live I--current +=item Bschedinfo [I--set Bparameter=value] Idomain-id {[I--config] +[I--live] | I--current} Make it more obvious that neither or --config, --live, --current need to be specified as mentioned above. ... -=item Bsetmem Idomain-id Bkilobytes optional I--config I--live -I--current +=item Bsetmem Idomain-id Bkilobytes {[I--config] [I--live] | +I--current} The same here. ... -=item Bsetmaxmem Idomain-id Bkilobytes optional I--config I--live -I--current +=item Bsetmaxmem Idomain-id Bkilobytes {[I--config] [I--live] | +I--current} And here. ... -=item Bvcpucount Idomain-id optional I--maximum I--current -I--config I--live +=item Bvcpucount Idomain-id [I--maximum] {I--current | +[I--config] [I--live]} And here. Print information about the virtual cpu counts of the given Idomain-id. If no flags are specified, all possible counts are @@ -830,8 +832,8 @@ values; these two flags cannot both be specified. Returns basic information about the domain virtual CPUs, like the number of vCPUs, the running time, the affinity to physical processors. -=item Bvcpupin Idomain-id optional Ivcpu Icpulist I--live I--config -I--current +=item Bvcpupin Idomain-id [Ivcpu] [Icpulist] {[I--live] +[I--config] | I--current} Ditto. ... Query or change the pinning of domain VCPUs to host physical CPUs. To pin a single Ivcpu, specify Icpulist; otherwise, you can query one @@ -876,9 +878,9 @@ See the documentation to learn about libvirt XML format for a device. For cdrom and floppy devices, this command only replaces the media within the single existing device; consider using Bupdate-device for this usage. -=item Battach-disk Idomain-id Isource Itarget optional -I--driver driver I--subdriver subdriver I--type type -I--mode mode I--persistent I--sourcetype soucetype +=item Battach-disk Idomain-id Isource Itarget +[I--driver driver] [I--subdriver subdriver] [I--type type] +[I--mode mode] [I--persistent] [I--sourcetype soucetype] Probably [I--driver Bdriver] instead of [I--driver driver] to be more consistent with with other places, but this would probably better fit in a follow-up patch. ACK with those nits fixed. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH RFC v3 1/6] Introduce the function virCgroupForVcpu
Introduce the function virCgroupForVcpu() to create sub directory for each vcpu. --- src/libvirt_private.syms |1 + src/util/cgroup.c| 72 ++--- src/util/cgroup.h|5 +++ 3 files changed, 73 insertions(+), 5 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 3e3b1dd..30804eb 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -67,6 +67,7 @@ virCgroupDenyAllDevices; virCgroupDenyDevicePath; virCgroupForDomain; virCgroupForDriver; +virCgroupForVcpu; virCgroupFree; virCgroupGetBlkioWeight; virCgroupGetCpuShares; diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 740cedf..8994aca 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -52,6 +52,16 @@ struct virCgroup { struct virCgroupController controllers[VIR_CGROUP_CONTROLLER_LAST]; }; +typedef enum { +VIR_CGROUP_NONE = 0, /* create subdir under each cgroup if possible. */ +VIR_CGROUP_MEM_HIERACHY = 1 0, /* call virCgroupSetMemoryUseHierarchy + * before creating subcgroups and + * attaching tasks + */ +VIR_CGROUP_VCPU = 1 1, /* create subdir only under the cgroup cpu, + * cpuacct and cpuset if possible. */ +} virCgroupFlags; + /** * virCgroupFree: * @@ -503,7 +513,7 @@ static int virCgroupSetMemoryUseHierarchy(virCgroupPtr group) } static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, - int create, bool memory_hierarchy) + int create, unsigned int flags) { int i; int rc = 0; @@ -516,6 +526,13 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, if (!group-controllers[i].mountPoint) continue; +/* We need to control cpu bandwidth for each vcpu now */ +if ((flags VIR_CGROUP_VCPU) i != VIR_CGROUP_CONTROLLER_CPU) { +/* treat it as unmounted and we can use virCgroupAddTask */ +VIR_FREE(group-controllers[i].mountPoint); +continue; +} + rc = virCgroupPathOfController(group, i, , path); if (rc 0) return rc; @@ -555,7 +572,7 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, * Note that virCgroupSetMemoryUseHierarchy should always be * called prior to creating subcgroups and attaching tasks. */ -if (memory_hierarchy +if ((flags VIR_CGROUP_MEM_HIERACHY) group-controllers[VIR_CGROUP_CONTROLLER_MEMORY].mountPoint != NULL (i == VIR_CGROUP_CONTROLLER_MEMORY || STREQ(group-controllers[i].mountPoint, group-controllers[VIR_CGROUP_CONTROLLER_MEMORY].mountPoint))) { @@ -641,7 +658,7 @@ static int virCgroupAppRoot(int privileged, if (rc != 0) goto cleanup; -rc = virCgroupMakeGroup(rootgrp, *group, create, false); +rc = virCgroupMakeGroup(rootgrp, *group, create, VIR_CGROUP_NONE); cleanup: virCgroupFree(rootgrp); @@ -801,7 +818,7 @@ int virCgroupForDriver(const char *name, VIR_FREE(path); if (rc == 0) { -rc = virCgroupMakeGroup(rootgrp, *group, create, false); +rc = virCgroupMakeGroup(rootgrp, *group, create, VIR_CGROUP_NONE); if (rc != 0) virCgroupFree(group); } @@ -861,7 +878,7 @@ int virCgroupForDomain(virCgroupPtr driver, * a group for driver, is to avoid overhead to track * cumulative usage that we don't need. */ -rc = virCgroupMakeGroup(driver, *group, create, true); +rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_MEM_HIERACHY); if (rc != 0) virCgroupFree(group); } @@ -879,6 +896,51 @@ int virCgroupForDomain(virCgroupPtr driver ATTRIBUTE_UNUSED, #endif /** + * virCgroupForVcpu: + * + * @driver: group for the domain + * @vcpuid: id of the vcpu + * @group: Pointer to returned virCgroupPtr + * + * Returns 0 on success + */ +#if defined HAVE_MNTENT_H defined HAVE_GETMNTENT_R +int virCgroupForVcpu(virCgroupPtr driver, + int vcpuid, + virCgroupPtr *group, + int create) +{ +int rc; +char *path; + +if (driver == NULL) +return -EINVAL; + +if (virAsprintf(path, %s/vcpu%d, driver-path, vcpuid) 0) +return -ENOMEM; + +rc = virCgroupNew(path, group); +VIR_FREE(path); + +if (rc == 0) { +rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_VCPU); +if (rc != 0) +virCgroupFree(group); +} + +return rc; +} +#else +int virCgroupForVcpu(virCgroupPtr driver ATTRIBUTE_UNUSED, + int vcpuid ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED, +
[libvirt] [PATCH RFC v3 5/6] qemu: Implement cfs_period and cfs_quota's modification
--- src/qemu/qemu_driver.c | 259 +++- 1 files changed, 234 insertions(+), 25 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8d54e58..c5d0e05 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -5111,6 +5111,7 @@ static char *qemuGetSchedulerType(virDomainPtr dom, { struct qemud_driver *driver = dom-conn-privateData; char *ret = NULL; +char *cfs_period_path = NULL; qemuDriverLock(driver); if (!qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { @@ -5119,14 +5120,29 @@ static char *qemuGetSchedulerType(virDomainPtr dom, goto cleanup; } -if (nparams) -*nparams = 1; +/* check whether the host supports CFS bandwidth */ +if (virCgroupPathOfController(driver-cgroup, VIR_CGROUP_CONTROLLER_CPU, + cpu.cfs_period_us, cfs_period_path) 0) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, +%s, +_(cannot get the path of cgroup CPU controller)); +goto cleanup; +} + +if (nparams) { +if (access(cfs_period_path, F_OK) 0) { +*nparams = 1; +} else { +*nparams = 3; +} +} ret = strdup(posix); if (!ret) virReportOOMError(); cleanup: +VIR_FREE(cfs_period_path); qemuDriverUnlock(driver); return ret; } @@ -5753,6 +5769,48 @@ cleanup: return ret; } +static int +qemuSetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, + unsigned long long period, long long quota) +{ +int i; +qemuDomainObjPrivatePtr priv = vm-privateData; +virCgroupPtr cgroup_vcpu = NULL; +int rc; + +if (period == 0 quota == 0) +return 0; + +if (priv-nvcpupids == 0 || priv-vcpupids[0] == vm-pid) { +/* If we does not know VCPU-PID mapping or all vcpu runs in the same + * thread, we can not control each vcpu. + */ +return qemuSetupCgroupVcpuBW(cgroup, period, quota); +} + +for (i = 0; i priv-nvcpupids; i++) { +rc = virCgroupForVcpu(cgroup, i, cgroup_vcpu, 0); +if (rc 0) { +virReportSystemError(-rc, + _(Unable to find vcpu cgroup for %s(vcpu: +%d)), + vm-def-name, i); +goto cleanup; +} + +if (qemuSetupCgroupVcpuBW(cgroup_vcpu, period, quota) 0) +goto cleanup; + +virCgroupFree(cgroup_vcpu); +} + +return 0; + +cleanup: +virCgroupFree(cgroup_vcpu); +return -1; +} + static int qemuSetSchedulerParametersFlags(virDomainPtr dom, virTypedParameterPtr params, int nparams, @@ -5762,9 +5820,10 @@ static int qemuSetSchedulerParametersFlags(virDomainPtr dom, int i; virCgroupPtr group = NULL; virDomainObjPtr vm = NULL; -virDomainDefPtr persistentDef = NULL; +virDomainDefPtr vmdef = NULL; int ret = -1; bool isActive; +int rc; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG, -1); @@ -5788,10 +5847,17 @@ static int qemuSetSchedulerParametersFlags(virDomainPtr dom, flags = VIR_DOMAIN_AFFECT_CONFIG; } -if ((flags VIR_DOMAIN_AFFECT_CONFIG) !vm-persistent) { -qemuReportError(VIR_ERR_OPERATION_INVALID, %s, -_(cannot change persistent config of a transient domain)); -goto cleanup; +if (flags VIR_DOMAIN_AFFECT_CONFIG) { +if (!vm-persistent) { +qemuReportError(VIR_ERR_OPERATION_INVALID, %s, +_(cannot change persistent config of a transient domain)); +goto cleanup; +} + +/* Make a copy for updated domain. */ +vmdef = virDomainObjCopyPersistentDef(driver-caps, vm); +if (!vmdef) +goto cleanup; } if (flags VIR_DOMAIN_AFFECT_LIVE) { @@ -5818,7 +5884,6 @@ static int qemuSetSchedulerParametersFlags(virDomainPtr dom, virTypedParameterPtr param = params[i]; if (STREQ(param-field, cpu_shares)) { -int rc; if (param-type != VIR_TYPED_PARAM_ULLONG) { qemuReportError(VIR_ERR_INVALID_ARG, %s, _(invalid type for cpu_shares tunable, expected a 'ullong')); @@ -5837,19 +5902,47 @@ static int qemuSetSchedulerParametersFlags(virDomainPtr dom, } if (flags VIR_DOMAIN_AFFECT_CONFIG) { -persistentDef = virDomainObjGetPersistentDef(driver-caps, vm); -if (!persistentDef) { -qemuReportError(VIR_ERR_INTERNAL_ERROR, %s, -_(can't get persistentDef)); +vmdef-cputune.shares = params[i].value.ul; +
[libvirt] [PATCH v3 4/6] qemu: Implement period and quota tunable XML configuration and parsing
--- src/conf/domain_conf.c | 20 +++- src/conf/domain_conf.h |2 + src/qemu/qemu_cgroup.c | 127 +++ src/qemu/qemu_cgroup.h |4 + src/qemu/qemu_process.c |4 + tests/qemuxml2argvdata/qemuxml2argv-cputune.xml |2 + 6 files changed, 157 insertions(+), 2 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 8c3e44e..f959a48 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -6021,6 +6021,14 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps, def-cputune.shares) 0) def-cputune.shares = 0; +if (virXPathULongLong(string(./cputune/period[1]), ctxt, + def-cputune.period) 0) +def-cputune.period = 0; + +if (virXPathLongLong(string(./cputune/quota[1]), ctxt, + def-cputune.quota) 0) +def-cputune.quota = 0; + if ((n = virXPathNodeSet(./cputune/vcpupin, ctxt, nodes)) 0) { goto error; } @@ -9721,12 +9729,19 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAsprintf(buf, current='%u', def-vcpus); virBufferAsprintf(buf, %u/vcpu\n, def-maxvcpus); -if (def-cputune.shares || def-cputune.vcpupin) +if (def-cputune.shares || def-cputune.vcpupin || +def-cputune.period || def-cputune.quota) virBufferAddLit(buf, cputune\n); if (def-cputune.shares) virBufferAsprintf(buf, shares%lu/shares\n, def-cputune.shares); +if (def-cputune.period) +virBufferAsprintf(buf, period%llu/period\n, + def-cputune.period); +if (def-cputune.quota) +virBufferAsprintf(buf, quota%lld/quota\n, + def-cputune.quota); if (def-cputune.vcpupin) { int i; for (i = 0; i def-cputune.nvcpupin; i++) { @@ -9748,7 +9763,8 @@ virDomainDefFormatInternal(virDomainDefPtr def, } } -if (def-cputune.shares || def-cputune.vcpupin) +if (def-cputune.shares || def-cputune.vcpupin || +def-cputune.period || def-cputune.quota) virBufferAddLit(buf, /cputune\n); if (def-numatune.memory.nodemask) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 172d3c2..02e5ae4 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1195,6 +1195,8 @@ struct _virDomainDef { struct { unsigned long shares; +unsigned long long period; +long long quota; int nvcpupin; virDomainVcpuPinDefPtr *vcpupin; } cputune; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index b357852..ab87fab 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -24,6 +24,7 @@ #include config.h #include qemu_cgroup.h +#include qemu_domain.h #include cgroup.h #include logging.h #include memory.h @@ -376,6 +377,132 @@ cleanup: return -1; } +int qemuSetupCgroupVcpuBW(virCgroupPtr cgroup, unsigned long long period, + long long quota) +{ +int rc; +unsigned long long old_period; + +if (period == 0 quota == 0) +return 0; + +if (period) { +/* get old period, and we can rollback if set quota failed */ +rc = virCgroupGetCpuCfsPeriod(cgroup, old_period); +if (rc 0) { +virReportSystemError(-rc, + _(%s), Unable to get cpu bandwidth period); +return -1; +} + +rc = virCgroupSetCpuCfsPeriod(cgroup, period); +if (rc 0) { +virReportSystemError(-rc, + _(%s), Unable to set cpu bandwidth period); +return -1; +} +} + +if (quota) { +rc = virCgroupSetCpuCfsQuota(cgroup, quota); +if (rc 0) { +virReportSystemError(-rc, + _(%s), Unable to set cpu bandwidth quota); +goto cleanup; +} +} + +return 0; + +cleanup: +if (period) { +rc = virCgroupSetCpuCfsPeriod(cgroup, old_period); +if (rc 0) +virReportSystemError(-rc, + _(%s), + Unable to rollback cpu bandwidth period); +} + +return -1; +} + +int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) +{ +virCgroupPtr cgroup = NULL; +virCgroupPtr cgroup_vcpu = NULL; +qemuDomainObjPrivatePtr priv = vm-privateData; +int rc; +unsigned int i; +unsigned long long period = vm-def-cputune.period; +long long quota = vm-def-cputune.quota; + +if (driver-cgroup == NULL) +return 0; /* Not supported, so claim success */ + +rc = virCgroupForDomain(driver-cgroup, vm-def-name, cgroup, 0); +if (rc != 0) { +
[libvirt] [PATCH RFC v3 6/6] doc: Add documentation for new cputune elements period and quota
--- docs/formatdomain.html.in | 19 +++ 1 files changed, 19 insertions(+), 0 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 269fc30..d388332 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -307,6 +307,8 @@ lt;vcpupin vcpu=2 cpuset=2,3/gt; lt;vcpupin vcpu=3 cpuset=0,4/gt; lt;sharesgt;2048lt;/sharesgt; +lt;periodgt;100lt;/periodgt; +lt;quotagt;-1lt;/quotagt; lt;/cputunegt; lt;numatunegt; lt;memory mode=strict nodeset=1-4,^3/gt; @@ -400,6 +402,23 @@ 2048 will get twice as much CPU time as a VM configured with value 1024. span class=sinceSince 0.9.0/span /dd + dtcodeperiod/code/dt + dd +The optional codeperiod/code element specifies the enforcement +interval(unit: microseconds). Within codeperiod/code, the domain +will not be allowed to consume more than codequota/code worth of +runtime. The value should be in range [1000, 100]. +span class=sinceSince 0.9.4/span + /dd + dtcodequota/code/dt + dd +The optional codequota/code element specifies the maximum allowed +bandwidth(unit: microseconds). A domain with codequota/code as any +negative value indicates that the domain has infinite bandwidth, which +means that it is not bandwidth controlled. The value should be in range +[1000, 18446744073709551] or less than 0. +span class=sinceSince 0.9.4/span + /dd dtcodenumatune/code/dt dd The optional codenumatune/code element provides details of -- 1.7.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] docs: Fix spice documentation typo
于 2011年07月18日 17:33, Michal Privoznik 写道: We missed ending tag for paragraph element --- docs/formatdomain.html.in |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 269fc30..a54ee6a 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -1989,7 +1989,7 @@ qemu-kvm -net nic,model=? /dev/null thecodecopypaste/code property tocodeno/code,span class=sincesince 0.9.3/span. -/ +/p /dd dtcoderdp/code/dt dd ACK Osier -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] docs: Fix spice documentation typo
On 18.07.2011 12:00, Osier Yang wrote: 于 2011年07月18日 17:33, Michal Privoznik 写道: We missed ending tag for paragraph element --- docs/formatdomain.html.in | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 269fc30..a54ee6a 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -1989,7 +1989,7 @@ qemu-kvm -net nic,model=? /dev/null thecodecopypaste/code property tocodeno/code,span class=sincesince 0.9.3/span. -/ +/p /dd dtcoderdp/code/dt dd ACK Osier Thanks, pushed. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC PATCH] qemu: Fail APIs not allowed during async job
On Fri, Jul 15, 2011 at 16:18:54 +0100, Daniel P. Berrange wrote: On Fri, Jul 15, 2011 at 04:41:38PM +0200, Jiri Denemark wrote: When an asynchronous job is running while another API that is incompatible with that job is called, we now try to wait until the job finishes and either run the API or fail with timeout. I guess nicer solution is to just fail such API immediately and let the application retry once the asynchronous job ends. I'm not entirely convinced this is a good idea, because IIUC, what this is in effect doing is having a zero second timeout. Previously we would wait forever, currently we wait upto 30 seconds IIRC, and now we'll wait 0 seconds. I think this will create lots of spurious timeouts for applications to needlessly deal with. I'm not sure how far in the past are you referring to by saying previously. Anyway since few years age we've been waiting up to 30 seconds. Now we would wait for 0 seconds but only in case current job is migration/save/dump. If the job doesn't finish in 30 seconds, we would just fail with timeout providing quite unhelpful error message. This way we would provide a better error message. Removing the timeout should be fine in this case since in most cases (migration/save) the domain won't exist after the job finishes anyway so the current API would fail anyway. However, if we still want to have the timeout there, we could just make the error message better in case we're waiting for incompatible async job to finish. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC PATCH] qemu: Fail APIs not allowed during async job
On Fri, Jul 15, 2011 at 09:09:52 -0600, Eric Blake wrote: On 07/15/2011 08:41 AM, Jiri Denemark wrote: When an asynchronous job is running while another API that is incompatible with that job is called, we now try to wait until the job finishes and either run the API or fail with timeout. I guess nicer solution is to just fail such API immediately and let the application retry once the asynchronous job ends. --- src/qemu/THREADS.txt |5 ++--- src/qemu/qemu_domain.c | 28 +++- 2 files changed, 17 insertions(+), 16 deletions(-) If all such APIs have a flag argument, then we could make the behavior configurable - passing 0 blocks until possible, and passing VIR_DOMAIN_OPERATION_NONBLOCK as a flag returns immediately. But we probably don't have uniform flags arguments. Which APIs are affected? All APIs that modify state, currently the following list (qemu driver names): Resume Shutdown Reboot SetMemoryFlags InjectNMI SetVcpusFlags Suspend Restore ModifyDeviceFlags SnapshotCreateActive RevertToSnapshot SnapshotDelete MonitorCommand If we can't control things by a flag, then returning a specific failure seems like the best way to go (it is easier to write an app that can retry than it is to write an app that doesn't suffer from unintended blocking). So this patch seems sane to me, although I'd still like a list of all affected APIs before giving ack. BTW, the patch is not complete so acking or not would only affect the idea. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 1/5] make libvirt_driver_storage use pthread function
On Sun, Jul 17, 2011 at 06:44:57PM +0800, Guannan Ren wrote: --- src/Makefile.am |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/Makefile.am b/src/Makefile.am index 54b1ca0..90c4393 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -878,7 +878,7 @@ libvirt_driver_storage_la_SOURCES = libvirt_driver_storage_la_CFLAGS = \ -I@top_srcdir@/src/conf $(AM_CFLAGS) libvirt_driver_storage_la_LDFLAGS = $(AM_LDFLAGS) -libvirt_driver_storage_la_LIBADD = +libvirt_driver_storage_la_LIBADD = libvirt_util.la if WITH_SECDRIVER_SELINUX libvirt_driver_storage_la_LIBADD += $(SELINUX_LIBS) endif NACK, this is not allowed. It sounds like you need to add another export to src/libvirt_private.syms Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 4/5] In storageVolumeCreateXML, spawn a new thread for volbuilding, in storageVolumeDelete, generate the signal
On Sun, Jul 17, 2011 at 06:45:00PM +0800, Guannan Ren wrote: --- src/storage/storage_backend.c |9 src/storage/storage_driver.c | 83 - 2 files changed, 82 insertions(+), 10 deletions(-) diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index f632edd..bc10933 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -1632,3 +1632,12 @@ virStorageBackendRunProgNul(virConnectPtr conn, return -1; } #endif /* WIN32 */ + +void virStorageBackendVoluCleanup(void *arg) +{ + +volBuildThreadPtr data = arg; + +data-buildret = 0; +data-threadEnd = 1; +} diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 997b876..d8ac648 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -48,6 +48,7 @@ #include files.h #include fdstream.h #include configmake.h +#include threads.h #define VIR_FROM_THIS VIR_FROM_STORAGE @@ -1276,6 +1277,29 @@ cleanup: static int storageVolumeDelete(virStorageVolPtr obj, unsigned int flags); +static void virStorageBuildVol(void *arg) +{ +int ret = -1; +volBuildThreadPtr data = arg; +virStoragePoolObjPtr pool = data-pool; +virStorageVolDefPtr vol = data-vol; + +pthread_cleanup_push(virStorageBackendVoluCleanup, data); + +ret = data-buildvol(data-obj-conn, pool, vol); + +pthread_cleanup_pop(0); NACK, all use of pthread specific APIs must be in src/util/threads-pthread.c Independantly of this, IMHO pthread cancellation handlers are a recipe for trouble because it is incredibly hard to make sure you correctly cleanup all resources in the thread, even with use of cleanup handlers. IMHO, threads should be made to monitor some external quit boolean variable (eg see threadpool.c thread termination). Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 4/5] In storageVolumeCreateXML, spawn a new thread for volbuilding, in storageVolumeDelete, generate the signal
On 07/18/2011 06:57 PM, Daniel P. Berrange wrote: On Sun, Jul 17, 2011 at 06:45:00PM +0800, Guannan Ren wrote: --- src/storage/storage_backend.c |9 src/storage/storage_driver.c | 83 - 2 files changed, 82 insertions(+), 10 deletions(-) diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index f632edd..bc10933 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -1632,3 +1632,12 @@ virStorageBackendRunProgNul(virConnectPtr conn, return -1; } #endif /* WIN32 */ + +void virStorageBackendVoluCleanup(void *arg) +{ + +volBuildThreadPtr data = arg; + +data-buildret = 0; +data-threadEnd = 1; +} diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c index 997b876..d8ac648 100644 --- a/src/storage/storage_driver.c +++ b/src/storage/storage_driver.c @@ -48,6 +48,7 @@ #include files.h #include fdstream.h #include configmake.h +#include threads.h #define VIR_FROM_THIS VIR_FROM_STORAGE @@ -1276,6 +1277,29 @@ cleanup: static int storageVolumeDelete(virStorageVolPtr obj, unsigned int flags); +static void virStorageBuildVol(void *arg) +{ +int ret = -1; +volBuildThreadPtr data = arg; +virStoragePoolObjPtr pool = data-pool; +virStorageVolDefPtr vol = data-vol; + +pthread_cleanup_push(virStorageBackendVoluCleanup, data); + +ret = data-buildvol(data-obj-conn, pool, vol); + +pthread_cleanup_pop(0); NACK, all use of pthread specific APIs must be in src/util/threads-pthread.c Independantly of this, IMHO pthread cancellation handlers are a recipe for trouble because it is incredibly hard to make sure you correctly cleanup all resources in the thread, even with use of cleanup handlers. IMHO, threads should be made to monitor some external quit boolean variable (eg see threadpool.c thread termination). Daniel yep but the push and pop function couldn't be in different function, according to man page as follows. It is dangerous to use thread handler, I agree with you. Thanks. These functions may be implemented as macros. The application shall ensure that they appear as statements, and in pairs within the same lexical scope (that is, the pthread_cleanup_push() macro may be thought to expand to a token list whose first token is ’{’ with pthread_cleanup_pop() expanding to a token list whose last token is the corresponding ’}’ ). -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] doc: Correct docs for iface commands
duplicate documents for iface-name, lacks of document for iface-mac, inconsistent option names with virsh help strings. --- tools/virsh.pod | 16 ++-- 1 files changed, 10 insertions(+), 6 deletions(-) diff --git a/tools/virsh.pod b/tools/virsh.pod index 377fac0..4f218d3 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -1081,20 +1081,24 @@ Returns the list of active host interfaces. If I--all is specified this will also include defined but inactive interfaces. If I--inactive is specified only the inactive ones will be listed. -=item Biface-name Iiface-MAC +=item Biface-name Iinterface -Convert a host interface MAC to interface name, if the Iiface-MAC is -unique among the host's interfaces. +Convert a host interface MAC to interface name, if the MAC address is unique +among the host's interfaces. -=item Biface-name Iiface-MAC +Iinterface specifies the interface MAC address. + +=item Biface-mac Iinterface Convert a host interface name to MAC address. -=item Biface-start Iiface +Iinterface specifies the interface name. + +=item Biface-start Iinterface Start a (previously defined) host interface, such as by running if-up. -=item Biface-undefine Iiface +=item Biface-undefine Iinterface Undefine the configuration for an inactive host interface. -- 1.7.6 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 1/7] bandwidth: Define schema and create documentation
On Tue, Jul 12, 2011 at 13:57:07 +0200, Michal Privoznik wrote: Define new 'bandwidth' element with possible child element 'inbound' and 'outbound' addressing incoming and outgoing traffic respectively: bandwidth inbound average='1mbit' peak='2mbit' burst='5m'/ outbound average='0.5mbit'/ /bandwidth Leaving any element out means not to shape traffic in that direction. Accepted units are the same as 'tc' accepts. tc is an internal implementation which we shouldn't expose externally. Instead of allowing units, we can just select one fixed unit and have plain numbers in the attributes. Kilobytes (per second) seems like a good unit for me since it's small enough that I doubt anyone would need less than that and we use kilobytes in other places as well (memory size, e.g.). This element can be inserted into domain's 'interface' and 'network'. --- docs/formatdomain.html.in | 32 docs/formatnetwork.html.in | 30 ++ docs/schemas/domain.rng| 50 docs/schemas/network.rng | 50 4 files changed, 162 insertions(+), 0 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 269fc30..766b2bc 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -1825,6 +1825,38 @@ qemu-kvm -net nic,model=? /dev/null span class=sinceSince 0.8.8/span /p +h5a name=elementQoSQuality of service/a/h5 + +pre + ... + lt;devicesgt; +lt;interface type='network'gt; + lt;source network='default'/gt; + lt;target dev='vnet0'/gt; + blt;bandwidthgt; +lt;inbound average='1mbit' peak='0.5mbps' burst='1m'/gt; +lt;outbound average='128kbit' peak='256kbit' burst='256kbit'/gt; + lt;/bandwidthgt;/b +lt;/interfacegt; + lt;devicesgt; + .../pre + + p +This part of interface XML provides setting quality of service. Incoming +and outgoing traffic can be shaped independently. The +codebandwidth/code element can have at most one codeinbound/code +and at most one codeoutbound/code child elements. Leaving any of these +children element out result in no QoS applied on that traffic direction. +So, when you want to shape only domain's incoming traffic, use +codeinbound/code only, and vice versa. Each of these elements have one +mandatory attribute codeaverage/code. It specifies average bit rate on +interface being shaped. Then there are two optional attributes: +codepeak/code, which specifies maximum rate at which interface can send +data, and codeburst/code, amount of bytes that can be burst at +codepeak/code speed. Accepted values for attributes are decimal +numbers, optionally followed by unit. See codeman tc/code for more +details. + /p We shouldn't mention tc in XML documentation. The semantics should be documented in a self-contained way. h4a name=elementsInputInput devices/a/h4 p diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in index f9421c3..607e1aa 100644 --- a/docs/formatnetwork.html.in +++ b/docs/formatnetwork.html.in @@ -102,6 +102,36 @@ 0.4.2/span/dd /dl +h5a name=elementQoSQuality of service/a/h5 + +pre + ... + lt;forward mode='nat' dev='eth0'/gt; + blt;bandwidthgt; +lt;inbound average='1mbit' peak='0.5mbps' burst='1m'/gt; +lt;outbound average='128kbit' peak='256kbit' burst='256kbit'/gt; + lt;/bandwidthgt;/b + lt;mac address='00:16:3E:5D:C7:9E'/gt; + .../pre + + p +This part of network XML provides setting quality of service. Incoming +and outgoing traffic can be shaped independently. The +codebandwidth/code element can have at most one codeinbound/code +and at most one codeoutbound/code child elements. Leaving any of these +children element out result in no QoS applied on that traffic direction. +So, when you want to shape only network's incoming traffic, use +codeinbound/code only, and vice versa. Each of these elements have one +mandatory attribute codeaverage/code. It specifies average bit rate on +interface being shaped. Then there are two optional attributes: +codepeak/code, which specifies maximum rate at which bridge can send +data, and codeburst/code, amount of bytes that can be burst at +codepeak/code speed. Accepted values for attributes are decimal +numbers, optionally followed by unit. See codeman tc/code for +more details. The rate is shared equally within domains connected +to the network. + /p + The same applies here. ... diff --git a/docs/schemas/domain.rng b/docs/schemas/domain.rng index 8a4e3fe..e12f5b7 100644 --- a/docs/schemas/domain.rng +++ b/docs/schemas/domain.rng @@ -2349,6 +2352,43 @@ /element /define + define
Re: [libvirt] [PATCH v2 2/7] bandwidth: Declare internal structures
On Tue, Jul 12, 2011 at 13:57:08 +0200, Michal Privoznik wrote: --- src/util/network.h | 16 1 files changed, 16 insertions(+), 0 deletions(-) diff --git a/src/util/network.h b/src/util/network.h index ed0b78c..568bca1 100644 --- a/src/util/network.h +++ b/src/util/network.h @@ -45,6 +45,22 @@ typedef struct { typedef virSocketAddr *virSocketAddrPtr; +typedef struct { +/* Even if we let user to input rates + * in various units, we store them in bps */ Let's not allow users to use various units :-) +unsigned long average; +unsigned long peak; +unsigned long burst; +} virRate; + +typedef virRate *virRatePtr; + +typedef struct { +virRate in, out; +} virBandwidth; + +typedef virBandwidth *virBandwidthPtr; + int virSocketParseAddr(const char *val, virSocketAddrPtr addr, int hint); Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 3/7] bandwidth: Add format parsing functions
On Tue, Jul 12, 2011 at 13:57:09 +0200, Michal Privoznik wrote: These functions take on input decimal numbers optionally followed by unit. Units are exactly the same as 'tc' accepts. --- src/conf/domain_conf.c |3 + src/conf/domain_conf.h |1 + src/conf/network_conf.c |5 + src/conf/network_conf.h |1 + src/libvirt_private.syms |1 + src/util/network.c | 203 ++ src/util/network.h |2 + 7 files changed, 216 insertions(+), 0 deletions(-) This can be greatly simplified once we forbid using units in bandwidth attributes. ... diff --git a/src/util/network.c b/src/util/network.c index eb16e0c..476ecde 100644 --- a/src/util/network.c +++ b/src/util/network.c @@ -10,6 +10,7 @@ #include config.h #include arpa/inet.h +#include math.h BTW, why did you need to include math.h? Getting rid of it is another good reason for removing units. #include memory.h #include network.h @@ -21,6 +22,9 @@ ... +/** + * virBandwidthParseXML: The name here doesn't match the function this is supposed to document. + * @node: XML node + * @def: where to store the parsed result + * + * Parse bandwidth XML and store into given pointer + * + * Returns 0 on success, -1 on error. + */ +int +virBandwidthDefParseNode(xmlNodePtr node, virBandwidthPtr def) +{ +int ret = -1; +xmlNodePtr cur = node-children; +xmlNodePtr in = NULL, out = NULL; + +if (!node || !def || +!xmlStrEqual(node-name, BAD_CAST bandwidth)) +return -1; + +memset(def, 0, sizeof(virBandwidth)); Using sizeof(*def) is better. +while (cur) { +if (cur-type == XML_ELEMENT_NODE) { +if (xmlStrEqual(cur-name, BAD_CAST inbound)) { +if (in) { +virBandwidthError(VIR_ERR_XML_DETAIL, %s, + _(Only one child inbound +element allowed)); +goto cleanup; +} +in = cur; +} else if (xmlStrEqual(cur-name, BAD_CAST outbound)) { +if (out) { +virBandwidthError(VIR_ERR_XML_DETAIL, %s, + _(Only one child outbound +element allowed)); +goto cleanup; +} +out = cur; +} else { +virBandwidthError(VIR_ERR_CONFIG_UNSUPPORTED, + _(Unknown element %s), + cur-name); +goto cleanup; AFAIK we ignore unknown XML elements in other XML parsing code, shouldn't we do the same here? ... Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 4/7] bandwidth: Create format functions
On Tue, Jul 12, 2011 at 13:57:10 +0200, Michal Privoznik wrote: --- src/conf/domain_conf.c |3 + src/conf/network_conf.c |3 + src/libvirt_private.syms |3 + src/util/network.c | 153 ++ src/util/network.h |7 ++ 5 files changed, 169 insertions(+), 0 deletions(-) Removing units makes this simpler as well. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH RFC v3 0/6] support cpu bandwidth in libvirt
On Mon, 2011-07-18 at 17:34 +0800, Wen Congyang wrote: TODO: 1. We create sub directory for each vcpu in cpu subsystem. So we should recalculate cpu.shares for each vcpu. Is the per vcpu cgroup optional? I.e., is is possible to set the period and quota for the entire domain and let the host scheduler deal with it? Caveat: Domain level CFS shares seems to work well--work well here means behaves as I expected ;-). I have no experience with the period/quota facility and libvirt domains, so maybe it doesn't make sense to cap cpu utilization at the domain level. Regards, Lee Changelog: v3: fix some small bugs implement the simple way v2: almost rewrite the patchset to support to control each vcpu's bandwidth. Limit quota to [-1, 2^64/1000] at the schemas level. We will check it at cgroup level. Wen Congyang (6): Introduce the function virCgroupForVcpu cgroup: Implement cpu.cfs_period_us and cpu.cfs_quota_us tuning API Update XML Schema for new entries qemu: Implement period and quota tunable XML configuration and parsing qemu: Implement cfs_period and cfs_quota's modification doc: Add documentation for new cputune elements period and quota docs/formatdomain.html.in | 19 ++ docs/schemas/domain.rng | 26 +++- src/conf/domain_conf.c | 20 ++- src/conf/domain_conf.h |2 + src/libvirt_private.syms|5 + src/qemu/qemu_cgroup.c | 127 +++ src/qemu/qemu_cgroup.h |4 + src/qemu/qemu_driver.c | 259 --- src/qemu/qemu_process.c |4 + src/util/cgroup.c | 153 +- src/util/cgroup.h | 11 + tests/qemuxml2argvdata/qemuxml2argv-cputune.xml |2 + 12 files changed, 596 insertions(+), 36 deletions(-) -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 5/7] bandwitdh: Implement functions to enable and disable QoS
On Tue, Jul 12, 2011 at 13:57:11 +0200, Michal Privoznik wrote: These function executes 'tc' with appropriate arguments to set desired QoS setting on interface or bridge during its creation. --- configure.ac|4 + src/libvirt_private.syms|2 + src/network/bridge_driver.c | 12 +++ src/qemu/qemu_command.c | 10 +++- src/util/macvtap.c | 12 +++- src/util/macvtap.h |5 +- src/util/network.c | 161 +++ src/util/network.h |3 + 8 files changed, 205 insertions(+), 4 deletions(-) I didn't spot any issues in this patch but I have a general question. Does QoS state need to be cleaned up in case virBandwidthEnable succeeds but something further in the process fails? That is do we need to call virBandwidthDisable in the cleanup part? Similarly, if is any cleanup needed when some parts of virBandwidthEnable succeeds but not all of them? Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 6/7] bandwidth: Add test cases for network
On Tue, Jul 12, 2011 at 13:57:12 +0200, Michal Privoznik wrote: --- tests/networkxml2xmlin/bandwidth-network.xml | 16 tests/networkxml2xmlout/bandwidth-network.xml | 16 tests/networkxml2xmltest.c|1 + 3 files changed, 33 insertions(+), 0 deletions(-) create mode 100644 tests/networkxml2xmlin/bandwidth-network.xml create mode 100644 tests/networkxml2xmlout/bandwidth-network.xml Looks good except for the units. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 7/7] bandwidth: Add domain schema test suite
On Tue, Jul 12, 2011 at 13:57:13 +0200, Michal Privoznik wrote: --- tests/domainschemadata/domain-bandwidth.xml | 72 +++ 1 files changed, 72 insertions(+), 0 deletions(-) create mode 100644 tests/domainschemadata/domain-bandwidth.xml No units, please :-) Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/7] Add support for setting QoS
On Tue, Jul 12, 2011 at 13:57:06 +0200, Michal Privoznik wrote: This patch series add support for setting traffic shaping and policing on both domain's interface and network's virtual bridge. Basically, this is done via 'tc' from iproute2 package. For shaping is HTB used, for policing we need u32 match selector. Both should be available in RHEL-6 kernel. How this works: On an virtual interface which has limits defined a root qdisc are replaced. Ingress root for outbound traffic shaping and egress for inbound. Basically, in inbound traffic policing is applied, on outbound shaping. New qdiscs are set to limit the traffic to rate set in XML. For shaping it is possible to set the size of buffer. Accepted values for rate, peak and burst have same format as 'tc' command and are documented. Supported devices are VIR_DOMAIN_NET_TYPE_NETWORK, VIR_DOMAIN_NET_TYPE_BRIDGE and VIR_DOMAIN_NET_TYPE_DIRECT. One more thing. Would it be possible to also implement new APIs to manage network bandwidth on-the-fly? It's not for this patchset but is does sound like a good idea to have the possibility to, for example, restrict bandwidth of a running domain since more domains need to use the network or increase the bandwidth because host's owner got more money from domain's owner :-) Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] RFC New virDomainBlockPull API family to libvirt
On Fri, Jul 15, 2011 at 3:09 PM, Adam Litke a...@us.ibm.com wrote: On 07/15/2011 05:39 AM, Stefan Hajnoczi wrote: On Thu, Jul 14, 2011 at 7:47 PM, Adam Litke a...@us.ibm.com wrote: On 07/13/2011 08:04 PM, Daniel Veillard wrote: On Wed, Jul 13, 2011 at 03:46:30PM -0500, Adam Litke wrote: Unfortunately, after committing the blockPull API to libvirt, the qemu community decided to change the API somewhat and we needed to revert the libvirt implementation. Now that the qemu API is settling out again, I would like to propose an updated libvirt public API which has required only a minor set of changes to sync back up to the qemu API. Summary of changes: - Qemu dropped incremental streaming so remove libvirt incremental BlockPull() API - Rename virDomainBlockPullAll() to virDomainBlockPull() - Changes required to qemu monitor handlers for changed command names Okay. snip On the other hand I suspect that we are missing the mechanism to control the rate of the transfer in the new API, which is something which could be implemented in the old incremental mechanism, but not anymore. So I wonder if the virDomainBlockPull() shouldn't get an unsigned long bandwidth (to stay coherent with migration APIs) before the flags. I don't know if the support is ready in QEmu but I suspect that will be an important point, so if not present will be added :-) Hopefully Stefan can comment on any throttling mechanisms that might be added to the qemu implementation. It would be nice to have this feature built into the API to match the migration APIs, but if that is not feasible then we could always add it later. If we follow the live migration API then a block_stream_set_speed command would be used. libvirt would issue it as a separate command immediately after starting the streaming operation. There is the window of time after streaming has started but before block_stream_set_speed has been called where no throttling takes place, but I don't think this is a practical problem. It also opens the possibility of allowing the user to change the rate limit value while the block streaming operation is active. In light of our decision to provide a generic BlockJobAbort/BlockJobInfo interface, should SetSpeed be generic too? virDomainBlockJobSetSpeed() or virDomainBlockPullSetSpeed() ? The block_stream_set_speed semantics allow the command to be used when no streaming operation is active. This can be used to set the speed without a window of time after creation and before set_speed is called. If we switch to block_job_set_speed then it is cleaner to restrict the command to work on active jobs only. This is because there is only one speed variable kept but there might be multiple job types (streaming, compaction, etc) which would need different settings. What do you think about this? block_job_set_speed --- Set maximum speed for a background block operation. This is a per-block device command that can only be issued when there is an active block job. Throttling can be disabled by setting the speed to 0. Arguments: - device: device name (json-string) - value: maximum speed, in bytes per second (json-int) Errors: NotSupported: job type does not support speed setting Example: - { execute: block_job_set_speed, arguments: { device: virtio0, value: 1024 } } Stefan -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 17/8] save: support direct autostart in qemu.conf
When auto-starting a domain on libvirtd startup, let the user configure whether to have the VIR_DOMAIN_START_DIRECT flag effect. * src/qemu/qemu.conf (auto_start_direct): Document new variable. * src/qemu/libvirtd_qemu.aug (vnc_entry): Let augeas parse it. * src/qemu/qemu_conf.h (qemud_driver): Store new preference. * src/qemu/qemu_conf.c (qemudLoadDriverConfig): Parse it. * src/qemu/qemu_driver.c (qemuAutostartDomain): Honor it. --- Might be worth combining with 9/8. src/qemu/libvirtd_qemu.aug |1 + src/qemu/qemu.conf |8 src/qemu/qemu_conf.c |4 src/qemu/qemu_conf.h |2 ++ src/qemu/qemu_driver.c |4 ++-- 5 files changed, 17 insertions(+), 2 deletions(-) diff --git a/src/qemu/libvirtd_qemu.aug b/src/qemu/libvirtd_qemu.aug index dea6770..a78cd10 100644 --- a/src/qemu/libvirtd_qemu.aug +++ b/src/qemu/libvirtd_qemu.aug @@ -42,6 +42,7 @@ module Libvirtd_qemu = | str_entry dump_image_format | str_entry auto_dump_path | bool_entry auto_dump_direct + | bool_entry auto_start_direct | str_entry hugetlbfs_mount | bool_entry relaxed_acs_check | bool_entry vnc_allow_host_audio diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index 2a0664d..48ae781 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -216,6 +216,14 @@ # # auto_dump_direct = 0 +# When a domain is configured to be auto-started, enabling this flag +# has the same effect as using the VIR_DOMAIN_START_DIRECT flag with the +# virDomainCreateWithFlags API. That is, the system uses O_DIRECT if +# possible, which puts less pressure on the file system cache but may +# cause slower operation. +# +# auto_start_direct = 0 + # If provided by the host and a hugetlbfs mount point is configured, # a guest may request huge page backing. When this mount point is # unspecified here, determination of a host mount point in /proc/mounts diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index cf6cb6b..144dbda 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -382,6 +382,10 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, CHECK_TYPE (auto_dump_direct, VIR_CONF_LONG); if (p) driver-autoDumpDirect = true; +p = virConfGetValue (conf, auto_start_direct); +CHECK_TYPE (auto_start_direct, VIR_CONF_LONG); +if (p) driver-autoStartDirect = true; + p = virConfGetValue (conf, hugetlbfs_mount); CHECK_TYPE (hugetlbfs_mount, VIR_CONF_STRING); if (p p-str) { diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index bc025af..afc3ef4 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -121,6 +121,8 @@ struct qemud_driver { char *autoDumpPath; bool autoDumpDirect; +bool autoStartDirect; + pciDeviceList *activePciHostdevs; virBitmapPtr reservedVNCPorts; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 27b971f..64fe3b9 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -150,11 +150,11 @@ qemuAutostartDomain(void *payload, const void *name ATTRIBUTE_UNUSED, void *opaq vm-def-name, err ? err-message : _(unknown error)); } else { -/* XXX need to wire direct autostart into qemu.conf */ if (vm-autostart !virDomainObjIsActive(vm) qemuDomainObjStart(data-conn, data-driver, vm, - false, false, false) 0) { + false, false, + data-driver-autoStartDirect) 0) { err = virGetLastError(); VIR_ERROR(_(Failed to autostart VM '%s': %s), vm-def-name, -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 18/8] save: support direct flag in libvirt-guests init script
* tools/libvirt-guests.init.sh (start): Use SAVE_DIRECT. --- Patch 10/8 has a bug fixed here - the code must use $direct rather than $direct so as to elide the argument rather than inject an empty string argument when the option is not in use. The more I look at this series, the more I like the idea of adding both save and restore direct support at the same time, so I'll post a v2 that rebases things accordingly. tools/libvirt-guests.init.sh |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/tools/libvirt-guests.init.sh b/tools/libvirt-guests.init.sh index 4e383c4..7e1dd51 100644 --- a/tools/libvirt-guests.init.sh +++ b/tools/libvirt-guests.init.sh @@ -144,6 +144,8 @@ start() { fi isfirst=true +direct= +test x$SAVE_DIRECT = x0 || direct=--direct while read uri list; do configured=false set -f @@ -173,7 +175,7 @@ start() { else sleep $START_DELAY fi -retval run_virsh $uri start $name /dev/null \ +retval run_virsh $uri start $direct $name /dev/null \ gettext done; echo fi fi @@ -194,7 +196,7 @@ suspend_guest() direct= test x$SAVE_DIRECT = x0 || direct=--direct printf %s $label -run_virsh $uri managedsave $direct $guest /dev/null +run_virsh $uri managedsave $direct $guest /dev/null virsh_pid=$! while true; do sleep 1 -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] virsh: avoid double free of domain
On 07/17/2011 07:17 PM, Wen Congyang wrote: At 07/17/2011 11:29 PM, a...@redhat.com Write: From: Alex Jiaa...@redhat.com * tools/virsh.c: avoid double free of domain, when weight value of blkiotune less than 0, codes will free domain and jump to cleanup section, however, cleanup will free domain again. } ACK Pushed. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] interface: Check for interface (in-)activity on some operations
On 07/15/2011 10:36 AM, Michal Privoznik wrote: On 15.07.2011 16:29, Eric Blake wrote: On 07/15/2011 07:58 AM, Michal Privoznik wrote: Right now it is possible to undefine an active interface, or destroy inactive. This patch add some checking to these operations to prevent this. Also fix test driver. I'm inclined to NACK this on design principles (I haven't read the patch itself, though). Given the discussion about domains and undefine, the ability to undefine an active interface is a feature, provided we support the concept of a transient interface like we do for transient domains. That is, we have the following transitions: nothing - transient/running via Create nothing - persistent/inactive via Define persistent/inactive - persistent/active via Start persistent/inactive - gone via Undefine persistent/running - persistent/inactive via Destroy persistent/running - transient/running via Undefine transient/running - gone via Destroy transient/running - persistent/running via Define and rejecting Undefine on a running interface would prevent the ability to transistion a persistent over to a transient interface. On the other hand, if we don't support transient interfaces, then the above analysis which works for domains would have to be adjusted for interfaces, so you may have something to patch after all. Well, although we have function interfaceCreate, it is actually (from semantic POV) interfaceStart, because it just start inactive but defined interface. So we do not support transient interfaces. Therefore transitions for interfaces are slightly different from transitions for domains. That's why I think we do need this patch. Since I was the original author of this file, I guess I'd better get into the conversation :-) The odd part of this is that I recall having a conversation about allowing/not allowing undefine of an interface that is active, and I *thought* that it didn't allow it (but obviously it does). A couple of points: 1) The fact that we don't support transient interfaces now doesn't preclude supporting them in the future (although we may use some method other than netcf to do it). 2) We should be careful changing this, in case it has any effects on virt-manager's use of the API. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 8/8] virsh: Extend virsh dominfo to display if managed state exists
On 07/15/2011 03:06 AM, Osier Yang wrote: --- tools/virsh.c |8 1 files changed, 8 insertions(+), 0 deletions(-) Your rebuttals to my arguments about positioning are sound, so I'm okay with leaving the positioning of this output prior to Security strings. However, on re-reading it, I see one more issue: diff --git a/tools/virsh.c b/tools/virsh.c index 8a62612..120f3c8 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -2366,6 +2366,7 @@ cmdDominfo(vshControl *ctl, const vshCmd *cmd) int autostart; unsigned int id; char *str, uuid[VIR_UUID_STRING_BUFLEN]; +int has_managed_state = 0; if (!vshConnectionUsability(ctl, ctl-conn)) return false; @@ -2430,6 +2431,13 @@ cmdDominfo(vshControl *ctl, const vshCmd *cmd) autostart ? _(enable) : _(disable) ); } +has_managed_state = virDomainHasManagedSaveImage(dom, 0); +if (has_managed_state 0) +vshPrint(ctl, %-15s %s\n, _(Has managed state:), _(unknown)); Here, we print managed state, but the command that creates that state is called managedsave, and the API we call is virDomainHasManagedSaveImage. Not to mention that it surpasses 15 columns, which makes the output not lined up with all the other rows. Also, the Has is a bit redundant. I'd prefer to see this output as: Managed save: yes|no|unknown ACK with that nit fixed, and this can be pushed independently of the rest of the series. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] libvirt, virsh set migrate speed
HI, Eric, I download the libvirt-0.9.3.tar.gz, compile, and install it. Then when I tried the command virsh migrate-setspeed vm --bandwidth 1000. It gives me the error error: unknown procedure: 207 Do you have any solution to this? Thanks. - Hui On 7/15/11 2:57 PM, Eric Blake ebl...@redhat.com wrote: On 07/15/2011 12:34 PM, Hui Kang wrote: Hi, I checked the libvirt git tree. It seems that this patch has not been applied. I think to use virsh command line to control the migration bandwidth is useful. Thanks. - Hui API virDomainMigrateSetMaxSpeed was introduced since 0.9.0, but no command in virsh yet. --- tools/virsh.c | 46 ++ tools/virsh.pod |5 + 2 files changed, 51 insertions(+), 0 deletions(-) Thanks, but this was indeed applied as commit b73f1f8d5, and in the 0.9.3 release. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 8/8] virsh: Extend virsh dominfo to display if managed state exists
On 07/18/2011 11:34 AM, Eric Blake wrote: Here, we print managed state, but the command that creates that state is called managedsave, and the API we call is virDomainHasManagedSaveImage. Not to mention that it surpasses 15 columns, which makes the output not lined up with all the other rows. Also, the Has is a bit redundant. I'd prefer to see this output as: Managed save: yes|no|unknown ACK with that nit fixed, and this can be pushed independently of the rest of the series. Also, squash in some documentation: diff --git i/tools/virsh.pod w/tools/virsh.pod index 5afa1f2..312ddf0 100644 --- i/tools/virsh.pod +++ w/tools/virsh.pod @@ -514,6 +514,9 @@ Save and destroy (stop) a running domain, so it can be restarted from the same state at a later time. When the virsh Bstart command is next run for the domain, it will automatically be started from this saved state. +The Bdominfo command can be used to query whether a domain currently +has any managed save state. + =item Bmanagedsave-remove Idomain-id Remove the Bmanagedsave state file for a domain, if it exists. This -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] libvirt, virsh set migrate speed
On 07/18/2011 11:34 AM, Hui Kang wrote: HI, Eric, I download the libvirt-0.9.3.tar.gz, compile, and install it. Then when I tried the command virsh migrate-setspeed vm --bandwidth 1000. It gives me the error error: unknown procedure: 207 Do you have any solution to this? Thanks. That means that your libvirtd daemon is too old. Your client virsh knows a new API that is not yet supported by the server. The solution is to upgrade the running libvirtd to 0.9.3, so that it can honor the newer client's requests. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 4/5] In storageVolumeCreateXML, spawn a new thread for volbuilding, in storageVolumeDelete, generate the signal
On 07/18/2011 05:13 AM, Guannan Ren wrote: Independantly of this, IMHO pthread cancellation handlers are a recipe for trouble because it is incredibly hard to make sure you correctly cleanup all resources in the thread, even with use of cleanup handlers. IMHO, threads should be made to monitor some external quit boolean variable (eg see threadpool.c thread termination). Daniel yep but the push and pop function couldn't be in different function, according to man page as follows. The concept of pushing and popping handlers isn't the bottleneck here, rather, it is that _all_ functions in the callstack must reliably use the push/pop handlers correctly, before _any_ function can call pthread_cancel on the entire callstack. Right now, none of our code uses any push/pop handlers. Therefore, it becomes a much larger patch to properly insert push/pop handlers into the entire call chain than it is to just write the worker thread to periodically check if any other thread has requested an early quit. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] doc: Correct docs for iface commands
On 07/18/2011 05:45 AM, Osier Yang wrote: duplicate documents for iface-name, lacks of document for iface-mac, inconsistent option names with virsh help strings. --- tools/virsh.pod | 16 ++-- 1 files changed, 10 insertions(+), 6 deletions(-) diff --git a/tools/virsh.pod b/tools/virsh.pod index 377fac0..4f218d3 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -1081,20 +1081,24 @@ Returns the list of active host interfaces. If I--all is specified this will also include defined but inactive interfaces. If I--inactive is specified only the inactive ones will be listed. -=item Biface-name Iiface-MAC +=item Biface-name Iinterface We have other cases of man page names differing from virsh.c names. For example, 'virsh help domid' documents 'domain' along with later mentioning domain name or uuid under OPTIONS, but 'man virsh' merely documents 'domid domain-name-or-uuid' in its synopsis. I'm okay with fixing the synopsis to match virsh, but we should probably scrub things to fix all commands to use consistent names. At any rate, this is strict improvement, so: ACK. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCHv4] libvirt: do not mix internal flags into public API
On 07/15/2011 05:12 PM, Eric Blake wrote: There were two API in driver.c that were silently masking flags bits prior to calling out to the drivers, and several others that were explicitly masking flags bits. This is not forward-compatible - if we ever have that many flags in the future, then talking to an old server that masks out the flags would be indistinguishable from talking to a new server that can honor the flag. In general, libvirt.c should forward _all_ flags on to drivers, and only the drivers should reject unknown flags. In the case of virDrvSecretGetValue, the solution is to separate the internal driver callback function to have two parameters instead of one, with only one parameter affected by the public API. On second thought, I don't like this. It's nicer to have all the driver callbacks match the public API as much as possible, and to instead create a new entry point in libvirt_private.syms that qemu can use for the case where it needs to call the internal API instead of the public virSecretGetValue. v5 coming up. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 0/7] Add support for setting QoS
This patch series add support for setting traffic shaping and policing on both domain's interface and network's virtual bridge. Basically, this is done via 'tc' from iproute2 package. For shaping is HTB used, for policing we need u32 match selector. Both should be available in RHEL-6 kernel. How this works: On an virtual interface which has limits defined a root qdisc are replaced. Ingress root for outbound traffic shaping and egress for inbound. Basically, in inbound traffic policing is applied, on outbound shaping. New qdiscs are set to limit the traffic to rate set in XML. For shaping it is possible to set the size of buffer. Accepted values for rate, peak and burst are integer numbers. Units are kilobytes per second for rate or kilobytes for size. Supported devices are VIR_DOMAIN_NET_TYPE_NETWORK, VIR_DOMAIN_NET_TYPE_BRIDGE and VIR_DOMAIN_NET_TYPE_DIRECT. diff to v2: -Jirka's review included diff to v1: -rebase to current HEAD -add support for macvtap devices Michal Privoznik (7): bandwidth: Define schema and create documentation bandwidth: Declare internal structures bandwidth: Add parsing functions bandwidth: Create format functions bandwitdh: Implement functions to enable and disable QoS bandwidth: Add test cases for network bandwidth: Add domain schema test suite configure.ac |4 + docs/formatdomain.html.in | 33 +++ docs/formatnetwork.html.in| 30 +++ docs/schemas/bandwidth.rng| 52 docs/schemas/domain.rng |4 + docs/schemas/network.rng |4 + src/conf/domain_conf.c|6 + src/conf/domain_conf.h|1 + src/conf/network_conf.c |8 + src/conf/network_conf.h |1 + src/libvirt_private.syms |4 + src/network/bridge_driver.c | 12 + src/qemu/qemu_command.c | 10 +- src/util/macvtap.c| 12 +- src/util/macvtap.h|5 +- src/util/network.c| 345 + src/util/network.h| 23 ++ tests/domainschemadata/domain-bandwidth.xml | 72 + tests/networkxml2xmlin/bandwidth-network.xml | 16 ++ tests/networkxml2xmlout/bandwidth-network.xml | 16 ++ tests/networkxml2xmltest.c|1 + 21 files changed, 655 insertions(+), 4 deletions(-) create mode 100644 docs/schemas/bandwidth.rng create mode 100644 tests/domainschemadata/domain-bandwidth.xml create mode 100644 tests/networkxml2xmlin/bandwidth-network.xml create mode 100644 tests/networkxml2xmlout/bandwidth-network.xml -- 1.7.5.rc3 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 2/7] bandwidth: Declare internal structures
--- src/util/network.h | 14 ++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/src/util/network.h b/src/util/network.h index ed0b78c..af32558 100644 --- a/src/util/network.h +++ b/src/util/network.h @@ -45,6 +45,20 @@ typedef struct { typedef virSocketAddr *virSocketAddrPtr; +typedef struct { +unsigned long average; /* kbytes/s */ +unsigned long peak; /* kbytes/s */ +unsigned long burst;/* kbytes */ +} virRate; + +typedef virRate *virRatePtr; + +typedef struct { +virRate in, out; +} virBandwidth; + +typedef virBandwidth *virBandwidthPtr; + int virSocketParseAddr(const char *val, virSocketAddrPtr addr, int hint); -- 1.7.5.rc3 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 7/7] bandwidth: Add domain schema test suite
--- tests/domainschemadata/domain-bandwidth.xml | 72 +++ 1 files changed, 72 insertions(+), 0 deletions(-) create mode 100644 tests/domainschemadata/domain-bandwidth.xml diff --git a/tests/domainschemadata/domain-bandwidth.xml b/tests/domainschemadata/domain-bandwidth.xml new file mode 100644 index 000..852c97b --- /dev/null +++ b/tests/domainschemadata/domain-bandwidth.xml @@ -0,0 +1,72 @@ +domain type='kvm' + namef14-60/name + uuid38644c45-5227-a936-3b38-bc4f72c355bb/uuid + memory1048576/memory + currentMemory1048576/currentMemory + vcpu2/vcpu + os +type arch='x86_64' machine='pc-0.13'hvm/type +boot dev='cdrom'/ +boot dev='hd'/ +bootmenu enable='yes'/ + /os + features +acpi/ +apic/ +pae/ + /features + clock offset='utc'/ + on_poweroffdestroy/on_poweroff + on_rebootrestart/on_reboot + on_crashrestart/on_crash + devices +emulator/usr/bin/qemu-kvm/emulator +disk type='file' device='disk' + driver name='qemu' type='qcow2'/ + source file='/var/lib/libvirt/images/f14-6.img'/ + target dev='vda' bus='virtio'/ + address type='pci' domain='0x' bus='0x00' slot='0x04' function='0x0'/ +/disk +disk type='file' device='cdrom' + driver name='qemu' type='raw'/ + source file='/home/zippy/tmp/Fedora-14-x86_64-Live-KDE.iso'/ + target dev='hdc' bus='ide'/ + readonly/ + address type='drive' controller='0' bus='1' unit='0'/ +/disk +controller type='ide' index='0' + address type='pci' domain='0x' bus='0x00' slot='0x01' function='0x1'/ +/controller +controller type='virtio-serial' index='0' + address type='pci' domain='0x' bus='0x00' slot='0x06' function='0x0'/ +/controller +interface type='network' + mac address='52:54:00:24:a5:9f'/ + source network='default'/ + bandwidth +inbound average='1000' peak='4000' burst='1024'/ +outbound average='128' peak='256' burst='32768'/ + /bandwidth + address type='pci' domain='0x' bus='0x00' slot='0x07' function='0x0'/ +/interface +serial type='pty' + target port='0'/ +/serial +console type='pty' + target type='serial' port='0'/ +/console +input type='tablet' bus='usb'/ +input type='mouse' bus='ps2'/ +graphics type='vnc' port='-1' autoport='yes'/ +sound model='ac97' + address type='pci' domain='0x' bus='0x00' slot='0x03' function='0x0'/ +/sound +video + model type='vga' vram='9216' heads='1'/ + address type='pci' domain='0x' bus='0x00' slot='0x02' function='0x0'/ +/video +memballoon model='virtio' + address type='pci' domain='0x' bus='0x00' slot='0x05' function='0x0'/ +/memballoon + /devices +/domain -- 1.7.5.rc3 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 4/7] bandwidth: Create format functions
--- src/conf/domain_conf.c |3 ++ src/conf/network_conf.c |3 ++ src/libvirt_private.syms |1 + src/util/network.c | 72 ++ src/util/network.h |4 ++ 5 files changed, 83 insertions(+), 0 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 0d8c7e7..7bc6c1a 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -8839,6 +8839,9 @@ virDomainNetDefFormat(virBufferPtr buf, virBufferAddLit(buf, /tune\n); } +if (virBandwidrhDefFormat(buf, def-bandwidth, ) 0) +return -1; + if (virDomainDeviceInfoFormat(buf, def-info, flags) 0) return -1; diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index c9929e2..43145b1 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1093,6 +1093,9 @@ char *virNetworkDefFormat(const virNetworkDefPtr def) if (virNetworkDNSDefFormat(buf, def-dns) 0) goto error; +if (virBandwidrhDefFormat(buf, def-bandwidth, ) 0) +goto error; + for (ii = 0; ii def-nips; ii++) { if (virNetworkIpDefFormat(buf, def-ips[ii]) 0) goto error; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 12db3d7..1cc9bca 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -700,6 +700,7 @@ nlComm; # network.h +virBandwidrhDefFormat; virBandwidthDefParseNode; virSocketAddrBroadcast; virSocketAddrBroadcastByPrefix; diff --git a/src/util/network.c b/src/util/network.c index ce949c7..58c0492 100644 --- a/src/util/network.c +++ b/src/util/network.c @@ -786,3 +786,75 @@ virBandwidthDefParseNode(xmlNodePtr node, virBandwidthPtr def) cleanup: return ret; } + +static int +virBandwidthChildDefFormat(virBufferPtr buf, + virRatePtr def, + const char *elem_name) +{ +if (!buf || !def || !elem_name) +return -1; + +if (def-average) { +virBufferAsprintf(buf, %s average='%lu', elem_name, def-average); + +if (def-peak) +virBufferAsprintf(buf, peak='%lu', def-peak); + +if (def-burst) +virBufferAsprintf(buf, burst='%lu', def-burst); +virBufferAddLit(buf, /\n); +} + +return 0; +} + +/** + * virBandwidrhDefFormat: + * @buf: Buffer to print to + * @def: Data source + * @indent: prepend all lines printed with this + * + * Formats bandwidth and prepend each line with @indent. + * Passing NULL to @indent is equivalent passing . + * + * Returns 0 on success, else -1. + */ +int +virBandwidrhDefFormat(virBufferPtr buf, + virBandwidthPtr def, + const char *indent) +{ +int ret = -1; + +if (!buf || !def) +goto cleanup; + +if (!indent) +indent = ; + +if (!def-in.average !def-out.average) { +ret = 0; +goto cleanup; +} + +virBufferAsprintf(buf, %sbandwidth\n, indent); +if (def-in.average) { +virBufferAsprintf(buf, %s , indent); +if (virBandwidthChildDefFormat(buf, def-in, inbound) 0) +goto cleanup; +} + +if (def-out.average) { +virBufferAsprintf(buf, %s , indent); +if (virBandwidthChildDefFormat(buf, def-out, outbound) 0) +goto cleanup; +} + +virBufferAsprintf(buf, %s/bandwidth\n, indent); + +ret = 0; + +cleanup: +return ret; +} diff --git a/src/util/network.h b/src/util/network.h index 54f7aad..98e3082 100644 --- a/src/util/network.h +++ b/src/util/network.h @@ -21,6 +21,7 @@ # include netdb.h # include netinet/in.h # include xml.h +# include buf.h typedef struct { union { @@ -106,4 +107,7 @@ int virSocketAddrPrefixToNetmask(unsigned int prefix, int family); int virBandwidthDefParseNode(xmlNodePtr node, virBandwidthPtr def); +int virBandwidrhDefFormat(virBufferPtr buf, + virBandwidthPtr def, + const char *indent); #endif /* __VIR_NETWORK_H__ */ -- 1.7.5.rc3 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 5/7] bandwitdh: Implement functions to enable and disable QoS
These function executes 'tc' with appropriate arguments to set desired QoS setting on interface or bridge during its creation. --- configure.ac|4 + src/libvirt_private.syms|2 + src/network/bridge_driver.c | 12 +++ src/qemu/qemu_command.c | 10 +++- src/util/macvtap.c | 12 +++- src/util/macvtap.h |5 +- src/util/network.c | 161 +++ src/util/network.h |3 + 8 files changed, 205 insertions(+), 4 deletions(-) diff --git a/configure.ac b/configure.ac index e9d5be4..2b864cf 100644 --- a/configure.ac +++ b/configure.ac @@ -165,6 +165,8 @@ AC_PATH_PROG([RADVD], [radvd], [radvd], [/sbin:/usr/sbin:/usr/local/sbin:$PATH]) AC_PATH_PROG([BRCTL], [brctl], [brctl], [/sbin:/usr/sbin:/usr/local/sbin:$PATH]) +AC_PATH_PROG([TC], [tc], [tc], +[/sbin:/usr/sbin:/usr/local/sbin:$PATH]) AC_PATH_PROG([UDEVADM], [udevadm], [], [/sbin:/usr/sbin:/usr/local/sbin:$PATH]) AC_PATH_PROG([UDEVSETTLE], [udevsettle], [], @@ -178,6 +180,8 @@ AC_DEFINE_UNQUOTED([RADVD],[$RADVD], [Location or name of the radvd program]) AC_DEFINE_UNQUOTED([BRCTL],[$BRCTL], [Location or name of the brctl program (see bridge-utils)]) +AC_DEFINE_UNQUOTED([TC],[$TC], +[Location or name of the tc profram (see iproute2)]) if test -n $UDEVADM; then AC_DEFINE_UNQUOTED([UDEVADM],[$UDEVADM], [Location or name of the udevadm program]) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 1cc9bca..ff398d7 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -702,6 +702,8 @@ nlComm; # network.h virBandwidrhDefFormat; virBandwidthDefParseNode; +virBandwidthDisable; +virBandwidthEnable; virSocketAddrBroadcast; virSocketAddrBroadcastByPrefix; virSocketAddrIsNetmask; diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 0a12bc0..47f9799 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -1804,6 +1804,13 @@ networkStartNetworkDaemon(struct network_driver *driver, if (v6present networkStartRadvd(network) 0) goto err4; +if (virBandwidthEnable(network-def-bandwidth, network-def-bridge) 0) { +networkReportError(VIR_ERR_INTERNAL_ERROR, + _(cannot set bandwidth limits on %s), + network-def-bridge); +goto err5; +} + /* Persist the live configuration now we have bridge info */ if (virNetworkSaveConfig(NETWORK_STATE_DIR, network-def) 0) { goto err5; @@ -1895,6 +1902,11 @@ static int networkShutdownNetworkDaemon(struct network_driver *driver, unlink(stateFile); VIR_FREE(stateFile); +if (virBandwidthDisable(network-def-bridge, true) 0) { +VIR_WARN(Failed to disable QoS on %s, + network-def-name); +} + if (network-radvdPid 0) { char *radvdpidbase; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index a3bce4a..78cb865 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -128,7 +128,7 @@ qemuPhysIfaceConnect(virDomainDefPtr def, rc = openMacvtapTap(net-ifname, net-mac, net-data.direct.linkdev, net-data.direct.mode, vnet_hdr, def-uuid, net-data.direct.virtPortProfile, res_ifname, -vmop, driver-stateDir); +vmop, driver-stateDir, net-bandwidth); if (rc = 0) { virDomainAuditNetDevice(def, net, res_ifname, true); VIR_FREE(net-ifname); @@ -291,6 +291,14 @@ qemuNetworkIfaceConnect(virDomainDefPtr def, } } +if (virBandwidthEnable(net-bandwidth, net-ifname) 0) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, +_(cannot set bandwidth limits on %s), +net-ifname); +VIR_FORCE_CLOSE(tapfd); +goto cleanup; +} + if (tapfd = 0) { if ((net-filter) (net-ifname)) { err = virDomainConfNWFilterInstantiate(conn, net); diff --git a/src/util/macvtap.c b/src/util/macvtap.c index 30343c8..bb7784c 100644 --- a/src/util/macvtap.c +++ b/src/util/macvtap.c @@ -266,7 +266,8 @@ openMacvtapTap(const char *tgifname, virVirtualPortProfileParamsPtr virtPortProfile, char **res_ifname, enum virVMOperationType vmOp, - char *stateDir) + char *stateDir, + virBandwidthPtr bandwidth) { const char *type = macvtap; int c, rc; @@ -360,6 +361,15 @@ create_name: } else goto disassociate_exit; +if (virBandwidthEnable(bandwidth, cr_ifname) 0) { +macvtapError(VIR_ERR_INTERNAL_ERROR, + _(cannot set bandwidth limits on %s), + cr_ifname); +rc = -1; +goto disassociate_exit; +} + + return rc; disassociate_exit: diff
[libvirt] [PATCH 3/7] bandwidth: Add parsing functions
These functions parse given XML node and store the output at given pointer. Unknown elements are silently ignored. Attributes must be integer and must fit in unsigned long. --- src/conf/domain_conf.c |3 + src/conf/domain_conf.h |1 + src/conf/network_conf.c |5 ++ src/conf/network_conf.h |1 + src/libvirt_private.syms |1 + src/util/network.c | 112 ++ src/util/network.h |2 + 7 files changed, 125 insertions(+), 0 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 8c3e44e..0d8c7e7 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2854,6 +2854,9 @@ virDomainNetDefParseXML(virCapsPtr caps, if (virDomainDeviceBootParseXML(cur, def-bootIndex, bootMap)) goto error; +} else if (xmlStrEqual(cur-name, BAD_CAST bandwidth)) { +if (virBandwidthDefParseNode(cur, def-bandwidth) 0) +goto error; } } cur = cur-next; diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 172d3c2..06cea02 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -398,6 +398,7 @@ struct _virDomainNetDef { virDomainDeviceInfo info; char *filter; virNWFilterHashTablePtr filterparams; +virBandwidth bandwidth; }; enum virDomainChrDeviceType { diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index ae479bf..c9929e2 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -742,6 +742,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) char *tmp; xmlNodePtr *ipNodes = NULL; xmlNodePtr dnsNode = NULL; +xmlNodePtr bandwidthNode = NULL; int nIps; if (VIR_ALLOC(def) 0) { @@ -777,6 +778,10 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) /* Parse network domain information */ def-domain = virXPathString(string(./domain[1]/@name), ctxt); +if ((bandwidthNode = virXPathNode(./bandwidth, ctxt)) != NULL +virBandwidthDefParseNode(bandwidthNode, def-bandwidth) 0) +goto error; + /* Parse bridge information */ def-bridge = virXPathString(string(./bridge[1]/@name), ctxt); tmp = virXPathString(string(./bridge[1]/@stp), ctxt); diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index 5edcf27..565a464 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -127,6 +127,7 @@ struct _virNetworkDef { virNetworkIpDefPtr ips; /* ptr to array of IP addresses on this network */ virNetworkDNSDefPtr dns; /* ptr to dns related configuration */ +virBandwidth bandwidth; }; typedef struct _virNetworkObj virNetworkObj; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 3e3b1dd..12db3d7 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -700,6 +700,7 @@ nlComm; # network.h +virBandwidthDefParseNode; virSocketAddrBroadcast; virSocketAddrBroadcastByPrefix; virSocketAddrIsNetmask; diff --git a/src/util/network.c b/src/util/network.c index eb16e0c..ce949c7 100644 --- a/src/util/network.c +++ b/src/util/network.c @@ -21,6 +21,9 @@ virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \ __FUNCTION__, __LINE__, __VA_ARGS__) +#define virBandwidthError(code, ...)\ +virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \ + __FUNCTION__, __LINE__, __VA_ARGS__) /* * Helpers to extract the IP arrays from the virSocketAddrPtr * That part is the less portable of the module @@ -674,3 +677,112 @@ virSocketAddrPrefixToNetmask(unsigned int prefix, error: return result; } + +static int +virBandwidthParseChildDefNode(xmlNodePtr node, virRatePtr rate) +{ +int ret = -1; +char *average = NULL; +char *peak = NULL; +char *burst = NULL; + +if (!node || !rate) +return -1; + +average = virXMLPropString(node, average); +peak = virXMLPropString(node, peak); +burst = virXMLPropString(node, burst); + +if (average) { +if (virStrToLong_ul(average, NULL, 10, rate-average) 0) { +virBandwidthError(VIR_ERR_CONFIG_UNSUPPORTED, + _(could not convert %s), + average); +goto cleanup; +} +} else { +virBandwidthError(VIR_ERR_XML_DETAIL, %s, + _(Missing mandatory average attribute)); +goto cleanup; +} + +if (peak virStrToLong_ul(peak, NULL, 10, rate-peak) 0) { +virBandwidthError(VIR_ERR_CONFIG_UNSUPPORTED, + _(could not convert %s), + peak); +goto cleanup; +} + +if (burst virStrToLong_ul(burst, NULL, 10, rate-burst) 0) { +virBandwidthError(VIR_ERR_CONFIG_UNSUPPORTED, +
[libvirt] [PATCH 6/7] bandwidth: Add test cases for network
--- tests/networkxml2xmlin/bandwidth-network.xml | 16 tests/networkxml2xmlout/bandwidth-network.xml | 16 tests/networkxml2xmltest.c|1 + 3 files changed, 33 insertions(+), 0 deletions(-) create mode 100644 tests/networkxml2xmlin/bandwidth-network.xml create mode 100644 tests/networkxml2xmlout/bandwidth-network.xml diff --git a/tests/networkxml2xmlin/bandwidth-network.xml b/tests/networkxml2xmlin/bandwidth-network.xml new file mode 100644 index 000..555ee18 --- /dev/null +++ b/tests/networkxml2xmlin/bandwidth-network.xml @@ -0,0 +1,16 @@ +network + nametest-net/name + uuid986fed9e-a488-186d-ef2d-17ebfd1993f8/uuid + forward mode='nat'/ + bridge name='virbr1' stp='on' delay='0' / + mac address='52:54:00:E6:A2:C9'/ + bandwidth +inbound average='1000' peak='2000' burst='1024'/ +outbound average='2000'/ + /bandwidth + ip address='192.168.120.1' netmask='255.255.255.0' +dhcp + range start='192.168.120.2' end='192.168.120.254' / +/dhcp + /ip +/network diff --git a/tests/networkxml2xmlout/bandwidth-network.xml b/tests/networkxml2xmlout/bandwidth-network.xml new file mode 100644 index 000..555ee18 --- /dev/null +++ b/tests/networkxml2xmlout/bandwidth-network.xml @@ -0,0 +1,16 @@ +network + nametest-net/name + uuid986fed9e-a488-186d-ef2d-17ebfd1993f8/uuid + forward mode='nat'/ + bridge name='virbr1' stp='on' delay='0' / + mac address='52:54:00:E6:A2:C9'/ + bandwidth +inbound average='1000' peak='2000' burst='1024'/ +outbound average='2000'/ + /bandwidth + ip address='192.168.120.1' netmask='255.255.255.0' +dhcp + range start='192.168.120.2' end='192.168.120.254' / +/dhcp + /ip +/network diff --git a/tests/networkxml2xmltest.c b/tests/networkxml2xmltest.c index 065166d..9dac7b4 100644 --- a/tests/networkxml2xmltest.c +++ b/tests/networkxml2xmltest.c @@ -88,6 +88,7 @@ mymain(void) DO_TEST(netboot-proxy-network); DO_TEST(nat-network-dns-txt-record); DO_TEST(nat-network-dns-hosts); +DO_TEST(bandwidth-network); return (ret==0 ? EXIT_SUCCESS : EXIT_FAILURE); } -- 1.7.5.rc3 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] RFC New virDomainBlockPull API family to libvirt
On 07/18/2011 09:35 AM, Stefan Hajnoczi wrote: On the other hand I suspect that we are missing the mechanism to control the rate of the transfer in the new API, which is something which could be implemented in the old incremental mechanism, but not anymore. So I wonder if the virDomainBlockPull() shouldn't get an unsigned long bandwidth (to stay coherent with migration APIs) before the flags. I don't know if the support is ready in QEmu but I suspect that will be an important point, so if not present will be added :-) Hopefully Stefan can comment on any throttling mechanisms that might be added to the qemu implementation. It would be nice to have this feature built into the API to match the migration APIs, but if that is not feasible then we could always add it later. If we follow the live migration API then a block_stream_set_speed command would be used. libvirt would issue it as a separate command immediately after starting the streaming operation. There is the window of time after streaming has started but before block_stream_set_speed has been called where no throttling takes place, but I don't think this is a practical problem. It also opens the possibility of allowing the user to change the rate limit value while the block streaming operation is active. In light of our decision to provide a generic BlockJobAbort/BlockJobInfo interface, should SetSpeed be generic too? virDomainBlockJobSetSpeed() or virDomainBlockPullSetSpeed() ? The block_stream_set_speed semantics allow the command to be used when no streaming operation is active. This can be used to set the speed without a window of time after creation and before set_speed is called. If we switch to block_job_set_speed then it is cleaner to restrict the command to work on active jobs only. This is because there is only one speed variable kept but there might be multiple job types (streaming, compaction, etc) which would need different settings. What do you think about this? I think the block_job_set_speed semantics seem reasonable to me. What is the method for querying the current speed setting? I would suggest extending the query-block-job command to include this information. In this case, I would extend virDomainBlockJobInfo to contain: /* The maximum allowable bandwidth for this job (MB/s) */ unsigned long bandwidth; block_job_set_speed --- Set maximum speed for a background block operation. This is a per-block device command that can only be issued when there is an active block job. Throttling can be disabled by setting the speed to 0. Arguments: - device: device name (json-string) - value: maximum speed, in bytes per second (json-int) Errors: NotSupported: job type does not support speed setting Example: - { execute: block_job_set_speed, arguments: { device: virtio0, value: 1024 } } Stefan -- Adam Litke IBM Linux Technology Center -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH RFC v3 3/6] Update XML Schema for new entries
On 07/18/2011 04:41 AM, Wen Congyang wrote: --- docs/schemas/domain.rng | 26 +- 1 files changed, 25 insertions(+), 1 deletions(-) diff --git a/docs/schemas/domain.rng b/docs/schemas/domain.rng index 8a4e3fe..5f8151d 100644 --- a/docs/schemas/domain.rng +++ b/docs/schemas/domain.rng @@ -388,6 +388,16 @@ ref name=cpushares/ /element /optional + optional +element name=period + ref name=cpuperiod/ +/element + /optional + optional +element name=quota + ref name=cpuquota/ +/element + /optional zeroOrMore element name=vcpupin attribute name=vcpu Yes, this is exactly the interface we are looking for. @@ -2401,7 +2411,21 @@ data type=unsignedInt param name=pattern[0-9]+/param /data - /define + /define + define name=cpuperiod +data type=unsignedLong + param name=pattern[0-9]+/param + param name=minInclusive1000/param + param name=maxInclusive100/param +/data + /define + define name=cpuquota +data type=long + param name=pattern-?[0-9]+/param + param name=maxInclusive18446744073709551/param + param name='minInclusive'-1/param +/data + /define define name=PortNumber data type=short param name=minInclusive-1/param -- Adam Litke IBM Linux Technology Center -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCHv4] libvirt: do not mix internal flags into public API
On 07/18/2011 01:27 PM, Eric Blake wrote: On 07/15/2011 05:12 PM, Eric Blake wrote: There were two API in driver.c that were silently masking flags bits prior to calling out to the drivers, and several others that were explicitly masking flags bits. This is not forward-compatible - if we ever have that many flags in the future, then talking to an old server that masks out the flags would be indistinguishable from talking to a new server that can honor the flag. In general, libvirt.c should forward _all_ flags on to drivers, and only the drivers should reject unknown flags. In the case of virDrvSecretGetValue, the solution is to separate the internal driver callback function to have two parameters instead of one, with only one parameter affected by the public API. On second thought, I don't like this. It's nicer to have all the driver callbacks match the public API as much as possible, and to instead create a new entry point in libvirt_private.syms that qemu can use for the case where it needs to call the internal API instead of the public virSecretGetValue. v5 coming up. Here's the interdiff for v5: diff --git i/src/driver.h w/src/driver.h index 9d0d3de..759150d 100644 --- i/src/driver.h +++ w/src/driver.h @@ -1258,6 +1258,10 @@ typedef int typedef unsigned char * (*virDrvSecretGetValue) (virSecretPtr secret, size_t *value_size, + unsigned int flags); +typedef unsigned char * +(*virDrvSecretGetValueInternal) (virSecretPtr secret, + size_t *value_size, unsigned int flags, unsigned int internalFlags); typedef int @@ -1295,6 +1299,7 @@ struct _virSecretDriver { virDrvSecretGetXMLDesc getXMLDesc; virDrvSecretSetValue setValue; virDrvSecretGetValue getValue; +virDrvSecretGetValueInternal getValueInternal; virDrvSecretUndefine undefine; }; diff --git i/src/libvirt.c w/src/libvirt.c index 39e2041..34acede 100644 --- i/src/libvirt.c +++ w/src/libvirt.c @@ -12680,7 +12680,7 @@ virSecretGetValue(virSecretPtr secret, size_t *value_size, unsigned int flags) if (conn-secretDriver != NULL conn-secretDriver-getValue != NULL) { unsigned char *ret; -ret = conn-secretDriver-getValue(secret, value_size, flags, 0); +ret = conn-secretDriver-getValue(secret, value_size, flags); if (ret == NULL) goto error; return ret; diff --git i/src/qemu/qemu_process.c w/src/qemu/qemu_process.c index 448b06e..cb7575f 100644 --- i/src/qemu/qemu_process.c +++ w/src/qemu/qemu_process.c @@ -257,7 +257,7 @@ qemuProcessGetVolumeQcowPassphrase(virConnectPtr conn, if (conn-secretDriver == NULL || conn-secretDriver-lookupByUUID == NULL || -conn-secretDriver-getValue == NULL) { +conn-secretDriver-getValueInternal == NULL) { qemuReportError(VIR_ERR_NO_SUPPORT, %s, _(secret storage not supported)); goto cleanup; @@ -276,8 +276,8 @@ qemuProcessGetVolumeQcowPassphrase(virConnectPtr conn, enc-secrets[0]-uuid); if (secret == NULL) goto cleanup; -data = conn-secretDriver-getValue(secret, size, 0, - VIR_SECRET_GET_VALUE_INTERNAL_CALL); +data = conn-secretDriver-getValueInternal(secret, size, 0, + VIR_SECRET_GET_VALUE_INTERNAL_CALL); virUnrefSecret(secret); if (data == NULL) goto cleanup; diff --git i/src/remote/remote_driver.c w/src/remote/remote_driver.c index c2f8bbd..d3b5df9 100644 --- i/src/remote/remote_driver.c +++ w/src/remote/remote_driver.c @@ -3174,7 +3174,7 @@ remoteSecretClose (virConnectPtr conn) static unsigned char * remoteSecretGetValue (virSecretPtr secret, size_t *value_size, - unsigned int flags, unsigned int internalFlags) + unsigned int flags) { unsigned char *rv = NULL; remote_secret_get_value_args args; @@ -3183,12 +3183,6 @@ remoteSecretGetValue (virSecretPtr secret, size_t *value_size, remoteDriverLock (priv); -/* internalFlags intentionally do not go over the wire */ -if (internalFlags) { -remoteError(VIR_ERR_NO_SUPPORT, %s, _(no internalFlags support)); -goto done; -} - make_nonnull_secret (args.secret, secret); args.flags = flags; diff --git i/src/secret/secret_driver.c w/src/secret/secret_driver.c index 02cdbb9..87e2b83 100644 --- i/src/secret/secret_driver.c +++ w/src/secret/secret_driver.c @@ -873,8 +873,9 @@ cleanup: } static unsigned char * -secretGetValue(virSecretPtr obj, size_t *value_size, unsigned int flags, - unsigned int internalFlags) +secretGetValueInternal(virSecretPtr obj, size_t *value_size, + unsigned int flags, + unsigned
Re: [libvirt] [PATCHv4] libvirt: do not mix internal flags into public API
On 07/18/2011 02:39 PM, Eric Blake wrote: On second thought, I don't like this. It's nicer to have all the driver callbacks match the public API as much as possible, and to instead create a new entry point in libvirt_private.syms that qemu can use for the case where it needs to call the internal API instead of the public virSecretGetValue. v5 coming up. Here's the interdiff for v5: Except that this break docs/hvsupport.pl, since that file doesn't know that the new function is for internal use only. At this point, I'm leaning back towards just using v4 as is, with a single driver.h callback for secretGetValue, instead of trying to have two different driver callbacks, one public and one for internal only. If we change our minds later, then further touching up of secretGetValue can be an independent patch. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH RFC v3 5/6] qemu: Implement cfs_period and cfs_quota's modification
On 07/18/2011 04:42 AM, Wen Congyang wrote: @@ -5983,7 +6169,30 @@ out: goto cleanup; } -*nparams = 1; +if (*nparams 1) { +params[1].value.ul = period; +params[1].type = VIR_TYPED_PARAM_ULLONG; +if (virStrcpyStatic(params[1].field, cfs_period) == NULL) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, +%s, +_(Field cfs_period too long for destination)); +goto cleanup; +} + +params[2].value.ul = quota; Possible buffer overflow if *nparams == 2 ... +params[2].type = VIR_TYPED_PARAM_LLONG; +if (virStrcpyStatic(params[2].field, cfs_quota) == NULL) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, +%s, +_(Field cfs_quota too long for destination)); +goto cleanup; +} + +*nparams = 3; +} else { +*nparams = 1; +} + ret = 0; cleanup: -- Adam Litke IBM Linux Technology Center -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v3 4/6] qemu: Implement period and quota tunable XML configuration and parsing
On 07/18/2011 04:42 AM, Wen Congyang wrote: +int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) +{ +virCgroupPtr cgroup = NULL; +virCgroupPtr cgroup_vcpu = NULL; +qemuDomainObjPrivatePtr priv = vm-privateData; +int rc; +unsigned int i; +unsigned long long period = vm-def-cputune.period; +long long quota = vm-def-cputune.quota; + +if (driver-cgroup == NULL) +return 0; /* Not supported, so claim success */ + +rc = virCgroupForDomain(driver-cgroup, vm-def-name, cgroup, 0); +if (rc != 0) { +virReportSystemError(-rc, + _(Unable to find cgroup for %s), + vm-def-name); +goto cleanup; +} + +if (priv-nvcpupids == 0 || priv-vcpupids[0] == vm-pid) { +/* If we does not know VCPU-PID mapping or all vcpu runs in the same + * thread, we can not control each vcpu. + */ +if (period || quota) { +if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { +if (qemuSetupCgroupVcpuBW(cgroup, period, quota) 0) +goto cleanup; +} +} +return 0; +} I found a problem above. In the case where we are controlling quota at the domain level cgroup we must multiply the user-specified quota by the number of vcpus in the domain in order to get the same performance as we would with per-vcpu cgroups. As written, the vm will be essentially capped at 1 vcpu worth of quota regardless of the number of vcpus. You will also have to apply this logic in reverse when reporting the scheduler statistics so that the quota number is a per-vcpu quantity. -- Adam Litke IBM Linux Technology Center -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] libvirt: clean up virSecretGetValue driver callbacks
A bit of refactoring so that the public APIs are easier to match to at least one driver callback, by making the internal flags go through a new driver callback. * docs/hvsupport.pl: Allow for internal-only callbacks. * src/driver.h (virDrvSecretGetValue): Revert previous change. (virDrvSecretGetValueInternal): New driver callback. (struct _virSecretDriver): Add new callback. * src/secret/secret_driver.c (secretGetValue): Split... (secretGetValueInternal): ...into new function. (secretDriver): Register internal getValue handler. * src/libvirt.c (virSecretGetValue): Update clients. * src/remote/remote_driver.c (remoteSecretGetValue): Likewise. * src/qemu/qemu_process.c (qemuProcessGetVolumeQcowPassphrase): Likewise. --- I'm not convinced this makes life any better, but I finally got things into shape to work if we want to go this route. Meanwhile, I'm pushing the v4 patch as-is, so this becomes an independent separate patch. docs/hvsupport.pl |4 +++- src/driver.h |5 + src/libvirt.c |2 +- src/qemu/qemu_process.c|6 +++--- src/remote/remote_driver.c |8 +--- src/secret/secret_driver.c | 12 ++-- 6 files changed, 23 insertions(+), 14 deletions(-) diff --git a/docs/hvsupport.pl b/docs/hvsupport.pl index b0d1f0f..e1049e6 100755 --- a/docs/hvsupport.pl +++ b/docs/hvsupport.pl @@ -161,6 +161,8 @@ while (defined($line = FILE)) { $api = virConnect$name; } elsif (exists $apis{virNode$name}) { $api = virNode$name; + } elsif ($name =~ /Internal$/) { + next; } else { die driver $name does not have a public API; } @@ -211,7 +213,7 @@ foreach my $src (@srcs) { my $meth = $2; my $vers = $3; - next if $api eq no || $api eq name; + next if $api eq no || $api eq name || $api =~ /Internal$/; die Method $meth in $src is missing version unless defined $vers; diff --git a/src/driver.h b/src/driver.h index 9d0d3de..759150d 100644 --- a/src/driver.h +++ b/src/driver.h @@ -1258,6 +1258,10 @@ typedef int typedef unsigned char * (*virDrvSecretGetValue) (virSecretPtr secret, size_t *value_size, + unsigned int flags); +typedef unsigned char * +(*virDrvSecretGetValueInternal) (virSecretPtr secret, + size_t *value_size, unsigned int flags, unsigned int internalFlags); typedef int @@ -1295,6 +1299,7 @@ struct _virSecretDriver { virDrvSecretGetXMLDesc getXMLDesc; virDrvSecretSetValue setValue; virDrvSecretGetValue getValue; +virDrvSecretGetValueInternal getValueInternal; virDrvSecretUndefine undefine; }; diff --git a/src/libvirt.c b/src/libvirt.c index 39e2041..34acede 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -12680,7 +12680,7 @@ virSecretGetValue(virSecretPtr secret, size_t *value_size, unsigned int flags) if (conn-secretDriver != NULL conn-secretDriver-getValue != NULL) { unsigned char *ret; -ret = conn-secretDriver-getValue(secret, value_size, flags, 0); +ret = conn-secretDriver-getValue(secret, value_size, flags); if (ret == NULL) goto error; return ret; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 448b06e..cb7575f 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -257,7 +257,7 @@ qemuProcessGetVolumeQcowPassphrase(virConnectPtr conn, if (conn-secretDriver == NULL || conn-secretDriver-lookupByUUID == NULL || -conn-secretDriver-getValue == NULL) { +conn-secretDriver-getValueInternal == NULL) { qemuReportError(VIR_ERR_NO_SUPPORT, %s, _(secret storage not supported)); goto cleanup; @@ -276,8 +276,8 @@ qemuProcessGetVolumeQcowPassphrase(virConnectPtr conn, enc-secrets[0]-uuid); if (secret == NULL) goto cleanup; -data = conn-secretDriver-getValue(secret, size, 0, -VIR_SECRET_GET_VALUE_INTERNAL_CALL); +data = conn-secretDriver-getValueInternal(secret, size, 0, + VIR_SECRET_GET_VALUE_INTERNAL_CALL); virUnrefSecret(secret); if (data == NULL) goto cleanup; diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index c2f8bbd..d3b5df9 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -3174,7 +3174,7 @@ remoteSecretClose (virConnectPtr conn) static unsigned char * remoteSecretGetValue (virSecretPtr secret, size_t *value_size, - unsigned int flags, unsigned int
Re: [libvirt] [libvirt-users] cannot perform tunnelled migration without using peer2peer flag
On Mon, Jul 18, 2011 at 13:11:42 -0600, Eric Blake wrote: On 07/18/2011 04:11 AM, Osier Yang wrote: 于 2011年07月18日 10:07, zhang xintao 写道: Dear All I try to migration a kvm guest os to another host failed server: ubuntu 11.04 server virsh:migrate --live --tunnelled vm1 qemu+ssh://192.168.10.3/system error:Requested operation is not valid:cannot perform tunnelled migration without using peer2peer flag The error tells you all, you need to use --p2p. That said, why can't virsh be smarter, and automatically request the right underlying flags without making the user also type --p2p? Any problems with this patch? No problem at all, I think it's a very good idea. From db318d41c2a70189c21303c824aa4a862815874a Mon Sep 17 00:00:00 2001 From: Eric Blake ebl...@redhat.com Date: Mon, 18 Jul 2011 13:10:29 -0600 Subject: [PATCH] virsh: make migrate --tunnelled imply --p2p We can make the virsh migrate UI friendlier by supplying the missing bit automatically instead of erroring out when requesting --tunnelled without --p2p. * tools/virsh.c (doMigrate): Make --p2p optional when using --tunnelled. * tools/virsh.pod (migrate): Tweak wording accordingly. --- tools/virsh.c |2 +- tools/virsh.pod |5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) So, here is my ACK Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCHv4] libvirt: do not mix internal flags into public API
On 07/18/2011 02:53 PM, Laine Stump wrote: On 07/15/2011 07:12 PM, Eric Blake wrote: There were two API in driver.c that were silently masking flags bits prior to calling out to the drivers, and several others that were explicitly masking flags bits. This is not forward-compatible - if we ever have that many flags in the future, then talking to an old server that masks out the flags would be indistinguishable from talking to a new server that can honor the flag. In general, libvirt.c should forward _all_ flags on to drivers, and only the drivers should reject unknown flags. +++ b/src/interface/netcf_driver.c @@ -344,6 +344,8 @@ static char *interfaceGetXMLDesc(virInterfacePtr ifinfo, virInterfaceDefPtr ifacedef = NULL; char *ret = NULL; + virCheckFlags(VIR_INTERFACE_XML_INACTIVE, NULL); + If I understand correctly, this is here to put the burden on the driver to check/reject flags, rather than letting the client do it. Correct. Older drivers didn't reject unknown flags, so there is a situation where a newer client talking to a pre-0.9.4 server will misbehave, but the only way to fix that would be to teach 0.9.4 client codes to reject any flags that did not exist prior to 0.9.4 before making the RPC. But from 0.9.4 onwards, all drivers will all reject unknown flags, so the RPC can blindly pass all flags through to let the remote driver decide whether or not a flag is known, instead of rejecting the flag prematurely. - ret = conn-secretDriver-getValue(secret, value_size, flags); + ret = conn-secretDriver-getValue(secret, value_size, flags, 0); So the getValue() callback has an extra internalFlags arg, but that is only non-0 if get Value is called from the driver side (e.g. from qemu), and never sent over the wire. A bit odd, but I can live with it (or you can add the v5 interdiff - either way is fine with me). I ended up pushing v4 as is, and saving the separation of driver functions for a separate patch, which I'm still not convinced whether to keep or drop. ACK, either with or without the v5 interdiff added. Having a separate callback seems cleaner, but may have other implications I'm not aware of. Now pushed. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [libvirt-users] cannot perform tunnelled migration without using peer2peer flag
On 07/18/2011 03:17 PM, Jiri Denemark wrote: That said, why can't virsh be smarter, and automatically request the right underlying flags without making the user also type --p2p? Any problems with this patch? No problem at all, I think it's a very good idea. From db318d41c2a70189c21303c824aa4a862815874a Mon Sep 17 00:00:00 2001 From: Eric Blakeebl...@redhat.com Date: Mon, 18 Jul 2011 13:10:29 -0600 Subject: [PATCH] virsh: make migrate --tunnelled imply --p2p We can make the virsh migrate UI friendlier by supplying the missing bit automatically instead of erroring out when requesting --tunnelled without --p2p. * tools/virsh.c (doMigrate): Make --p2p optional when using --tunnelled. * tools/virsh.pod (migrate): Tweak wording accordingly. --- tools/virsh.c |2 +- tools/virsh.pod |5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) So, here is my ACK Pushed. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] virsh: make vcpucount use --current consistently
Rename the existing --current flag to the new name --active, while adding a new flag --current to expose the new VIR_DOMAIN_AFFECT_CURRENT flag of virDomainGetVcpusFlags. For backwards compability, the output does not change (even though the label current no longer matches the spelling of the option that would trigger that number in isolation), and we accept --current --live as an undocumented synonym for --active --live to avoid breaking any existing clients. * tools/virsh.c (cmdVcpucount): Add --active flag, and rearrange existing flag handling to expose VIR_DOMAIN_AFFECT_CURRENT support. * tools/virsh.pod (vcpucount): Document this. --- Incorporating my proposal from: https://www.redhat.com/archives/libvir-list/2011-July/msg01099.html tools/virsh.c | 77 +++--- tools/virsh.pod | 19 - 2 files changed, 67 insertions(+), 29 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index 2eb218a..65fba0f 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -2738,9 +2738,11 @@ static const vshCmdInfo info_vcpucount[] = { static const vshCmdOptDef opts_vcpucount[] = { {domain, VSH_OT_DATA, VSH_OFLAG_REQ, N_(domain name, id or uuid)}, {maximum, VSH_OT_BOOL, 0, N_(get maximum cap on vcpus)}, -{current, VSH_OT_BOOL, 0, N_(get current vcpu usage)}, -{config, VSH_OT_BOOL, 0, N_(get value to be used on next boot)}, +{active, VSH_OT_BOOL, 0, N_(get number of currently active vcpus)}, {live, VSH_OT_BOOL, 0, N_(get value from running domain)}, +{config, VSH_OT_BOOL, 0, N_(get value to be used on next boot)}, +{current, VSH_OT_BOOL, 0, + N_(get value according to current domain state)}, {NULL, 0, 0, NULL} }; @@ -2750,31 +2752,50 @@ cmdVcpucount(vshControl *ctl, const vshCmd *cmd) virDomainPtr dom; bool ret = true; int maximum = vshCommandOptBool(cmd, maximum); -int current = vshCommandOptBool(cmd, current); +int active = vshCommandOptBool(cmd, active); int config = vshCommandOptBool(cmd, config); int live = vshCommandOptBool(cmd, live); -bool all = maximum + current + config + live == 0; +int current = vshCommandOptBool(cmd, current); +bool all = maximum + active + current + config + live == 0; int count; -if (maximum current) { -vshError(ctl, %s, - _(--maximum and --current cannot both be specified)); +/* We want one of each pair of mutually exclusive options; that + * is, use of flags requires exactly two options. We reject the + * use of more than 2 flags later on. */ +if (maximum + active + current + config + live == 1) { +if (maximum || active) { +vshError(ctl, + _(when using --%s, one of --config, --live, or --current + must be specified), + maximum ? maximum : active); +} else { +vshError(ctl, + _(when using --%s, either --maximum or --active must be + specified), + (current ? current : config ? config : live)); +} return false; } -if (config live) { + +/* Backwards compatibility: prior to 0.9.4, + * VIR_DOMAIN_AFFECT_CURRENT was unsupported, and --current meant + * the opposite of --maximum. Translate the old '--current + * --live' into the new '--active --live', while treating the new + * '--maximum --current' correctly rather than rejecting it as + * '--maximum --active'. */ +if (!maximum !active current) { +current = false; +active = true; +} + +if (maximum active) { vshError(ctl, %s, - _(--config and --live cannot both be specified)); + _(--maximum and --active cannot both be specified)); return false; } -/* We want one of each pair of mutually exclusive options; that - * is, use of flags requires exactly two options. */ -if (maximum + current + config + live == 1) { -vshError(ctl, - _(when using --%s, either --%s or --%s must be specified), - (maximum ? maximum : current ? current - : config ? config : live), - maximum + current ? config : maximum, - maximum + current ? live : current); +if (current + config + live 1) { +vshError(ctl, %s, + _(--config, --live, and --current are mutually exclusive)); return false; } @@ -2785,8 +2806,20 @@ cmdVcpucount(vshControl *ctl, const vshCmd *cmd) return false; /* In all cases, try the new API first; if it fails because we are - * talking to an older client, try a fallback API before giving - * up. */ + * talking to an older client, generally we try a fallback API + * before giving up. --current requires the new API, since we + * don't know whether the domain is running or
[libvirt] [PATCH 00/10] Rollback migration when libvirtd restarts
This is the rest of the original 19 patch series updated with some bugfixes and rebased on current master, which is also available at https://gitorious.org/~jirka/libvirt/jirka-staging/commits/migration-recovery I didn't manage to run this through the libvirt-tck migration test but I did some additional testing with the following combinations of libvirt (current contains this series): source destination client --- current current current current current 0.9.2-1.el6 current 0.9.2-2.fc15current 0.9.2-2.fc15current 0.9.2-1.el6 In all combinations I did - normal migration from src to dst - normal migration back - p2p migration from src to dst - p2p migration back To test failure recovery, I aborted the migration or libvirtd deamons at various places. All this was done with a single running domain without restarting it. Jiri Denemark (10): qemu: Implement migration job phases qemu: Migration job on destination daemon qemu: Migration job on source daemon qemu: Recover from interrupted migration qemu: Remove special case for virDomainGetBlockInfo qemu: Remove special case for virDomainBlockStats qemu: Remove special case for virDomainMigrateSetMaxSpeed qemu: Remove special case for virDomainMigrateSetMaxDowntime qemu: Remove special case for virDomainSuspend qemu: Remove special case for virDomainAbortJob include/libvirt/libvirt.h.in |3 + src/libvirt.c| 27 ++- src/libvirt_internal.h |6 + src/qemu/MIGRATION.txt | 55 src/qemu/qemu_domain.c | 18 +- src/qemu/qemu_domain.h | 31 +-- src/qemu/qemu_driver.c | 285 - src/qemu/qemu_migration.c| 574 -- src/qemu/qemu_migration.h| 40 +++- src/qemu/qemu_process.c | 121 +- 10 files changed, 750 insertions(+), 410 deletions(-) create mode 100644 src/qemu/MIGRATION.txt -- 1.7.6 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 01/10] qemu: Implement migration job phases
This patch introduces several helper methods to deal with jobs and phases during migration in a simpler manner. --- src/qemu/MIGRATION.txt| 55 +++ src/qemu/qemu_domain.c|5 ++ src/qemu/qemu_migration.c | 91 + src/qemu/qemu_migration.h | 37 ++ 4 files changed, 188 insertions(+), 0 deletions(-) create mode 100644 src/qemu/MIGRATION.txt diff --git a/src/qemu/MIGRATION.txt b/src/qemu/MIGRATION.txt new file mode 100644 index 000..6c32998 --- /dev/null +++ b/src/qemu/MIGRATION.txt @@ -0,0 +1,55 @@ +QEMU Migration Locking Rules + + +Migration is a complicated beast which may span across several APIs on both +source and destination side and we need to keep the domain we are migrating in +a consistent state during the whole process. + +To avoid anyone from changing the domain in the middle of migration we need to +keep MIGRATION_OUT job active during migration from Begin to Confirm on the +source side and MIGRATION_IN job has to be active from Prepare to Finish on +the destination side. + +For this purpose we introduce several helper methods to deal with locking +primitives (described in THREADS.txt) in the right way: + +* qemuMigrationJobStart + +* qemuMigrationJobContinue + +* qemuMigrationJobStartPhase + +* qemuMigrationJobSetPhase + +* qemuMigrationJobFinish + +The sequence of calling qemuMigrationJob* helper methods is as follows: + +- The first API of a migration protocol (Prepare or Perform/Begin depending on + migration type and version) has to start migration job and keep it active: + + qemuMigrationJobStart(driver, vm, QEMU_JOB_MIGRATION_{IN,OUT}); + qemuMigrationJobSetPhase(driver, vm, QEMU_MIGRATION_PHASE_*); + ...do work... + qemuMigrationJobContinue(vm); + +- All consequent phases except for the last one have to keep the job active: + + if (!qemuMigrationJobIsActive(vm, QEMU_JOB_MIGRATION_{IN,OUT})) + return; + qemuMigrationJobStartPhase(driver, vm, QEMU_MIGRATION_PHASE_*); + ...do work... + qemuMigrationJobContinue(vm); + +- The last migration phase finally finishes the migration job: + + if (!qemuMigrationJobIsActive(vm, QEMU_JOB_MIGRATION_{IN,OUT})) + return; + qemuMigrationJobStartPhase(driver, vm, QEMU_MIGRATION_PHASE_*); + ...do work... + qemuMigrationJobFinish(driver, vm); + +While migration job is running (i.e., after qemuMigrationJobStart* but before +qemuMigrationJob{Continue,Finish}), migration phase can be advanced using + + qemuMigrationJobSetPhase(driver, vm, QEMU_MIGRATION_PHASE_*); diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index f9755a4..fee562d 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -26,6 +26,7 @@ #include qemu_domain.h #include qemu_command.h #include qemu_capabilities.h +#include qemu_migration.h #include memory.h #include logging.h #include virterror_internal.h @@ -72,6 +73,8 @@ qemuDomainAsyncJobPhaseToString(enum qemuDomainAsyncJob job, switch (job) { case QEMU_ASYNC_JOB_MIGRATION_OUT: case QEMU_ASYNC_JOB_MIGRATION_IN: +return qemuMigrationJobPhaseTypeToString(phase); + case QEMU_ASYNC_JOB_SAVE: case QEMU_ASYNC_JOB_DUMP: case QEMU_ASYNC_JOB_NONE: @@ -92,6 +95,8 @@ qemuDomainAsyncJobPhaseFromString(enum qemuDomainAsyncJob job, switch (job) { case QEMU_ASYNC_JOB_MIGRATION_OUT: case QEMU_ASYNC_JOB_MIGRATION_IN: +return qemuMigrationJobPhaseTypeFromString(phase); + case QEMU_ASYNC_JOB_SAVE: case QEMU_ASYNC_JOB_DUMP: case QEMU_ASYNC_JOB_NONE: diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index dfa80e3..9659e8d 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -46,6 +46,19 @@ #define VIR_FROM_THIS VIR_FROM_QEMU +VIR_ENUM_IMPL(qemuMigrationJobPhase, QEMU_MIGRATION_PHASE_LAST, + none, + preform2, + begin3, + perform3, + perform3_done, + confirm3_cancelled, + confirm3, + prepare, + finish2, + finish3, +); + enum qemuMigrationCookieFlags { QEMU_MIGRATION_COOKIE_FLAG_GRAPHICS, QEMU_MIGRATION_COOKIE_FLAG_LOCKSTATE, @@ -2770,3 +2783,81 @@ cleanup: } return ret; } + +int +qemuMigrationJobStart(struct qemud_driver *driver, + virDomainObjPtr vm, + enum qemuDomainAsyncJob job) +{ +qemuDomainObjPrivatePtr priv = vm-privateData; + +if (qemuDomainObjBeginAsyncJobWithDriver(driver, vm, job) 0) +return -1; + +if (job == QEMU_ASYNC_JOB_MIGRATION_IN) +qemuDomainObjSetAsyncJobMask(vm, QEMU_JOB_NONE); +else +qemuDomainObjSetAsyncJobMask(vm, DEFAULT_JOB_MASK); + +priv-job.info.type = VIR_DOMAIN_JOB_UNBOUNDED; + +return 0; +} + +void
[libvirt] [PATCH 05/10] qemu: Remove special case for virDomainGetBlockInfo
Like other query commands, this can now be called directly during migration. --- src/qemu/qemu_domain.h|4 src/qemu/qemu_driver.c| 42 -- src/qemu/qemu_migration.c | 14 -- 3 files changed, 12 insertions(+), 48 deletions(-) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 45fae55..37c9515 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -77,7 +77,6 @@ enum qemuDomainJobSignals { QEMU_JOB_SIGNAL_MIGRATE_DOWNTIME = 1 2, /* Request migration downtime change */ QEMU_JOB_SIGNAL_MIGRATE_SPEED = 1 3, /* Request migration speed change */ QEMU_JOB_SIGNAL_BLKSTAT = 1 4, /* Request blkstat during migration */ -QEMU_JOB_SIGNAL_BLKINFO = 1 5, /* Request blkinfo during migration */ }; struct qemuDomainJobSignalsData { @@ -86,9 +85,6 @@ struct qemuDomainJobSignalsData { char *statDevName; /* Device name used by blkstat calls */ virDomainBlockStatsPtr blockStat; /* Block statistics for QEMU_JOB_SIGNAL_BLKSTAT */ int *statRetCode; /* Return code for the blkstat calls */ -char *infoDevName; /* Device name used by blkinfo calls */ -virDomainBlockInfoPtr blockInfo; /* Block information for QEMU_JOB_SIGNAL_BLKINFO */ -int *infoRetCode; /* Return code for the blkinfo calls */ }; struct qemuDomainJobObj { diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index b378cb7..407f52f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6510,39 +6510,21 @@ static int qemuDomainGetBlockInfo(virDomainPtr dom, virDomainObjIsActive(vm)) { qemuDomainObjPrivatePtr priv = vm-privateData; -if ((priv-job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) -|| (priv-job.asyncJob == QEMU_ASYNC_JOB_SAVE)) { -virDomainObjRef(vm); -while (priv-job.signals QEMU_JOB_SIGNAL_BLKINFO) -ignore_value(virCondWait(priv-job.signalCond, vm-lock)); - -priv-job.signalsData.infoDevName = disk-info.alias; -priv-job.signalsData.blockInfo = info; -priv-job.signalsData.infoRetCode = ret; -priv-job.signals |= QEMU_JOB_SIGNAL_BLKINFO; - -while (priv-job.signals QEMU_JOB_SIGNAL_BLKINFO) -ignore_value(virCondWait(priv-job.signalCond, vm-lock)); +if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) 0) +goto cleanup; -if (virDomainObjUnref(vm) == 0) -vm = NULL; +if (virDomainObjIsActive(vm)) { +ignore_value(qemuDomainObjEnterMonitor(driver, vm)); +ret = qemuMonitorGetBlockExtent(priv-mon, +disk-info.alias, +info-allocation); +qemuDomainObjExitMonitor(driver, vm); } else { -if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) 0) -goto cleanup; - -if (virDomainObjIsActive(vm)) { -ignore_value(qemuDomainObjEnterMonitor(driver, vm)); -ret = qemuMonitorGetBlockExtent(priv-mon, -disk-info.alias, -info-allocation); -qemuDomainObjExitMonitor(driver, vm); -} else { -ret = 0; -} - -if (qemuDomainObjEndJob(driver, vm) == 0) -vm = NULL; +ret = 0; } + +if (qemuDomainObjEndJob(driver, vm) == 0) +vm = NULL; } else { ret = 0; } diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 3eeb67f..d90219c 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -819,20 +819,6 @@ qemuMigrationProcessJobSignals(struct qemud_driver *driver, if (ret 0) VIR_WARN(Unable to get block statistics); -} else if (priv-job.signals QEMU_JOB_SIGNAL_BLKINFO) { -ret = qemuDomainObjEnterMonitorWithDriver(driver, vm); -if (ret == 0) { -ret = qemuMonitorGetBlockExtent(priv-mon, - priv-job.signalsData.infoDevName, - priv-job.signalsData.blockInfo-allocation); -qemuDomainObjExitMonitorWithDriver(driver, vm); -} - -*priv-job.signalsData.infoRetCode = ret; -priv-job.signals ^= QEMU_JOB_SIGNAL_BLKINFO; - -if (ret 0) -VIR_WARN(Unable to get block information); } else { ret = 0; } -- 1.7.6 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 02/10] qemu: Migration job on destination daemon
Make MIGRATION_IN use the new helper methods. --- src/qemu/qemu_domain.c|2 +- src/qemu/qemu_domain.h|1 - src/qemu/qemu_migration.c | 97 ++-- 3 files changed, 41 insertions(+), 59 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index fee562d..d2f03dd 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -645,7 +645,7 @@ void qemuDomainSetNamespaceHooks(virCapsPtr caps) caps-ns.href = qemuDomainDefNamespaceHref; } -void +static void qemuDomainObjSaveJob(struct qemud_driver *driver, virDomainObjPtr obj) { if (!virDomainObjIsActive(obj)) { diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 2ba6007..45fae55 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -187,7 +187,6 @@ int qemuDomainObjEndAsyncJob(struct qemud_driver *driver, void qemuDomainObjEndNestedJob(struct qemud_driver *driver, virDomainObjPtr obj); -void qemuDomainObjSaveJob(struct qemud_driver *driver, virDomainObjPtr obj); void qemuDomainObjSetJobPhase(struct qemud_driver *driver, virDomainObjPtr obj, int phase); diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 9659e8d..6e7117b 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1130,9 +1130,9 @@ qemuMigrationPrepareAny(struct qemud_driver *driver, QEMU_MIGRATION_COOKIE_LOCKSTATE))) goto cleanup; -if (qemuDomainObjBeginAsyncJobWithDriver(driver, vm, - QEMU_ASYNC_JOB_MIGRATION_IN) 0) +if (qemuMigrationJobStart(driver, vm, QEMU_ASYNC_JOB_MIGRATION_IN) 0) goto cleanup; +qemuMigrationJobSetPhase(driver, vm, QEMU_MIGRATION_PHASE_PREPARE); /* Domain starts inactive, even if the domain XML had an id field. */ vm-def-id = -1; @@ -1190,28 +1190,19 @@ qemuMigrationPrepareAny(struct qemud_driver *driver, event = virDomainEventNewFromObj(vm, VIR_DOMAIN_EVENT_STARTED, VIR_DOMAIN_EVENT_STARTED_MIGRATED); -ret = 0; -endjob: -if (qemuDomainObjEndAsyncJob(driver, vm) == 0) { -vm = NULL; -} else if (!vm-persistent !virDomainObjIsActive(vm)) { -virDomainRemoveInactive(driver-domains, vm); +/* We keep the job active across API calls until the finish() call. + * This prevents any other APIs being invoked while incoming + * migration is taking place. + */ +if (qemuMigrationJobContinue(vm) == 0) { vm = NULL; +qemuReportError(VIR_ERR_OPERATION_FAILED, +%s, _(domain disappeared)); +goto cleanup; } -/* We set a fake job active which is held across - * API calls until the finish() call. This prevents - * any other APIs being invoked while incoming - * migration is taking place - */ -if (vm -virDomainObjIsActive(vm)) { -priv-job.asyncJob = QEMU_ASYNC_JOB_MIGRATION_IN; -qemuDomainObjSaveJob(driver, vm); -priv-job.info.type = VIR_DOMAIN_JOB_UNBOUNDED; -priv-job.start = now; -} +ret = 0; cleanup: virDomainDefFree(def); @@ -1223,6 +1214,15 @@ cleanup: qemuDomainEventQueue(driver, event); qemuMigrationCookieFree(mig); return ret; + +endjob: +if (qemuMigrationJobFinish(driver, vm) == 0) { +vm = NULL; +} else if (!vm-persistent) { +virDomainRemoveInactive(driver-domains, vm); +vm = NULL; +} +goto cleanup; } @@ -2397,27 +2397,23 @@ qemuMigrationFinish(struct qemud_driver *driver, virDomainPtr dom = NULL; virDomainEventPtr event = NULL; int newVM = 1; -qemuDomainObjPrivatePtr priv = NULL; qemuMigrationCookiePtr mig = NULL; +virErrorPtr orig_err = NULL; + VIR_DEBUG(driver=%p, dconn=%p, vm=%p, cookiein=%s, cookieinlen=%d, cookieout=%p, cookieoutlen=%p, flags=%lx, retcode=%d, driver, dconn, vm, NULLSTR(cookiein), cookieinlen, cookieout, cookieoutlen, flags, retcode); -virErrorPtr orig_err = NULL; -priv = vm-privateData; -if (priv-job.asyncJob != QEMU_ASYNC_JOB_MIGRATION_IN) { -qemuReportError(VIR_ERR_NO_DOMAIN, -_(domain '%s' is not processing incoming migration), vm-def-name); +if (!qemuMigrationJobIsActive(vm, QEMU_ASYNC_JOB_MIGRATION_IN)) goto cleanup; -} -qemuDomainObjDiscardAsyncJob(driver, vm); -if (!(mig = qemuMigrationEatCookie(driver, vm, cookiein, cookieinlen, 0))) -goto cleanup; +qemuMigrationJobStartPhase(driver, vm, + v3proto ? QEMU_MIGRATION_PHASE_FINISH3 + : QEMU_MIGRATION_PHASE_FINISH2); -if (qemuDomainObjBeginJobWithDriver(driver, vm,
[libvirt] [PATCH 06/10] qemu: Remove special case for virDomainBlockStats
Like other query commands, this can now be called directly during migration. --- src/qemu/qemu_domain.h|4 --- src/qemu/qemu_driver.c| 54 +++-- src/qemu/qemu_migration.c | 18 --- 3 files changed, 18 insertions(+), 58 deletions(-) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 37c9515..a3acaf5 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -76,15 +76,11 @@ enum qemuDomainJobSignals { QEMU_JOB_SIGNAL_SUSPEND = 1 1, /* Request VM suspend to finish live migration offline */ QEMU_JOB_SIGNAL_MIGRATE_DOWNTIME = 1 2, /* Request migration downtime change */ QEMU_JOB_SIGNAL_MIGRATE_SPEED = 1 3, /* Request migration speed change */ -QEMU_JOB_SIGNAL_BLKSTAT = 1 4, /* Request blkstat during migration */ }; struct qemuDomainJobSignalsData { unsigned long long migrateDowntime; /* Data for QEMU_JOB_SIGNAL_MIGRATE_DOWNTIME */ unsigned long migrateBandwidth; /* Data for QEMU_JOB_SIGNAL_MIGRATE_SPEED */ -char *statDevName; /* Device name used by blkstat calls */ -virDomainBlockStatsPtr blockStat; /* Block statistics for QEMU_JOB_SIGNAL_BLKSTAT */ -int *statRetCode; /* Return code for the blkstat calls */ }; struct qemuDomainJobObj { diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 407f52f..f6cd433 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6056,46 +6056,28 @@ qemudDomainBlockStats (virDomainPtr dom, } priv = vm-privateData; -if ((priv-job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) -|| (priv-job.asyncJob == QEMU_ASYNC_JOB_SAVE)) { -virDomainObjRef(vm); -while (priv-job.signals QEMU_JOB_SIGNAL_BLKSTAT) -ignore_value(virCondWait(priv-job.signalCond, vm-lock)); - -priv-job.signalsData.statDevName = disk-info.alias; -priv-job.signalsData.blockStat = stats; -priv-job.signalsData.statRetCode = ret; -priv-job.signals |= QEMU_JOB_SIGNAL_BLKSTAT; - -while (priv-job.signals QEMU_JOB_SIGNAL_BLKSTAT) -ignore_value(virCondWait(priv-job.signalCond, vm-lock)); - -if (virDomainObjUnref(vm) == 0) -vm = NULL; -} else { -if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) 0) -goto cleanup; +if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) 0) +goto cleanup; -if (!virDomainObjIsActive(vm)) { -qemuReportError(VIR_ERR_OPERATION_INVALID, -%s, _(domain is not running)); -goto endjob; -} +if (!virDomainObjIsActive(vm)) { +qemuReportError(VIR_ERR_OPERATION_INVALID, +%s, _(domain is not running)); +goto endjob; +} -ignore_value(qemuDomainObjEnterMonitor(driver, vm)); -ret = qemuMonitorGetBlockStatsInfo(priv-mon, - disk-info.alias, - stats-rd_req, - stats-rd_bytes, - stats-wr_req, - stats-wr_bytes, - stats-errs); -qemuDomainObjExitMonitor(driver, vm); +ignore_value(qemuDomainObjEnterMonitor(driver, vm)); +ret = qemuMonitorGetBlockStatsInfo(priv-mon, + disk-info.alias, + stats-rd_req, + stats-rd_bytes, + stats-wr_req, + stats-wr_bytes, + stats-errs); +qemuDomainObjExitMonitor(driver, vm); endjob: -if (qemuDomainObjEndJob(driver, vm) == 0) -vm = NULL; -} +if (qemuDomainObjEndJob(driver, vm) == 0) +vm = NULL; cleanup: if (vm) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index d90219c..52262e2 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -801,24 +801,6 @@ qemuMigrationProcessJobSignals(struct qemud_driver *driver, } if (ret 0) VIR_WARN(Unable to set migration speed); -} else if (priv-job.signals QEMU_JOB_SIGNAL_BLKSTAT) { -ret = qemuDomainObjEnterMonitorWithDriver(driver, vm); -if (ret == 0) { -ret = qemuMonitorGetBlockStatsInfo(priv-mon, - priv-job.signalsData.statDevName, - priv-job.signalsData.blockStat-rd_req, - priv-job.signalsData.blockStat-rd_bytes, - priv-job.signalsData.blockStat-wr_req, - priv-job.signalsData.blockStat-wr_bytes, - priv-job.signalsData.blockStat-errs); -
[libvirt] [PATCH 04/10] qemu: Recover from interrupted migration
--- src/qemu/qemu_process.c | 110 ++- 1 files changed, 109 insertions(+), 1 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 448b06e..48bd435 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -37,6 +37,7 @@ #include qemu_hostdev.h #include qemu_hotplug.h #include qemu_bridge_filter.h +#include qemu_migration.h #if HAVE_NUMACTL # define NUMA_VERSION1_COMPATIBILITY 1 @@ -2236,6 +2237,111 @@ qemuProcessUpdateState(struct qemud_driver *driver, virDomainObjPtr vm) } static int +qemuProcessRecoverMigration(struct qemud_driver *driver, +virDomainObjPtr vm, +virConnectPtr conn, +enum qemuDomainAsyncJob job, +enum qemuMigrationJobPhase phase, +virDomainState state, +int reason) +{ +if (job == QEMU_ASYNC_JOB_MIGRATION_IN) { +switch (phase) { +case QEMU_MIGRATION_PHASE_NONE: +case QEMU_MIGRATION_PHASE_PERFORM2: +case QEMU_MIGRATION_PHASE_BEGIN3: +case QEMU_MIGRATION_PHASE_PERFORM3: +case QEMU_MIGRATION_PHASE_PERFORM3_DONE: +case QEMU_MIGRATION_PHASE_CONFIRM3_CANCELLED: +case QEMU_MIGRATION_PHASE_CONFIRM3: +case QEMU_MIGRATION_PHASE_LAST: +break; + +case QEMU_MIGRATION_PHASE_PREPARE: +VIR_DEBUG(Killing unfinished incoming migration for domain %s, + vm-def-name); +return -1; + +case QEMU_MIGRATION_PHASE_FINISH2: +/* source domain is already killed so let's just resume the domain + * and hope we are all set */ +VIR_DEBUG(Incoming migration finished, resuming domain %s, + vm-def-name); +if (qemuProcessStartCPUs(driver, vm, conn, + VIR_DOMAIN_RUNNING_UNPAUSED) 0) { +VIR_WARN(Could not resume domain %s, vm-def-name); +} +break; + +case QEMU_MIGRATION_PHASE_FINISH3: +/* migration finished, we started resuming the domain but didn't + * confirm success or failure yet; killing it seems safest */ +VIR_DEBUG(Killing migrated domain %s, vm-def-name); +return -1; +} +} else if (job == QEMU_ASYNC_JOB_MIGRATION_OUT) { +switch (phase) { +case QEMU_MIGRATION_PHASE_NONE: +case QEMU_MIGRATION_PHASE_PREPARE: +case QEMU_MIGRATION_PHASE_FINISH2: +case QEMU_MIGRATION_PHASE_FINISH3: +case QEMU_MIGRATION_PHASE_LAST: +break; + +case QEMU_MIGRATION_PHASE_BEGIN3: +/* nothing happen so far, just forget we were about to migrate the + * domain */ +break; + +case QEMU_MIGRATION_PHASE_PERFORM2: +case QEMU_MIGRATION_PHASE_PERFORM3: +/* migration is still in progress, let's cancel it and resume the + * domain */ +VIR_DEBUG(Canceling unfinished outgoing migration of domain %s, + vm-def-name); +/* TODO cancel possibly running migrate operation */ +/* resume the domain but only if it was paused as a result of + * migration */ +if (state == VIR_DOMAIN_PAUSED +(reason == VIR_DOMAIN_PAUSED_MIGRATION || + reason == VIR_DOMAIN_PAUSED_UNKNOWN)) { +if (qemuProcessStartCPUs(driver, vm, conn, + VIR_DOMAIN_RUNNING_UNPAUSED) 0) { +VIR_WARN(Could not resume domain %s, vm-def-name); +} +} +break; + +case QEMU_MIGRATION_PHASE_PERFORM3_DONE: +/* migration finished but we didn't have a chance to get the result + * of Finish3 step; third party needs to check what to do next + */ +break; + +case QEMU_MIGRATION_PHASE_CONFIRM3_CANCELLED: +/* Finish3 failed, we need to resume the domain */ +VIR_DEBUG(Resuming domain %s after failed migration, + vm-def-name); +if (state == VIR_DOMAIN_PAUSED +(reason == VIR_DOMAIN_PAUSED_MIGRATION || + reason == VIR_DOMAIN_PAUSED_UNKNOWN)) { +if (qemuProcessStartCPUs(driver, vm, conn, + VIR_DOMAIN_RUNNING_UNPAUSED) 0) { +VIR_WARN(Could not resume domain %s, vm-def-name); +} +} +break; + +case QEMU_MIGRATION_PHASE_CONFIRM3: +/* migration completed, we need to kill the domain here */ +return -1; +} +} + +return 0; +} + +static int qemuProcessRecoverJob(struct qemud_driver *driver, virDomainObjPtr vm,
[libvirt] [PATCH 07/10] qemu: Remove special case for virDomainMigrateSetMaxSpeed
Call qemu monitor command directly within a special job that is only allowed during outgoing migration. --- src/qemu/qemu_domain.c|1 + src/qemu/qemu_domain.h|3 +-- src/qemu/qemu_driver.c| 23 +++ src/qemu/qemu_migration.c | 21 + src/qemu/qemu_process.c |1 + 5 files changed, 23 insertions(+), 26 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d2f03dd..7748592 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -52,6 +52,7 @@ VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST, destroy, suspend, modify, + migration operation, none, /* async job is never stored in job.active */ async nested, ); diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index a3acaf5..fa4e182 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -49,6 +49,7 @@ enum qemuDomainJob { QEMU_JOB_DESTROY, /* Destroys the domain (cannot be masked out) */ QEMU_JOB_SUSPEND, /* Suspends (stops vCPUs) the domain */ QEMU_JOB_MODIFY,/* May change state */ +QEMU_JOB_MIGRATION_OP, /* Operation influencing outgoing migration */ /* The following two items must always be the last items before JOB_LAST */ QEMU_JOB_ASYNC, /* Asynchronous job */ @@ -75,12 +76,10 @@ enum qemuDomainJobSignals { QEMU_JOB_SIGNAL_CANCEL = 1 0, /* Request job cancellation */ QEMU_JOB_SIGNAL_SUSPEND = 1 1, /* Request VM suspend to finish live migration offline */ QEMU_JOB_SIGNAL_MIGRATE_DOWNTIME = 1 2, /* Request migration downtime change */ -QEMU_JOB_SIGNAL_MIGRATE_SPEED = 1 3, /* Request migration speed change */ }; struct qemuDomainJobSignalsData { unsigned long long migrateDowntime; /* Data for QEMU_JOB_SIGNAL_MIGRATE_DOWNTIME */ -unsigned long migrateBandwidth; /* Data for QEMU_JOB_SIGNAL_MIGRATE_SPEED */ }; struct qemuDomainJobObj { diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f6cd433..f0c6489 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7436,19 +7436,23 @@ qemuDomainMigrateSetMaxSpeed(virDomainPtr dom, qemuDriverLock(driver); vm = virDomainFindByUUID(driver-domains, dom-uuid); +qemuDriverUnlock(driver); if (!vm) { char uuidstr[VIR_UUID_STRING_BUFLEN]; virUUIDFormat(dom-uuid, uuidstr); qemuReportError(VIR_ERR_NO_DOMAIN, _(no domain with matching uuid '%s'), uuidstr); -goto cleanup; +return -1; } +if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MIGRATION_OP) 0) +goto cleanup; + if (!virDomainObjIsActive(vm)) { qemuReportError(VIR_ERR_OPERATION_INVALID, %s, _(domain is not running)); -goto cleanup; +goto endjob; } priv = vm-privateData; @@ -7456,18 +7460,21 @@ qemuDomainMigrateSetMaxSpeed(virDomainPtr dom, if (priv-job.asyncJob != QEMU_ASYNC_JOB_MIGRATION_OUT) { qemuReportError(VIR_ERR_OPERATION_INVALID, %s, _(domain is not being migrated)); -goto cleanup; +goto endjob; } -VIR_DEBUG(Requesting migration speed change to %luMbs, bandwidth); -priv-job.signalsData.migrateBandwidth = bandwidth; -priv-job.signals |= QEMU_JOB_SIGNAL_MIGRATE_SPEED; -ret = 0; +VIR_DEBUG(Setting migration bandwidth to %luMbs, bandwidth); +ignore_value(qemuDomainObjEnterMonitor(driver, vm)); +ret = qemuMonitorSetMigrationSpeed(priv-mon, bandwidth); +qemuDomainObjExitMonitor(driver, vm); + +endjob: +if (qemuDomainObjEndJob(driver, vm) == 0) +vm = NULL; cleanup: if (vm) virDomainObjUnlock(vm); -qemuDriverUnlock(driver); return ret; } diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 52262e2..7c5583b 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -788,19 +788,6 @@ qemuMigrationProcessJobSignals(struct qemud_driver *driver, } if (ret 0) VIR_WARN(Unable to set migration downtime); -} else if (priv-job.signals QEMU_JOB_SIGNAL_MIGRATE_SPEED) { -unsigned long bandwidth = priv-job.signalsData.migrateBandwidth; - -priv-job.signals ^= QEMU_JOB_SIGNAL_MIGRATE_SPEED; -priv-job.signalsData.migrateBandwidth = 0; -VIR_DEBUG(Setting migration bandwidth to %luMbs, bandwidth); -ret = qemuDomainObjEnterMonitorWithDriver(driver, vm); -if (ret == 0) { -ret = qemuMonitorSetMigrationSpeed(priv-mon, bandwidth); -qemuDomainObjExitMonitorWithDriver(driver, vm); -} -if (ret 0) -VIR_WARN(Unable to set migration speed); } else { ret = 0; } @@ -2865,10 +2852,12 @@ qemuMigrationJobStart(struct qemud_driver *driver, if
[libvirt] [PATCH 09/10] qemu: Remove special case for virDomainSuspend
--- src/qemu/qemu_domain.h|1 - src/qemu/qemu_driver.c| 46 ++-- src/qemu/qemu_migration.c |6 + 3 files changed, 24 insertions(+), 29 deletions(-) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 1593257..503b9ad 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -74,7 +74,6 @@ enum qemuDomainAsyncJob { enum qemuDomainJobSignals { QEMU_JOB_SIGNAL_CANCEL = 1 0, /* Request job cancellation */ -QEMU_JOB_SIGNAL_SUSPEND = 1 1, /* Request VM suspend to finish live migration offline */ }; struct qemuDomainJobObj { diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 99fab1a..6b8cbc9 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1327,6 +1327,8 @@ static int qemudDomainSuspend(virDomainPtr dom) { int ret = -1; virDomainEventPtr event = NULL; qemuDomainObjPrivatePtr priv; +virDomainPausedReason reason; +int eventDetail; qemuDriverLock(driver); vm = virDomainFindByUUID(driver-domains, dom-uuid); @@ -1353,34 +1355,32 @@ static int qemudDomainSuspend(virDomainPtr dom) { priv = vm-privateData; if (priv-job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) { -if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_PAUSED) { -VIR_DEBUG(Requesting domain pause on %s, - vm-def-name); -priv-job.signals |= QEMU_JOB_SIGNAL_SUSPEND; -} -ret = 0; -goto cleanup; +reason = VIR_DOMAIN_PAUSED_MIGRATION; +eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED; } else { -if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_SUSPEND) 0) -goto cleanup; +reason = VIR_DOMAIN_PAUSED_USER; +eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_PAUSED; +} -if (!virDomainObjIsActive(vm)) { -qemuReportError(VIR_ERR_OPERATION_INVALID, -%s, _(domain is not running)); +if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_SUSPEND) 0) +goto cleanup; + +if (!virDomainObjIsActive(vm)) { +qemuReportError(VIR_ERR_OPERATION_INVALID, +%s, _(domain is not running)); +goto endjob; +} +if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_PAUSED) { +if (qemuProcessStopCPUs(driver, vm, reason) 0) { goto endjob; } -if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_PAUSED) { -if (qemuProcessStopCPUs(driver, vm, VIR_DOMAIN_PAUSED_USER) 0) { -goto endjob; -} -event = virDomainEventNewFromObj(vm, - VIR_DOMAIN_EVENT_SUSPENDED, - VIR_DOMAIN_EVENT_SUSPENDED_PAUSED); -} -if (virDomainSaveStatus(driver-caps, driver-stateDir, vm) 0) -goto endjob; -ret = 0; +event = virDomainEventNewFromObj(vm, + VIR_DOMAIN_EVENT_SUSPENDED, + eventDetail); } +if (virDomainSaveStatus(driver-caps, driver-stateDir, vm) 0) +goto endjob; +ret = 0; endjob: if (qemuDomainObjEndJob(driver, vm) == 0) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index dbf0412..bcd020f 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -770,11 +770,6 @@ qemuMigrationProcessJobSignals(struct qemud_driver *driver, if (ret 0) { VIR_WARN(Unable to cancel job); } -} else if (priv-job.signals QEMU_JOB_SIGNAL_SUSPEND) { -priv-job.signals ^= QEMU_JOB_SIGNAL_SUSPEND; -VIR_DEBUG(Pausing domain for non-live migration); -if (qemuMigrationSetOffline(driver, vm) 0) -VIR_WARN(Unable to pause domain); } else { ret = 0; } @@ -2843,6 +2838,7 @@ qemuMigrationJobStart(struct qemud_driver *driver, qemuDomainObjSetAsyncJobMask(vm, QEMU_JOB_NONE); } else { qemuDomainObjSetAsyncJobMask(vm, DEFAULT_JOB_MASK | + JOB_MASK(QEMU_JOB_SUSPEND) | JOB_MASK(QEMU_JOB_MIGRATION_OP)); } -- 1.7.6 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 08/10] qemu: Remove special case for virDomainMigrateSetMaxDowntime
Call qemu monitor command directly within a special job that is only allowed during outgoing migration. --- src/qemu/qemu_domain.c|1 - src/qemu/qemu_domain.h|6 -- src/qemu/qemu_driver.c| 23 +++ src/qemu/qemu_migration.c | 13 - 4 files changed, 15 insertions(+), 28 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 7748592..4c43e8b 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -186,7 +186,6 @@ qemuDomainObjResetAsyncJob(qemuDomainObjPrivatePtr priv) job-start = 0; memset(job-info, 0, sizeof(job-info)); job-signals = 0; -memset(job-signalsData, 0, sizeof(job-signalsData)); } void diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index fa4e182..1593257 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -75,11 +75,6 @@ enum qemuDomainAsyncJob { enum qemuDomainJobSignals { QEMU_JOB_SIGNAL_CANCEL = 1 0, /* Request job cancellation */ QEMU_JOB_SIGNAL_SUSPEND = 1 1, /* Request VM suspend to finish live migration offline */ -QEMU_JOB_SIGNAL_MIGRATE_DOWNTIME = 1 2, /* Request migration downtime change */ -}; - -struct qemuDomainJobSignalsData { -unsigned long long migrateDowntime; /* Data for QEMU_JOB_SIGNAL_MIGRATE_DOWNTIME */ }; struct qemuDomainJobObj { @@ -95,7 +90,6 @@ struct qemuDomainJobObj { virCond signalCond; /* Use to coordinate the safe queries during migration */ unsigned int signals; /* Signals for running job */ -struct qemuDomainJobSignalsData signalsData;/* Signal specific data */ }; typedef struct _qemuDomainPCIAddressSet qemuDomainPCIAddressSet; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f0c6489..99fab1a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7387,19 +7387,23 @@ qemuDomainMigrateSetMaxDowntime(virDomainPtr dom, qemuDriverLock(driver); vm = virDomainFindByUUID(driver-domains, dom-uuid); +qemuDriverUnlock(driver); if (!vm) { char uuidstr[VIR_UUID_STRING_BUFLEN]; virUUIDFormat(dom-uuid, uuidstr); qemuReportError(VIR_ERR_NO_DOMAIN, _(no domain with matching uuid '%s'), uuidstr); -goto cleanup; +return -1; } +if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MIGRATION_OP) 0) +goto cleanup; + if (!virDomainObjIsActive(vm)) { qemuReportError(VIR_ERR_OPERATION_INVALID, %s, _(domain is not running)); -goto cleanup; +goto endjob; } priv = vm-privateData; @@ -7407,18 +7411,21 @@ qemuDomainMigrateSetMaxDowntime(virDomainPtr dom, if (priv-job.asyncJob != QEMU_ASYNC_JOB_MIGRATION_OUT) { qemuReportError(VIR_ERR_OPERATION_INVALID, %s, _(domain is not being migrated)); -goto cleanup; +goto endjob; } -VIR_DEBUG(Requesting migration downtime change to %llums, downtime); -priv-job.signalsData.migrateDowntime = downtime; -priv-job.signals |= QEMU_JOB_SIGNAL_MIGRATE_DOWNTIME; -ret = 0; +VIR_DEBUG(Setting migration downtime to %llums, downtime); +ignore_value(qemuDomainObjEnterMonitor(driver, vm)); +ret = qemuMonitorSetMigrationDowntime(priv-mon, downtime); +qemuDomainObjExitMonitor(driver, vm); + +endjob: +if (qemuDomainObjEndJob(driver, vm) == 0) +vm = NULL; cleanup: if (vm) virDomainObjUnlock(vm); -qemuDriverUnlock(driver); return ret; } diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 7c5583b..dbf0412 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -775,19 +775,6 @@ qemuMigrationProcessJobSignals(struct qemud_driver *driver, VIR_DEBUG(Pausing domain for non-live migration); if (qemuMigrationSetOffline(driver, vm) 0) VIR_WARN(Unable to pause domain); -} else if (priv-job.signals QEMU_JOB_SIGNAL_MIGRATE_DOWNTIME) { -unsigned long long ms = priv-job.signalsData.migrateDowntime; - -priv-job.signals ^= QEMU_JOB_SIGNAL_MIGRATE_DOWNTIME; -priv-job.signalsData.migrateDowntime = 0; -VIR_DEBUG(Setting migration downtime to %llums, ms); -ret = qemuDomainObjEnterMonitorWithDriver(driver, vm); -if (ret == 0) { -ret = qemuMonitorSetMigrationDowntime(priv-mon, ms); -qemuDomainObjExitMonitorWithDriver(driver, vm); -} -if (ret 0) -VIR_WARN(Unable to set migration downtime); } else { ret = 0; } -- 1.7.6 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 10/10] qemu: Remove special case for virDomainAbortJob
This doesn't abort migration job in any phase, yet. --- src/qemu/qemu_domain.c|9 +--- src/qemu/qemu_domain.h| 12 +++--- src/qemu/qemu_driver.c| 36 ++--- src/qemu/qemu_migration.c | 48 - src/qemu/qemu_process.c | 12 +- 5 files changed, 39 insertions(+), 78 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 4c43e8b..0afa8db 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -52,6 +52,7 @@ VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST, destroy, suspend, modify, + abort, migration operation, none, /* async job is never stored in job.active */ async nested, @@ -158,12 +159,6 @@ qemuDomainObjInitJob(qemuDomainObjPrivatePtr priv) return -1; } -if (virCondInit(priv-job.signalCond) 0) { -ignore_value(virCondDestroy(priv-job.cond)); -ignore_value(virCondDestroy(priv-job.asyncCond)); -return -1; -} - return 0; } @@ -185,7 +180,6 @@ qemuDomainObjResetAsyncJob(qemuDomainObjPrivatePtr priv) job-mask = DEFAULT_JOB_MASK; job-start = 0; memset(job-info, 0, sizeof(job-info)); -job-signals = 0; } void @@ -208,7 +202,6 @@ qemuDomainObjFreeJob(qemuDomainObjPrivatePtr priv) { ignore_value(virCondDestroy(priv-job.cond)); ignore_value(virCondDestroy(priv-job.asyncCond)); -ignore_value(virCondDestroy(priv-job.signalCond)); } diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 503b9ad..679259f 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -38,7 +38,9 @@ # define JOB_MASK(job) (1 (job - 1)) # define DEFAULT_JOB_MASK \ -(JOB_MASK(QEMU_JOB_QUERY) | JOB_MASK(QEMU_JOB_DESTROY)) +(JOB_MASK(QEMU_JOB_QUERY) | \ + JOB_MASK(QEMU_JOB_DESTROY) | \ + JOB_MASK(QEMU_JOB_ABORT)) /* Only 1 job is allowed at any time * A job includes *all* monitor commands, even those just querying @@ -49,6 +51,7 @@ enum qemuDomainJob { QEMU_JOB_DESTROY, /* Destroys the domain (cannot be masked out) */ QEMU_JOB_SUSPEND, /* Suspends (stops vCPUs) the domain */ QEMU_JOB_MODIFY,/* May change state */ +QEMU_JOB_ABORT, /* Abort current async job */ QEMU_JOB_MIGRATION_OP, /* Operation influencing outgoing migration */ /* The following two items must always be the last items before JOB_LAST */ @@ -72,10 +75,6 @@ enum qemuDomainAsyncJob { QEMU_ASYNC_JOB_LAST }; -enum qemuDomainJobSignals { -QEMU_JOB_SIGNAL_CANCEL = 1 0, /* Request job cancellation */ -}; - struct qemuDomainJobObj { virCond cond; /* Use to coordinate jobs */ enum qemuDomainJob active; /* Currently running job */ @@ -86,9 +85,6 @@ struct qemuDomainJobObj { unsigned long long mask;/* Jobs allowed during async job */ unsigned long long start; /* When the async job started */ virDomainJobInfo info; /* Async job progress data */ - -virCond signalCond; /* Use to coordinate the safe queries during migration */ -unsigned int signals; /* Signals for running job */ }; typedef struct _qemuDomainPCIAddressSet qemuDomainPCIAddressSet; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 6b8cbc9..31748f3 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7347,24 +7347,36 @@ static int qemuDomainAbortJob(virDomainPtr dom) { goto cleanup; } -priv = vm-privateData; +if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_ABORT) 0) +goto cleanup; if (virDomainObjIsActive(vm)) { -if (priv-job.asyncJob) { -VIR_DEBUG(Requesting cancellation of job on vm %s, vm-def-name); -priv-job.signals |= QEMU_JOB_SIGNAL_CANCEL; -} else { -qemuReportError(VIR_ERR_OPERATION_INVALID, -%s, _(no job is active on the domain)); -goto cleanup; -} -} else { qemuReportError(VIR_ERR_OPERATION_INVALID, %s, _(domain is not running)); -goto cleanup; +goto endjob; } -ret = 0; +priv = vm-privateData; + +if (!priv-job.asyncJob) { +qemuReportError(VIR_ERR_OPERATION_INVALID, +%s, _(no job is active on the domain)); +goto endjob; +} else if (priv-job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_IN) { +qemuReportError(VIR_ERR_OPERATION_INVALID, %s, +_(cannot abort incoming migration; + use virDomainDestroy instead)); +goto endjob; +} + +VIR_DEBUG(Cancelling job at client request); +ignore_value(qemuDomainObjEnterMonitor(driver, vm)); +ret =
Re: [libvirt] [PATCH v3 4/6] qemu: Implement period and quota tunable XML configuration and parsing
At 07/19/2011 04:35 AM, Adam Litke Write: This is looking good to me. I am pleased to see that the global case is handled as expected when per-vcpu threads are not active. On 07/18/2011 04:42 AM, Wen Congyang wrote: snip +int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) +{ +virCgroupPtr cgroup = NULL; +virCgroupPtr cgroup_vcpu = NULL; +qemuDomainObjPrivatePtr priv = vm-privateData; +int rc; +unsigned int i; +unsigned long long period = vm-def-cputune.period; +long long quota = vm-def-cputune.quota; + +if (driver-cgroup == NULL) +return 0; /* Not supported, so claim success */ I just want to check: Is the above logic still valid? Failure to apply This logic is the same as the logic in the similar function qemuSetupCgroup(). a quota setting (0) could have fairly substantial implications for system performance. I am not convinced either way but I do think this point merits some discussion. + +rc = virCgroupForDomain(driver-cgroup, vm-def-name, cgroup, 0); +if (rc != 0) { +virReportSystemError(-rc, + _(Unable to find cgroup for %s), + vm-def-name); +goto cleanup; +} + +if (priv-nvcpupids == 0 || priv-vcpupids[0] == vm-pid) { +/* If we does not know VCPU-PID mapping or all vcpu runs in the same + * thread, we can not control each vcpu. + */ +if (period || quota) { +if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { +if (qemuSetupCgroupVcpuBW(cgroup, period, quota) 0) +goto cleanup; +} +} +return 0; +} + +for (i = 0; i priv-nvcpupids; i++) { +rc = virCgroupForVcpu(cgroup, i, cgroup_vcpu, 1); +if (rc 0) { +virReportSystemError(-rc, + _(Unable to create vcpu cgroup for %s(vcpu: +%d)), + vm-def-name, i); +goto cleanup; +} + +/* move the thread for vcpu to sub dir */ +rc = virCgroupAddTask(cgroup_vcpu, priv-vcpupids[i]); +if (rc 0) { +virReportSystemError(-rc, + _(unable to add vcpu %d task %d to cgroup), + i, priv-vcpupids[i]); +goto cleanup; +} + +if (period || quota) { +if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { +if (qemuSetupCgroupVcpuBW(cgroup_vcpu, period, quota) 0) +goto cleanup; +} +} + +virCgroupFree(cgroup_vcpu); +} + +virCgroupFree(cgroup_vcpu); +virCgroupFree(cgroup); +return 0; + +cleanup: +virCgroupFree(cgroup_vcpu); +if (cgroup) { +virCgroupRemove(cgroup); +virCgroupFree(cgroup); +} + +return -1; +} + int qemuRemoveCgroup(struct qemud_driver *driver, virDomainObjPtr vm, -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v3 4/6] qemu: Implement period and quota tunable XML configuration and parsing
At 07/19/2011 04:59 AM, Adam Litke Write: On 07/18/2011 04:42 AM, Wen Congyang wrote: +int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) +{ +virCgroupPtr cgroup = NULL; +virCgroupPtr cgroup_vcpu = NULL; +qemuDomainObjPrivatePtr priv = vm-privateData; +int rc; +unsigned int i; +unsigned long long period = vm-def-cputune.period; +long long quota = vm-def-cputune.quota; + +if (driver-cgroup == NULL) +return 0; /* Not supported, so claim success */ + +rc = virCgroupForDomain(driver-cgroup, vm-def-name, cgroup, 0); +if (rc != 0) { +virReportSystemError(-rc, + _(Unable to find cgroup for %s), + vm-def-name); +goto cleanup; +} + +if (priv-nvcpupids == 0 || priv-vcpupids[0] == vm-pid) { +/* If we does not know VCPU-PID mapping or all vcpu runs in the same + * thread, we can not control each vcpu. + */ +if (period || quota) { +if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPU)) { +if (qemuSetupCgroupVcpuBW(cgroup, period, quota) 0) +goto cleanup; +} +} +return 0; +} I found a problem above. In the case where we are controlling quota at the domain level cgroup we must multiply the user-specified quota by the number of vcpus in the domain in order to get the same performance as we would with per-vcpu cgroups. As written, the vm will be essentially capped at 1 vcpu worth of quota regardless of the number of vcpus. You will also have to apply this logic in reverse when reporting the scheduler statistics so that the quota number is a per-vcpu quantity. When quota is 1000, and per-vcpu thread is not active, we can start vm successfully. When the per-vcpu thread is active, and the num of vcpu is more than 1, we can not start vm if we multiply the user-specified quota. It will confuse the user: sometimes the vm can be started, but sometimes the vm can not be started with the same configuration. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH RFC v3 5/6] qemu: Implement cfs_period and cfs_quota's modification
At 07/19/2011 04:44 AM, Adam Litke Write: On 07/18/2011 04:42 AM, Wen Congyang wrote: @@ -5983,7 +6169,30 @@ out: goto cleanup; } -*nparams = 1; +if (*nparams 1) { +params[1].value.ul = period; +params[1].type = VIR_TYPED_PARAM_ULLONG; +if (virStrcpyStatic(params[1].field, cfs_period) == NULL) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, +%s, +_(Field cfs_period too long for destination)); +goto cleanup; +} + +params[2].value.ul = quota; Possible buffer overflow if *nparams == 2 ... Yes, I forgot check the value :( +params[2].type = VIR_TYPED_PARAM_LLONG; +if (virStrcpyStatic(params[2].field, cfs_quota) == NULL) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, +%s, +_(Field cfs_quota too long for destination)); +goto cleanup; +} + +*nparams = 3; +} else { +*nparams = 1; +} + ret = 0; cleanup: -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH RFC v3 0/6] support cpu bandwidth in libvirt
At 07/18/2011 09:36 PM, Lee Schermerhorn Write: On Mon, 2011-07-18 at 17:34 +0800, Wen Congyang wrote: TODO: 1. We create sub directory for each vcpu in cpu subsystem. So we should recalculate cpu.shares for each vcpu. Is the per vcpu cgroup optional? I.e., is is possible to set the period and quota for the entire domain and let the host scheduler deal with it? Caveat: Domain level CFS shares seems to work well--work well here means behaves as I expected ;-). I have no experience with the period/quota facility and libvirt domains, so maybe it doesn't make The quota's value means that all tasks in this task group as a whole will not be allowedto consume more than quota(unit: us) worth of runtime within a period of period(unit: us). If per-vcpu thread is active, each vcpu has a thread. If one vcpu consume quota worth of runtime, the other vcpu will be hunger. sense to cap cpu utilization at the domain level. Regards, Lee Changelog: v3: fix some small bugs implement the simple way v2: almost rewrite the patchset to support to control each vcpu's bandwidth. Limit quota to [-1, 2^64/1000] at the schemas level. We will check it at cgroup level. Wen Congyang (6): Introduce the function virCgroupForVcpu cgroup: Implement cpu.cfs_period_us and cpu.cfs_quota_us tuning API Update XML Schema for new entries qemu: Implement period and quota tunable XML configuration and parsing qemu: Implement cfs_period and cfs_quota's modification doc: Add documentation for new cputune elements period and quota docs/formatdomain.html.in | 19 ++ docs/schemas/domain.rng | 26 +++- src/conf/domain_conf.c | 20 ++- src/conf/domain_conf.h |2 + src/libvirt_private.syms|5 + src/qemu/qemu_cgroup.c | 127 +++ src/qemu/qemu_cgroup.h |4 + src/qemu/qemu_driver.c | 259 --- src/qemu/qemu_process.c |4 + src/util/cgroup.c | 153 +- src/util/cgroup.h | 11 + tests/qemuxml2argvdata/qemuxml2argv-cputune.xml |2 + 12 files changed, 596 insertions(+), 36 deletions(-) -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] doc: Correct docs for iface commands
于 2011年07月19日 03:24, Eric Blake 写道: On 07/18/2011 05:45 AM, Osier Yang wrote: duplicate documents for iface-name, lacks of document for iface-mac, inconsistent option names with virsh help strings. --- tools/virsh.pod | 16 ++-- 1 files changed, 10 insertions(+), 6 deletions(-) diff --git a/tools/virsh.pod b/tools/virsh.pod index 377fac0..4f218d3 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -1081,20 +1081,24 @@ Returns the list of active host interfaces. If I--all is specified this will also include defined but inactive interfaces. If I--inactive is specified only the inactive ones will be listed. -=item Biface-name Iiface-MAC +=item Biface-name Iinterface We have other cases of man page names differing from virsh.c names. For example, 'virsh help domid' documents 'domain' along with later mentioning domain name or uuid under OPTIONS, but 'man virsh' merely documents 'domid domain-name-or-uuid' in its synopsis. I'm okay with fixing the synopsis to match virsh, but we should probably scrub things to fix all commands to use consistent names. At any rate, this is strict improvement, so: ACK. Pushed, thanks, and I'm looking at the whole virsh manual to work out a patch fixes the all. Osier -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v3 5/8] undefine: Implement internal API for libxl driver
于 2011年07月19日 02:43, Eric Blake 写道: On 07/18/2011 01:53 AM, Osier Yang wrote: This is just similiar as changes on qemu driver. * src/libxl/libxl_driver.c: New callback for libxl_driver, new function libxlDomainUndefineFlags, and changes libxlDomainUndefine as a wrapper of libxlDomainUndefineFlags. --- src/libxl/libxl_driver.c | 34 +- 1 files changed, 33 insertions(+), 1 deletions(-) ACK. +char *name = NULL; int ret = -1; +virCheckFlags(VIR_DOMAIN_UNDEFINE_MANAGED_SAVE, -1); Oh, here you went with MANAGED_SAVE, instead of MANAGED_STATE in patch 4/8 v3. Double check that each patch in your series compiles cleanly before pushing. I'm creating a whole v4 series (without the patch for virsh dominfo), in case of I miss something again, I'm always careless. :( Osier -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list