Re: [libvirt] [PATCH] libvirt: add daemon itself as shutdown reason

2018-11-07 Thread Nikolay Shirokovskiy



On 07.11.2018 17:29, John Ferlan wrote:
> 
> 
> On 10/8/18 7:21 AM, Nikolay Shirokovskiy wrote:
>> Let's introduce shutdown reason "daemon" which means we have to
>> kill running domain ourselves as the best action we can take at
>> that moment. Failure to pick up domain on daemon restart is
>> one example of such case. Using reason "crashed" is a bit misleading
>> as it is used when qemu is actually crashed or qemu destroyed on
>> guest panic as result of user choice that is the problem was
>> in qemu/guest itself. So I propose to use "crashed" only for
>> qemu side issues and introduce "daemon" when we have to kill the qemu
>> for good.
>>
>> There is another example where "daemon" will be useful. If we can
>> not reboot domain we kill it and got "crashed" reason now. Looks
>> like good candidate for "daemon" reason.
>>
>> Signed-off-by: Nikolay Shirokovskiy 
>> ---
>>  include/libvirt/libvirt-domain.h |  1 +
>>  src/conf/domain_conf.c   |  3 ++-
>>  src/qemu/qemu_process.c  | 11 ---
>>  tools/virsh-domain-monitor.c |  3 ++-
>>  4 files changed, 9 insertions(+), 9 deletions(-)
>>
> 
> [...]
> 
> Now that I've pushed commits 296e05b54 and 8f0f8425d, let's revisit
> this...  Merging in that set of changes with this patch means that
> instead of:
> 
>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>> index e9c7618..c4bc9ca 100644
>> --- a/src/qemu/qemu_process.c
>> +++ b/src/qemu/qemu_process.c
>> @@ -7988,15 +7988,12 @@ qemuProcessReconnect(void *opaque)
>>  /* We can't get the monitor back, so must kill the VM
>>   * to remove danger of it ending up running twice if
>>   * user tries to start it again later
>> - * If we couldn't get the monitor since QEMU supports
>> - * no-shutdown, we can safely say that the domain
>> - * crashed ... */
>> -state = VIR_DOMAIN_SHUTOFF_CRASHED;
>> -/* If BeginJob failed, we jumped here without a job, let's hope 
>> another
>> + * If BeginJob failed, we jumped here without a job, let's hope 
>> another
>>   * thread didn't have a chance to start playing with the domain yet
>>   * (it's all we can do anyway).
>>   */
>> -qemuProcessStop(driver, obj, state, QEMU_ASYNC_JOB_NONE, stopFlags);
>> +qemuProcessStop(driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
>> +QEMU_ASYNC_JOB_NONE, stopFlags);
>>  }
>>  goto cleanup;
>>  }
>> @@ -8035,7 +8032,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
>>   * is no thread that could be doing anything else with the same 
>> domain
>>   * object.
>>   */
>> -qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
>> +qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
>>  QEMU_ASYNC_JOB_NONE, 0);
>>  qemuDomainRemoveInactiveJobLocked(src->driver, obj);
>>  
> 
> We'd have:
> 
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index 3bf84834ec..023e187729 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -7995,10 +7995,14 @@ qemuProcessReconnect(void *opaque)
>   *
>   * If we cannot get to the monitor when the QEMU command
>   * line used -no-shutdown, then we can safely say that the
> - * domain crashed; otherwise, we don't really know. */
> + * domain crashed; otherwise, if the monitor was started,
> + * then we can blame ourselves, else we failed before the
> + * monitor started so we don't really know. */
>  if (!priv->mon && tryMonReconn &&
>  qemuDomainIsUsingNoShutdown(priv))
>  state = VIR_DOMAIN_SHUTOFF_CRASHED;
> +else if (priv->mon)
> +state = VIR_DOMAIN_SHUTOFF_DAEMON;
>  else
>  state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
> 
> @@ -8045,7 +8049,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
>   * is no thread that could be doing anything else with the same
> domain
>   * object.
>   */
> -qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
> +qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
>  QEMU_ASYNC_JOB_NONE, 0);

I think we need to report VIR_DOMAIN_SHUTOFF_UNKNOWN here. This is similar to 
qemuProcessReconnect before we try to reconnect. May be this hunk need to be 
placed in distinct patch therefore.

>  qemuDomainRemoveInactiveJobLocked(src->driver, obj);
> 
> [...]
> 
> So does this essentially meet/handle the condition you were trying to
> alter?  That is - once we get beyond the monitor reconnection, then if
> we end up in error that means the daemon has decided to stop things.
> Leaving the UNKNOWN to be the period prior to attempting to reconnect
> and CRASHED to the period during which we try to connect to the monitor.
> 
> This also captures failures after connection is successful (when

Re: [libvirt] [PATCH] virNetworkDefUpdateDNSHost: Require both IP and a hostname to match

2018-11-07 Thread W. Trevor King
On Wed, Nov 07, 2018 at 11:17:51AM -0500, Laine Stump wrote:
> On 11/6/18 12:10 PM, W. Trevor King wrote:
> > But HOST records can have the same hostname for multiple records
> > (similar to TXT records with the same value).  The value that
> > needs to be distinct for HOST records is the IP address.
> 
> You're going to force me to go dig out the decades-old DNS RFCs,
> aren't you?

The relevant RFCs are probably [1,2].  But as far as they're concened
IP <-> domain-name mappings are many to many.  That doesn't really
impact libvirt's  entries.  For example:

  IP   hostname
  192.168.1.1  alice
  192.168.1.1  bob
  192.168.1.2  alice

could be represented with unique IPs:

  

  alice
  bob


  alice

  

or with repeated IPs:

  

  alice


  bob


  alice

  

Depending on how generous you were feeling, you could also support
overlaps:

  

  alice
  bob


  bob


  alice

  

I know libvirt supports the unique-IP form.  I haven't looked to see
whether it supports the repeated IP form (with or without overlaps).

> (but in the meantime, if something was working before (e.g. removing
> an entry by hostname alone) then that needs to continue working,
> otherwise some application depending on that behavior will now be
> broken, and we've made promises about that not happening with
> libvirt.

I think continuing to support the IP/hostname-union logic makes things
too complicated.  Would you have to detect the "we match too many
things and would ordinarily error out" case so you could switch on the
"they must expect this to be an IP/hostname-intersection match" logic?
How would you feel about a v2 function (a la dup2), which callers
switch on with VIR_NETWORK_SECTION_DNS_HOST2 or something to opt in to
the new logic?  This would also give you a way to notice the old
section variable and log deprecation notices (does the libary have
hooks for logging?) to help get the word out, and help set the stage
for possibly removing the old logic after a few years.

> > 1. You can now delete entries from an existing network like:
> >
> >  
> >
> >  example
> >
> >
> >  example
> >
> >  
> >
> >   with input like:
> >
> >
> >
> >
> >   or:
> >
> >
> >  example
> >
> >
> >   Previously, only the former would work (the latter used to raise
> >   "multiple matching DNS HOST records were found in network").
> 
> 
> Without thinking too much about it, that (the "multiple matching
> "  error) sounds like a bug that can/should be fixed - since
> both hostname and ip find only a single record, there shouldn't be
> an error.

Both  elements in the original network are "example", so
they are not unique.  The current logic's IP/hostname-union logic
matches both, and errors out.  I don't think this is a bug we can fix
without something ugly like "if IP/hostname-union gives multiple
matches, implicitly switch to IP/hostname-intersection" or a way to
explicitly select the more-specific logic.

> > 3. You can now add multiple entries with a common hostname (as long as
> >they have distinct IP addresses).  Previously, adding:
> >
> >  
> >example
> >  
> >
> >to an existing:
> >
> >  
> >example
> >  
> >
> >would have raised "there is already at least one DNS HOST record
> >with a matching field in network".
> > ---
> >
> > I'm actually not clear on whether the 'ip' attribute is required
> > to be unique or not.
> 
> Well, *something* needs to be unique. Either one of the fields, or
> the combination of some fields. If possible, decisions about that
> should be based on the RFCs, and then if the backend implementation
> (dnsmasq in this case) is any different, that should be treated as a
> different kind of error.

We've had no trouble creating networks where multiple IP share a
common hostname in libvirt.  The only problem I'm aware of (and the
reason for this patch) is that libvirt doesn't allow you to *update*
those networks.  You currently have to delete them and then re-create
them, after which you need to re-attach your domains to the new
network.

Cheers,
Trevor

[1]: https://tools.ietf.org/html/rfc1035
[2]: https://tools.ietf.org/html/rfc1794

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] build: Fix uninstall when WITH_APPARMOR_PROFILES is defined

2018-11-07 Thread Jim Fehlig
When libvirt configuration includes '--with-apparmor-profiles', the
make uninstall target fails

make[1]: Entering directory '/home/jim/upstream/libvirt/examples'
 ( cd '/etc/apparmor.d//abstractions' && rm -f libvirt-qemu libvirt-lxc )
 ( cd '/etc/apparmor.d/' && rm -f usr.lib.libvirt.virt-aa-helper 
usr.sbin.libvirtd )
make[1]: *** No rule to make target 'uninstall-apparmor-local', needed by
'uninstall-local'.  Stop.

Add missing 'uninstall-apparmor-local' target to the examples Makefile.am.

Signed-off-by: Jim Fehlig 
---
 examples/Makefile.am | 4 
 1 file changed, 4 insertions(+)

diff --git a/examples/Makefile.am b/examples/Makefile.am
index 7069d74e74..27f8b0ef09 100644
--- a/examples/Makefile.am
+++ b/examples/Makefile.am
@@ -96,6 +96,10 @@ install-apparmor-local:
'usr.lib.libvirt.virt-aa-helper'" \
>$(APPARMOR_LOCAL_DIR)/usr.lib.libvirt.virt-aa-helper
 
+uninstall-apparmor-local:
+   rm -f "$(APPARMOR_LOCAL_DIR)/usr.lib.libvirt.virt-aa-helper"
+   rmdir $(APPARMOR_LOCAL_DIR) || :
+
 INSTALL_DATA_LOCAL += install-apparmor-local
 UNINSTALL_LOCAL += uninstall-apparmor-local
 endif WITH_APPARMOR_PROFILES
-- 
2.18.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 1/2] util: Fix virCgroupGetMemoryStat

2018-11-07 Thread John Ferlan
From: Peter Chubb 

Commit 901d2b9c introduced virCgroupGetMemoryStat and replaced
the LXC virLXCCgroupGetMemStat logic in commit e634c7cd0. However,
in doing so the replacement wasn't exact as the LXC logic used
getline() to process the cgroup controller data, while the new
virCgroupGetMemoryStat used "memory.stat" manual buffer read/
processing which neglected to forward through @line in order
to read each line in the output.

To fix that, we should be sure to carry forward the @line value
for each line read updating it beyond that current @newLine value
once we've calculated the values that we want.

Signed-off-by: John Ferlan 
---
 src/util/vircgroupv1.c | 7 ++-
 src/util/vircgroupv2.c | 7 ++-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/util/vircgroupv1.c b/src/util/vircgroupv1.c
index 28a74474ee..678ffda35c 100644
--- a/src/util/vircgroupv1.c
+++ b/src/util/vircgroupv1.c
@@ -1476,7 +1476,7 @@ virCgroupV1GetMemoryStat(virCgroupPtr group,
 
 line = stat;
 
-while (line) {
+while (line && *line) {
 char *newLine = strchr(line, '\n');
 char *valueStr = strchr(line, ' ');
 unsigned long long value;
@@ -1506,6 +1506,11 @@ virCgroupV1GetMemoryStat(virCgroupPtr group,
 inactiveFileVal = value >> 10;
 else if (STREQ(line, "unevictable"))
 unevictableVal = value >> 10;
+
+if (newLine)
+line = newLine + 1;
+else
+break;
 }
 
 *cache = cacheVal;
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c
index 32adb06784..541e8e790e 100644
--- a/src/util/vircgroupv2.c
+++ b/src/util/vircgroupv2.c
@@ -1068,7 +1068,7 @@ virCgroupV2GetMemoryStat(virCgroupPtr group,
 
 line = stat;
 
-while (line) {
+while (line && *line) {
 char *newLine = strchr(line, '\n');
 char *valueStr = strchr(line, ' ');
 unsigned long long value;
@@ -1102,6 +1102,11 @@ virCgroupV2GetMemoryStat(virCgroupPtr group,
 inactiveFileVal = value >> 10;
 else if (STREQ(line, "unevictable"))
 unevictableVal = value >> 10;
+
+if (newLine)
+line = newLine + 1;
+else
+break;
 }
 
 *cache = cacheVal;
-- 
2.17.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 2/2] tests: Augment vcgrouptest to add virCgroupGetMemoryStat

2018-11-07 Thread John Ferlan
Add a test to fetch the GetMemoryStat output. This only gets
data for v1 only right now since the v2 data from commit 61ff6021
is rather useless returning all 0's.

Signed-off-by: John Ferlan 
---
 tests/vircgrouptest.c | 61 +++
 1 file changed, 61 insertions(+)

diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c
index 310e1fb6a2..06c4a8ef5c 100644
--- a/tests/vircgrouptest.c
+++ b/tests/vircgrouptest.c
@@ -802,6 +802,64 @@ static int testCgroupGetMemoryUsage(const void *args 
ATTRIBUTE_UNUSED)
 return ret;
 }
 
+
+static int
+testCgroupGetMemoryStat(const void *args ATTRIBUTE_UNUSED)
+{
+virCgroupPtr cgroup = NULL;
+int rv, ret = -1;
+size_t i;
+
+const unsigned long long expected_values[] = {
+1336619008ULL,
+67100672ULL,
+145887232ULL,
+661872640ULL,
+627400704UL,
+3690496ULL
+};
+const char* names[] = {
+"cache",
+"active_anon",
+"inactive_anon",
+"active_file",
+"inactive_file",
+"unevictable"
+};
+unsigned long long values[ARRAY_CARDINALITY(expected_values)];
+
+if ((rv = virCgroupNewPartition("/virtualmachines", true,
+(1 << VIR_CGROUP_CONTROLLER_MEMORY),
+)) < 0) {
+fprintf(stderr, "Could not create /virtualmachines cgroup: %d\n", -rv);
+goto cleanup;
+}
+
+if ((rv = virCgroupGetMemoryStat(cgroup, [0],
+ [1], [2],
+ [3], [4],
+ [5])) < 0) {
+fprintf(stderr, "Could not retrieve GetMemoryStat for /virtualmachines 
cgroup: %d\n", -rv);
+goto cleanup;
+}
+
+for (i = 0; i < ARRAY_CARDINALITY(expected_values); i++) {
+if (expected_values[i] != (values[i] << 10)) {
+fprintf(stderr,
+"Wrong value (%llu) for %s from virCgroupGetMemoryStat 
(expected %llu)\n",
+values[i], names[i], expected_values[i]);
+goto cleanup;
+}
+}
+
+ret = 0;
+
+ cleanup:
+virCgroupFree();
+return ret;
+}
+
+
 static int testCgroupGetBlkioIoServiced(const void *args ATTRIBUTE_UNUSED)
 {
 virCgroupPtr cgroup = NULL;
@@ -1035,6 +1093,9 @@ mymain(void)
 if (virTestRun("virCgroupGetMemoryUsage works", testCgroupGetMemoryUsage, 
NULL) < 0)
 ret = -1;
 
+if (virTestRun("virCgroupGetMemoryStat works", testCgroupGetMemoryStat, 
NULL) < 0)
+ret = -1;
+
 if (virTestRun("virCgroupGetPercpuStats works", testCgroupGetPercpuStats, 
NULL) < 0)
 ret = -1;
 cleanupFakeFS(fakerootdir);
-- 
2.17.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 0/2] Fix virCgroupGetMemoryStat memory.stat parsing

2018-11-07 Thread John Ferlan
Originally reported here:

https://www.redhat.com/archives/libvir-list/2018-November/msg00252.html

John Ferlan (1):
  tests: Augment vcgrouptest to add virCgroupGetMemoryStat

Peter Chubb (1):
  util: Fix virCgroupGetMemoryStat

 src/util/vircgroupv1.c |  7 -
 src/util/vircgroupv2.c |  7 -
 tests/vircgrouptest.c  | 61 ++
 3 files changed, 73 insertions(+), 2 deletions(-)

-- 
2.17.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Parsing of memory.stat in libvirt

2018-11-07 Thread John Ferlan



On 11/6/18 3:25 AM, peter.ch...@data61.csiro.au wrote:
> 
> Hi Folks,
>libvirt currently spams the logs with
> 
> error : virCgroupGetMemoryStat:2490 : internal error: Cannot parse 
> 'memory.stat' cgroup file
> 
>whenever someone does ps in a container.
> 
> This is because the parser for memory/stat is incorrect: the `line'
> variable is never updated, so each time through the loop, the same
> start-of-line is compared for the token;  and as all spaces are
> eventually replaced with NUL the error exit is taken instead of
> ending the loop properly. 
> 
> Here is a strawman patch to fix the problem. Please note, I'm not
> subscribed to this list;  I reported the bug as
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=913023 and was asked
> to send the patch upstream.
> 
>  src/util/vircgroup.c |7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 

H... well I see quite true... nice catch!

Seems commit id 901d2b9c87 added the code and commit e634c7cd0 used it
for LXC's virLXCCgroupGetMemStat (libvirt 4.7.0, so rather recent
fortunately). However that conversion seemed to missed the nuance that
getline() is a bit different than the algorithm generated parsing
through the read file.

Too bad an existing test wasn't utilized for it, because that would have
failed miserably.

An any case, the patch below is against an older version of libvirt, we
typically work from top of the git tree, but since you're not a regular
contributor I will put something together and CC you using top of tree.
Since we adhere to usage of Signed-Off-By, I'd need to have you agree to
have me add an S-O-B with your name. For now I'll put my name with you
as the author.

> --- libvirt.orig/src/util/vircgroup.c
> +++ libvirt/src/util/vircgroup.c
> @@ -2477,7 +2477,7 @@ virCgroupGetMemoryStat(virCgroupPtr grou
>  
>  line = stat;
>  
> -while (line) {
> +while (*line) {

probably should be line && *line  Since if line was for who knows what
reason NULL, then life wouldn't be happy.

Tks,

John

updated patch will be sent shortly (with any luck).

>  char *newLine = strchr(line, '\n');
>  char *valueStr = strchr(line, ' ');
>  unsigned long long value;
> @@ -2507,6 +2507,11 @@ virCgroupGetMemoryStat(virCgroupPtr grou
>  inactiveFileVal = value >> 10;
>  else if (STREQ(line, "unevictable"))
>  unevictableVal = value >> 10;
> +
> + if (newLine)
> + line = newLine + 1;
> + else
> + break;
>  }
>  
>  *cache = cacheVal;
> 
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] tests: Use correct function name in error path

2018-11-07 Thread John Ferlan
Commit id 5eb61e6846 neglected to change the name in the wrong value
output to virCgroupGetPercpuStats from virCgroupGetMemoryUsage.

Signed-off-by: John Ferlan 
---

 Pushed under the trivial rule.

 tests/vircgrouptest.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c
index 8fcee21bb2..310e1fb6a2 100644
--- a/tests/vircgrouptest.c
+++ b/tests/vircgrouptest.c
@@ -756,7 +756,7 @@ static int testCgroupGetPercpuStats(const void *args 
ATTRIBUTE_UNUSED)
 
 if (params[i].value.ul != expected[i]) {
 fprintf(stderr,
-"Wrong value from virCgroupGetMemoryUsage at %zu (expected 
%llu)\n",
+"Wrong value from virCgroupGetPercpuStats at %zu (expected 
%llu)\n",
 i, params[i].value.ul);
 goto cleanup;
 }
-- 
2.17.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC 3/3] qemu: update qemu command-line generating for NVDIMM memory

2018-11-07 Thread John Ferlan



On 10/16/18 10:21 PM, Luyao Zhong wrote:
> According to the result parsing from xml, add corresponding properties
> into QEMU command line, including 'align', 'pmem', 'persistence' and
> 'nvdimm-persistence'. And add testcases related to these properties.
> 
> Signed-off-by: Zhong,Luyao 
> ---
>  src/qemu/qemu_command.c| 25 ++
>  .../memory-hotplug-nvdimm-align.args   | 31 
>  .../memory-hotplug-nvdimm-align.xml| 58 
> ++
>  .../memory-hotplug-nvdimm-persistence.args | 31 
>  .../memory-hotplug-nvdimm-persistence.xml  | 58 
> ++
>  .../memory-hotplug-nvdimm-pmem.args| 31 
>  .../memory-hotplug-nvdimm-pmem.xml | 58 
> ++
>  .../memory-hotplug-nvdimm-unarmed.args | 31 
>  .../memory-hotplug-nvdimm-unarmed.xml  | 58 
> ++
>  tests/qemuxml2argvtest.c   | 12 +
>  10 files changed, 393 insertions(+)
>  create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-align.args
>  create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-align.xml
>  create mode 100644 
> tests/qemuxml2argvdata/memory-hotplug-nvdimm-persistence.args
>  create mode 100644 
> tests/qemuxml2argvdata/memory-hotplug-nvdimm-persistence.xml
>  create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-pmem.args
>  create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-pmem.xml
>  create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-unarmed.args
>  create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-unarmed.xml
> 

As I noted in patch2, the .xml files from xml2xmlargvdata belong there...

You are missing qemu_capabilites.{c,h} and friends changes. Many other
examples, but they essentially determine which version the attribute was
added to QEMU and only allow/use it for those versions and beyond.

When adding capabilities, we tend to make them singular patches, but
you'd only need 1 as I assume all the new switches were added for the
same qemu release.  You can note that in the commit message too.

Running make check make-check would have told you:

--- tests/qemuxml2argvdata/memory-hotplug-nvdimm-persistence.args
2018-11-07 12:55:59.165965066 -0500
+++ -   2018-11-07 12:56:21.555957343 -0500
@@ -7,7 +7,8 @@
 /usr/bin/qemu-system-i686 \
 -name QEMUGuest1 \
 -S \
--machine
pc,accel=tcg,usb=off,dump-guest-core=off,nvdimm=on,nvdimm-persistence=mem-ctrl
\
+-machine pc,accel=tcg,usb=off,dump-guest-core=off,nvdimm=on,\
+nvdimm-persistence=mem-ctrl \
 -m size=219136k,slots=16,maxmem=1099511627776k \
 -smp 2,sockets=2,cores=1,threads=1 \
 -numa node,nodeid=0,cpus=0-1,mem=214 \
Incorrect line wrapping in
tests/qemuxml2argvdata/memory-hotplug-nvdimm-persistence.args
Use test-wrap-argv.pl to wrap test data files
make: *** [cfg.mk:1139: test-wrap-argv] Error 1


You're also missing a patch to update docs/news.xml to describe your new
feature. So your simple 3 patches have probably blossomed into 12 or so.
 Don't feel the need to post all at the same time. Work through the
 changes first, post those, get feedback, maybe get them pushed,
and then go from there.

> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index 887947d..466a466 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c
> @@ -3319,6 +3319,22 @@ qemuBuildMemoryBackendProps(virJSONValuePtr 
> *backendProps,
>  if (virJSONValueObjectAdd(props, "U:size", mem->size * 1024, NULL) < 0)
>  goto cleanup;
>  
> +if (mem->alignsize) {

mem->alignsize > 0

It's one of my crusades ;-)

> +if (virJSONValueObjectAdd(props,
> +  "U:align",
> +  mem->alignsize * 1024,
> +  NULL) < 0)
> +goto cleanup;
> +}

Please multiple arguments on a line like "U:size" above - just don't go
beyond 80 char wide display...

> +
> +if (mem->nvdimmPmem) {

This would be bool, so OK.

> +if (virJSONValueObjectAdd(props,
> +  "s:pmem",
> +  
> virTristateSwitchTypeToString(mem->nvdimmPmem),
> +  NULL) < 0)

if (virJSONValueObjectAdd(props, "s:pmem", "on", NULL) < 0)

> +goto cleanup;
> +}
> +
>  if (mem->sourceNodes) {
>  nodemask = mem->sourceNodes;
>  } else {
> @@ -3480,6 +3496,10 @@ qemuBuildMemoryDeviceStr(virDomainMemoryDefPtr mem)
>  if (mem->labelsize)
>  virBufferAsprintf(, "label-size=%llu,", mem->labelsize * 
> 1024);
>  
> +if (mem->nvdimmUnarmed)
> +virBufferAsprintf(, "unarmed=%s,",
> +  
> virTristateSwitchTypeToString(mem->nvdimmUnarmed));

use virBufferAddLit here.  There's other examples.

> +
>

Re: [libvirt] [RFC 2/3] xml: update xml parsing and formating about NVDIMM memory

2018-11-07 Thread John Ferlan



NB: I had to remove "zh...@redhat.com" from the CC line since it failed
to send.

On 10/16/18 10:21 PM, Luyao Zhong wrote:
> Four new parameters were introduced into libvirt xml, including
> 'align', 'pmem', 'persistence' and 'unarmed', which are related to
> NVDIMM memory device. So we need parse and format the xml to save
> these configurations.Besides, more testcases related to these
> parameters were added to verify the xml2xml process.
> 
> Signed-off-by: Zhong,Luyao 
> ---
>  src/conf/domain_conf.c | 115 
> +++--
>  src/conf/domain_conf.h |  14 +++
>  src/libvirt_private.syms   |   2 +
>  .../memory-hotplug-nvdimm-align.xml|   1 +
>  .../memory-hotplug-nvdimm-persistence.xml  |   1 +
>  .../memory-hotplug-nvdimm-pmem.xml |   1 +
>  .../memory-hotplug-nvdimm-unarmed.xml  |   1 +
>  tests/qemuxml2xmltest.c|   4 +
>  8 files changed, 132 insertions(+), 7 deletions(-)
>  create mode 12 tests/qemuxml2xmloutdata/memory-hotplug-nvdimm-align.xml
>  create mode 12 
> tests/qemuxml2xmloutdata/memory-hotplug-nvdimm-persistence.xml
>  create mode 12 tests/qemuxml2xmloutdata/memory-hotplug-nvdimm-pmem.xml
>  create mode 12 tests/qemuxml2xmloutdata/memory-hotplug-nvdimm-unarmed.xml
> 

This patch causes failures for "make check", specifically virschematest
and qemuxml2xmltest... Seems you may have forgotten the input XML for
this patch, but included it in the next one.  This one needs the input data.

Similar to patch1 comments - you'll need to extract things out a bit,
essentially creating 3 patches from this one - although those would be
included with the 3 patches for each part of patch1.

> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 9911d56..1326116 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -932,6 +932,12 @@ VIR_ENUM_IMPL(virDomainMemoryModel,
>"dimm",
>"nvdimm")
>  
> +VIR_ENUM_IMPL(virDomainMemoryPersistence,
> +  VIR_DOMAIN_MEMORY_PERSISTENCE_LAST,
> +  "",
> +  "mem-ctrl",
> +  "cpu")
> +
>  VIR_ENUM_IMPL(virDomainShmemModel, VIR_DOMAIN_SHMEM_MODEL_LAST,
>"ivshmem",
>"ivshmem-plain",
> @@ -15656,7 +15662,9 @@ virDomainMemorySourceDefParseXML(xmlNodePtr node,
>   virDomainMemoryDefPtr def)
>  {
>  int ret = -1;
> +int val;
>  char *nodemask = NULL;
> +char *ndPmem = NULL;
>  xmlNodePtr save = ctxt->node;
>  ctxt->node = node;
>  
> @@ -15685,6 +15693,21 @@ virDomainMemorySourceDefParseXML(xmlNodePtr node,
> _("path is required for model 'nvdimm'"));
>  goto cleanup;
>  }
> +
> +if (virDomainParseMemory("./alignsize", "./alignsize/@unit", ctxt,
> + >alignsize, false, false) < 0)
> +goto cleanup;
> +
> +if ((ndPmem = virXPathString("string(./pmem)", ctxt))) {

This is a much simpler check following nosharepages' model.

> +if ((val = virTristateSwitchTypeFromString(ndPmem)) < 0) {
> +virReportError(VIR_ERR_XML_DETAIL,
> +   _("Invalid value of nvdimm 'pmem': %s"),
> +   ndPmem);
> +goto cleanup;
> +}
> +def->nvdimmPmem = val;
> +}
> +VIR_FREE(ndPmem);
>  break;
>  
>  case VIR_DOMAIN_MEMORY_MODEL_NONE:
> @@ -15696,6 +15719,7 @@ virDomainMemorySourceDefParseXML(xmlNodePtr node,
>  
>   cleanup:
>  VIR_FREE(nodemask);
> +VIR_FREE(ndPmem);
>  ctxt->node = save;
>  return ret;
>  }


> @@ -15710,6 +15734,9 @@ virDomainMemoryTargetDefParseXML(xmlNodePtr node,
>  xmlNodePtr save = ctxt->node;
>  ctxt->node = node;
>  int rv;
> +int val;
> +char *ndPrst = NULL;
> +char *ndUnarmed = NULL;
>  
>  /* initialize to value which marks that the user didn't specify it */
>  def->targetNode = -1;
> @@ -15741,11 +15768,35 @@ virDomainMemoryTargetDefParseXML(xmlNodePtr node,
> _("label size must be smaller than NVDIMM size"));
>  goto cleanup;
>  }
> +
> +if ((ndPrst = virXPathString("string(./persistence)", ctxt))) {
> +   if ((val = virDomainMemoryPersistenceTypeFromString(ndPrst)) < 0) 
> {
> +   virReportError(VIR_ERR_XML_DETAIL,
> +  _("Invalid value of nvdimm 'persistence': %s"),
> +  ndPrst);
> +   goto cleanup;
> +   }
> +   def->nvdimmPersistence = val;
> +}
> +VIR_FREE(ndPrst);

This one seems strange to place on a target since it gets added to a
machine command line.  I would think it needs to be w/ machine, because
it's not 

Re: [libvirt] [PATCH v2] lxc: Clang is complaining about possible NULL pointer.

2018-11-07 Thread John Ferlan



On 11/7/18 3:57 PM, Julio Faracco wrote:
> The array "mount" inside lxc_container is not being checked before for
> loop. Clang syntax scan is complaining about this segmentation fault.
> 
> Signed-off-by: Julio Faracco 
> ---
>  src/lxc/lxc_container.c | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 

Reviewed-by: John Ferlan 
(and pushed)

John

FWIW:
Ironically Coverity never complained about this one even though it's in
the category of things Coverity doesn't like either ;-)

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC 1/3] xml: introduce more config elements for NVDIMM memory

2018-11-07 Thread John Ferlan



On 10/16/18 10:21 PM, Luyao Zhong wrote:
> In order to align with QEMU ,four more parameters about NVDIMM will
> be introduced into Libvirt xml.
> 
> 1.alignsize
> The 'alignsize' option allows users to specify the proper alignment. When
> mmap(2) the backend files, QEMU uses the host page size by default as
> the alignment of mapping address. However, some backends may require
> alignments different from the pagesize. For example, mmap a device DAX
> on NVDIMM maybe 2M-aligned.
> 
> 2.pmem
> The 'pmem' option allows users to specify whether the backend storage of
> memory-backend-file is a real persistent memory that can be accessed in
> SNIA NVM Programming Model. If the vNVDIMM backend is in host persistent
> memory , and 'pmem' is 'on' and QEMU is built with libpmem, qemu will
> guarantee the persistence of its own writes to the vNVDIMM backend.
> 
> 3.persistence
> The 'persistence' option allows users to set platform-supported features
> about NVDIMM data persistence of a guest. It has two values, 'mem-ctrl'
> means the platform supports flushing dirty data from the memory controller
> to the NVDIMMs in the event of power loss, 'cpu' means The platform supports
> flushing dirty data from the CPU cache to the NVDIMMs in the event of power
> loss, which contains what 'mem-ctrl' means.
> 
> 4.unarmed
> The 'unarmed' option allows users to mark vNVDIMM read-only. Only one type
> of vNVDIMM backends can guarantee the guest write persistence, which is the
> device DAX on the real NVDIMM device (e.g., /dev/dax0.0). When using other
> types of backends, it's suggested to set 'unarmed' option to 'on', so the
> guest Linux NVDIMM driver will mark such vNVDIMM device as read-only.
> 

caveat: I didn't research too deeply into each of these options, but
I'll give you some feedback from the aspect of how you should formulate
your patches.

Rather than essentially replicate what was added in formatdomain provide
the examples here and a summary of what the proposed XML would "look
like", e.g.

Since 'alignsize' and 'pmem' seem to be related specifically to memory
mapped files:

  

  /tmp/nvdimm
  2048
  on

...
  

The  would seem to similar in description to the existing
 - although yes for a different model type.

The  seems to have two states "on" or "off", thus having just
 similar to perhaps how nosharepages is handled.

Since they're both memory mapped related changes, keep them together
from conf, xml, rng, etc. Then the "next" patch would be to do the qemu
and capability changes. With the last being any virsh changes to allow
modification on the command line (if possible).

The "persistence" seems to be less of a  option and more of a
 option, true/false?  It would thus need to be separated from
the others. Still similar to the first one, separate patch for conf,
xml, rng, html... Then patch for qemu...

The "unarmed" is a  option since it's being added in patch2
after node and label are processed, so it should be separate from the
 and  option patches.  If it's just going to have 2
states, then it doesn't need on - instead it could
just be .

> For more details, see
> https://github.com/qemu/qemu/blob/master/docs/nvdimm.txt

Not a commit message worthy...  Place it under the ---...

> 
> Signed-off-by: Zhong,Luyao 
> ---
>  docs/formatdomain.html.in | 98 
> ---
>  docs/schemas/domaincommon.rng | 31 --
>  2 files changed, 111 insertions(+), 18 deletions(-)
> 

As noted above, but perhaps difficult to determine - the html.in, .rng,
domain_conf, xml2xml tests, etc. should be in one patch.  Then another
patch for the qemu, capabilities, xml2argv tests, etc. in the next
patch. Then if you're going to add virsh options a separate patch for
that (although probably doesn't apply here).

After each patch, running make check syntax-check needs to pass.

> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index 8189959..9dec984 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -8258,6 +8258,8 @@ qemu-kvm -net nic,model=? /dev/null
>memory model='nvdimm'
>  source
>path/tmp/nvdimm/path
> +  alignsize unit='KiB'2048/alignsize
> +  pmemon/pmem
>  /source
>  target
>size unit='KiB'524288/size
> @@ -8265,6 +8267,8 @@ qemu-kvm -net nic,model=? /dev/null
>label
>  size unit='KiB'128/size
>/label
> +  persistencecpu/persistence
> +  unarmedon/unarmed
>  /target
>/memory
>  /devices
> @@ -8339,10 +8343,38 @@ qemu-kvm -net nic,model=? /dev/null
>  
>  
>  
> -  For model nvdimm this element is mandatory and has a
> -  single child element path that represents a path
> -  in the host that backs the nvdimm module in the guest.
> +  For model nvdimm this element is mandatory. The
> +  mandatory child element path represents a path in
> +  the host that 

[libvirt] [PATCH v2] lxc: Clang is complaining about possible NULL pointer.

2018-11-07 Thread Julio Faracco
The array "mount" inside lxc_container is not being checked before for
loop. Clang syntax scan is complaining about this segmentation fault.

Signed-off-by: Julio Faracco 
---
 src/lxc/lxc_container.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c
index 918194dacd..d834bf01d7 100644
--- a/src/lxc/lxc_container.c
+++ b/src/lxc/lxc_container.c
@@ -867,9 +867,13 @@ static int lxcContainerSetReadOnly(void)
 }
 }
 
-if (mounts)
-qsort(mounts, nmounts, sizeof(mounts[0]),
-  virStringSortRevCompare);
+if (!mounts) {
+ret = 0;
+goto cleanup;
+}
+
+qsort(mounts, nmounts, sizeof(mounts[0]),
+  virStringSortRevCompare);
 
 for (i = 0; i < nmounts; i++) {
 VIR_DEBUG("Bind readonly %s", mounts[i]);
@@ -883,9 +887,7 @@ static int lxcContainerSetReadOnly(void)
 
 ret = 0;
  cleanup:
-for (i = 0; i < nmounts; i++)
-VIR_FREE(mounts[i]);
-VIR_FREE(mounts);
+virStringListFreeCount(mounts, nmounts);
 endmntent(procmnt);
 return ret;
 
-- 
2.17.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 3/3] qemu: don't duplicate suspended events and state changes

2018-11-07 Thread John Ferlan



On 10/17/18 4:58 AM, Nikolay Shirokovskiy wrote:
> 
> 
> On 16.10.2018 21:46, John Ferlan wrote:
>>
>> $SUBJ:
>>
>> s/don't/Don't
>>
>> On 10/10/18 4:04 AM, Nikolay Shirokovskiy wrote:
>>> Now when STOP event handler has correct both suspended event reason
>>> and paused state reason let's wipe out duplicated event sending and
>>> state changed in/after qemuProcessStopCPUs.
>>
>> Since the STOP event handler can use the pausedReason as sent to
>> qemuProcessStopCPUs, we no longer need to send duplicate suspended
>> lifecycle events because we know what caused the stop along with extra
>> details. This processing allows us to also remove the duplicated state
>> change from qemuProcessStopCPUs.
>>
>>

Placed pkrempa on the CC line to hopefully get his feedback in order to
either move this along or require some rework due to the change possibly
generating a lifecycle event now.  It's not clear from the referenced
commit why no event was desired and I'd prefer Peter's input just in
case there's some particular reason that wasn't immediately evident or
widely known...

Tks -

John

>>>
>>> Signed-off-by: Nikolay Shirokovskiy 
>>> ---
>>>  src/qemu/qemu_driver.c| 26 +++---
>>>  src/qemu/qemu_migration.c | 42 ++
>>>  src/qemu/qemu_migration.h |  4 
>>>  src/qemu/qemu_process.c   | 13 +++--
>>>  4 files changed, 16 insertions(+), 69 deletions(-)
>>>
>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>>> index a52e249..7e08768 100644
>>> --- a/src/qemu/qemu_driver.c
>>> +++ b/src/qemu/qemu_driver.c
>>> @@ -1792,10 +1792,8 @@ static int qemuDomainSuspend(virDomainPtr dom)
>>>  virQEMUDriverPtr driver = dom->conn->privateData;
>>>  virDomainObjPtr vm;
>>>  int ret = -1;
>>> -virObjectEventPtr event = NULL;
>>>  qemuDomainObjPrivatePtr priv;
>>>  virDomainPausedReason reason;
>>> -int eventDetail;
>>>  int state;
>>>  virQEMUDriverConfigPtr cfg = NULL;
>>>  
>>> @@ -1814,16 +1812,12 @@ static int qemuDomainSuspend(virDomainPtr dom)
>>>  if (virDomainObjCheckActive(vm) < 0)
>>>  goto endjob;
>>>  
>>> -if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) {
>>> +if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT)
>>>  reason = VIR_DOMAIN_PAUSED_MIGRATION;
>>> -eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED;
>>> -} else if (priv->job.asyncJob == QEMU_ASYNC_JOB_SNAPSHOT) {
>>> +else if (priv->job.asyncJob == QEMU_ASYNC_JOB_SNAPSHOT)
>>>  reason = VIR_DOMAIN_PAUSED_SNAPSHOT;
>>> -eventDetail = -1; /* don't create lifecycle events when doing 
>>> snapshot */
>>
>> So with these changes how do we handle this special case? See commit
>> f569b87f5 when this was introduced.
>>
>> Do we need to adjust qemuProcessHandleStop to not generate the event
>> when "reason == VIR_DOMAIN_PAUSED_SNAPSHOT && priv->job.asyncJob ==
>> QEMU_ASYNC_JOB_SNAPSHOT"?
> 
> Well we had lifecycle event anyway because of stop handler so I decided
> to keep sending it. However I'm not sure why it was not sent in 
> qemuDomainSuspend
> originally. For example for migration we do send event. Looks like this
> event does not hurt anyone if it survives up to now.
> 
> I'm ok with commit message to if the case.
> 
> Nikolay
> 
>>
>> The rest is OK - just need your thoughts on this particular case for the
>> R-By.
>>
>> John
>>
>>> -} else {
>>> +else
>>>  reason = VIR_DOMAIN_PAUSED_USER;
>>> -eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_PAUSED;
>>> -}
>>>  
>>
>> [...]
>>

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] virNetworkDefUpdateDNSHost: Require both IP and a hostname to match

2018-11-07 Thread Laine Stump
On 11/6/18 12:10 PM, W. Trevor King wrote:
> Since fc19a0059 (network: backend functions for updating network dns
> host/srv/txt, 2012-11-12), the matching logic for various network
> components has been:
>
> 1) for HOST records, it's considered a match if the IP address or any
>of the hostnames of an existing record matches.
>
> 2) for SRV records, it's a match if all of
>domain+service+protocol+target *which have been specified* are
>matched.
>
> 3) for TXT records, there is only a single field to match - name
>(value can be the same for multiple records, and isn't considered a
>search term), so by definition there can be no ambiguous matches.
>
> But HOST records can have the same hostname for multiple records
> (similar to TXT records with the same value).  The value that needs to
> be distinct for HOST records is the IP address.


You're going to force me to go dig out the decades-old DNS RFCs, aren't
you? :-P (Seriously, I think we need to go back to the original source
and make sure we're interpreting it correctly before making any changes.
I can't do that right now without losing a ton of context on other
issues in my brain though...)

(but in the meantime, if something was working before (e.g. removing an
entry by hostname alone) then that needs to continue working, otherwise
some application depending on that behavior will now be broken, and
we've made promises about that not happening with libvirt.


>   This commit updates
> the matching logic to only consider the IP address.  Compared to the
> previous HOST logic:
>
> 1. You can now delete entries from an existing network like:
>
>  
>
>  example
>
>
>  example
>
>  
>
>   with input like:
>
>
>
>
>   or:
>
>
>  example
>
>
>   Previously, only the former would work (the latter used to raise
>   "multiple matching DNS HOST records were found in network").


Without thinking too much about it, that (the "multiple matching "
error) sounds like a bug that can/should be fixed - since both hostname
and ip find only a single record, there shouldn't be an error.


>
> 2. You can no longer remove entries by hostname alone.  Previously,
>you may have been able to remove an entry from an existing network
>like:
>
>  
>
>  example-1
>
>
>  example-2
>
>  
>
>with input like:
>
>  
>example-1
>  
>
>using the 'name' property to get through the partialOkay check in
>virNetworkDHCPHostDefParseXML.  Now that input will raise "Missing
>IP address in network '%s' DNS HOST record".


That is a change in the behavior of the API where the previous behavior
was desired (i.e. not a bug), so not acceptable.


>
> 3. You can now add multiple entries with a common hostname (as long as
>they have distinct IP addresses).  Previously, adding:
>
>  
>example
>  
>
>to an existing:
>
>  
>example
>  
>
>would have raised "there is already at least one DNS HOST record
>with a matching field in network".
> ---
>
> I'm actually not clear on whether the 'ip' attribute is required to be
> unique or not.


Well, *something* needs to be unique. Either one of the fields, or the
combination of some fields. If possible, decisions about that should be
based on the RFCs, and then if the backend implementation (dnsmasq in
this case) is any different, that should be treated as a different kind
of error.


>   If not, maybe the logic should be:
>
> * Deletes with just an IP remove all  entries that match that
>   IP.
> * Deletes with just a hostname remove all  entries that
>   match that hostname.
> * Deletes with an IP and a hostname remove matching  entries
>   from  entries which match the IP.
> * If  removal completely empties a , the  is
>   also removed.
>
> Thoughts?
>
>  src/conf/network_conf.c | 31 ++-
>  1 file changed, 10 insertions(+), 21 deletions(-)
>
> diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c
> index 39a13b4..8ed62ac 100644
> --- a/src/conf/network_conf.c
> +++ b/src/conf/network_conf.c
> @@ -587,14 +587,14 @@ virNetworkDNSHostDefParseXML(const char *networkName,
>  xmlNodePtr cur;
>  char *ip;
>  
> -if (!(ip = virXMLPropString(node, "ip")) && !partialOkay) {
> +if (!(ip = virXMLPropString(node, "ip"))) {
>  virReportError(VIR_ERR_XML_DETAIL,
> _("Missing IP address in network '%s' DNS HOST 
> record"),
> networkName);
>  goto error;
>  }
>  
> -if (ip && (virSocketAddrParse(>ip, ip, AF_UNSPEC) < 0)) {
> +if (virSocketAddrParse(>ip, ip, AF_UNSPEC) < 0) {
>  virReportError(VIR_ERR_XML_DETAIL,
> _("Invalid IP address in network '%s' DNS HOST 
> record"),
> networkName);
> @@ -603,6 +603,13 @@ 

Re: [libvirt] [PATCH 0/2] conf: Forbid some nonsensical combinations under

2018-11-07 Thread John Ferlan


On 10/15/18 8:39 AM, Michal Privoznik wrote:
> *** BLURB HERE ***
> 
> Michal Prívozník (2):
>   conf: Forbid hugepages and 
>   conf: Forbid hugepages and 
> 
>  src/conf/domain_conf.c| 25 +--
>  .../qemuxml2argvdata/hugepages-memaccess2.xml |  1 -
>  2 files changed, 23 insertions(+), 3 deletions(-)
> 

It wasn't clear from the existing review comment whether you were going
to update; however, the one thing I'd point out is that by changing Post
Parse processing to add some XML check/condition that was previously
accepted could theoretically cause a guest to disappear after libvirtd
restart, right?

Thus shouldn't the changes go in the Validate processing?

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 5/5] snapshot: Support reparenting external disk snapshots when deleting

2018-11-07 Thread Peter Krempa
On Sun, Oct 21, 2018 at 19:38:52 +0300, Povilas Kanapickas wrote:
> Signed-off-by: Povilas Kanapickas 
> ---
>  src/qemu/qemu_driver.c | 89 ++
>  1 file changed, 89 insertions(+)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 227ec1c6d9..a3ffc19122 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c

[]

> @@ -16617,6 +16618,88 @@ 
> qemuDomainSnapshotReparentChildrenMetadata(virDomainSnapshotObjPtr snap,
> rep->cfg->snapshotDir);
>  }
>  
> +static int
> +qemuDomainSnapshotReparentDiskExternal(virQEMUDriverPtr driver,
> +   virDomainSnapshotObjPtr parent_snap,
> +   virDomainSnapshotDiskDefPtr disk)
> +{
> +const char* qemu_img_path = NULL;
> +virCommandPtr cmd = NULL;
> +int i;
> +int ret = -1;
> +const char* parent_disk_path = NULL;
> +
> +// Find the path to the disk we should use as the base when reparenting
> +// FIXME: what if there's no parent snapshot? i.e. we need to reparent on
> +// "empty" disk?
> +for (i = 0; i < parent_snap->def->dom->ndisks; ++i) {
> +if (STREQ(parent_snap->def->dom->disks[i]->dst, disk->name)) {
> +parent_disk_path = parent_snap->def->dom->disks[i]->src->path;
> +break;
> +}
> +}
> +
> +if (parent_disk_path == NULL) {
> +virReportError(VIR_ERR_INTERNAL_ERROR,
> +   _("Could not find disk to reparent '%s' to"),
> +   disk->src->path);
> +goto cleanup;
> +}
> +
> +if (!(qemu_img_path = qemuFindQemuImgBinary(driver))) {
> +goto cleanup;
> +}
> +
> +if (!(cmd = virCommandNewArgList(qemu_img_path,
> + "rebase", "-b", parent_disk_path,
> + disk->src->path,
> + NULL))) {

Okay, this solves the problems that I've mentioned previously. There
still might be problems with matching which images actually get deleted
if the definition changed.

Also note that it needs to support network disks.

This code also should be added together with the deletion code as they
are tightly related.

> +goto cleanup;
> +}
> +
> +if (virCommandRun(cmd, NULL) < 0)
> +goto cleanup;
> +
> +ret = 0;
> +cleanup:
> +virCommandFree(cmd);
> +return ret;
> +}
> +
> +static int
> +qemuDomainSnapshotReparentDisks(virDomainSnapshotObjPtr snap,
> +virQEMUSnapReparentPtr rep)
> +{
> +int i;
> +virDomainSnapshotDiskDefPtr snap_disk;
> +
> +if (!snap->def->dom) {
> +VIR_WARN("Any external disk snapshots in a snapshot created with "
> + "pre-0.9.5 libvirt will not be correctly reparented.");
> +// In snapshots created with pre-0.9.5 libvirt we don't have 
> information
> +// needed to correctly reparent external disk snapshots but we also
> +// don't even know whether external disk snapshots were used so that
> +// we could report this as an error. We can only emit a warning in 
> that
> +// case.

As said earlier, this should not happen with external snapshots.


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 2/5] snapshot: Add VIR_DEBUG to qemuDomainSnapshotCreateXML()

2018-11-07 Thread Peter Krempa
On Sun, Oct 21, 2018 at 19:38:49 +0300, Povilas Kanapickas wrote:
> Signed-off-by: Povilas Kanapickas 
> ---
>  src/qemu/qemu_driver.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 279e5d93aa..eb104d306b 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -15298,6 +15298,8 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain,
>  const char *xmlDesc,
>  unsigned int flags)
>  {
> +VIR_DEBUG("domain=%p, flags=0x%x, xmlDesc=%s", domain, flags, xmlDesc);

This is logged by the public wrapper virDomainSnapshotCreateXML:

VIR_DOMAIN_DEBUG(domain, "xmlDesc=%s, flags=0x%x", xmlDesc, flags);



signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 3/5] snapshot: Support deleting external disk snapshots when deleting

2018-11-07 Thread Peter Krempa
On Sun, Oct 21, 2018 at 19:38:50 +0300, Povilas Kanapickas wrote:
> Signed-off-by: Povilas Kanapickas 
> ---
>  src/qemu/qemu_domain.c | 45 ++
>  src/qemu/qemu_driver.c |  7 ---
>  2 files changed, 45 insertions(+), 7 deletions(-)
> 
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index ba3fff607a..73c41788f8 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -8244,6 +8244,38 @@ qemuDomainSnapshotForEachQcow2(virQEMUDriverPtr driver,
>   op, try_all, def->ndisks);
>  }
>  
> +static int
> +qemuDomainSnapshotDiscardExternal(virDomainSnapshotObjPtr snap)
> +{
> +int i;
> +virDomainSnapshotDiskDefPtr snap_disk;
> +
> +// FIXME: libvirt stores the VM definition and the list of disks that
> +// snapshot has been taken of since 0.9.5. Before that we must assume 
> that
> +// all disks from within the VM definition have been snapshotted and that
> +// they all were internal disks. Did libvirt support external snapshots
> +// before 0.9.5? If so, if we ever end up in this code path we may incur
> +// data loss as we'll delete whole QEMU file thinking it's an external
> +// snapshot.

No any external snapshot does have the state, so the condition below is
dead code.

> +if (!snap->def->dom) {
> +virReportError(VIR_ERR_OPERATION_INVALID,
> +   _("Snapshots created using libvirt 9.5 and containing 
> "
> + "external disk snapshots cannot be safely "
> + "discarded."));
> +return -1;
> +}
> +
> +for (i = 0; i < snap->def->ndisks; ++i) {
> +snap_disk = &(snap->def->disks[i]);
> +if (snap_disk->snapshot != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)
> +continue;
> +if (unlink(snap_disk->src->path) < 0)
> +VIR_WARN("Failed to unlink snapshot disk '%s'",
> + snap_disk->src->path);

So this will ever only work properly for leaf snapshots. If you try to
delete a snapshot which has children you'll destroy any of it's
children. Since they are not deleted here we can't do it without that
since reverting to them would be broken.

Ideally we want to merge the changes to all relevant snapshots so that
the intermediate files and snapshot metadata can be deleted without data
loss even in the middle of the chain/tree.

Also one other note, snapshots are possible on networked storage where
unlink will not work, so this needs to be made more general.


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 1/5] snapshot: Implement reverting for external disk snapshots

2018-11-07 Thread Peter Krempa
On Sun, Oct 21, 2018 at 19:38:48 +0300, Povilas Kanapickas wrote:
> Signed-off-by: Povilas Kanapickas 
> ---
>  src/qemu/qemu_driver.c | 246 +
>  1 file changed, 224 insertions(+), 22 deletions(-)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index a52e2495d5..279e5d93aa 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -15952,19 +15952,194 @@ qemuDomainSnapshotHasMetadata(virDomainSnapshotPtr 
> snapshot,
>  return ret;
>  }
>  
> +static int
> +qemuDomainSnapshotRevertExternalSingleDisk(virQEMUDriverPtr driver,
> +   virDomainDiskDefPtr revert_disk,
> +   virDomainDiskDefPtr backing_disk)
> +{
> +virCommandPtr cmd = NULL;
> +const char *qemuImgPath;
> +virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver);
> +int ret = -1;
> +
> +if (!(qemuImgPath = qemuFindQemuImgBinary(driver)))
> +goto cleanup;
> +
> +if (unlink(revert_disk->src->path) < 0)
> +VIR_WARN("Failed to overwrite current image for snapshot '%s'",
> + revert_disk->src->path);
> +
> +/* TODO: maybe figure out the format from the backing_disk? */

This is necessary, but should be known since it's recorded in the
overlay file.

> +revert_disk->src->format = VIR_STORAGE_FILE_QCOW2;
> +/* FIXME: what about revert_disk->src->backingStore ? */
> +
> +/* creates cmd line args: qemu-img create -f qcow2 -o */
> +if (!(cmd = virCommandNewArgList(qemuImgPath,
> + "create",
> + "-f",
> + 
> virStorageFileFormatTypeToString(revert_disk->src->format),
> + "-o",
> + NULL)))
> +goto cleanup;
> +
> +/* adds cmd line arg: 
> backing_file=/path/to/backing/file,backing_fmt=format */
> +virCommandAddArgFormat(cmd, "backing_file=%s,backing_fmt=%s",
> +   backing_disk->src->path,
> +   
> virStorageFileFormatTypeToString(backing_disk->src->format));
> +
> +/* adds cmd line args: /path/to/target/file */
> +virCommandAddArg(cmd, revert_disk->src->path);
> +
> +if (virCommandRun(cmd, NULL) < 0)
> +goto cleanup;
> +
> +virCommandFree(cmd);
> +cmd = NULL;
> +ret = 0;
> +
> + cleanup:
> +virCommandFree(cmd);
> +virObjectUnref(cfg);
> +
> +return ret;
> +}
> +
> +static void
> +qemuComputeSnapshotDiskToDomainDiskMapping(virDomainSnapshotDefPtr snap_def,

There already should be a similar function named "...AlignDisks..." or
something like that.

> +   virDomainDefPtr dom_def,
> +   int* out_mapping)
> +{
> +size_t i, j;
> +int found_idx;
> +virDomainSnapshotDiskDefPtr snap_disk;
> +
> +for (i = 0; i < snap_def->ndisks; ++i) {
> +snap_disk = &(snap_def->disks[i]);
> +if (snap_disk->snapshot != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)
> +continue;
> +
> +found_idx = -1;
> +for (j = 0; j < dom_def->ndisks; ++j) {
> +if (STREQ(dom_def->disks[j]->dst, snap_disk->name)) {
> +found_idx = j;
> +break;
> +}
> +}
> +out_mapping[i] = found_idx;
> +}
> +}
> +
> +/* This function reverts an external snapshot disk state. With external disks
> +   we can't just revert to the disks listed in the domain stored within the
> +   snapshot, because it's read-only and might be used by other VMs in 
> different
> +   backing chains. Since the contents of the current disks will be discarded
> +   in favor of data in the snapshot, we reuse them by resetting them and
> +   pointing the backing image link to the disks listed within the snapshot.
> +
> +   The domain is expected to be inactive.
> +
> +   new_def is the new domain definition that will be stored to vm sometime in
> +   the future.
> + */
> +static int
> +qemuDomainSnapshotRevertInactiveExternal(virQEMUDriverPtr driver,
> + virDomainObjPtr vm,
> + virDomainSnapshotObjPtr snap,
> + virDomainDefPtr new_def)
> +{
> +/* We have the following disks recorded in the snapshot and the current
> +   domain definition:
> +
> +- the current disk state before the revert (vm->def->disks). We'll
> +  discard and reuse them.

So this has multiple problems:

1) You can have a chain of snapshots and revert in middle of that:
- this will destroy any snapshots which depend on the one you are
  reverting to and that is not acceptable

2) The snapshot can have a different mapping of disks. When reverting
the old topology should be kept as it existed at the time when 

[libvirt] [PATCH for-3.2 v2] vhost-user: define conventions for vhost-user backends

2018-11-07 Thread Marc-André Lureau
As discussed during "[PATCH v4 00/29] vhost-user for input & GPU"
review, let's define a common set of backend conventions to help with
management layer implementation, and interoperability.

v2:
 - use a vhost-user.json schema to discover backends and describe
   capability format
 - drop --pidfile
 - add some notes about daemonizing & stdin/out/err

Cc: libvir-list@redhat.com
Cc: Gerd Hoffmann 
Cc: Daniel P. Berrangé 
Cc: Changpeng Liu 
Cc: Dr. David Alan Gilbert 
Cc: Felipe Franciosi 
Cc: Gonglei 
Cc: Maxime Coquelin 
Cc: Michael S. Tsirkin 
Cc: Victor Kaplansky 
Signed-off-by: Marc-André Lureau 
---
 MAINTAINERS  |   1 +
 docs/interop/vhost-user.json | 219 +++
 docs/interop/vhost-user.txt  | 101 +++-
 3 files changed, 319 insertions(+), 2 deletions(-)
 create mode 100644 docs/interop/vhost-user.json

diff --git a/MAINTAINERS b/MAINTAINERS
index bd2dff7827..58082c6d92 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1238,6 +1238,7 @@ vhost
 M: Michael S. Tsirkin 
 S: Supported
 F: hw/*/*vhost*
+F: docs/interop/vhost-user.json
 F: docs/interop/vhost-user.txt
 F: backends/vhost-user.c
 F: include/sysemu/vhost-user-backend.h
diff --git a/docs/interop/vhost-user.json b/docs/interop/vhost-user.json
new file mode 100644
index 00..91b5bf499e
--- /dev/null
+++ b/docs/interop/vhost-user.json
@@ -0,0 +1,219 @@
+# -*- Mode: Python -*-
+#
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# Authors:
+#  Marc-André Lureau 
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# later. See the COPYING file in the top-level directory.
+
+##
+# = vhost user backend discovery & capabilities
+##
+
+##
+# @VHostUserBackendType:
+#
+# List the various vhost user backend types.
+#
+# @net: virtio net
+# @block: virtio block
+# @console: virtio console
+# @rng: virtio rng
+# @balloon: virtio balloon
+# @rpmsg: virtio remote processor messaging
+# @scsi: virtio scsi
+# @9p: 9p virtio console
+# @rproc-serial: virtio remoteproc serial link
+# @caif: virtio caif
+# @gpu: virtio gpu
+# @input: virtio input
+# @vsock: virtio vsock transport
+# @crypto: virtio crypto
+#
+# Since: 3.2
+##
+{
+  'enum': 'VHostUserBackendType',
+  'data': [ 'net', 'block', 'console', 'rng', 'balloon', 'rpmsg',
+'scsi', '9p', 'rproc-serial', 'caif', 'gpu', 'input', 'vsock',
+'crypto' ]
+}
+
+##
+# @VHostUserBackendInputFeature:
+#
+# List of vhost user "input" features.
+#
+# @evdev-path: The --evdev-path command line option is supported.
+# @no-grab: The --no-grab command line option is supported.
+#
+# Since: 3.2
+##
+{
+  'enum': 'VHostUserBackendInputFeature',
+  'data': [ 'evdev-path', 'no-grab' ]
+}
+
+##
+# @VHostUserBackendCapabilitiesInput:
+#
+# Capabilities reported by vhost user "input" backends
+#
+# @features: list of supported features.
+#
+# Since: 3.2
+##
+{
+  'struct': 'VHostUserBackendCapabilitiesInput',
+  'data': {
+'features': [ 'VHostUserBackendInputFeature' ]
+  }
+}
+
+##
+# @VHostUserBackendGPUFeature:
+#
+# List of vhost user "gpu" features.
+#
+# @render-node: The --render-node command line option is supported.
+# @virgl: The --virgl command line option is supported.
+#
+# Since: 3.2
+##
+{
+  'enum': 'VHostUserBackendGPUFeature',
+  'data': [ 'render-node', 'virgl' ]
+}
+
+##
+# @VHostUserBackendCapabilitiesGPU:
+#
+# Capabilities reported by vhost user "gpu" backends.
+#
+# @features: list of supported features.
+#
+# Since: 3.2
+##
+{
+  'struct': 'VHostUserBackendCapabilitiesGPU',
+  'data': {
+'features': [ 'VHostUserBackendGPUFeature' ]
+  }
+}
+
+##
+# @VHostUserBackendCapabilities:
+#
+# Capabilities reported by vhost user backends.
+#
+# @type: The vhost user backend type.
+#
+# Since: 3.2
+##
+{
+  'union': 'VHostUserBackendCapabilities',
+  'base': { 'type': 'VHostUserBackendType' },
+  'discriminator': 'type',
+  'data': {
+'input': 'VHostUserBackendCapabilitiesInput',
+'gpu': 'VHostUserBackendCapabilitiesGPU'
+  }
+}
+
+##
+# @VhostUserBackend:
+#
+# Describes a vhost user backend to management software.
+#
+# It is possible for multiple @VhostUserBackend elements to match the
+# search criteria of management software. Applications thus need rules
+# to pick one of the many matches, and users need the ability to
+# override distro defaults.
+#
+# It is recommended to create vhost user backend JSON files (each
+# containing a single @VhostUserBackend root element) with a
+# double-digit prefix, for example "50-qemu-gpu.json",
+# "50-crosvm-gpu.json", etc, so they can be sorted in predictable
+# order. The backend JSON files should be searched for in three
+# directories:
+#
+#   - /usr/share/qemu/vhost-user -- populated by distro-provided
+#   packages (XDG_DATA_DIRS covers
+#   /usr/share by default),
+#
+#   - /etc/qemu/vhost-user -- exclusively for sysadmins' local additions,
+#
+#   - $XDG_CONFIG_HOME/qemu/vhost-user -- 

Re: [libvirt] [PATCH] qemu: Don't ignore resume events

2018-11-07 Thread Peter Krempa
On Wed, Nov 07, 2018 at 15:56:02 +0100, Jiri Denemark wrote:
> On Wed, Nov 07, 2018 at 15:48:44 +0100, Peter Krempa wrote:
> > On Wed, Nov 07, 2018 at 15:11:11 +0100, Jiri Denemark wrote:
> > > Since commit v4.7.0-302-ge6d77a75c4 processing RESUME event is mandatory
> > > for updating domain state. But the event handler explicitly ignored this
> > > event in some cases. Thus the state would be wrong after a fake reboot
> > > or when a domain was rebooted after it crashed.
> > > 
> > > BTW, the code to ignore RESUME event after SHUTDOWN didn't make sense
> > > even before making RESUME event mandatory. Most likely it was there as a
> > > result of careless copy from qemuProcessHandleStop.
> > > 
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1612943
> > > 
> > > Signed-off-by: Jiri Denemark 
> > > ---
> > >  src/qemu/qemu_process.c | 8 +---
> > >  1 file changed, 1 insertion(+), 7 deletions(-)
> > > 
> > > diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> > > index c698c3b29c..59ca7cd333 100644
> > > --- a/src/qemu/qemu_process.c
> > > +++ b/src/qemu/qemu_process.c
> > > @@ -719,12 +719,7 @@ qemuProcessHandleResume(qemuMonitorPtr mon 
> > > ATTRIBUTE_UNUSED,
> > >  priv->runningReason = VIR_DOMAIN_RUNNING_UNKNOWN;
> > >  }
> > >  
> > > -if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_PAUSED) {
> > > -if (priv->gotShutdown) {
> > > -VIR_DEBUG("Ignoring RESUME event after SHUTDOWN");
> > > -goto unlock;
> > > -}
> > > -
> > > +if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_RUNNING) {
> > >  eventDetail = qemuDomainRunningReasonToResumeEvent(reason);
> > >  VIR_DEBUG("Transitioned guest %s out of paused into resumed 
> > > state, "
> > 
> > This message might become misleading since the VM may not be paused, but
> > e.g. crashed or some other state. Maybe just drop the mention of the
> > PAUSED state?
> 
> Hmm, you're right. Should I squash the second trivial patch to this one
> then since I'll be touching the message here anyway?

Sure. Just mention it in the commit message please.


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] qemu: Don't ignore resume events

2018-11-07 Thread Jiri Denemark
On Wed, Nov 07, 2018 at 15:48:44 +0100, Peter Krempa wrote:
> On Wed, Nov 07, 2018 at 15:11:11 +0100, Jiri Denemark wrote:
> > Since commit v4.7.0-302-ge6d77a75c4 processing RESUME event is mandatory
> > for updating domain state. But the event handler explicitly ignored this
> > event in some cases. Thus the state would be wrong after a fake reboot
> > or when a domain was rebooted after it crashed.
> > 
> > BTW, the code to ignore RESUME event after SHUTDOWN didn't make sense
> > even before making RESUME event mandatory. Most likely it was there as a
> > result of careless copy from qemuProcessHandleStop.
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1612943
> > 
> > Signed-off-by: Jiri Denemark 
> > ---
> >  src/qemu/qemu_process.c | 8 +---
> >  1 file changed, 1 insertion(+), 7 deletions(-)
> > 
> > diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> > index c698c3b29c..59ca7cd333 100644
> > --- a/src/qemu/qemu_process.c
> > +++ b/src/qemu/qemu_process.c
> > @@ -719,12 +719,7 @@ qemuProcessHandleResume(qemuMonitorPtr mon 
> > ATTRIBUTE_UNUSED,
> >  priv->runningReason = VIR_DOMAIN_RUNNING_UNKNOWN;
> >  }
> >  
> > -if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_PAUSED) {
> > -if (priv->gotShutdown) {
> > -VIR_DEBUG("Ignoring RESUME event after SHUTDOWN");
> > -goto unlock;
> > -}
> > -
> > +if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_RUNNING) {
> >  eventDetail = qemuDomainRunningReasonToResumeEvent(reason);
> >  VIR_DEBUG("Transitioned guest %s out of paused into resumed state, 
> > "
> 
> This message might become misleading since the VM may not be paused, but
> e.g. crashed or some other state. Maybe just drop the mention of the
> PAUSED state?

Hmm, you're right. Should I squash the second trivial patch to this one
then since I'll be touching the message here anyway?

Jirka

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v3 06/20] backup: Document new XML for backups

2018-11-07 Thread Vladimir Sementsov-Ogievskiy
25.10.2018 22:20, Eric Blake wrote:
> Prepare for new checkpoint and backup APIs by describing the XML
> that will represent a checkpoint and backup.  The checkpoint XML
> is modeled heavily after virDomainSnapshotPtr, since both represent
> a point in time of the guest (however, a snapshot exists with the
> intent to roll back to that point, while a checkpoint exists to
> facilitate later incremental backups). Meanwhile, the backup XML
> has enough information to represent both push model (the hypervisor
> writes the backup file to a location of the user's choice) and the
> pull model (the hypervisor needs local temporary storage, and also
> creates an NBD server that the user can use to read the backup via
> a third-party client)..  But while a snapshot exists with the
> intent of rolling back to that state, a checkpoint instead makes it
> possible to create an incremental backup at a later time.

hm, the last sentence is a duplication of "(however," above

>
> Add testsuite coverage for some minimal uses of both XML.
>
> Ultimately, I'd love for push model backups to target a network
> driver rather than just a local file or block device; but doing
> that got hairy

got hairy = is complicated?

>   (while  uses  as the description
> of a host or network resource, I picked  as the description
> of a push model backup target [defaults to qcow2 but can also be
> raw or any other format]

As I remember, it defaults to source disk format.

> , and  as the description
> of a pull model backup scratch space [must be qcow2]). The ideal
> refactoring would be a way to parameterize RNG to accept
> ... so that the name of the subelement
> can be  for domain, or  or  as needed for
> backups. Future patches may improve this area of code.
>
> Signed-off-by: Eric Blake 
>
> ---
> v2: apply (some) wording changes from review
> ---
>   docs/docs.html.in|   3 +-
>   docs/domainstatecapture.html.in  |   4 +-
>   docs/format.html.in  |   1 +
>   docs/formatcheckpoint.html.in| 291 +++
>   docs/index.html.in   |   3 +-
>   docs/schemas/domainbackup.rng| 185 
>   docs/schemas/domaincheckpoint.rng|  94 ++
>   libvirt.spec.in  |   2 +
>   mingw-libvirt.spec.in|   4 +
>   tests/Makefile.am|   6 +-
>   tests/domainbackupxml2xmlin/backup-pull.xml  |   9 +
>   tests/domainbackupxml2xmlin/backup-push.xml  |   9 +
>   tests/domainbackupxml2xmlin/empty.xml|   1 +
>   tests/domainbackupxml2xmlout/backup-pull.xml |   9 +
>   tests/domainbackupxml2xmlout/backup-push.xml |   9 +
>   tests/domainbackupxml2xmlout/empty.xml   |   7 +
>   tests/domaincheckpointxml2xmlin/empty.xml|   1 +
>   tests/domaincheckpointxml2xmlin/sample.xml   |   7 +
>   tests/domaincheckpointxml2xmlout/empty.xml   |  10 +
>   tests/domaincheckpointxml2xmlout/sample.xml  |  16 +
>   tests/virschematest.c|   4 +
>   21 files changed, 670 insertions(+), 5 deletions(-)
>   create mode 100644 docs/formatcheckpoint.html.in
>   create mode 100644 docs/schemas/domainbackup.rng
>   create mode 100644 docs/schemas/domaincheckpoint.rng
>   create mode 100644 tests/domainbackupxml2xmlin/backup-pull.xml
>   create mode 100644 tests/domainbackupxml2xmlin/backup-push.xml
>   create mode 100644 tests/domainbackupxml2xmlin/empty.xml
>   create mode 100644 tests/domainbackupxml2xmlout/backup-pull.xml
>   create mode 100644 tests/domainbackupxml2xmlout/backup-push.xml
>   create mode 100644 tests/domainbackupxml2xmlout/empty.xml
>   create mode 100644 tests/domaincheckpointxml2xmlin/empty.xml
>   create mode 100644 tests/domaincheckpointxml2xmlin/sample.xml
>   create mode 100644 tests/domaincheckpointxml2xmlout/empty.xml
>   create mode 100644 tests/domaincheckpointxml2xmlout/sample.xml
>
> diff --git a/docs/docs.html.in b/docs/docs.html.in
> index 4c46b74980..4914e7dbed 100644
> --- a/docs/docs.html.in
> +++ b/docs/docs.html.in
> @@ -79,7 +79,8 @@
> domain capabilities,
> node devices,
> secrets,
> -  snapshots
> +  snapshots,
> +  backups and checkpoints
>
>   URI format
>   The URI formats used for connecting to libvirt
> diff --git a/docs/domainstatecapture.html.in b/docs/domainstatecapture.html.in
> index f7f2fe0b98..9b890b4c0c 100644
> --- a/docs/domainstatecapture.html.in
> +++ b/docs/domainstatecapture.html.in
> @@ -259,9 +259,9 @@
>   a checkpoint as a side-effect of starting a new incremental
>   backup with virDomainBackupBegin(), since a
>   second incremental backup is most useful when using the
> -checkpoint created during the first.  
> +this command.
>
> virDomainBackupBegin(), virDomainBackupEnd()
> This API wraps approaches for capturing the state of disks

Re: [libvirt] [PATCH] qemu: snapshot: better error for active external readonly disk

2018-11-07 Thread Peter Krempa
On Mon, Oct 29, 2018 at 13:35:36 +0300, Nikolay Shirokovskiy wrote:
> External snapshot of readonly disk of active domain is impossible but error
> message [1] is cryptic (it's source is qemu). Let's make error message more
> user friendly.
> 
> The problem is libvirtd precreates snapshot image with no write permission 
> which
> is not expected by qemu.
> 
> [1] current error message
> error: internal error: unable to execute QEMU command 'transaction':
> Could not create file: Permission denied
> 
> Signed-off-by: Nikolay Shirokovskiy 
> ---
> 
> By the way if domain is not active then snapshot is possible. However top 
> image
> will have write permissions after snapshot. We can make snapshot work for
> active domain I guess if we precreate writable snapshot image, but then we end
> up running readonly disk on writable image. Also making snapshot of readonly
> disk is not that practical so let's just fix error message for now.

We should not allow doing this at all then. Not even when inactive.

>  src/qemu/qemu_driver.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index a52e249..e75931e 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -14683,6 +14683,14 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm,
>active, reuse) < 0)
>  goto cleanup;
>  
> +if (dom_disk->src->readonly && active) {
> +virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> +   _("external snapshot for readonly disk %s "
> + "of active domain is not supported"),
> +   disk->name);
> +goto cleanup;
> +}

Please put this into qemuDomainSnapshotPrepareDiskExternal.
qemuDomainSnapshotPrepare is already too long and we don't need to make
it even longer.

Also as noted I don't think we should allow this even when inactive.


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] qemu: Don't ignore resume events

2018-11-07 Thread Peter Krempa
On Wed, Nov 07, 2018 at 15:11:11 +0100, Jiri Denemark wrote:
> Since commit v4.7.0-302-ge6d77a75c4 processing RESUME event is mandatory
> for updating domain state. But the event handler explicitly ignored this
> event in some cases. Thus the state would be wrong after a fake reboot
> or when a domain was rebooted after it crashed.
> 
> BTW, the code to ignore RESUME event after SHUTDOWN didn't make sense
> even before making RESUME event mandatory. Most likely it was there as a
> result of careless copy from qemuProcessHandleStop.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1612943
> 
> Signed-off-by: Jiri Denemark 
> ---
>  src/qemu/qemu_process.c | 8 +---
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index c698c3b29c..59ca7cd333 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -719,12 +719,7 @@ qemuProcessHandleResume(qemuMonitorPtr mon 
> ATTRIBUTE_UNUSED,
>  priv->runningReason = VIR_DOMAIN_RUNNING_UNKNOWN;
>  }
>  
> -if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_PAUSED) {
> -if (priv->gotShutdown) {
> -VIR_DEBUG("Ignoring RESUME event after SHUTDOWN");
> -goto unlock;
> -}
> -
> +if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_RUNNING) {
>  eventDetail = qemuDomainRunningReasonToResumeEvent(reason);
>  VIR_DEBUG("Transitioned guest %s out of paused into resumed state, "

This message might become misleading since the VM may not be paused, but
e.g. crashed or some other state. Maybe just drop the mention of the
PAUSED state?

>"reason '%s', event detail %d",

ACK


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH v2 7/8] qemu: implement usb hub device hotunplug

2018-11-07 Thread Han Han
On Tue, Nov 6, 2018 at 6:47 AM John Ferlan  wrote:

>
> $SUBJ:
>
> s/implement/Implement
>
> On 10/12/18 4:50 AM, Han Han wrote:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1375423
> >
>
> Add the infrastructure to allow a USB Hub device to be hot unplugged.
>
> > Signed-off-by: Han Han 
> > ---
> >  src/qemu/qemu_driver.c  |  5 ++-
> >  src/qemu/qemu_hotplug.c | 74 -
> >  src/qemu/qemu_hotplug.h |  4 +++
> >  3 files changed, 81 insertions(+), 2 deletions(-)
> >
>
> This is where things get a bit dicey. Are you sure we can allow this
> given that we automagically create hub devices when there's too many USB
> devices, see:
>
> tests/qemuxml2argvdata/usb-hub-autoadd-deluxe.args
> tests/qemuxml2argvdata/usb-hub-autoadd-deluxe.xml
>
> and in the code qemuDomainUSBAddressAddHubs?
>
> Note that the test XML doesn't have a hub device defined, but yet some
> are created. If someone decides to hot unplug one that has something
> plugged into it (e.g. in the case of that test XML, some USB Input
> device), then what happens?
>

> Can you test that and ensure the results that you get?
>
Currently, if you hot-unplug a hub with usb devices attached, these
attached usb
device will be removed, too. e.g:

Start a VM with a hub. 8 usb mouse attached to the hub(Port 3.1~3.8):
# virsh qemu-monitor-command rhel7 --hmp info usb
  Device 0.2, Port 3, Speed 12 Mb/s, Product QEMU USB Hub, ID: hub0
  Device 0.2, Port 1, Speed 480 Mb/s, Product QEMU USB Tablet, ID: input0
  Device 0.3, Port 3.1, Speed 12 Mb/s, Product QEMU USB Mouse, ID: input3
  Device 0.4, Port 3.2, Speed 12 Mb/s, Product QEMU USB Mouse, ID: input4
  Device 0.5, Port 3.3, Speed 12 Mb/s, Product QEMU USB Mouse, ID: input5
  Device 0.6, Port 3.4, Speed 12 Mb/s, Product QEMU USB Mouse, ID: input6
  Device 0.7, Port 3.5, Speed 12 Mb/s, Product QEMU USB Mouse, ID: input7
  Device 0.8, Port 3.6, Speed 12 Mb/s, Product QEMU USB Mouse, ID: input8
  Device 0.9, Port 3.7, Speed 12 Mb/s, Product QEMU USB Mouse, ID: input9
  Device 0.10, Port 3.8, Speed 12 Mb/s, Product QEMU USB Mouse, ID: input10
  Device 0.0, Port 2, Speed 1.5 Mb/s, Product USB Redirection Device, ID:
redir0

Then hot-unplug the hub by hmp command:
# virsh qemu-monitor-command rhel7 --hmp device_del hub0

All usb mouse devices attached to hub disappeared in the live xml. And no
error
in the qemu VM log. However, I am not sure if other usb devices attached to
hub
will be removed without error...

I see you've essentially copied the steps that an Input device would
> take; however, I'd suspect a Hub device is a bit more special and we may
> need to process the various USB devices to make sure there isn't one
> using the Hub device by port... Even worse if it's port=1 and there's a
> port=1.1 around that equates to more ports (like from the test).
>
> The question becomes - can we determine which port a USB device is using
> via the guest status/live XML? Would the qemu del device error out or
> happily accept deleting a hub with attached ports? Or would it just drop
> all those attached devices?
>

I think that is not the expected result above. It better to refer to the
scsi controller in libvirt.
When you hot-unplug a scsi controller with scsi disk attached, you will get:
# virsh detach-device rhel7 /tmp/scsi.xml
error: Failed to detach device from /tmp/scsi.xml
error: operation failed: device cannot be detached: device is busy

>
Thanks for your review and valuable advice.

>
> Logically what's here would appear to work and is essentially similar to
> the Input devices code, so from that aspect things look OK - it's this
> one (hah) minor detail.
>
> Tks -
>
> John
>
> > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> > index de764a7f1c..c8a6d98dc0 100644
> > --- a/src/qemu/qemu_driver.c
> > +++ b/src/qemu/qemu_driver.c
> > @@ -7822,11 +7822,14 @@ qemuDomainDetachDeviceLive(virDomainObjPtr vm,
> >  ret = qemuDomainDetachVsockDevice(vm, dev->data.vsock, async);
> >  break;
> >
> > +case VIR_DOMAIN_DEVICE_HUB:
> > +ret = qemuDomainDetachHubDevice(vm, dev->data.hub, async);
> > +break;
> > +
> >  case VIR_DOMAIN_DEVICE_FS:
> >  case VIR_DOMAIN_DEVICE_SOUND:
> >  case VIR_DOMAIN_DEVICE_VIDEO:
> >  case VIR_DOMAIN_DEVICE_GRAPHICS:
> > -case VIR_DOMAIN_DEVICE_HUB:
> >  case VIR_DOMAIN_DEVICE_SMARTCARD:
> >  case VIR_DOMAIN_DEVICE_MEMBALLOON:
> >  case VIR_DOMAIN_DEVICE_NVRAM:
> > diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
> > index 1b6cc36bc8..87749ec011 100644
> > --- a/src/qemu/qemu_hotplug.c
> > +++ b/src/qemu/qemu_hotplug.c
> > @@ -4965,6 +4965,32 @@ qemuDomainRemoveRedirdevDevice(virQEMUDriverPtr
> driver,
> >  }
> >
> >
> > +static int
> > +qemuDomainRemoveHubDevice(virDomainObjPtr vm,
> > +  virDomainHubDefPtr dev)
> > +{
> > +qemuDomainObjPrivatePtr priv = vm->privateData;
> > +virQEMUDriverPtr driver = priv->driver;
> > +

Re: [libvirt] [PATCH] qemu: Fix debug message in qemuProcessHandleResume

2018-11-07 Thread Peter Krempa
On Wed, Nov 07, 2018 at 15:31:04 +0100, Jiri Denemark wrote:
> The message talks about "resumed" state, which is a bit confusing. While
> we have a "resumed" event, the corresponding state is called "running".
> 
> Signed-off-by: Jiri Denemark 
> ---
>  src/qemu/qemu_process.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

ACK


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 0/3] add disk driver metadata_cache_size option

2018-11-07 Thread Peter Krempa
On Fri, Nov 02, 2018 at 11:11:50 +0100, Kevin Wolf wrote:
> Am 01.11.2018 um 12:32 hat Nikolay Shirokovskiy geschrieben:
> > Hi, all.
> > 
> > This is a patch series after offlist agreement on introducing
> > metadata-cache-size option for disks. The options itself is described in 2nd
> > patch of the series.
> > 
> > There is a plenty of attempts to add option to set qcow2 metadata cache 
> > sizes in
> > past, see [1] [2] to name recent that has comments I tried to address or has
> > work I reused to some extent.
> > 
> > I would like to address Peter's comment in [1] that this option is image's
> > option rather then disk's here. While this makes sense if we set specific 
> > cache
> > value that covers disk only partially, here when we have maximum policy that
> > covers whole disk it makes sense to set same value for all images. The 
> > setted
> > value is only upper limit on cache size and thus cache for every image will
> > grow proportionally to image data size "visible from top" eventually I 
> > guess.
> > And these sizes are changing during guest lifetime - thus we need set whole
> > disk limit for every image for 'maxium' effect.
> > 
> > On the other hand if we add policies different from 'maximum' it may require
> > per image option. Well I can't imagine name for element for backing chain 
> > that
> > will have cache size attribute better then 'driver'). And driver is already
> > used for top image so I leave it this way.
> > 
> >   KNOWN ISSUES
> > 
> > 1. when using -driver to configure disks in command line cache size does not
> >get propagated thru backing chain
> > 
> > 2. when making external disk snapshot cache size sneak into backing file
> >filename and thus cache size for backing image became remembered there
> > 
> > 3. blockcommit after such snapshot will not work because libvirt is not 
> > ready
> >for backing file name is reported as json sometimes
> > 
> > Looks like 1 can be addressed in qemu.
> 
> I don't think we want to add inheritance of driver-specific options.
> Option inheritance is already a bit messy with options that every driver
> understands.
> 
> And -drive is the obsolete interface anyway, when you use -blockdev you
> specify all nodes explicitly.

So this would mean we get different behaviour depending on whether
-blockdev is supported by libvirt and qemu or not.

This means that there are the following possibilities:

1) allow this feature only with -blockdev

2) limit this only to the top layer image

3) make it configurable per backing chain entry.

Given our discussion on the KVM forum, I don't think it makes sense to
do 3, 2 would make it incomplete, so 1 is the only reasonable option if
qemu will not do the inheritance.

> > The reason for behaviour in 2 I do not understand honestly.
> 
> QEMU doesn't recognise that the cache size option doesn't affect which
> data you're accessing. Max had a series that should probably fix it.
> Maybe we should finally get it merged.
> 
> Anyway, I suppose if you use libvirt in -blockdev mode, it doesn't make
> a difference anyway, because libvirt should then manually call
> blockdev-create for the overlay and therefore specify the backing file
> string explicitly.

Yes, in that case we'll create it manually with the correct backing
store path. Currently we'd ignore such an entry in the backing store
path when detecting the chain from the disk so this should not affect
anything.

> Can you confirm this in practice?
> 
> > And 3 will be naturally addessed after blockjobs starts working with
> > blockdev configuration I guess.

Did you test this? We do support backing files with 'json:' pseudo
protocol.

> 
> Hopefully (Peter?), but depending on 2. it might not even be necessary
> if libvirt doesn't explicitly store json: pseudo-filenames.

Well, we will store them eventually in json pseudo-protocol format since
in some cases (e.g. multi-host gluster) we don't have any other option.


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 2/3] xml: add disk driver metadata_cache_size option

2018-11-07 Thread Peter Krempa
On Thu, Nov 01, 2018 at 14:32:23 +0300, Nikolay Shirokovskiy wrote:
> The only possible value is 'maximum' which makes l2_cache_size large
> enough to keep all metadata in memory. This will unleash disks
> performace for some database workloads and IO benchmarks with random
> access to whole disk.
> 
> Note that imlementation sets l2-cache-size and not cache-size.
> Both *cache-size's is upper limit on cache size value thus instead of
> setting precise limit for disk which involves knowing disk size
> and disk's cluster size we can just set INT64_MAX. Unfortunately
> both old and new versions of qemu fail on setting cache-size to INT64_MAX.
> Fortunately both old and new versions works well on such setting for
> l2-cache-size. As guest performance depends only l2 cache size
> and not refcount cache size (which is documented in recent qemu)
> we can set l2 directly.
> 
> Signed-off-by: Nikolay Shirokovskiy 
> ---
>  docs/formatdomain.html.in  |  7 
>  docs/schemas/domaincommon.rng  | 10 +
>  src/conf/domain_conf.c | 17 
>  src/conf/domain_conf.h |  9 
>  src/qemu/qemu_command.c| 26 
>  src/qemu/qemu_domain.c |  1 +
>  .../qemuxml2argvdata/disk-metadata_cache_size.args | 34 +++
>  .../qemuxml2argvdata/disk-metadata_cache_size.xml  | 42 +++
>  tests/qemuxml2argvtest.c   |  2 +
>  .../disk-metadata_cache_size.xml   | 48 
> ++
>  tests/qemuxml2xmltest.c|  2 +
>  11 files changed, 198 insertions(+)
>  create mode 100644 tests/qemuxml2argvdata/disk-metadata_cache_size.args
>  create mode 100644 tests/qemuxml2argvdata/disk-metadata_cache_size.xml
>  create mode 100644 tests/qemuxml2xmloutdata/disk-metadata_cache_size.xml
> 
> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index 8189959..93e0009 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -3556,6 +3556,13 @@
>  virt queues for virtio-blk. (Since 
> 3.9.0)
>
>
> +The optional metadata_cache_size attribute specifies
> +metadata cache size policy. The only possible value is "maximum" 
> to
> +keep all metadata in cache, this will help if workload needs 
> access
> +to whole disk all the time.  (Since
> +4.9.0)

I wanted to complain that we prefer camelCase to underscores generally,
but given that the  element has at least 4 attributes using
underscores that point would be moot.

What's missing though is the description of the default value when the
attribute is not present. Also I think that we should allow to pass
"default" as a valid argument.

> +  
> +  
>For virtio disks,
>Virtio-specific options can also be
>set. (Since 3.5.0)
> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
> index 099a949..18efa3a 100644
> --- a/docs/schemas/domaincommon.rng
> +++ b/docs/schemas/domaincommon.rng
> @@ -1990,6 +1990,9 @@
>  
>
>
> +
> +  
> +  
>  
>
>  
> @@ -2090,6 +2093,13 @@
>
>  
>
> +  
> +
> +  
> +maximum

Here default should be allowed as well.

> +  
> +
> +  
>
>  
>

[...]

> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index 1ff593c..b33e6a5 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c
> @@ -1330,6 +1330,21 @@ qemuCheckDiskConfig(virDomainDiskDefPtr disk,
>  return -1;
>  }
>  
> +if (disk->metadata_cache_size &&
> +(disk->src->type != VIR_STORAGE_TYPE_FILE ||

Why don't do this for network-based qcow2s?

> + disk->src->format != VIR_STORAGE_FILE_QCOW2)) {

Note that a QCOW2 can also be a part of the backing chain where the top
image format is not qcow2. In such case it would not work. Without
-blockdev support I don't see a possibility to expose that to qemu
though since I expect the -drive command being rejected if the top image
is not a qcow2.

> +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> +   _("metadata_cache_size can only be set for qcow2 
> disks"));
> +return -1;
> +}
> +
> +if (disk->metadata_cache_size &&

This part is common to both of the above conditions.

> +disk->metadata_cache_size != 
> VIR_DOMAIN_DISK_METADATA_CACHE_SIZE_MAXIMUM) {
> +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> +   _("metadata_cache_size can only be set to 
> 'maximum'"));
> +return -1;
> +}
> +
>  if (qemuCaps) {
>  if (disk->serial &&
>  disk->bus == VIR_DOMAIN_DISK_BUS_SCSI &&
> @@ -1353,6 +1368,14 @@ 

[libvirt] [PATCH 1/2] qemu: Drop unreachable code from qemuProcessHandleStop

2018-11-07 Thread Jiri Denemark
If gotShutdown is true, the domain state cannot be running because of
the following code in qemuProcessHandleShutdown:

priv->gotShutdown = true;

VIR_DEBUG("Transitioned guest %s to shutdown state",
  vm->def->name);
virDomainObjSetState(vm,
 VIR_DOMAIN_SHUTDOWN,
 VIR_DOMAIN_SHUTDOWN_UNKNOWN);

Signed-off-by: Jiri Denemark 
---
 src/qemu/qemu_process.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 4e93b2d741..820d90aef7 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -649,11 +649,6 @@ qemuProcessHandleStop(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
 if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) {
 qemuDomainObjPrivatePtr priv = vm->privateData;
 
-if (priv->gotShutdown) {
-VIR_DEBUG("Ignoring STOP event after SHUTDOWN");
-goto unlock;
-}
-
 if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) {
 if (priv->job.current->status ==
 QEMU_DOMAIN_JOB_STATUS_POSTCOPY) {
@@ -690,7 +685,6 @@ qemuProcessHandleStop(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
 }
 }
 
- unlock:
 virObjectUnlock(vm);
 virObjectEventStateQueue(driver->domainEventState, event);
 virObjectUnref(cfg);
-- 
2.19.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 0/2] qemu: Drop priv->gotShutdown

2018-11-07 Thread Jiri Denemark
Jiri Denemark (2):
  qemu: Drop unreachable code from qemuProcessHandleStop
  qemu: Drop priv->gotShutdown

 src/qemu/qemu_domain.h  |  1 -
 src/qemu/qemu_driver.c  |  3 ++-
 src/qemu/qemu_process.c | 13 +
 3 files changed, 3 insertions(+), 14 deletions(-)

-- 
2.19.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 2/2] qemu: Drop priv->gotShutdown

2018-11-07 Thread Jiri Denemark
The gotShutdown bool has been redundant since we started setting
VIR_DOMAIN_SHUTDOWN state after receiving SHUTDOWN event from QEMU.

Signed-off-by: Jiri Denemark 
---
 src/qemu/qemu_domain.h  | 1 -
 src/qemu/qemu_driver.c  | 3 ++-
 src/qemu/qemu_process.c | 7 +--
 3 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
index 80bd4bde91..63d645a31a 100644
--- a/src/qemu/qemu_domain.h
+++ b/src/qemu/qemu_domain.h
@@ -286,7 +286,6 @@ struct _qemuDomainObjPrivate {
 qemuAgentPtr agent;
 bool agentError;
 
-bool gotShutdown;
 bool beingDestroyed;
 char *pidfile;
 
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index a52e2495d5..cc7fde0695 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -4748,7 +4748,8 @@ processMonitorEOFEvent(virQEMUDriverPtr driver,
 goto endjob;
 }
 
-if (priv->monJSON && !priv->gotShutdown) {
+if (priv->monJSON &&
+virDomainObjGetState(vm, NULL) != VIR_DOMAIN_SHUTDOWN) {
 VIR_DEBUG("Monitor connection to '%s' closed without SHUTDOWN event; "
   "assuming the domain crashed", vm->def->name);
 eventReason = VIR_DOMAIN_EVENT_STOPPED_FAILED;
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 820d90aef7..0f74d72490 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -491,7 +491,6 @@ qemuProcessFakeReboot(void *opaque)
"%s", _("resume operation failed"));
 goto endjob;
 }
-priv->gotShutdown = false;
 
 if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm, driver->caps) < 
0) {
 VIR_WARN("Unable to save status on vm %s after state change",
@@ -578,7 +577,7 @@ qemuProcessHandleShutdown(qemuMonitorPtr mon 
ATTRIBUTE_UNUSED,
 virObjectLock(vm);
 
 priv = vm->privateData;
-if (priv->gotShutdown) {
+if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_SHUTDOWN) {
 VIR_DEBUG("Ignoring repeated SHUTDOWN event from domain %s",
   vm->def->name);
 goto unlock;
@@ -587,7 +586,6 @@ qemuProcessHandleShutdown(qemuMonitorPtr mon 
ATTRIBUTE_UNUSED,
   vm->def->name);
 goto unlock;
 }
-priv->gotShutdown = true;
 
 VIR_DEBUG("Transitioned guest %s to shutdown state",
   vm->def->name);
@@ -5980,7 +5978,6 @@ qemuProcessPrepareDomain(virQEMUDriverPtr driver,
 priv->monJSON = true;
 priv->monError = false;
 priv->monStart = 0;
-priv->gotShutdown = false;
 priv->runningReason = VIR_DOMAIN_RUNNING_UNKNOWN;
 
 VIR_DEBUG("Updating guest CPU definition");
@@ -7394,8 +7391,6 @@ int qemuProcessAttach(virConnectPtr conn ATTRIBUTE_UNUSED,
 monConfig = NULL;
 priv->monJSON = monJSON;
 
-priv->gotShutdown = false;
-
 /* Attaching to running QEMU so we need to detect whether it was started
  * with -no-reboot. */
 qemuProcessPrepareAllowReboot(vm);
-- 
2.19.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 1/3] qemu: caps: add drive.qcow2.l2-cache-size

2018-11-07 Thread Peter Krempa
On Thu, Nov 01, 2018 at 14:32:22 +0300, Nikolay Shirokovskiy wrote:

Missing description of what the capability bit detects in the commit
message.

> Signed-off-by: Nikolay Shirokovskiy 
> ---
>  src/qemu/qemu_capabilities.c   | 3 +++
>  src/qemu/qemu_capabilities.h   | 1 +

[...]

>  28 files changed, 30 insertions(+)
> 
> diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
> index e228f52..7d42254 100644
> --- a/src/qemu/qemu_capabilities.c
> +++ b/src/qemu/qemu_capabilities.c
> @@ -508,6 +508,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST,
>/* 315 */
>"vfio-pci.display",
>"blockdev",
> +  "drive.qcow2.l2-cache-size",
>  );
>  
>  
> @@ -1234,6 +1235,8 @@ static struct virQEMUCapsStringFlags 
> virQEMUCapsQMPSchemaQueries[] = {
>  { "blockdev-add/arg-type/+vxhs", QEMU_CAPS_VXHS},
>  { "blockdev-add/arg-type/+iscsi/password-secret", 
> QEMU_CAPS_ISCSI_PASSWORD_SECRET },
>  { "blockdev-add/arg-type/+qcow2/encrypt/+luks/key-secret", 
> QEMU_CAPS_QCOW2_LUKS },
> +{ "blockdev-add/arg-type/options/+qcow2/l2-cache-size", 
> QEMU_CAPS_DRIVE_QCOW2_L2_CACHE_SIZE},

This can be dropped ...

> +{ "blockdev-add/arg-type/+qcow2/l2-cache-size", 
> QEMU_CAPS_DRIVE_QCOW2_L2_CACHE_SIZE},

... and this needs to be modified since the cache behaves differently
as pointed out in 3/3.

>  { "nbd-server-start/arg-type/tls-creds", QEMU_CAPS_NBD_TLS },
>  { "screendump/arg-type/device", QEMU_CAPS_SCREENDUMP_DEVICE },
>  { "block-commit/arg-type/*top",  QEMU_CAPS_ACTIVE_COMMIT },


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH] qemu: Fix debug message in qemuProcessHandleResume

2018-11-07 Thread Jiri Denemark
The message talks about "resumed" state, which is a bit confusing. While
we have a "resumed" event, the corresponding state is called "running".

Signed-off-by: Jiri Denemark 
---
 src/qemu/qemu_process.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 59ca7cd333..4e93b2d741 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -721,7 +721,7 @@ qemuProcessHandleResume(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
 
 if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_RUNNING) {
 eventDetail = qemuDomainRunningReasonToResumeEvent(reason);
-VIR_DEBUG("Transitioned guest %s out of paused into resumed state, "
+VIR_DEBUG("Transitioned guest %s out of paused into running state, "
   "reason '%s', event detail %d",
   vm->def->name, virDomainRunningReasonTypeToString(reason),
   eventDetail);
-- 
2.19.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] libvirt: add daemon itself as shutdown reason

2018-11-07 Thread John Ferlan



On 10/8/18 7:21 AM, Nikolay Shirokovskiy wrote:
> Let's introduce shutdown reason "daemon" which means we have to
> kill running domain ourselves as the best action we can take at
> that moment. Failure to pick up domain on daemon restart is
> one example of such case. Using reason "crashed" is a bit misleading
> as it is used when qemu is actually crashed or qemu destroyed on
> guest panic as result of user choice that is the problem was
> in qemu/guest itself. So I propose to use "crashed" only for
> qemu side issues and introduce "daemon" when we have to kill the qemu
> for good.
> 
> There is another example where "daemon" will be useful. If we can
> not reboot domain we kill it and got "crashed" reason now. Looks
> like good candidate for "daemon" reason.
> 
> Signed-off-by: Nikolay Shirokovskiy 
> ---
>  include/libvirt/libvirt-domain.h |  1 +
>  src/conf/domain_conf.c   |  3 ++-
>  src/qemu/qemu_process.c  | 11 ---
>  tools/virsh-domain-monitor.c |  3 ++-
>  4 files changed, 9 insertions(+), 9 deletions(-)
> 

[...]

Now that I've pushed commits 296e05b54 and 8f0f8425d, let's revisit
this...  Merging in that set of changes with this patch means that
instead of:

> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index e9c7618..c4bc9ca 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -7988,15 +7988,12 @@ qemuProcessReconnect(void *opaque)
>  /* We can't get the monitor back, so must kill the VM
>   * to remove danger of it ending up running twice if
>   * user tries to start it again later
> - * If we couldn't get the monitor since QEMU supports
> - * no-shutdown, we can safely say that the domain
> - * crashed ... */
> -state = VIR_DOMAIN_SHUTOFF_CRASHED;
> -/* If BeginJob failed, we jumped here without a job, let's hope 
> another
> + * If BeginJob failed, we jumped here without a job, let's hope 
> another
>   * thread didn't have a chance to start playing with the domain yet
>   * (it's all we can do anyway).
>   */
> -qemuProcessStop(driver, obj, state, QEMU_ASYNC_JOB_NONE, stopFlags);
> +qemuProcessStop(driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
> +QEMU_ASYNC_JOB_NONE, stopFlags);
>  }
>  goto cleanup;
>  }
> @@ -8035,7 +8032,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
>   * is no thread that could be doing anything else with the same 
> domain
>   * object.
>   */
> -qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
> +qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
>  QEMU_ASYNC_JOB_NONE, 0);
>  qemuDomainRemoveInactiveJobLocked(src->driver, obj);
>  

We'd have:

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 3bf84834ec..023e187729 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7995,10 +7995,14 @@ qemuProcessReconnect(void *opaque)
  *
  * If we cannot get to the monitor when the QEMU command
  * line used -no-shutdown, then we can safely say that the
- * domain crashed; otherwise, we don't really know. */
+ * domain crashed; otherwise, if the monitor was started,
+ * then we can blame ourselves, else we failed before the
+ * monitor started so we don't really know. */
 if (!priv->mon && tryMonReconn &&
 qemuDomainIsUsingNoShutdown(priv))
 state = VIR_DOMAIN_SHUTOFF_CRASHED;
+else if (priv->mon)
+state = VIR_DOMAIN_SHUTOFF_DAEMON;
 else
 state = VIR_DOMAIN_SHUTOFF_UNKNOWN;

@@ -8045,7 +8049,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
  * is no thread that could be doing anything else with the same
domain
  * object.
  */
-qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
 QEMU_ASYNC_JOB_NONE, 0);
 qemuDomainRemoveInactiveJobLocked(src->driver, obj);

[...]

So does this essentially meet/handle the condition you were trying to
alter?  That is - once we get beyond the monitor reconnection, then if
we end up in error that means the daemon has decided to stop things.
Leaving the UNKNOWN to be the period prior to attempting to reconnect
and CRASHED to the period during which we try to connect to the monitor.

This also captures failures after connection is successful (when
priv->mon in qemuConnectMonitor).

Getting more specific failures of why the connection failed is perhaps a
future task if it really matters.

If this looks good to you let me know, I can merge and push with the
previously noted changes and a commit message of:

"This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as 

[libvirt] [PATCH] qemu: Process RDMA GID state change event

2018-11-07 Thread Yuval Shaia
This event is emitted on the monitor when a GID table in pvrdma device
is modified and the change needs to be propagate to the backend RDMA
device's GID table.

The control over the RDMA device's GID table is done by updating the
device's Ethernet function addresses.
Usually the first GID entry is determine by the MAC address, the second
by the first IPv6 address and the third by the IPv4 address. Other
entries can be added by adding more IP addresses. The opposite is the
same, i.e. whenever an address is removed, the corresponding GID entry
is removed.

The process is done by the network and RDMA stacks. Whenever an address
is added the ib_core driver is notified and calls the device driver's
add_gid function which in turn update the device.

To support this in pvrdma device we need to hook into the create_bind
and destroy_bind HW commands triggered by pvrdma driver in guest.
Whenever a changed is made to the pvrdma device's GID table a special
QMP messages is sent to be processed by libvirt to update the address of
the backend Ethernet device.

Signed-off-by: Yuval Shaia 
---
 src/qemu/qemu_domain.c   |  3 ++
 src/qemu/qemu_domain.h   | 15 +
 src/qemu/qemu_driver.c   | 40 
 src/qemu/qemu_monitor.c  | 28 +
 src/qemu/qemu_monitor.h  | 13 
 src/qemu/qemu_monitor_json.c | 36 ++
 src/qemu/qemu_process.c  | 59 
 7 files changed, 194 insertions(+)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index ba3fff607a..8da54c7ee9 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -13479,6 +13479,9 @@ qemuProcessEventFree(struct qemuProcessEvent *event)
 case QEMU_PROCESS_EVENT_GUESTPANIC:
 qemuMonitorEventPanicInfoFree(event->data);
 break;
+case QEMU_PROCESS_EVENT_RDMA_GID_STATUS_CHANGED:
+qemuMonitorEventRdmaGidStatusFree(event->data);
+break;
 case QEMU_PROCESS_EVENT_WATCHDOG:
 case QEMU_PROCESS_EVENT_DEVICE_DELETED:
 case QEMU_PROCESS_EVENT_NIC_RX_FILTER_CHANGED:
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
index 80bd4bde91..1b188843e3 100644
--- a/src/qemu/qemu_domain.h
+++ b/src/qemu/qemu_domain.h
@@ -478,6 +478,18 @@ struct _qemuDomainVsockPrivate {
 };
 
 
+typedef struct _qemuDomainRdmaGidStatusChangedPrivate 
qemuDomainRdmaGidStatusChangedPrivate;
+typedef qemuDomainRdmaGidStatusChangedPrivate 
*qemuDomainRdmaGidStatusChangedPrivatePtr;
+struct _qemuDomainRdmaGidStatusChangedPrivate {
+virObject parent;
+
+char *netdev;
+bool gid_status;
+uint64_t subnet_prefix;
+uint64_t interface_id;
+};
+
+
 typedef enum {
 QEMU_PROCESS_EVENT_WATCHDOG = 0,
 QEMU_PROCESS_EVENT_GUESTPANIC,
@@ -487,6 +499,7 @@ typedef enum {
 QEMU_PROCESS_EVENT_BLOCK_JOB,
 QEMU_PROCESS_EVENT_MONITOR_EOF,
 QEMU_PROCESS_EVENT_PR_DISCONNECT,
+QEMU_PROCESS_EVENT_RDMA_GID_STATUS_CHANGED,
 
 QEMU_PROCESS_EVENT_LAST
 } qemuProcessEventType;
@@ -499,6 +512,8 @@ struct qemuProcessEvent {
 void *data;
 };
 
+void 
qemuMonitorEventRdmaGidStatusFree(qemuDomainRdmaGidStatusChangedPrivatePtr 
info);
+
 void qemuProcessEventFree(struct qemuProcessEvent *event);
 
 typedef struct _qemuDomainLogContext qemuDomainLogContext;
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index a52e2495d5..dc088d844f 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -4788,6 +4788,43 @@ processPRDisconnectEvent(virDomainObjPtr vm)
 }
 
 
+static void
+processRdmaGidStatusChangedEvent(virDomainObjPtr vm,
+ qemuDomainRdmaGidStatusChangedPrivatePtr info)
+{
+unsigned int prefix_len;
+virSocketAddr addr = {0};
+int rc;
+
+if (!virDomainObjIsActive(vm))
+return;
+
+VIR_DEBUG("netdev=%s,gid_status=%d,subnet_prefix=0x%lx,interface_id=0x%lx",
+  info->netdev, info->gid_status, info->subnet_prefix,
+  info->interface_id);
+
+if (info->subnet_prefix) {
+prefix_len = 64;
+uint32_t ipv6[4];
+memcpy([0], >subnet_prefix, sizeof(info->subnet_prefix));
+memcpy([2], >interface_id, sizeof(info->subnet_prefix));
+virSocketAddrSetIPv6AddrNetOrder(, ipv6);
+} else {
+prefix_len = 24;
+virSocketAddrSetIPv4AddrNetOrder(, info->interface_id >> 32);
+}
+
+if (info->gid_status)
+rc = virNetDevIPAddrAdd(info->netdev, , NULL, prefix_len);
+else
+rc = virNetDevIPAddrDel(info->netdev, , prefix_len);
+
+if (rc)
+VIR_ERROR(_("Fail to update address 0x%lx to %s"), info->interface_id,
+  info->netdev);
+}
+
+
 static void qemuProcessEventHandler(void *data, void *opaque)
 {
 struct qemuProcessEvent *processEvent = data;
@@ -4828,6 +4865,9 @@ static void qemuProcessEventHandler(void *data, void 
*opaque)
 case QEMU_PROCESS_EVENT_PR_DISCONNECT:
 processPRDisconnectEvent(vm);
 

[libvirt] [PATCH] qemu: Don't ignore resume events

2018-11-07 Thread Jiri Denemark
Since commit v4.7.0-302-ge6d77a75c4 processing RESUME event is mandatory
for updating domain state. But the event handler explicitly ignored this
event in some cases. Thus the state would be wrong after a fake reboot
or when a domain was rebooted after it crashed.

BTW, the code to ignore RESUME event after SHUTDOWN didn't make sense
even before making RESUME event mandatory. Most likely it was there as a
result of careless copy from qemuProcessHandleStop.

https://bugzilla.redhat.com/show_bug.cgi?id=1612943

Signed-off-by: Jiri Denemark 
---
 src/qemu/qemu_process.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index c698c3b29c..59ca7cd333 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -719,12 +719,7 @@ qemuProcessHandleResume(qemuMonitorPtr mon 
ATTRIBUTE_UNUSED,
 priv->runningReason = VIR_DOMAIN_RUNNING_UNKNOWN;
 }
 
-if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_PAUSED) {
-if (priv->gotShutdown) {
-VIR_DEBUG("Ignoring RESUME event after SHUTDOWN");
-goto unlock;
-}
-
+if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_RUNNING) {
 eventDetail = qemuDomainRunningReasonToResumeEvent(reason);
 VIR_DEBUG("Transitioned guest %s out of paused into resumed state, "
   "reason '%s', event detail %d",
@@ -742,7 +737,6 @@ qemuProcessHandleResume(qemuMonitorPtr mon ATTRIBUTE_UNUSED,
 }
 }
 
- unlock:
 virObjectUnlock(vm);
 virObjectEventStateQueue(driver->domainEventState, event);
 virObjectUnref(cfg);
-- 
2.19.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] Parsing of memory.stat in libvirt

2018-11-07 Thread Peter.Chubb


Hi Folks,
   libvirt currently spams the logs with

error : virCgroupGetMemoryStat:2490 : internal error: Cannot parse 
'memory.stat' cgroup file

   whenever someone does ps in a container.

This is because the parser for memory/stat is incorrect: the `line'
variable is never updated, so each time through the loop, the same
start-of-line is compared for the token;  and as all spaces are
eventually replaced with NUL the error exit is taken instead of
ending the loop properly. 

Here is a strawman patch to fix the problem. Please note, I'm not
subscribed to this list;  I reported the bug as
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=913023 and was asked
to send the patch upstream.

 src/util/vircgroup.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- libvirt.orig/src/util/vircgroup.c
+++ libvirt/src/util/vircgroup.c
@@ -2477,7 +2477,7 @@ virCgroupGetMemoryStat(virCgroupPtr grou
 
 line = stat;
 
-while (line) {
+while (*line) {
 char *newLine = strchr(line, '\n');
 char *valueStr = strchr(line, ' ');
 unsigned long long value;
@@ -2507,6 +2507,11 @@ virCgroupGetMemoryStat(virCgroupPtr grou
 inactiveFileVal = value >> 10;
 else if (STREQ(line, "unevictable"))
 unevictableVal = value >> 10;
+
+   if (newLine)
+   line = newLine + 1;
+   else
+   break;
 }
 
 *cache = cacheVal;


-- 
Dr Peter Chubb Tel: +61 2 9490 5852  http://ts.data61.csiro.au/
Trustworthy Systems Group   Data61 (formerly NICTA)

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v2 2/8] qemu: Allow coldunplugging of hub device

2018-11-07 Thread John Ferlan


On 11/7/18 8:17 AM, Han Han wrote:
> 
> 
> On Tue, Nov 6, 2018 at 6:29 AM John Ferlan  > wrote:
> 
> 
> 
> On 10/12/18 4:50 AM, Han Han wrote:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1375423
> >
> > Signed-off-by: Han Han mailto:h...@redhat.com>>
> > ---
> >  src/conf/domain_conf.c   | 30 ++
> >  src/conf/domain_conf.h   |  3 +++
> >  src/libvirt_private.syms |  1 +
> >  src/qemu/qemu_driver.c   | 10 +-
> >  4 files changed, 43 insertions(+), 1 deletion(-)
> >
> 
> I assume no concerns over something that could be attached to a port on
> the hub... I guess for a cold unplug it probably won't matter so much.
> 
> For the usb hub coldunplug, I find a concern. If a usb device attatched
> to a usb-hub, then we coldunplug the hub, how will the hub and usb device
> look like? All disappear OR all remain?

Since you've posted the patches, then I would hope you could answer your
own question regarding what happens.

Still in a way this is different than hot/live unplug. Attempting to
start the guest after you've removed a hub that had defined/listed USB
address/attachments would I assume fail because the hub is gone. But
there is a little self doubt because of the automatic hub add algorithm
pointed out in patch 7.

If it does expectedly fail, then someone would have to cold plug a new
device in using the correct address before booting - which is an
expected operation.

In a way no different that someone changing a guest config to use 100
vCPUs or 100G on a host that couldn't support those amounts. When cold
changing things, as long as the value is valid - we're good. When
starting if we don't have the resources, then we're not so good

John

> 
> As I know, for scsi and scsi controllers, it is all remaining :)
> 
> 
> Reviewed-by: John Ferlan  >
> 
> John
> 
> 
> 
> -- 
> Best regards,
> ---
> Han Han
> Quality Engineer
> Redhat.
> 
> Email: h...@redhat.com 
> Phone: +861065339333

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH v2 6/8] qemu: implement usb hub device hotplug

2018-11-07 Thread Han Han
On Tue, Nov 6, 2018 at 6:34 AM John Ferlan  wrote:

> $SUBJ:
>
> s/implement/Implement
>
> On 10/12/18 4:50 AM, Han Han wrote:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1375423
>
> Add the infrastructure to allow a USB Hub device to be hotplugged.
>
> >
> > Signed-off-by: Han Han 
> > ---
> >  src/qemu/qemu_driver.c  |  9 ++-
> >  src/qemu/qemu_hotplug.c | 58 +
> >  src/qemu/qemu_hotplug.h |  8 ++
> >  3 files changed, 74 insertions(+), 1 deletion(-)
> >
>
> [...]
>
> > diff --git a/src/qemu/qemu_hotplug.h b/src/qemu/qemu_hotplug.h
> > index 0297e42a98..444333c4df 100644
> > --- a/src/qemu/qemu_hotplug.h
> > +++ b/src/qemu/qemu_hotplug.h
> > @@ -135,10 +135,18 @@ int qemuDomainAttachInputDevice(virQEMUDriverPtr
> driver,
> >  virDomainObjPtr vm,
> >  virDomainInputDefPtr input);
> >
> > +int qemuDomainAttachHubDevice(virQEMUDriverPtr driver,
> > +  virDomainObjPtr vm,
> > +  virDomainHubDefPtr hub);
> > +
> >  int qemuDomainAttachVsockDevice(virQEMUDriverPtr driver,
> >  virDomainObjPtr vm,
> >  virDomainVsockDefPtr vsock);
> >
> > +int qemuDomainAttachInputDevice(virQEMUDriverPtr driver,
> > +virDomainObjPtr vm,
> > +virDomainInputDefPtr input);
> > +
>
> cut-n-paste error perhaps? I'll remove before pushing though...
>
Sorry for the silly mistake... I will remove it in new patch series.

>
> Reviewed-by: John Ferlan 
>
> John
>
>
> >  int qemuDomainAttachLease(virQEMUDriverPtr driver,
> >virDomainObjPtr vm,
> >virDomainLeaseDefPtr lease);
> >
>


-- 
Best regards,
---
Han Han
Quality Engineer
Redhat.

Email: h...@redhat.com
Phone: +861065339333
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] qemu: Don't use -mem-prealloc among with .prealloc=yes

2018-11-07 Thread Martin Kletzander

On Wed, Nov 07, 2018 at 10:47:01AM +0100, Michal Privoznik wrote:

On 11/07/2018 12:43 AM, John Ferlan wrote:



On 11/5/18 9:49 AM, Michal Privoznik wrote:

https://bugzilla.redhat.com/show_bug.cgi?id=1624223

There are two ways to request memory preallocation on cmd line:
-mem-prealloc and .prealloc attribute to memory-backend-file.


s/to/for a/ ?


However, as it turns out it's not safe to use both at the same
time. Prefer -mem-prealloc as it is more backward compatible
compared to switching to "-numa node,memdev=  + -object
memory-backend-file".



FWIW: Issue introduced by commit 1c4f3b56..

While I understand the reasoning, it's really too bad we couldn't "move"
the determination over which conflicting qualifier is used to earlier.
By the time we call the -numa backend we would already have had to make
the choice if I'm reading the ordering right.


Correct, you're reading it right.



But if it doesn't matter for the -numa object to use the -mem-prealloc,
then who am I to complain.  Of course the "future thinking" me that is
living in the present issues surrounding machine and pc makes me wonder
if choosing this as the default going forward into the future where
someone could deprecate the -mem-prealloc because -numa will be so
prevelant won't bite us down the road.


If -mem-prealloc is deprecated then we would have to construct -object
memory-backend-file. I'm not against this, but IIRC this fails during
migration. I mean, if you have a guest that uses -mem-path you can't
migrate it to -object memory-backing-file because qemu would fail to
load the migration stream. That is why we have @needBackend in
qemuBuildNumaArgStr(), so that new cmd line is built iff really needed.

This is the reason I went this way even though BZ suggests otherwise.



Curious how others feel - I'm not against this choice, just trying to
supply an opposing/differing viewpoint. We really have to start coding
for the future and consider what deprecation could mean especially for
arguments that essentially mean the same thing.


Signed-off-by: Michal Privoznik 
---
 src/qemu/qemu_command.c   | 37 +--
 src/qemu/qemu_command.h   |  1 +
 src/qemu/qemu_domain.c|  2 +
 src/qemu/qemu_domain.h|  3 ++
 src/qemu/qemu_hotplug.c   |  3 +-
 .../hugepages-numa-default-dimm.args  |  2 +-
 6 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index e338d3172e..0294030f0e 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -3123,6 +3123,7 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
  * @def: domain definition object
  * @mem: memory definition object
  * @autoNodeset: fallback nodeset in case of automatic NUMA placement
+ * @forbidPrealloc: don't set prealloc attribute


Slight bikeshed, but this changes the priv->memAlloc to @forbidPrealloc
which is IMO a bit odd.


Okay, what name do you suggest? My reasoning for the name was that it
should make sense from the function POV. That's why calling the variable
'memAlloc' did not make sense to me.



Beyond that, this becomes the 3rd @priv field to be passed along...
Maybe @priv should just be passed to access qemuCaps, autoNodeset, and
memPrealloc.


Ah sure.




  * @force: forcibly use one of the backends
  *
  * Creates a configuration object that represents memory backend of given guest
@@ -3136,6 +3137,9 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
  * Then, if one of the two memory-backend-* should be used, the @qemuCaps is
  * consulted to check if qemu does support it.
  *
+ * If @forbidPrealloc is true then 'prealloc' attribute of the backend is not
+ * set. This may come handy when global -mem-prealloc is already specified.
+ *
  * Returns: 0 on success,
  *  1 on success and if there's no need to use memory-backend-*
  * -1 on error.
@@ -3148,6 +3152,7 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
 virDomainDefPtr def,
 virDomainMemoryDefPtr mem,
 virBitmapPtr autoNodeset,
+bool forbidPrealloc,
 bool force)
 {
 const char *backendType = "memory-backend-file";
@@ -3265,11 +3270,13 @@ qemuBuildMemoryBackendProps(virJSONValuePtr 
*backendProps,
 if (mem->nvdimmPath) {
 if (VIR_STRDUP(memPath, mem->nvdimmPath) < 0)
 goto cleanup;
-prealloc = true;
+if (!forbidPrealloc)
+prealloc = true;
 } else if (useHugepage) {
 if (qemuGetDomainHupageMemPath(def, cfg, pagesize, ) < 0)
 goto cleanup;
-prealloc = true;
+if (!forbidPrealloc)
+prealloc = true;
 } else {
 /* We can have both pagesize and mem source. If that's the 

Re: [libvirt] [PATCH] qemu: Don't use -mem-prealloc among with .prealloc=yes

2018-11-07 Thread John Ferlan



On 11/7/18 4:47 AM, Michal Privoznik wrote:
> On 11/07/2018 12:43 AM, John Ferlan wrote:
>>
>>
>> On 11/5/18 9:49 AM, Michal Privoznik wrote:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1624223
>>>
>>> There are two ways to request memory preallocation on cmd line:
>>> -mem-prealloc and .prealloc attribute to memory-backend-file.
>>
>> s/to/for a/ ?
>>
>>> However, as it turns out it's not safe to use both at the same
>>> time. Prefer -mem-prealloc as it is more backward compatible
>>> compared to switching to "-numa node,memdev=  + -object
>>> memory-backend-file".
>>>
>>
>> FWIW: Issue introduced by commit 1c4f3b56..
>>
>> While I understand the reasoning, it's really too bad we couldn't "move"
>> the determination over which conflicting qualifier is used to earlier.
>> By the time we call the -numa backend we would already have had to make
>> the choice if I'm reading the ordering right.
> 
> Correct, you're reading it right.
> 
>>
>> But if it doesn't matter for the -numa object to use the -mem-prealloc,
>> then who am I to complain.  Of course the "future thinking" me that is
>> living in the present issues surrounding machine and pc makes me wonder
>> if choosing this as the default going forward into the future where
>> someone could deprecate the -mem-prealloc because -numa will be so
>> prevelant won't bite us down the road.
> 
> If -mem-prealloc is deprecated then we would have to construct -object
> memory-backend-file. I'm not against this, but IIRC this fails during
> migration. I mean, if you have a guest that uses -mem-path you can't
> migrate it to -object memory-backing-file because qemu would fail to
> load the migration stream. That is why we have @needBackend in
> qemuBuildNumaArgStr(), so that new cmd line is built iff really needed.
> 
> This is the reason I went this way even though BZ suggests otherwise.
> 

So having the need for -mem-path would seem to need to be a migration
deal breaker regardless, true? It's "confusing" to tie -mem-path,
-mem-prealloc, and .prealloc=yes for the less informed reader. There's
some "relationships" here that without explicitly detailing them could
at some point in time get lost/misunderstood and then cause problems.

>>
>> Curious how others feel - I'm not against this choice, just trying to
>> supply an opposing/differing viewpoint. We really have to start coding
>> for the future and consider what deprecation could mean especially for
>> arguments that essentially mean the same thing.
>>
>>> Signed-off-by: Michal Privoznik 
>>> ---
>>>  src/qemu/qemu_command.c   | 37 +--
>>>  src/qemu/qemu_command.h   |  1 +
>>>  src/qemu/qemu_domain.c|  2 +
>>>  src/qemu/qemu_domain.h|  3 ++
>>>  src/qemu/qemu_hotplug.c   |  3 +-
>>>  .../hugepages-numa-default-dimm.args  |  2 +-
>>>  6 files changed, 35 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>>> index e338d3172e..0294030f0e 100644
>>> --- a/src/qemu/qemu_command.c
>>> +++ b/src/qemu/qemu_command.c
>>> @@ -3123,6 +3123,7 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>>>   * @def: domain definition object
>>>   * @mem: memory definition object
>>>   * @autoNodeset: fallback nodeset in case of automatic NUMA placement
>>> + * @forbidPrealloc: don't set prealloc attribute
>>
>> Slight bikeshed, but this changes the priv->memAlloc to @forbidPrealloc
>> which is IMO a bit odd.
> 
> Okay, what name do you suggest? My reasoning for the name was that it
> should make sense from the function POV. That's why calling the variable
> 'memAlloc' did not make sense to me.
> 

No real suggestion other than @memPrealloc for consistency (which you
figured out from my miss-typed priv->memAlloc).

>>
>> Beyond that, this becomes the 3rd @priv field to be passed along...
>> Maybe @priv should just be passed to access qemuCaps, autoNodeset, and
>> memPrealloc.
> 
> Ah sure.
> 
>>

[...]

>>>  qemuBuildMemCommandLine(virCommandPtr cmd,
>>>  virQEMUDriverConfigPtr cfg,
>>>  const virDomainDef *def,
>>> -virQEMUCapsPtr qemuCaps)
>>> +virQEMUCapsPtr qemuCaps,
>>> +qemuDomainObjPrivatePtr priv)
>>>  {
>>>  if (qemuDomainDefValidateMemoryHotplug(def, qemuCaps, NULL) < 0)
>>>  return -1;
>>> @@ -7498,15 +7511,17 @@ qemuBuildMemCommandLine(virCommandPtr cmd,
>>>virDomainDefGetMemoryInitial(def) / 1024);
>>>  }
>>>  
>>> -if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE)
>>> +if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE) {
>>>  virCommandAddArgList(cmd, "-mem-prealloc", NULL);
>>> +priv->memPrealloc = true;
>>> +}
>>
>> I find it "confusing" that setting memPrealloc = true when
>> "def->mem.allocation == 

Re: [libvirt] [PATCH v2 2/8] qemu: Allow coldunplugging of hub device

2018-11-07 Thread Han Han
On Tue, Nov 6, 2018 at 6:29 AM John Ferlan  wrote:

>
>
> On 10/12/18 4:50 AM, Han Han wrote:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1375423
> >
> > Signed-off-by: Han Han 
> > ---
> >  src/conf/domain_conf.c   | 30 ++
> >  src/conf/domain_conf.h   |  3 +++
> >  src/libvirt_private.syms |  1 +
> >  src/qemu/qemu_driver.c   | 10 +-
> >  4 files changed, 43 insertions(+), 1 deletion(-)
> >
>
> I assume no concerns over something that could be attached to a port on
> the hub... I guess for a cold unplug it probably won't matter so much.
>
For the usb hub coldunplug, I find a concern. If a usb device attatched
to a usb-hub, then we coldunplug the hub, how will the hub and usb device
look like? All disappear OR all remain?

As I know, for scsi and scsi controllers, it is all remaining :)

>
> Reviewed-by: John Ferlan 
>
> John
>


-- 
Best regards,
---
Han Han
Quality Engineer
Redhat.

Email: h...@redhat.com
Phone: +861065339333
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH v2 0/3] qemu: Restore lost shutdown reason

2018-11-07 Thread John Ferlan



On 11/1/18 12:04 PM, John Ferlan wrote:
> v1:
> https://www.redhat.com/archives/libvir-list/2018-October/msg00945.html
> 
> Changes since v1:
> 
> Patch 1 (NEW)
> 
>   - Set the priv->allowReboot prior to any error processing since we
> could/should be using that.
> 
> Patch 2 (ADJUSTED LOGIC)
> 
>   - Rather than open code the "reason" to use the -no-shutdown flag,
> let's create a qemu_domain helper for that so that both the command
> line building and the Reconnection failure logic can use it.
> 
> Patch 3 (NEW)
> 
>   - Let's narrow the window for using VIR_DOMAIN_SHUTOFF_CRASHED to
> just the period where we try to open a monitor channel. If we
> cannot do so, then "assume" the reason was crashed. There are
> a few open failure steps that may not exactly fit the model, but
> those are probably splitting hairs.
> 
> John Ferlan (3):
>   qemu: Move allow reboot check setting
>   qemu: Restore lost shutdown reason
>   qemu: Narrow the shutdown reconnection failure reason window
> 
>  src/qemu/qemu_command.c |  6 +-
>  src/qemu/qemu_domain.c  | 17 +
>  src/qemu/qemu_domain.h  |  3 +++
>  src/qemu/qemu_process.c | 27 ++-
>  4 files changed, 39 insertions(+), 14 deletions(-)
> 

After sleeping on it, although we cannot come to an agreement on patch1,
I guess in the long run it doesn't matter since patch3 narrows the
window that made patch1 unnecessary. In order for CRASHED to be elicited
the monitor connection must've failed; otherwise, UNKNOWN would be used.

Moving forward also provides the change I found necessary for Nikolay's
domain reason addition patch to proceed forward, e.g.:

https://www.redhat.com/archives/libvir-list/2018-October/msg00493.html

So, I'll drop patch 1 and push patch2/3, then go back to Nikolay's code.

Tks -

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v3 04/20] backup: Document nuances between different state capture APIs

2018-11-07 Thread Vladimir Sementsov-Ogievskiy
Hi!

Finally I'm here, sorry for being so late, and sorry again if I touch something 
already discussed in v1/v2 which I've missed :(

25.10.2018 22:20, Eric Blake wrote:

Upcoming patches will add support for incremental backups via
a new API; but first, we need a landing page that gives an
overview of capturing various pieces of guest state, and which
APIs are best suited to which tasks.

Signed-off-by: Eric Blake 

---
v2: wording improvements based on review
---
 docs/docs.html.in   |   5 +
 docs/domainstatecapture.html.in | 314 
 docs/formatsnapshot.html.in |   2 +
 3 files changed, 321 insertions(+)
 create mode 100644 docs/domainstatecapture.html.in

diff --git a/docs/docs.html.in b/docs/docs.html.in
index 40e0e3b82e..4c46b74980 100644
--- a/docs/docs.html.in
+++ b/docs/docs.html.in
@@ -120,6 +120,11 @@

 Secure usage
 Secure usage of the libvirt APIs
+
+Domain state
+capture
+Comparison between different methods of capturing domain
+  state
   
 

diff --git a/docs/domainstatecapture.html.in b/docs/domainstatecapture.html.in
new file mode 100644
index 00..f7f2fe0b98
--- /dev/null
+++ b/docs/domainstatecapture.html.in
@@ -0,0 +1,314 @@
+
+
+http://www.w3.org/1999/xhtml;>
+  
+
+Domain state capture using Libvirt
+
+
+
+
+  In order to aid application developers to choose which
+  operations best suit their needs, this page compares the
+  different means for capturing state related to a domain managed
+  by libvirt.
+
+
+
+  The information here is primarily geared towards capturing the
+  state of an active domain. Capturing the state of an inactive
+  domain essentially amounts to copying the contents of guest
+  disks, followed by a fresh boot with disks restored to that
+  state. Some of the topics presented below may relate to inactive
+  state collection, but it is not the primary focus of this page.
+
+
+State capture trade-offs
+
+One of the features made possible with virtual machines is live
+  migration -- transferring all state related to the guest from
+  one host to another with minimal interruption to the guest's
+  activity. In this case, state includes domain memory (including
+  register and device contents), and domain storage (whether the
+  guest's view of the disks are backed by local storage on the
+  host, or by the hypervisor accessing shared storage over a
+  network).  A clever observer will then note that if all state is
+  available for live migration, then there is nothing stopping a
+  user from saving some or all of that state at a given point of
+  time in order to be able to later rewind guest execution back to
+  the state it previously had. The astute reader will also realize
+  that state capture at any level requires that the data must be
+  stored and managed by some mechanism. This processing might fit
+  in a single file, or more likely require a chain of related
+  files, and may require synchronization with third-party tools
+  built around managing the amount of data resulting from
+  capturing the state of multiple guests that each use multiple
+  disks.
+
+
+
+  There are several libvirt APIs associated with capturing the
+  state of a guest, which can later be used to rewind that guest
+  to the conditions it was in earlier.  The following is a list of
+  trade-offs and differences between the various facets that
+  affect capturing domain state for active domains:
+
+
+
+  Duration
+  Capturing state can be a lengthy process, so while the
+captured state ideally represents an atomic point in time
+correpsonding to something the guest was actually executing,
+capturing state tends to focus on minimizing guest downtime
+while performing the rest of the state capture in parallel
+with guest execution.  Some interfaces require up-front
+preparation (the state captured is not complete until the API
+ends, which may be some time after the command was first
+started), while other interfaces track the state when the
+command was first issued, regardless of the time spent in
+capturing the rest of the state.  Also, time spent in state
+capture may be longer than the time required for live
+migration, when state must be duplicated rather than shared.

Do you mean migration with shared storage in last sentence, or what? In the 
same way we can say that duplication may go faster if we mean incremental 
backup:) Or I don't follow) Moreover, migration time may be longer because of a 
lot of iterations (when backup always have one).


+  
+
+  Amount of state
+  For an online guest, there is a choice 

Re: [libvirt] [jenkins-ci PATCH 0/4] Add Fedora 29, drop Fedora 27

2018-11-07 Thread Pavel Hrdina
On Fri, Nov 02, 2018 at 03:39:32PM +0100, Andrea Bolognani wrote:
> It's that time of the year again :)
> 
> Andrea Bolognani (4):
>   guests: Add Fedora 29
>   Build on Fedora 29
>   Stop building on Fedora 27
>   guests: Drop Fedora 27

Reviewed-by: Pavel Hrdina 


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] qemu: Don't use -mem-prealloc among with .prealloc=yes

2018-11-07 Thread Michal Privoznik
On 11/07/2018 12:43 AM, John Ferlan wrote:
> 
> 
> On 11/5/18 9:49 AM, Michal Privoznik wrote:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1624223
>>
>> There are two ways to request memory preallocation on cmd line:
>> -mem-prealloc and .prealloc attribute to memory-backend-file.
> 
> s/to/for a/ ?
> 
>> However, as it turns out it's not safe to use both at the same
>> time. Prefer -mem-prealloc as it is more backward compatible
>> compared to switching to "-numa node,memdev=  + -object
>> memory-backend-file".
>>
> 
> FWIW: Issue introduced by commit 1c4f3b56..
> 
> While I understand the reasoning, it's really too bad we couldn't "move"
> the determination over which conflicting qualifier is used to earlier.
> By the time we call the -numa backend we would already have had to make
> the choice if I'm reading the ordering right.

Correct, you're reading it right.

> 
> But if it doesn't matter for the -numa object to use the -mem-prealloc,
> then who am I to complain.  Of course the "future thinking" me that is
> living in the present issues surrounding machine and pc makes me wonder
> if choosing this as the default going forward into the future where
> someone could deprecate the -mem-prealloc because -numa will be so
> prevelant won't bite us down the road.

If -mem-prealloc is deprecated then we would have to construct -object
memory-backend-file. I'm not against this, but IIRC this fails during
migration. I mean, if you have a guest that uses -mem-path you can't
migrate it to -object memory-backing-file because qemu would fail to
load the migration stream. That is why we have @needBackend in
qemuBuildNumaArgStr(), so that new cmd line is built iff really needed.

This is the reason I went this way even though BZ suggests otherwise.

> 
> Curious how others feel - I'm not against this choice, just trying to
> supply an opposing/differing viewpoint. We really have to start coding
> for the future and consider what deprecation could mean especially for
> arguments that essentially mean the same thing.
> 
>> Signed-off-by: Michal Privoznik 
>> ---
>>  src/qemu/qemu_command.c   | 37 +--
>>  src/qemu/qemu_command.h   |  1 +
>>  src/qemu/qemu_domain.c|  2 +
>>  src/qemu/qemu_domain.h|  3 ++
>>  src/qemu/qemu_hotplug.c   |  3 +-
>>  .../hugepages-numa-default-dimm.args  |  2 +-
>>  6 files changed, 35 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>> index e338d3172e..0294030f0e 100644
>> --- a/src/qemu/qemu_command.c
>> +++ b/src/qemu/qemu_command.c
>> @@ -3123,6 +3123,7 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>>   * @def: domain definition object
>>   * @mem: memory definition object
>>   * @autoNodeset: fallback nodeset in case of automatic NUMA placement
>> + * @forbidPrealloc: don't set prealloc attribute
> 
> Slight bikeshed, but this changes the priv->memAlloc to @forbidPrealloc
> which is IMO a bit odd.

Okay, what name do you suggest? My reasoning for the name was that it
should make sense from the function POV. That's why calling the variable
'memAlloc' did not make sense to me.

> 
> Beyond that, this becomes the 3rd @priv field to be passed along...
> Maybe @priv should just be passed to access qemuCaps, autoNodeset, and
> memPrealloc.

Ah sure.

> 
>>   * @force: forcibly use one of the backends
>>   *
>>   * Creates a configuration object that represents memory backend of given 
>> guest
>> @@ -3136,6 +3137,9 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
>>   * Then, if one of the two memory-backend-* should be used, the @qemuCaps is
>>   * consulted to check if qemu does support it.
>>   *
>> + * If @forbidPrealloc is true then 'prealloc' attribute of the backend is 
>> not
>> + * set. This may come handy when global -mem-prealloc is already specified.
>> + *
>>   * Returns: 0 on success,
>>   *  1 on success and if there's no need to use memory-backend-*
>>   * -1 on error.
>> @@ -3148,6 +3152,7 @@ qemuBuildMemoryBackendProps(virJSONValuePtr 
>> *backendProps,
>>  virDomainDefPtr def,
>>  virDomainMemoryDefPtr mem,
>>  virBitmapPtr autoNodeset,
>> +bool forbidPrealloc,
>>  bool force)
>>  {
>>  const char *backendType = "memory-backend-file";
>> @@ -3265,11 +3270,13 @@ qemuBuildMemoryBackendProps(virJSONValuePtr 
>> *backendProps,
>>  if (mem->nvdimmPath) {
>>  if (VIR_STRDUP(memPath, mem->nvdimmPath) < 0)
>>  goto cleanup;
>> -prealloc = true;
>> +if (!forbidPrealloc)
>> +prealloc = true;
>>  } else if (useHugepage) {
>>  if (qemuGetDomainHupageMemPath(def, cfg, pagesize, ) < 
>> 0)
>>  goto cleanup;
>> - 

[libvirt] RFC: put domain's interfaces into distinct namespaces

2018-11-07 Thread Nikolay Shirokovskiy
Hi, all!

There is performance issue with network filters and broadcast ethernet traffic.
If L2 segment is large enough (several thousands of VMs) then there is a lot of
broadcast ARP traffic (about frames 100/s). As aresult on host with several 
hundreds
VMs (say 300) we have kernel thread eating 100% of CPUs just for checking this 
traffic
against firewall rules. The problem is if there are rules in ebtables 
POSTROUTING chain
(clean-traffic is example of such filter) then when every single broadcast 
frame turns into
300, one for every distinct bridge port and then each one of these 300 is 
checked against
300 / 2 rules average to find chain for that port. As a result we have 100 * 
300 * 300 / 2
= 4.5 * 10^6 rules checks per second. Kernel does not spread this workload onto
different CPUs and anyway this is wasting CPUs!

The simple solution is to put rules that ACCEPT ARP traffic into POSTROUTING
itself before any port specific chains. But this will affect non-VM ports too 
and host itself. So can we instead make a distinct network namespace for every
VM and put tap there, next add the bridge into the namespace too so we can apply
ebtables rules there and insert tap into the bridge. Finally connect the bridges
in root namespace and VM namespace by veth pair. As result in the situation
described above each cloned frame will be cheched only againt rules for this
very VM. The regular TCP traffic will have same benefits. On the other hand we
need a bridge and veth pair for every VM and some CPU power to process this 
extra
traffic path.

The proposed approach also fixes the problem of slow libvirtd restarting with
network filters ([1], [2]) as it is rather difficult to mess network rules in
different network namespace, at least restarting/reloading firewalld won't
hurt such rules so we just don't need to reinstantiate rules at all.



[1] [RFC] Faster libvirtd restart with nwfilter rules
https://www.redhat.com/archives/libvir-list/2018-September/msg01206.html
  which continues in
https://www.redhat.com/archives/libvir-list/2018-October/msg00657.html

[2] [PATCH v2 0/2] nwfilter: don't reinstantiate rules if there is no need to
https://www.redhat.com/archives/libvir-list/2018-October/msg01317.html


Nikolay

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list