Re: [libvirt] libvirt? qemu change that mmaps ELF files breaks libvirt svirt handling for os.kernel

2019-10-04 Thread Paolo Bonzini
On 04/10/19 14:47, Christian Borntraeger wrote:
>> Please file a bz entry against the selinux-policy component for
>> whatever distro you're using, and/or Fedora rawhide, and CC me
>> on it too.
> 
> Done. This was on Fedora 30.
> https://bugzilla.redhat.com/show_bug.cgi?id=1758525
> 
>  Now sure about others like RHEL. RHV.
> 

We'll take care of that.  Thanks!

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 05/11] util: use glib string allocation/formatting functions

2019-10-04 Thread Daniel P . Berrangé
On Mon, Sep 30, 2019 at 01:35:36PM +0200, Pavel Hrdina wrote:
> On Fri, Sep 27, 2019 at 06:17:27PM +0100, Daniel P. Berrangé wrote:
> > Convert the string duplication APIs to use the g_strdup family of APIs.
> > 
> > Annoyingly our virVasprintf/virAsprintf functions return the character
> > count, even though 90% of our usage doesn't need it. To retain compat
> > with these semantics we have a call to strlen which costs CPU time.
> > 
> > We previously used the 'strdup-posix' gnulib module because mingw does
> > not set errno to ENOMEM on failure
> > 
> > We previously used the 'strndup' gnulib module because this function
> > does not exist on mingw.
> > 
> > We previously used the 'vasprintf' gnulib module because of many GNU
> > supported format specifiers not working on non-Linux platforms. glib's
> > own equivalent standardizes on GNU format specifiers too.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >  bootstrap.conf   |  3 ---
> >  src/util/virstring.c | 19 +++
> >  2 files changed, 7 insertions(+), 15 deletions(-)
> > 
> > diff --git a/bootstrap.conf b/bootstrap.conf
> > index 549d18c6d4..b6b75f9301 100644
> > --- a/bootstrap.conf
> > +++ b/bootstrap.conf
> > @@ -100,8 +100,6 @@ stat-time
> >  stdarg
> >  stpcpy
> >  strchrnul
> > -strdup-posix
> > -strndup
> >  strerror
> >  strerror_r-posix
> >  strptime
> > @@ -117,7 +115,6 @@ uname
> >  unsetenv
> >  useless-if-before-free
> >  usleep
> > -vasprintf
> >  verify
> >  vc-list-files
> >  vsnprintf
> > diff --git a/src/util/virstring.c b/src/util/virstring.c
> > index a4cc7e9c0a..c8c888b2a0 100644
> > --- a/src/util/virstring.c
> > +++ b/src/util/virstring.c
> > @@ -730,12 +730,9 @@ virVasprintfInternal(char **strp,
> >   const char *fmt,
> >   va_list list)
> >  {
> > -int ret;
> > +*strp = g_strdup_vprintf(fmt, list);
> >  
> > -if ((ret = vasprintf(strp, fmt, list)) == -1)
> > -abort();
> > -
> > -return ret;
> > +return strlen(*strp);
> 
> This will cause a SEGFAULT if strp is NULL as g_strdup_vprintf doesn't
> abort on failure.

I spent a long time investigating this

g_strdup_vprintf calls g_vasprintf() which in turn has 3 impls.

2 out of the 3 impls will abort on OOM, but one won't. The one
we use on Linux is the one that won't abort.

No application code that I can find ever checks the return value
of g_strdup_vprintf or the output string of g_vasprintf.

I eventually found a bug indicating the lack of abort on OOM is
indeed considered a mistake:

  https://gitlab.gnome.org/GNOME/glib/issues/1622

I've thus sent a patch to force an abort on OOM:

  https://gitlab.gnome.org/GNOME/glib/merge_requests/1145

Thus I think from libvirt's POV we can assume this aborts on OOM,
since every single other application using this does the same.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] libvirt? qemu change that mmaps ELF files breaks libvirt svirt handling for os.kernel

2019-10-04 Thread Christian Borntraeger


On 04.10.19 14:36, Daniel P. Berrangé wrote:
> On Fri, Oct 04, 2019 at 02:18:49PM +0200, Christian Borntraeger wrote:
>>
>>
>> On 04.10.19 14:13, Paolo Bonzini wrote:
>>> On 04/10/19 14:03, Christian Borntraeger wrote:
 Stefano, Paolo,

 I have an interesting fail in QEMU 

 2019-10-04T12:00:32.675188Z qemu-system-s390x: GLib: g_mapped_file_unref: 
 assertion 'file != NULL' failed
 that bisected to 
 commit 816b9fe450220e19acb91a0ce4a8ade7000648d1 (refs/bisect/bad)
 elf-ops.h: Map into memory the ELF to load

 strace tells that I can read the ELF file, but not mmap
 strace:
 214365 openat(AT_FDCWD, "/var/lib/libvirt/images/test_cpu_timer.elf", 
 O_RDONLY) = 36
 214365 read(46, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0", 16) = 16
 214365 lseek(46, 0, SEEK_SET)   = 0
 [...]
 214365 fstat(46, {st_mode=S_IFREG|0755, st_size=168176, ...}) = 0
 214365 mmap(NULL, 168176, PROT_READ|PROT_WRITE, MAP_PRIVATE, 46, 0) = -1 
 EACCES (Permission denied)

 So reading from /var/lib/libvirt/images/test_cpu_timer.elf does work, 
 mmaping does not.
 setenforce 0 makes the problem go away. 

 This might be more of an issue in libvirt, setting the svirt context too
 restrictive, but I am not too deep into the svirt part of libvirt.
 Reverting the qemu commit makes the problem go away.
>>>
>>> Yes, the policy is too restrictive in my opinion.
>>>
>>> Can you include the output of "audit2allow" and/or "audit2allow -R"?
>>>
>>> Thanks,
>>>
>>> Paolo
>>>
>>
>> require {
>>  type unconfined_t;
>>  type virt_content_t;
>>  type svirt_t;
>>  type systemd_tmpfiles_t;
>>  type user_home_t;
>>  type NetworkManager_t;
>>  class file { entrypoint execute ioctl lock map open read write };
>>  class bpf prog_run;
>> }
>>
>> #= svirt_t ==
>> allow svirt_t user_home_t:file { entrypoint execute ioctl lock open read 
>> write };
>>
>> # This avc can be allowed using the boolean 'domain_can_mmap_files'
> 
> This is an unrelated boolean and we don't want to use that so ignore
> this suggestion !
> 
>> allow svirt_t virt_content_t:file map;
> 
> This matches your strace.  virt_content_t is the label we use on
> files that are exposed to QEMU read-only.
> 
> The svirt policy only allows mmap on files that re exposed read-write
> currently.
> 
> I believe we can safely allow this mmap on read-only files too
> because  the read-only restriction is enforced at time of open,
> and any writes to the mapped file  will result in a private
> copy on write.
> 
> Please file a bz entry against the selinux-policy component for
> whatever distro you're using, and/or Fedora rawhide, and CC me
> on it too.

Done. This was on Fedora 30.
https://bugzilla.redhat.com/show_bug.cgi?id=1758525

 Now sure about others like RHEL. RHV.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] libvirt? qemu change that mmaps ELF files breaks libvirt svirt handling for os.kernel

2019-10-04 Thread Daniel P . Berrangé
On Fri, Oct 04, 2019 at 02:18:49PM +0200, Christian Borntraeger wrote:
> 
> 
> On 04.10.19 14:13, Paolo Bonzini wrote:
> > On 04/10/19 14:03, Christian Borntraeger wrote:
> >> Stefano, Paolo,
> >>
> >> I have an interesting fail in QEMU 
> >>
> >> 2019-10-04T12:00:32.675188Z qemu-system-s390x: GLib: g_mapped_file_unref: 
> >> assertion 'file != NULL' failed
> >> that bisected to 
> >> commit 816b9fe450220e19acb91a0ce4a8ade7000648d1 (refs/bisect/bad)
> >> elf-ops.h: Map into memory the ELF to load
> >>
> >> strace tells that I can read the ELF file, but not mmap
> >> strace:
> >> 214365 openat(AT_FDCWD, "/var/lib/libvirt/images/test_cpu_timer.elf", 
> >> O_RDONLY) = 36
> >> 214365 read(46, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0", 16) = 16
> >> 214365 lseek(46, 0, SEEK_SET)   = 0
> >> [...]
> >> 214365 fstat(46, {st_mode=S_IFREG|0755, st_size=168176, ...}) = 0
> >> 214365 mmap(NULL, 168176, PROT_READ|PROT_WRITE, MAP_PRIVATE, 46, 0) = -1 
> >> EACCES (Permission denied)
> >>
> >> So reading from /var/lib/libvirt/images/test_cpu_timer.elf does work, 
> >> mmaping does not.
> >> setenforce 0 makes the problem go away. 
> >>
> >> This might be more of an issue in libvirt, setting the svirt context too
> >> restrictive, but I am not too deep into the svirt part of libvirt.
> >> Reverting the qemu commit makes the problem go away.
> > 
> > Yes, the policy is too restrictive in my opinion.
> > 
> > Can you include the output of "audit2allow" and/or "audit2allow -R"?
> > 
> > Thanks,
> > 
> > Paolo
> > 
> 
> require {
>   type unconfined_t;
>   type virt_content_t;
>   type svirt_t;
>   type systemd_tmpfiles_t;
>   type user_home_t;
>   type NetworkManager_t;
>   class file { entrypoint execute ioctl lock map open read write };
>   class bpf prog_run;
> }
> 
> #= svirt_t ==
> allow svirt_t user_home_t:file { entrypoint execute ioctl lock open read 
> write };
> 
> # This avc can be allowed using the boolean 'domain_can_mmap_files'

This is an unrelated boolean and we don't want to use that so ignore
this suggestion !

> allow svirt_t virt_content_t:file map;

This matches your strace.  virt_content_t is the label we use on
files that are exposed to QEMU read-only.

The svirt policy only allows mmap on files that re exposed read-write
currently.

I believe we can safely allow this mmap on read-only files too
because  the read-only restriction is enforced at time of open,
and any writes to the mapped file  will result in a private
copy on write.

Please file a bz entry against the selinux-policy component for
whatever distro you're using, and/or Fedora rawhide, and CC me
on it too.

> corecmd_bin_entry_type(svirt_t)
> userdom_manage_user_home_content_dirs(svirt_t)
> userdom_map_user_home_files(svirt_t)
> virt_rw_svirt_image(svirt_t)

These look unrelated to the problem above.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] libvirt? qemu change that mmaps ELF files breaks libvirt svirt handling for os.kernel

2019-10-04 Thread Christian Borntraeger



On 04.10.19 14:13, Paolo Bonzini wrote:
> On 04/10/19 14:03, Christian Borntraeger wrote:
>> Stefano, Paolo,
>>
>> I have an interesting fail in QEMU 
>>
>> 2019-10-04T12:00:32.675188Z qemu-system-s390x: GLib: g_mapped_file_unref: 
>> assertion 'file != NULL' failed
>> that bisected to 
>> commit 816b9fe450220e19acb91a0ce4a8ade7000648d1 (refs/bisect/bad)
>> elf-ops.h: Map into memory the ELF to load
>>
>> strace tells that I can read the ELF file, but not mmap
>> strace:
>> 214365 openat(AT_FDCWD, "/var/lib/libvirt/images/test_cpu_timer.elf", 
>> O_RDONLY) = 36
>> 214365 read(46, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0", 16) = 16
>> 214365 lseek(46, 0, SEEK_SET)   = 0
>> [...]
>> 214365 fstat(46, {st_mode=S_IFREG|0755, st_size=168176, ...}) = 0
>> 214365 mmap(NULL, 168176, PROT_READ|PROT_WRITE, MAP_PRIVATE, 46, 0) = -1 
>> EACCES (Permission denied)
>>
>> So reading from /var/lib/libvirt/images/test_cpu_timer.elf does work, 
>> mmaping does not.
>> setenforce 0 makes the problem go away. 
>>
>> This might be more of an issue in libvirt, setting the svirt context too
>> restrictive, but I am not too deep into the svirt part of libvirt.
>> Reverting the qemu commit makes the problem go away.
> 
> Yes, the policy is too restrictive in my opinion.
> 
> Can you include the output of "audit2allow" and/or "audit2allow -R"?
> 
> Thanks,
> 
> Paolo
> 

require {
type unconfined_t;
type virt_content_t;
type svirt_t;
type systemd_tmpfiles_t;
type user_home_t;
type NetworkManager_t;
class file { entrypoint execute ioctl lock map open read write };
class bpf prog_run;
}

#= svirt_t ==
allow svirt_t user_home_t:file { entrypoint execute ioctl lock open read write 
};

# This avc can be allowed using the boolean 'domain_can_mmap_files'
allow svirt_t virt_content_t:file map;
corecmd_bin_entry_type(svirt_t)
userdom_manage_user_home_content_dirs(svirt_t)
userdom_map_user_home_files(svirt_t)
virt_rw_svirt_image(svirt_t)

#= systemd_tmpfiles_t ==
kernel_read_usermodehelper_state(systemd_tmpfiles_t)

#= unconfined_t ==
allow unconfined_t NetworkManager_t:bpf prog_run;

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] libvirt? qemu change that mmaps ELF files breaks libvirt svirt handling for os.kernel

2019-10-04 Thread Christian Borntraeger
Stefano, Paolo,

I have an interesting fail in QEMU 

2019-10-04T12:00:32.675188Z qemu-system-s390x: GLib: g_mapped_file_unref: 
assertion 'file != NULL' failed
that bisected to 
commit 816b9fe450220e19acb91a0ce4a8ade7000648d1 (refs/bisect/bad)
elf-ops.h: Map into memory the ELF to load

strace tells that I can read the ELF file, but not mmap
strace:
214365 openat(AT_FDCWD, "/var/lib/libvirt/images/test_cpu_timer.elf", O_RDONLY) 
= 36
214365 read(46, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0", 16) = 16
214365 lseek(46, 0, SEEK_SET)   = 0
[...]
214365 fstat(46, {st_mode=S_IFREG|0755, st_size=168176, ...}) = 0
214365 mmap(NULL, 168176, PROT_READ|PROT_WRITE, MAP_PRIVATE, 46, 0) = -1 EACCES 
(Permission denied)

So reading from /var/lib/libvirt/images/test_cpu_timer.elf does work, mmaping 
does not.
setenforce 0 makes the problem go away. 

This might be more of an issue in libvirt, setting the svirt context too
restrictive, but I am not too deep into the svirt part of libvirt.
Reverting the qemu commit makes the problem go away.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] libvirt? qemu change that mmaps ELF files breaks libvirt svirt handling for os.kernel

2019-10-04 Thread Paolo Bonzini
On 04/10/19 14:03, Christian Borntraeger wrote:
> Stefano, Paolo,
> 
> I have an interesting fail in QEMU 
> 
> 2019-10-04T12:00:32.675188Z qemu-system-s390x: GLib: g_mapped_file_unref: 
> assertion 'file != NULL' failed
> that bisected to 
> commit 816b9fe450220e19acb91a0ce4a8ade7000648d1 (refs/bisect/bad)
> elf-ops.h: Map into memory the ELF to load
> 
> strace tells that I can read the ELF file, but not mmap
> strace:
> 214365 openat(AT_FDCWD, "/var/lib/libvirt/images/test_cpu_timer.elf", 
> O_RDONLY) = 36
> 214365 read(46, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0", 16) = 16
> 214365 lseek(46, 0, SEEK_SET)   = 0
> [...]
> 214365 fstat(46, {st_mode=S_IFREG|0755, st_size=168176, ...}) = 0
> 214365 mmap(NULL, 168176, PROT_READ|PROT_WRITE, MAP_PRIVATE, 46, 0) = -1 
> EACCES (Permission denied)
> 
> So reading from /var/lib/libvirt/images/test_cpu_timer.elf does work, mmaping 
> does not.
> setenforce 0 makes the problem go away. 
> 
> This might be more of an issue in libvirt, setting the svirt context too
> restrictive, but I am not too deep into the svirt part of libvirt.
> Reverting the qemu commit makes the problem go away.

Yes, the policy is too restrictive in my opinion.

Can you include the output of "audit2allow" and/or "audit2allow -R"?

Thanks,

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v2 0/1] virsh: fixed handling of sourceless disks in 'domblkinfo' cmd

2019-10-04 Thread Pavel Mores
Integrating feedback to the original version.

This changes the approach to fixing the problem - previously, we called
virDomainGetBlockInfo() and if it failed, we mitigated the failure for CDROMs
and floppies specifically.  Now we first check for existence of a 
element in the corresponding XML and if we find none, we avoid calling
virDomainGetBlockInfo() altogether as we know it's bound to fail in that case.

The benefit is that whereas the previous fix swallowed all errors concerning
CDROMs and floppies - not just missing  - this one only handles
missing  specifically and doesn't mask other problems that might come
up.

The patch itself is fairly simple, unfortunately some noise is caused by
additional indentation of code related to the virDomainGetBlockInfo() call.
That code should however be unchanged (apart from reformating a comment to
keep line lengths under 80 chars).

Pavel Mores (1):
  fixed handling of sourceless disks in 'domblkinfo' cmd

 tools/virsh-domain-monitor.c | 33 -
 1 file changed, 20 insertions(+), 13 deletions(-)

-- 
2.21.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v2 1/1] fixed handling of sourceless disks in 'domblkinfo' cmd

2019-10-04 Thread Pavel Mores
virDomainGetBlockInfo() returns error if called on a disk with no
source (a sourceless disk might be a removable media drive with no
media in it, for instance an empty CDROM or floppy drive).

So far this caused the virsh domblkinfo --all command to abort and
ignore any remaining (not yet displayed) disk devices.  This patch
fixes the problem by first checking for existence of a 
element in the corresponding XML.  If none is found, we avoid calling
virDomainGetBlockInfo() altogether as we know it's bound to fail in
that case.

https://bugzilla.redhat.com/show_bug.cgi?id=1619625

Signed-off-by: Pavel Mores 
---
 tools/virsh-domain-monitor.c | 33 -
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c
index 0e2c4191d7..b69283b2da 100644
--- a/tools/virsh-domain-monitor.c
+++ b/tools/virsh-domain-monitor.c
@@ -507,20 +507,27 @@ cmdDomblkinfo(vshControl *ctl, const vshCmd *cmd)
 protocol = virXPathString("string(./source/@protocol)", ctxt);
 target = virXPathString("string(./target/@dev)", ctxt);
 
-rc = virDomainGetBlockInfo(dom, target, , 0);
-
-if (rc < 0) {
-/* If protocol is present that's an indication of a networked
- * storage device which cannot provide statistics, so generate
- * 0 based data and get the next disk. */
-if (protocol && !active &&
-virGetLastErrorCode() == VIR_ERR_INTERNAL_ERROR &&
-virGetLastErrorDomain() == VIR_FROM_STORAGE) {
-memset(, 0, sizeof(info));
-vshResetLibvirtError();
-} else {
-goto cleanup;
+if (virXPathBoolean("boolean(./source)", ctxt) == 1) {
+
+rc = virDomainGetBlockInfo(dom, target, , 0);
+
+if (rc < 0) {
+/* If protocol is present that's an indication of a
+ * networked storage device which cannot provide 
statistics,
+ * so generate 0 based data and get the next disk. */
+if (protocol && !active &&
+virGetLastErrorCode() == VIR_ERR_INTERNAL_ERROR &&
+virGetLastErrorDomain() == VIR_FROM_STORAGE) {
+memset(, 0, sizeof(info));
+vshResetLibvirtError();
+} else {
+goto cleanup;
+}
 }
+} else {
+/* if we don't call virDomainGetBlockInfo() who clears 'info'
+ * we have to do it manually */
+memset(, 0, sizeof(info));
 }
 
 if (!cmdDomblkinfoGet(ctl, , , , , human))
-- 
2.21.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [jenkins-ci PATCH] Use all_machines whenever possible

2019-10-04 Thread Andrea Bolognani
The reasons we had to prevent some targets from building
libvirt-dbus and libvirt-ocaml in the past clearly no longer
apply, since we're back to building them on all targets.

Drop the explicit target list and use all_machines instead.

Signed-off-by: Andrea Bolognani 
---
Pushed as trivial.

 guests/playbooks/build/projects/libvirt-dbus.yml  | 14 +-
 guests/playbooks/build/projects/libvirt-ocaml.yml | 14 +-
 jenkins/projects/libvirt-dbus.yaml| 10 +-
 jenkins/projects/libvirt-ocaml.yaml   | 10 +-
 4 files changed, 4 insertions(+), 44 deletions(-)

diff --git a/guests/playbooks/build/projects/libvirt-dbus.yml 
b/guests/playbooks/build/projects/libvirt-dbus.yml
index a6ff025..218bcc1 100644
--- a/guests/playbooks/build/projects/libvirt-dbus.yml
+++ b/guests/playbooks/build/projects/libvirt-dbus.yml
@@ -1,19 +1,7 @@
 ---
 - set_fact:
 name: libvirt-dbus
-machines:
-  - libvirt-centos-7
-  - libvirt-debian-9
-  - libvirt-debian-10
-  - libvirt-debian-sid
-  - libvirt-fedora-29
-  - libvirt-fedora-30
-  - libvirt-fedora-rawhide
-  - libvirt-freebsd-11
-  - libvirt-freebsd-12
-  - libvirt-freebsd-current
-  - libvirt-ubuntu-16
-  - libvirt-ubuntu-18
+machines: '{{ all_machines }}'
 archive_format: xz
 git_url: '{{ git_urls["libvirt-dbus"][git_remote] }}'
 
diff --git a/guests/playbooks/build/projects/libvirt-ocaml.yml 
b/guests/playbooks/build/projects/libvirt-ocaml.yml
index 58f017b..4202835 100644
--- a/guests/playbooks/build/projects/libvirt-ocaml.yml
+++ b/guests/playbooks/build/projects/libvirt-ocaml.yml
@@ -1,19 +1,7 @@
 ---
 - set_fact:
 name: libvirt-ocaml
-machines:
-  - libvirt-centos-7
-  - libvirt-debian-9
-  - libvirt-debian-10
-  - libvirt-debian-sid
-  - libvirt-fedora-29
-  - libvirt-fedora-30
-  - libvirt-fedora-rawhide
-  - libvirt-freebsd-11
-  - libvirt-freebsd-12
-  - libvirt-freebsd-current
-  - libvirt-ubuntu-16
-  - libvirt-ubuntu-18
+machines: '{{ all_machines }}'
 archive_format: xz
 git_url: '{{ git_urls["libvirt-ocaml"][git_remote] }}'
 
diff --git a/jenkins/projects/libvirt-dbus.yaml 
b/jenkins/projects/libvirt-dbus.yaml
index 2b72fb4..f7a52ef 100644
--- a/jenkins/projects/libvirt-dbus.yaml
+++ b/jenkins/projects/libvirt-dbus.yaml
@@ -1,15 +1,7 @@
 ---
 - project:
 name: libvirt-dbus
-machines:
-  - libvirt-centos-7
-  - libvirt-debian-9
-  - libvirt-debian-10
-  - libvirt-fedora-29
-  - libvirt-fedora-30
-  - libvirt-fedora-rawhide
-  - libvirt-freebsd-11
-  - libvirt-freebsd-12
+machines: '{all_machines}'
 title: Libvirt D-Bus
 archive_format: xz
 git_url: '{git_urls[libvirt-dbus][default]}'
diff --git a/jenkins/projects/libvirt-ocaml.yaml 
b/jenkins/projects/libvirt-ocaml.yaml
index 0c819cc..4a9bb65 100644
--- a/jenkins/projects/libvirt-ocaml.yaml
+++ b/jenkins/projects/libvirt-ocaml.yaml
@@ -1,15 +1,7 @@
 ---
 - project:
 name: libvirt-ocaml
-machines:
-  - libvirt-centos-7
-  - libvirt-debian-9
-  - libvirt-debian-10
-  - libvirt-fedora-29
-  - libvirt-fedora-30
-  - libvirt-fedora-rawhide
-  - libvirt-freebsd-11
-  - libvirt-freebsd-12
+machines: '{all_machines}'
 title: Libvirt OCaml
 archive_format: xz
 git_url: '{git_urls[libvirt-ocaml][default]}'
-- 
2.21.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC] Add API to change qemu agent response timeout

2019-10-04 Thread Michal Privoznik

On 10/3/19 11:52 PM, Jonathon Jongsma wrote:

Some layered products such as oVirt have requested a way to avoid being
blocked by guest agent commands when querying a loaded vm. For example,
many guest agent commands are polled periodically to monitor changes,
and rather than blocking the calling process, they'd prefer to simply
time out when an agent query is taking too long.

This patch adds a way for the user to specify a custom agent timeout
that is applied to all agent commands.

One special case to note here is the 'guest-sync' command. 'guest-sync'
is issued internally prior to calling any other command. (For example,
when libvirt wants to call 'guest-get-fsinfo', we first call
'guest-sync' and then call 'guest-get-fsinfo').

Previously, the 'guest-sync' command used a 5-second timeout
(VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT), whereas the actual command that
followed always blocked indefinitely
(VIR_DOMAIN_QEMU_AGENT_COMMAND_BLOCK). As part of this patch, if a
custom timeout is specified that is shorter than
5 seconds,  this new timeout also used for 'guest-sync'. If there is no
custom timeout or if the custom timeout is longer than 5 seconds, we
will continue to use the 5-second timeout.

See https://bugzilla.redhat.com/show_bug.cgi?id=1705426 for additional details.

Signed-off-by: Jonathon Jongsma 
---
  include/libvirt/libvirt-qemu.h |  2 +
  src/driver-hypervisor.h|  5 +++
  src/libvirt-qemu.c | 40 
  src/libvirt_qemu.syms  |  4 ++
  src/qemu/qemu_agent.c  | 69 +++---
  src/qemu/qemu_agent.h  |  3 ++
  src/qemu/qemu_driver.c | 24 
  src/qemu_protocol-structs  |  8 
  src/remote/qemu_protocol.x | 18 -
  src/remote/remote_driver.c |  1 +
  10 files changed, 143 insertions(+), 31 deletions(-)

diff --git a/include/libvirt/libvirt-qemu.h b/include/libvirt/libvirt-qemu.h
index 891617443f..8d3cc776e9 100644
--- a/include/libvirt/libvirt-qemu.h
+++ b/include/libvirt/libvirt-qemu.h
@@ -53,6 +53,8 @@ typedef enum {
  char *virDomainQemuAgentCommand(virDomainPtr domain, const char *cmd,
  int timeout, unsigned int flags);
  
+int virDomainQemuAgentSetTimeout(virDomainPtr domain, int timeout);

+
  /**
   * virConnectDomainQemuMonitorEventCallback:
   * @conn: the connection pointer
diff --git a/src/driver-hypervisor.h b/src/driver-hypervisor.h
index 015b2cd01c..2f17bff844 100644
--- a/src/driver-hypervisor.h
+++ b/src/driver-hypervisor.h
@@ -1372,6 +1372,10 @@ typedef int
  int *nparams,
  unsigned int flags);
  
+typedef int

+(*virDrvDomainQemuAgentSetTimeout)(virDomainPtr domain,
+   int timeout);
+
  typedef struct _virHypervisorDriver virHypervisorDriver;
  typedef virHypervisorDriver *virHypervisorDriverPtr;
  
@@ -1632,4 +1636,5 @@ struct _virHypervisorDriver {

  virDrvDomainCheckpointGetParent domainCheckpointGetParent;
  virDrvDomainCheckpointDelete domainCheckpointDelete;
  virDrvDomainGetGuestInfo domainGetGuestInfo;
+virDrvDomainQemuAgentSetTimeout domainQemuAgentSetTimeout;
  };
diff --git a/src/libvirt-qemu.c b/src/libvirt-qemu.c
index 1afb5fe529..73f119cb23 100644
--- a/src/libvirt-qemu.c
+++ b/src/libvirt-qemu.c
@@ -216,6 +216,46 @@ virDomainQemuAgentCommand(virDomainPtr domain,
  return NULL;
  }
  
+/**

+ * virDomainQemuAgentSetTimeout:
+ * @domain: a domain object
+ * @timeout: timeout in seconds
+ *
+ * Set how long to wait for a response from qemu agent commands. By default,
+ * agent commands block forever waiting for a response.
+ *
+ * @timeout must be -2, -1, 0 or positive.
+ * VIR_DOMAIN_QEMU_AGENT_COMMAND_BLOCK(-2): meaning to block forever waiting 
for
+ * a result.
+ * VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT(-1): use default timeout value.
+ * VIR_DOMAIN_QEMU_AGENT_COMMAND_NOWAIT(0): does not wait.
+ * positive value: wait for @timeout seconds
+ *
+ * Returns 0 on success, -1 on failure
+ */
+int
+virDomainQemuAgentSetTimeout(virDomainPtr domain,
+ int timeout)


This definitely deserves some @flags argument. Even though we don't have 
any flags to set now, it has proven to save us in the future couple of 
times. Look at "extra flags; not used yet " occurring in docs for 
our public APIs.



+{
+virConnectPtr conn;
+
+virResetLastError();
+
+virCheckDomainReturn(domain, -1);
+conn = domain->conn;
+
+if (conn->driver->domainQemuAgentSetTimeout) {
+if (conn->driver->domainQemuAgentSetTimeout(domain, timeout) < 0)
+goto error;
+return 0;
+}
+
+virReportUnsupportedError();
+
+ error:
+virDispatchError(conn);
+return -1;
+}
  
  /**

   * virConnectDomainQemuMonitorEventRegister:
diff --git a/src/libvirt_qemu.syms b/src/libvirt_qemu.syms
index 3a297e3a2b..348caea72e 100644
--- a/src/libvirt_qemu.syms
+++ 

Re: [libvirt] [RFC] Add API to change qemu agent response timeout

2019-10-04 Thread Michal Privoznik

On 10/4/19 8:14 AM, Peter Krempa wrote:

On Thu, Oct 03, 2019 at 16:52:12 -0500, Jonathon Jongsma wrote:

Some layered products such as oVirt have requested a way to avoid being
blocked by guest agent commands when querying a loaded vm. For example,
many guest agent commands are polled periodically to monitor changes,
and rather than blocking the calling process, they'd prefer to simply
time out when an agent query is taking too long.

This patch adds a way for the user to specify a custom agent timeout
that is applied to all agent commands.

One special case to note here is the 'guest-sync' command. 'guest-sync'
is issued internally prior to calling any other command. (For example,
when libvirt wants to call 'guest-get-fsinfo', we first call
'guest-sync' and then call 'guest-get-fsinfo').

Previously, the 'guest-sync' command used a 5-second timeout
(VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT), whereas the actual command that
followed always blocked indefinitely
(VIR_DOMAIN_QEMU_AGENT_COMMAND_BLOCK). As part of this patch, if a
custom timeout is specified that is shorter than
5 seconds,  this new timeout also used for 'guest-sync'. If there is no
custom timeout or if the custom timeout is longer than 5 seconds, we
will continue to use the 5-second timeout.

See https://bugzilla.redhat.com/show_bug.cgi?id=1705426 for additional details.

Signed-off-by: Jonathon Jongsma 
---
  include/libvirt/libvirt-qemu.h |  2 +
  src/driver-hypervisor.h|  5 +++
  src/libvirt-qemu.c | 40 
  src/libvirt_qemu.syms  |  4 ++
  src/qemu/qemu_agent.c  | 69 +++---
  src/qemu/qemu_agent.h  |  3 ++
  src/qemu/qemu_driver.c | 24 
  src/qemu_protocol-structs  |  8 
  src/remote/qemu_protocol.x | 18 -
  src/remote/remote_driver.c |  1 +
  10 files changed, 143 insertions(+), 31 deletions(-)


[...]


@@ -2737,3 +2730,19 @@ qemuAgentGetTimezone(qemuAgentPtr mon,
  
  return 0;

  }
+
+int
+qemuAgentSetTimeout(qemuAgentPtr mon,
+int timeout)
+{
+if (timeout < VIR_DOMAIN_QEMU_AGENT_COMMAND_MIN) {
+virReportError(VIR_ERR_INVALID_ARG,
+   _("guest agent timeout '%d' is "
+ "less than the minimum '%d'"),
+   timeout, VIR_DOMAIN_QEMU_AGENT_COMMAND_MIN);


This error is misleading as -1 and -2 are special values and not actual
timeout.


+return -1;
+}
+
+mon->timeout = timeout;
+return 0;
+}


[...]


diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 1e041a8bac..09251cc9e2 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -23434,6 +23434,29 @@ qemuDomainGetGuestInfo(virDomainPtr dom,
  return ret;
  }
  
+static int

+qemuDomainQemuAgentSetTimeout(virDomainPtr dom,
+  int timeout)
+{
+virDomainObjPtr vm = NULL;
+qemuAgentPtr agent;
+int ret = -1;
+
+if (!(vm = qemuDomObjFromDomain(dom)))
+goto cleanup;
+
+if (virDomainQemuAgentSetTimeoutEnsureACL(dom->conn, vm->def) < 0)
+goto cleanup;
+
+agent = qemuDomainObjEnterAgent(vm);


You must acquire a job on @vm if you want to call this:

/*
  * obj must be locked before calling
  *
  * To be called immediately before any QEMU agent API call.
  * Must have already called qemuDomainObjBeginAgentJob() or
  * qemuDomainObjBeginJobWithAgent() and checked that the VM is
  * still active.
  *
  * To be followed with qemuDomainObjExitAgent() once complete
  */
qemuAgentPtr
qemuDomainObjEnterAgent(virDomainObjPtr obj)



+ret = qemuAgentSetTimeout(agent, timeout);
+qemuDomainObjExitAgent(vm, agent);


Also this API is inherently racy if you have two clients setting the
timeout and it will influence calls of a different client.


How is this different to one client calling virDomainCreate() whilst the 
other calls virDomainDestroy() over the same domain? I believe our APIs 
are build on assumption that management layer calls only those APIs and 
in the right sequence so that desired state of things is achieved. If 
we'd have two management apps competing over the same domain then agent 
timeout is one of the smaller problems IMO.




IMO the only reasonable approach is to add new APIs which have a
'timeout' parameter for any agent API which requires tunable timeout to
prevent any races and unexpected behaviour.


Thing is, we don't tell users which APIs will talk to monitor and which 
will talk to the agent. For instance, qemuDomainShutdownFlags() prefers 
agent, but it wasn't always like that. And even if we did tell users 
that, we would need to document that @timeout applies to guest agent 
only and might not be honoured if API decides to talk to monitor.




Other possibility may be to add a qemu config file option for this but
that is not really dynamic.


I'd rather see this patch in than yet another qemu.conf knob.

However, reading the 

Re: [libvirt] [RFC] Add API to change qemu agent response timeout

2019-10-04 Thread Daniel P . Berrangé
On Thu, Oct 03, 2019 at 04:52:12PM -0500, Jonathon Jongsma wrote:
> Some layered products such as oVirt have requested a way to avoid being
> blocked by guest agent commands when querying a loaded vm. For example,
> many guest agent commands are polled periodically to monitor changes,
> and rather than blocking the calling process, they'd prefer to simply
> time out when an agent query is taking too long.
> 
> This patch adds a way for the user to specify a custom agent timeout
> that is applied to all agent commands.
> 
> One special case to note here is the 'guest-sync' command. 'guest-sync'
> is issued internally prior to calling any other command. (For example,
> when libvirt wants to call 'guest-get-fsinfo', we first call
> 'guest-sync' and then call 'guest-get-fsinfo').
> 
> Previously, the 'guest-sync' command used a 5-second timeout
> (VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT), whereas the actual command that
> followed always blocked indefinitely
> (VIR_DOMAIN_QEMU_AGENT_COMMAND_BLOCK). As part of this patch, if a
> custom timeout is specified that is shorter than
> 5 seconds,  this new timeout also used for 'guest-sync'. If there is no
> custom timeout or if the custom timeout is longer than 5 seconds, we
> will continue to use the 5-second timeout.
> 
> See https://bugzilla.redhat.com/show_bug.cgi?id=1705426 for additional 
> details.
> 
> Signed-off-by: Jonathon Jongsma 
> ---
>  include/libvirt/libvirt-qemu.h |  2 +

For something that is fullfilling a RFE from oVirt, adding an API in
libvirt-qemu.h is not a viable solution.

We class libvirt-qemu.h API usage as unsupported, as it is a
providing a backdoor around libvirt, for dev/test scenarios
only.

For anything that is to be used in production, it has to be
fully supported in the main libvirt API or XML schema.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC] Add API to change qemu agent response timeout

2019-10-04 Thread Daniel P . Berrangé
On Fri, Oct 04, 2019 at 08:14:39AM +0200, Peter Krempa wrote:
> On Thu, Oct 03, 2019 at 16:52:12 -0500, Jonathon Jongsma wrote:
> > Some layered products such as oVirt have requested a way to avoid being
> > blocked by guest agent commands when querying a loaded vm. For example,
> > many guest agent commands are polled periodically to monitor changes,
> > and rather than blocking the calling process, they'd prefer to simply
> > time out when an agent query is taking too long.
> > 
> > This patch adds a way for the user to specify a custom agent timeout
> > that is applied to all agent commands.
> > 
> > One special case to note here is the 'guest-sync' command. 'guest-sync'
> > is issued internally prior to calling any other command. (For example,
> > when libvirt wants to call 'guest-get-fsinfo', we first call
> > 'guest-sync' and then call 'guest-get-fsinfo').
> > 
> > Previously, the 'guest-sync' command used a 5-second timeout
> > (VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT), whereas the actual command that
> > followed always blocked indefinitely
> > (VIR_DOMAIN_QEMU_AGENT_COMMAND_BLOCK). As part of this patch, if a
> > custom timeout is specified that is shorter than
> > 5 seconds,  this new timeout also used for 'guest-sync'. If there is no
> > custom timeout or if the custom timeout is longer than 5 seconds, we
> > will continue to use the 5-second timeout.
> > 
> > See https://bugzilla.redhat.com/show_bug.cgi?id=1705426 for additional 
> > details.
> > 
> > Signed-off-by: Jonathon Jongsma 


> > @@ -2737,3 +2730,19 @@ qemuAgentGetTimezone(qemuAgentPtr mon,
> >  
> >  return 0;
> >  }
> > +
> > +int
> > +qemuAgentSetTimeout(qemuAgentPtr mon,
> > +int timeout)
> > +{
> > +if (timeout < VIR_DOMAIN_QEMU_AGENT_COMMAND_MIN) {
> > +virReportError(VIR_ERR_INVALID_ARG,
> > +   _("guest agent timeout '%d' is "
> > + "less than the minimum '%d'"),
> > +   timeout, VIR_DOMAIN_QEMU_AGENT_COMMAND_MIN);
> 
> This error is misleading as -1 and -2 are special values and not actual
> timeout.
> 
> > +return -1;
> > +}
> > +
> > +mon->timeout = timeout;
> > +return 0;
> > +}
> 
> [...]
> 
> > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> > index 1e041a8bac..09251cc9e2 100644
> > --- a/src/qemu/qemu_driver.c
> > +++ b/src/qemu/qemu_driver.c
> > @@ -23434,6 +23434,29 @@ qemuDomainGetGuestInfo(virDomainPtr dom,
> >  return ret;
> >  }
> >  
> > +static int
> > +qemuDomainQemuAgentSetTimeout(virDomainPtr dom,
> > +  int timeout)
> > +{
> > +virDomainObjPtr vm = NULL;
> > +qemuAgentPtr agent;
> > +int ret = -1;
> > +
> > +if (!(vm = qemuDomObjFromDomain(dom)))
> > +goto cleanup;
> > +
> > +if (virDomainQemuAgentSetTimeoutEnsureACL(dom->conn, vm->def) < 0)
> > +goto cleanup;
> > +
> > +agent = qemuDomainObjEnterAgent(vm);
> 
> You must acquire a job on @vm if you want to call this:
> 
> /*
>  * obj must be locked before calling
>  *
>  * To be called immediately before any QEMU agent API call.
>  * Must have already called qemuDomainObjBeginAgentJob() or
>  * qemuDomainObjBeginJobWithAgent() and checked that the VM is
>  * still active.
>  *
>  * To be followed with qemuDomainObjExitAgent() once complete
>  */
> qemuAgentPtr
> qemuDomainObjEnterAgent(virDomainObjPtr obj)
> 
> 
> > +ret = qemuAgentSetTimeout(agent, timeout);
> > +qemuDomainObjExitAgent(vm, agent);
> 
> Also this API is inherently racy if you have two clients setting the
> timeout and it will influence calls of a different client.
> 
> IMO the only reasonable approach is to add new APIs which have a
> 'timeout' parameter for any agent API which requires tunable timeout to
> prevent any races and unexpected behaviour.
> 
> Other possibility may be to add a qemu config file option for this but
> that is not really dynamic.

I guess the key question is whether we actually have a compelling
use case to vary the timeout on a per-command basis.

If not, then we could do fine with a global config that is either
recorded in the domain XML, or in the global QEMU config.

The possible causes of slowness are

 - Host is overloaded so the guest is not being scheduled
 - Guest is overloaded so the agent is not being scheduled
 - The agent has crash/deadlocked
 - The agent is maliciously not responding
 - The command genuinely takes a long time to perform its action

The first 4 of those are fine with a global timeout either on guest or
in the driver.

Only the last one really pushes for having this per-public API.

Looking commands, ones I think can take a long time are

 - FS trim - this can run for many minutes if there's alot to trim
 - FS freeze - this might need to wait for apps to quiesce their I/O IIUC
 - Guest exec - it can run any command

The latter isn't used in libvirt, can be be run via the QGA passthrough
api in libvirt-qemu.so

So sadly I think 

Re: [libvirt] [PATCH] virsh: fixed handling of targetless disks (e.g. empty CDROM) in 'domblkinfo' cmd

2019-10-04 Thread Pavel Mores
On Thu, Oct 03, 2019 at 06:25:40PM +0200, Ján Tomko wrote:
> On Thu, Oct 03, 2019 at 02:45:36PM +0200, Pavel Mores wrote:
> > On Thu, Oct 03, 2019 at 01:09:55PM +0200, Pavel Mores wrote:
> > > On Thu, Oct 03, 2019 at 11:54:44AM +0200, Ján Tomko wrote:
> > > > On Wed, Oct 02, 2019 at 04:21:52PM +0200, Pavel Mores wrote:
> > > > > On Wed, Oct 02, 2019 at 08:34:10AM -0300, Daniel Henrique Barboza 
> > > > > wrote:
> > > > > >
> > > > > >
> > > > > > On 10/1/19 11:45 AM, Pavel Mores wrote:
> > > > > > > virDomainGetBlockInfo() returns error if called on a disk with no 
> > > > > > > target
> > > > > > > (a targetless disk might be a removable media drive with no media 
> > > > > > > in it,
> > > > > > > for instance an empty CDROM drive).
> > > > > > >
> > > > > > > So far this caused the virsh domblkinfo --all command to abort 
> > > > > > > and ignore
> > > > > > > any remaining (not yet displayed) disk devices.  This patch fixes 
> > > > > > > it by
> > > > > > > ignoring virDomainGetBlockInfo() errors for CDROM and floppy 
> > > > > > > drives,
> > > > > > > similar to how it's done for network drives.
> > > > > > >
> > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1619625
> > > > > > >
> > > > > > > Signed-off-by: Pavel Mores 
> > > > > > > ---
> > > > > > >   tools/virsh-domain-monitor.c | 11 +++
> > > > > > >   1 file changed, 11 insertions(+)
> > > > > > >
> > > > > > > diff --git a/tools/virsh-domain-monitor.c 
> > > > > > > b/tools/virsh-domain-monitor.c
> > > > > > > index 0e2c4191d7..0f495c1a3f 100644
> > > > > > > --- a/tools/virsh-domain-monitor.c
> > > > > > > +++ b/tools/virsh-domain-monitor.c
> > > > > > > @@ -473,6 +473,7 @@ cmdDomblkinfo(vshControl *ctl, const vshCmd 
> > > > > > > *cmd)
> > > > > > >   char *cap = NULL;
> > > > > > >   char *alloc = NULL;
> > > > > > >   char *phy = NULL;
> > > > > > > +char *device_type = NULL;
> > > > > >
> > > > > > I believe you need to use VIR_AUTO(char *) here. Even considering 
> > > > > > that
> > > > > > the macro will be deprecated in the near future for glib stuff, by 
> > > > > > using
> > > > > > VIR_AUTO here:
> > > > > >
> > > > > > - you'll make it easier to introduce the glib replacement, since
> > > > > > s/VIR_AUTO/ is a trivial change;
> > > > > >
> > > > > > - you won't be adding more VIR_FREE() on top of existing cases that 
> > > > > > will
> > > > > > need to be addressed in the future.
> > > > >
> > > > > Was that the consensus from the migration debate?  If so I'm happy to 
> > > > > fix my
> > > > > patch.
> > > >
> > > > IIUC the consensus was that we will switch to g_auto eventually,
> > > > but it's a simple string replace that will be dealt with by whoever
> > > > pushes the patch.
> > > >
> > > > Also, if you go with any of the two approaches I suggested (ignoring all
> > > > errors or using the XPathBoolean), you don't need to introduce a new
> > > > allocated variable
> > > 
> > > Cool, so if there aren't any objections, I'll implement the XPathBoolean
> > > variant and submit v2.
> > 
> > As it turns out, the thing with the virDomainGetBlockInfo() call is that 
> > even
> > if it fails, it does initialise the virDomainBlockInfo structure enough to 
> > be
> > displayed in the table (even if just as a target name and dashes).
> 
> The https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockInfo
> only contains numbers, not the target name.
> 
> The target name is output by the 'vshTableRowAppend' command.
> 
> > 
> > It follows apparently that if we go the virXPathBoolean() way and skip the
> > virDomainGetBlockInfo() call, we can't show the sourceless device at all, we
> > can't even indicate that it exists.
> > 
> > Is that acceptable?  Or is there another way to fill in virDomainBlockInfo?
> > 
> 
> cmdDomblkinfoGet happily handles a virDomainBlockInfo with all zeroes,
> so as long as info is reset to zeroes on every loop iteration, it's okay
> to execute the rest of the loop even if we did not call virDomainGetBlockInfo

You're right, the problem wasn't actually caused by the virDomainBlockInfo
instance.

I'll post a v2 shortly.

Thanks!

pvl

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] qemu: capabilities: Fill in bochs-display info

2019-10-04 Thread Erik Skultety
On Thu, Oct 03, 2019 at 03:28:40PM +0200, Fabiano Fidêncio wrote:
> 086c19d69 added bochs-display capability but didn't fill in the info for
> domain capabilities.
>
> Signed-off-by: Fabiano Fidêncio 
> ---
If the bochs-was added in this release, then I'd say we should put it right in,
but since we're already one release late, I think it can way 1 more :).
Nevertheless:

Reviewed-by: Erik Skultety 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [RFC] Add API to change qemu agent response timeout

2019-10-04 Thread Peter Krempa
On Thu, Oct 03, 2019 at 16:52:12 -0500, Jonathon Jongsma wrote:
> Some layered products such as oVirt have requested a way to avoid being
> blocked by guest agent commands when querying a loaded vm. For example,
> many guest agent commands are polled periodically to monitor changes,
> and rather than blocking the calling process, they'd prefer to simply
> time out when an agent query is taking too long.
> 
> This patch adds a way for the user to specify a custom agent timeout
> that is applied to all agent commands.
> 
> One special case to note here is the 'guest-sync' command. 'guest-sync'
> is issued internally prior to calling any other command. (For example,
> when libvirt wants to call 'guest-get-fsinfo', we first call
> 'guest-sync' and then call 'guest-get-fsinfo').
> 
> Previously, the 'guest-sync' command used a 5-second timeout
> (VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT), whereas the actual command that
> followed always blocked indefinitely
> (VIR_DOMAIN_QEMU_AGENT_COMMAND_BLOCK). As part of this patch, if a
> custom timeout is specified that is shorter than
> 5 seconds,  this new timeout also used for 'guest-sync'. If there is no
> custom timeout or if the custom timeout is longer than 5 seconds, we
> will continue to use the 5-second timeout.
> 
> See https://bugzilla.redhat.com/show_bug.cgi?id=1705426 for additional 
> details.
> 
> Signed-off-by: Jonathon Jongsma 
> ---
>  include/libvirt/libvirt-qemu.h |  2 +
>  src/driver-hypervisor.h|  5 +++
>  src/libvirt-qemu.c | 40 
>  src/libvirt_qemu.syms  |  4 ++
>  src/qemu/qemu_agent.c  | 69 +++---
>  src/qemu/qemu_agent.h  |  3 ++
>  src/qemu/qemu_driver.c | 24 
>  src/qemu_protocol-structs  |  8 
>  src/remote/qemu_protocol.x | 18 -
>  src/remote/remote_driver.c |  1 +
>  10 files changed, 143 insertions(+), 31 deletions(-)

[...]

> @@ -2737,3 +2730,19 @@ qemuAgentGetTimezone(qemuAgentPtr mon,
>  
>  return 0;
>  }
> +
> +int
> +qemuAgentSetTimeout(qemuAgentPtr mon,
> +int timeout)
> +{
> +if (timeout < VIR_DOMAIN_QEMU_AGENT_COMMAND_MIN) {
> +virReportError(VIR_ERR_INVALID_ARG,
> +   _("guest agent timeout '%d' is "
> + "less than the minimum '%d'"),
> +   timeout, VIR_DOMAIN_QEMU_AGENT_COMMAND_MIN);

This error is misleading as -1 and -2 are special values and not actual
timeout.

> +return -1;
> +}
> +
> +mon->timeout = timeout;
> +return 0;
> +}

[...]

> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 1e041a8bac..09251cc9e2 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -23434,6 +23434,29 @@ qemuDomainGetGuestInfo(virDomainPtr dom,
>  return ret;
>  }
>  
> +static int
> +qemuDomainQemuAgentSetTimeout(virDomainPtr dom,
> +  int timeout)
> +{
> +virDomainObjPtr vm = NULL;
> +qemuAgentPtr agent;
> +int ret = -1;
> +
> +if (!(vm = qemuDomObjFromDomain(dom)))
> +goto cleanup;
> +
> +if (virDomainQemuAgentSetTimeoutEnsureACL(dom->conn, vm->def) < 0)
> +goto cleanup;
> +
> +agent = qemuDomainObjEnterAgent(vm);

You must acquire a job on @vm if you want to call this:

/*
 * obj must be locked before calling
 *
 * To be called immediately before any QEMU agent API call.
 * Must have already called qemuDomainObjBeginAgentJob() or
 * qemuDomainObjBeginJobWithAgent() and checked that the VM is
 * still active.
 *
 * To be followed with qemuDomainObjExitAgent() once complete
 */
qemuAgentPtr
qemuDomainObjEnterAgent(virDomainObjPtr obj)


> +ret = qemuAgentSetTimeout(agent, timeout);
> +qemuDomainObjExitAgent(vm, agent);

Also this API is inherently racy if you have two clients setting the
timeout and it will influence calls of a different client.

IMO the only reasonable approach is to add new APIs which have a
'timeout' parameter for any agent API which requires tunable timeout to
prevent any races and unexpected behaviour.

Other possibility may be to add a qemu config file option for this but
that is not really dynamic.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list