Re: "default" watchdog device - ?

2022-04-08 Thread Nir Soffer
On Tue, Apr 5, 2022 at 7:27 PM lejeczek  wrote:
>
>
>
> On 29/03/2022 20:25, Nir Soffer wrote:
> > On Wed, Mar 16, 2022 at 1:55 PM lejeczek  wrote:
> >>
> >>
> >> On 15/03/2022 11:21, Daniel P. Berrangé wrote:
> >>> On Tue, Mar 15, 2022 at 10:39:50AM +, lejeczek wrote:
> >>>> Hi guys.
> >>>>
> >>>> Without explicitly, manually using watchdog device for a VM, the VM 
> >>>> (centOS
> >>>> 8 Stream 4.18.0-365.el8.x86_64) shows '/dev/watchdog' exists.
> >>>> To double check - 'dumpxml' does not show any such device - what kind of 
> >>>> a
> >>>> 'watchdog' that is?
> >>> The kernel can always provide a pure software watchdog IIRC. It can be
> >>> useful if a userspace app wants a watchdog. The limitation is that it
> >>> relies on the kernel remaining functional, as there's no hardware
> >>> backing it up.
> >>>
> >>> Regards,
> >>> Daniel
> >> On a related note - with 'i6300esb' watchdog which I tested
> >> and I believe is working.
> >> I get often in my VMs from 'dmesg':
> >> ...
> >> watchdog: BUG: soft lockup - CPU#0 stuck for xxxs! [swapper/0:0]
> >> rcu: INFO: rcu_sched self-detected stall on CPU
> >> ...
> >> This above is from Ubuntu and CentOS alike and when this
> >> happens, console via VNC responds to until first 'enter'
> >> then is non-resposive.
> >> This happens after VM(s) was migrated between hosts, but
> >> anyway..
> >> I do not see what I expected from 'watchdog' - there is no
> >> action whatsoever, which should be 'reset'. VM remains in
> >> such 'frozen' state forever.
> >>
> >> any & all shared thoughts much appreciated.
> >> L.
> > You need to run some userspace tool that will open the watchdog
> > device, and pet it periodically, telling the kernel that userspace is alive.
> >
> > If this tool will stop petting the watchdog, maybe because of a soft lockup
> > or other trouble, the watchdog device will reset the VM.
> >
> > watchdog(8) may be the tool you need.
> >
> > See also
> > https://www.kernel.org/doc/Documentation/watchdog/watchdog-api.rst
> >
> > Nir
> >
> I do not think that 'i6300esb' watchog works under those
> soft-lockups, whether it's qemu or OS end I cannot say.
> With:
>  
> in dom xml OS sees:
> -> $ llr /dev/watchdog*
> crw---. 1 root root  10, 130 Apr  5 16:59 /dev/watchdog
> crw---. 1 root root 248,   0 Apr  5 16:59 /dev/watchdog0
> crw---. 1 root root 248,   1 Apr  5 16:59 /dev/watchdog1
> and
> -> $ wdctl
> Device:/dev/watchdog
> Identity:  i6300ESB timer [version 0]
> Timeout:   30 seconds
> Pre-timeout:0 seconds
> FLAG   DESCRIPTION   STATUS BOOT-STATUS
> KEEPALIVEPING  Keep alive ping reply  1   0
> MAGICCLOSE Supports magic close char  0   0
> SETTIMEOUT Set timeout (in seconds)   0   0
>
> If it worked, the HW watchdog, then 'i6300esb' should reset
> the VM if nothing is pinging the watchdog - I read that it's
> possible to exit 'software' watchdog and not to cause HW
> watchdog take action. I do not know it that's happening here
> when I just 'systemclt stop watchdog'
> In '/etc/watchdog.conf' I do not point to any specific
> device, which I believe makes watchdogd do its things.
> Simple test:
> -> $ cat >> /dev/watchdog
> & 'Enter' press twice
> does invoke 'reset' action and I was to believe 'wdctl' that
> is HW watchdog working. But!...
> The main issue I have are those "soft lockups" where VM's OS
> becomes frozen, but nothing from the watchdog, no action -
> though, as VM is in such frozen state host shows high CPU
> for the VM.
>
> I do not anything fancy so I really wonder if what I see is
> that rare.
> Soft-lockup occur I think usually, cannot say that uniquely
> though, during or after VM live-migration.
>
> thanks, L.

On my fedora 35 vm, I see that /dev/watchdog0 is the right device:

# wdctl
Device:/dev/watchdog0
Identity:  i6300ESB timer [version 0]
Timeout:   30 seconds
Pre-timeout:0 seconds
FLAG   DESCRIPTION   STATUS BOOT-STATUS
KEEPALIVEPING  Keep alive ping reply  1   0
MAGICCLOSE Supports magic close char  0   0

Re: "default" watchdog device - ?

2022-03-29 Thread Nir Soffer
On Wed, Mar 16, 2022 at 1:55 PM lejeczek  wrote:
>
>
>
> On 15/03/2022 11:21, Daniel P. Berrangé wrote:
> > On Tue, Mar 15, 2022 at 10:39:50AM +, lejeczek wrote:
> >> Hi guys.
> >>
> >> Without explicitly, manually using watchdog device for a VM, the VM (centOS
> >> 8 Stream 4.18.0-365.el8.x86_64) shows '/dev/watchdog' exists.
> >> To double check - 'dumpxml' does not show any such device - what kind of a
> >> 'watchdog' that is?
> > The kernel can always provide a pure software watchdog IIRC. It can be
> > useful if a userspace app wants a watchdog. The limitation is that it
> > relies on the kernel remaining functional, as there's no hardware
> > backing it up.
> >
> > Regards,
> > Daniel
> On a related note - with 'i6300esb' watchdog which I tested
> and I believe is working.
> I get often in my VMs from 'dmesg':
> ...
> watchdog: BUG: soft lockup - CPU#0 stuck for xxxs! [swapper/0:0]
> rcu: INFO: rcu_sched self-detected stall on CPU
> ...
> This above is from Ubuntu and CentOS alike and when this
> happens, console via VNC responds to until first 'enter'
> then is non-resposive.
> This happens after VM(s) was migrated between hosts, but
> anyway..
> I do not see what I expected from 'watchdog' - there is no
> action whatsoever, which should be 'reset'. VM remains in
> such 'frozen' state forever.
>
> any & all shared thoughts much appreciated.
> L.

You need to run some userspace tool that will open the watchdog
device, and pet it periodically, telling the kernel that userspace is alive.

If this tool will stop petting the watchdog, maybe because of a soft lockup
or other trouble, the watchdog device will reset the VM.

watchdog(8) may be the tool you need.

See also
https://www.kernel.org/doc/Documentation/watchdog/watchdog-api.rst

Nir



Re: Libgfapi gone too?

2021-12-11 Thread Nir Soffer
On Fri, Dec 10, 2021 at 7:07 PM Dmitry Melekhov  wrote:
>
>
> 10.12.2021 18:27, lejeczek пишет:
> > really??
> > That was one of the most useful & practical bits - having libvirt&qemu
> > to make use of GlusterFS without having to "expose" GF volumes to the
> > filesystem. (everybody who do virt+glusterfs I know, do have it that way)
> >
> >
> I guess this will be available only in RHEV :-)

Even in RHV (RHEV was renamed to RHV a few years ago) native Gluster
disk was never fully supported...

>
> or
>
> in-kernel glusterfs client... what is it? I know only about fuse.

... and fuse is the only fully supported way in RHV.

Nir




Re: Libgfapi gone too?

2021-12-11 Thread Nir Soffer
On Sun, Dec 12, 2021 at 12:51 AM Nir Soffer  wrote:
>
> On Fri, Dec 10, 2021 at 7:18 PM Dmitry Melekhov  wrote:
> >
> >
> > 10.12.2021 21:10, Daniel P. Berrangé пишет:
> > > On Fri, Dec 10, 2021 at 09:03:55PM +0400, Dmitry Melekhov wrote:
> > >> 10.12.2021 18:27, lejeczek пишет:
> > >>> really??
> > >>> That was one of the most useful & practical bits - having libvirt&qemu
> > >>> to make use of GlusterFS without having to "expose" GF volumes to the
> > >>> filesystem. (everybody who do virt+glusterfs I know, do have it that
> > >>> way)
> > >>>
> > >>>
> > >> I guess this will be available only in RHEV :-)
> > > RHEV is based on RHEL-8, so changes in RHEL-9 don't impact it.
> > >
> > I'm sure next RHEV will be based on 9 :-)
>
> There will be no RHV on RHEL 9, but we work on porting oVirt to Centos
> Stream 9, so it should work on RHEL 9. Since native Gluster support was
> removed in qemu and libvirt, oVirt will not be able to support it in 9.
>
> You can ask on de...@ovirt.org if you need more info.

Please see
https://access.redhat.com/support/policy/updates/rhev




Re: Libgfapi gone too?

2021-12-11 Thread Nir Soffer
On Fri, Dec 10, 2021 at 7:18 PM Dmitry Melekhov  wrote:
>
>
> 10.12.2021 21:10, Daniel P. Berrangé пишет:
> > On Fri, Dec 10, 2021 at 09:03:55PM +0400, Dmitry Melekhov wrote:
> >> 10.12.2021 18:27, lejeczek пишет:
> >>> really??
> >>> That was one of the most useful & practical bits - having libvirt&qemu
> >>> to make use of GlusterFS without having to "expose" GF volumes to the
> >>> filesystem. (everybody who do virt+glusterfs I know, do have it that
> >>> way)
> >>>
> >>>
> >> I guess this will be available only in RHEV :-)
> > RHEV is based on RHEL-8, so changes in RHEL-9 don't impact it.
> >
> I'm sure next RHEV will be based on 9 :-)

There will be no RHV on RHEL 9, but we work on porting oVirt to Centos
Stream 9, so it should work on RHEL 9. Since native Gluster support was
removed in qemu and libvirt, oVirt will not be able to support it in 9.

You can ask on de...@ovirt.org if you need more info.

Nir




Re: [ovirt-devel] Issue: Device path changed after adding disks to guest VM

2021-01-06 Thread Nir Soffer
a  /dev/mapper/3600140594af345ed76d42058f2b1a454
 vdb  /dev/mapper/360014050058f2f8a0474dc7a8a7cc6a5
 vdc  /dev/mapper/36001405b4d0c0b7544d47438b21296ef

In the guest:

# ls -lh /dev/disk/by-id/virtio-*
lrwxrwxrwx. 1 root root 9 Jan  6 09:42
/dev/disk/by-id/virtio-b97e68b2-87ea-45ca-9 -> ../../vda
lrwxrwxrwx. 1 root root 9 Jan  6 09:42
/dev/disk/by-id/virtio-d9a29187-f492-4a0d-a -> ../../vdb
lrwxrwxrwx. 1 root root 9 Jan  6 09:51
/dev/disk/by-id/virtio-e801c2e4-dc2e-4c53-b -> ../../vdc


Shutdown VM and start it again

# virsh -r dumpxml disk-mapping
...

  
  

  
  


  


  
  
  40018b33-2b11-4d10-82e4-604a5b135fb2
  
  
  


  
  

  
  
  
  b97e68b2-87ea-45ca-94fb-277d5b30baa2
  
  


  
  

  
  
  
  e801c2e4-dc2e-4c53-b17b-bf6de99f16ed
  
  


  
  

  
  
  
  d9a29187-f492-4a0d-aea2-7d5216c957d7
  
  

...

# virsh -r domblklist disk-mapping
 Target   Source
---
 sdc  -
 sda  
/rhev/data-center/mnt/blockSD/84dc4e3c-00fd-4263-84e8-fc2466e9/images/40018b33-2b11-4d10-82e4-604a5b135fb2/40f455c4-8c92-4f8f-91c2-991b0ddfc2f5
 vda  /dev/mapper/3600140594af345ed76d42058f2b1a454
 vdb  /dev/mapper/36001405b4d0c0b7544d47438b21296ef
 vdc  /dev/mapper/360014050058f2f8a0474dc7a8a7cc6a5


In the guest:

# ls -lh /dev/disk/by-id/virtio-*
lrwxrwxrwx. 1 root root 9 Jan  6 09:55
/dev/disk/by-id/virtio-b97e68b2-87ea-45ca-9 -> ../../vda
lrwxrwxrwx. 1 root root 9 Jan  6 09:55
/dev/disk/by-id/virtio-d9a29187-f492-4a0d-a -> ../../vdb
lrwxrwxrwx. 1 root root 9 Jan  6 09:55
/dev/disk/by-id/virtio-e801c2e4-dc2e-4c53-b -> ../../vdc

Comparing to state before reboot:

# virsh -r domblklist disk-mapping
 Target   Source
---
 sdc  -
 sda  
/rhev/data-center/mnt/blockSD/84dc4e3c-00fd-4263-84e8-fc2466e9/images/40018b33-2b11-4d10-82e4-604a5b135fb2/40f455c4-8c92-4f8f-91c2-991b0ddfc2f5
 vda  /dev/mapper/3600140594af345ed76d42058f2b1a454
 vdb  /dev/mapper/360014050058f2f8a0474dc7a8a7cc6a5
 vdc  /dev/mapper/36001405b4d0c0b7544d47438b21296ef

# ls -lh /dev/disk/by-id/virtio-*
lrwxrwxrwx. 1 root root 9 Jan  6 09:42
/dev/disk/by-id/virtio-b97e68b2-87ea-45ca-9 -> ../../vda
lrwxrwxrwx. 1 root root 9 Jan  6 09:42
/dev/disk/by-id/virtio-d9a29187-f492-4a0d-a -> ../../vdb
lrwxrwxrwx. 1 root root 9 Jan  6 09:51
/dev/disk/by-id/virtio-e801c2e4-dc2e-4c53-b -> ../../vdc

In the guest disks are mapped to the same device name.

It looks like libivrt domblklist is not correct - vdb and vdc are switched.
Peter, this expected?

Nir

>
> Joy
>
> On Wed, Dec 2, 2020 at 1:28 PM Nir Soffer  wrote:
>>
>> On Wed, Dec 2, 2020 at 10:27 AM Joy Li  wrote:
>> >
>> > Hi All,
>> >
>> > I'm facing the problem that after adding disks to guest VM, the device 
>> > target path changed (My ovirt version is 4.3). For example:
>> >
>> > Before adding a disk:
>> >
>> > virsh # domblklist 
>> > Target   Source
>> > -
>> >  hdc  -
>> >  vda  /dev/mapper/3600a09803830386546244a546d494f53
>> >  vdb  /dev/mapper/3600a09803830386546244a546d494f54
>> >  vdc  /dev/mapper/3600a09803830386546244a546d494f55
>> >  vdd  /dev/mapper/3600a09803830386546244a546d494f56
>> >  vde  /dev/mapper/3600a09803830386546244a546d494f57
>> >  vdf  /dev/mapper/3600a09803830386546244a546d494f58
>> >
>> > After adding a disk, and then shutdown and start the VM:
>> >
>> > virsh # domblklist 
>> > Target   Source
>> > -
>> >  hdc  -
>> >  vda  /dev/mapper/3600a09803830386546244a546d494f53
>> >  vdb  /dev/mapper/3600a09803830386546244a546d494f54
>> >  vdc  /dev/mapper/3600a09803830386546244a546d494f6c
>> >  vdd  /dev/mapper/3600a09803830386546244a546d494f55
>> >  vde  /dev/mapper/3600a09803830386546244a546d494f56
>> >  vdf  /dev/mapper/3600a09803830386546244a546d494f57
>> >  vdg  /dev/mapper/3600a09803830386546244a546d494f58
>> >
>> > The devices' multipath doesn't map to the same target path as before, so 
>> > in my VM the /dev/vdc doesn't point to the old 
>> > /dev/mapper

Re: Libvirt driver iothread property for virtio-scsi disks

2020-11-04 Thread Nir Soffer
On Wed, Nov 4, 2020 at 6:42 PM Sergio Lopez  wrote:
>
> On Wed, Nov 04, 2020 at 05:48:40PM +0200, Nir Soffer wrote:
> > The docs[1] say:
> >
> > - The optional iothread attribute assigns the disk to an IOThread as 
> > defined by
> >   the range for the domain iothreads value. Multiple disks may be assigned 
> > to
> >   the same IOThread and are numbered from 1 to the domain iothreads value.
> >   Available for a disk device target configured to use "virtio" bus and 
> > "pci"
> >   or "ccw" address types. Since 1.2.8 (QEMU 2.1)
> >
> > Does it mean that virtio-scsi disks do not use iothreads?
>
> virtio-scsi disks can use iothreads, but they are configured in the
> scsi controller, not in the disk itself. All disks attached to the
> same controller will share the same iothread, but you can also attach
> multiple controllers.

Thanks, I found that we do use this in ovirt:


  
  
  


However the VMs in this setup are not created by oVirt, but manually using
libvirt. I'll make sure we configure the controller in the same way.

> > I'm experiencing a horrible performance using nested vms (up to 2 levels of
> > nesting) when accessing NFS storage running on one of the VMs. The NFS
> > server is using scsi disk.
> >
> > My theory is:
> > - Writing to NFS server is very slow (too much nesting, slow disk)
> > - Not using iothreads (because we don't use virtio?)
> > - Guest CPU is blocked by slow I/O
>
> I would discard the lack of iothreads as the culprit. They do improve
> the performance, but without them the performance should be quite
> decent anyway. Probably something else is causing the trouble.
>
> I would do a step by step analysis, testing the NFS performance from
> outside the VM first, and then elaborating upwards from that.

Makes sense, thanks.



Re: Libvirt driver iothread property for virtio-scsi disks

2020-11-04 Thread Nir Soffer
On Wed, Nov 4, 2020 at 6:54 PM Daniel P. Berrangé  wrote:
>
> On Wed, Nov 04, 2020 at 05:48:40PM +0200, Nir Soffer wrote:
> > The docs[1] say:
> >
> > - The optional iothread attribute assigns the disk to an IOThread as 
> > defined by
> >   the range for the domain iothreads value. Multiple disks may be assigned 
> > to
> >   the same IOThread and are numbered from 1 to the domain iothreads value.
> >   Available for a disk device target configured to use "virtio" bus and 
> > "pci"
> >   or "ccw" address types. Since 1.2.8 (QEMU 2.1)
> >
> > Does it mean that virtio-scsi disks do not use iothreads?
> >
> > I'm experiencing a horrible performance using nested vms (up to 2 levels of
> > nesting) when accessing NFS storage running on one of the VMs. The NFS
> > server is using scsi disk.
>
> When you say  2 levels of nesting do you definitely have KVM enabled at
> all levels, or are you ending up using TCG emulation, because the latter
> would certainly explain terrible performance.

Good point, I'll check that out, thanks.

> > My theory is:
> > - Writing to NFS server is very slow (too much nesting, slow disk)
> > - Not using iothreads (because we don't use virtio?)
> > - Guest CPU is blocked by slow I/O
>
> Regards,
> Daniel
> --
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
>




Libvirt driver iothread property for virtio-scsi disks

2020-11-04 Thread Nir Soffer
The docs[1] say:

- The optional iothread attribute assigns the disk to an IOThread as defined by
  the range for the domain iothreads value. Multiple disks may be assigned to
  the same IOThread and are numbered from 1 to the domain iothreads value.
  Available for a disk device target configured to use "virtio" bus and "pci"
  or "ccw" address types. Since 1.2.8 (QEMU 2.1)

Does it mean that virtio-scsi disks do not use iothreads?

I'm experiencing a horrible performance using nested vms (up to 2 levels of
nesting) when accessing NFS storage running on one of the VMs. The NFS
server is using scsi disk.

My theory is:
- Writing to NFS server is very slow (too much nesting, slow disk)
- Not using iothreads (because we don't use virtio?)
- Guest CPU is blocked by slow I/O

Does this make sense?

[1] https://libvirt.org/formatdomain.html#hard-drives-floppy-disks-cdroms

Nir



Re: [ovirt-users] Re: Testing ovirt 4.4.1 Nested KVM on Skylake-client (core i5) does not work

2020-09-14 Thread Nir Soffer
On Mon, Sep 14, 2020 at 8:42 AM Yedidyah Bar David  wrote:
>
> On Mon, Sep 14, 2020 at 12:28 AM wodel youchi  wrote:
> >
> > Hi,
> >
> > Thanks for the help, I think I found the solution using this link : 
> > https://www.berrange.com/posts/2018/06/29/cpu-model-configuration-for-qemu-kvm-on-x86-hosts/
> >
> > When executing : virsh dumpxml on my ovirt hypervisor I saw that the mpx 
> > flag was disabled, so I edited the XML file of the hypervisor VM and I did 
> > this : add the already enabled features and enable mpx with them. I 
> > stopped/started my hyerpvisor VM and voila, le nested VM-Manager has booted 
> > successfully.
> >
> >
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> >   
> Thanks for the report!
>
> Would you like to open a bug about this?
>
> A possible fix is probably to pass relevant options to the
> virt-install command in ovirt-ansible-hosted-engine-setup.
> Either always - no idea what the implications are - or
> optionally, or even allow the user to pass arbitrary options.

I don't think we need to do such change on our side. This seems like a
hard to reproduce libvirt bug.

The strange thing is that after playing with the XML generated by
virt-manager, using

[x] Copy host CPU configuration

Creating this XML:

  
Skylake-Client-IBRS
Intel



















  

Or using this XML in virt-manager:

  

Both work with these cluster CPU Type:

- Secure Intel Skylake Client Family
- Intel Skylake Client Family

I think the best place to discuss this is libvirt-users mailing list:
https://www.redhat.com/mailman/listinfo/libvirt-users

Nir

> Thanks and best regards,
>
> >
> >
> > Regards.
> >
> > Le dim. 13 sept. 2020 à 19:47, Nir Soffer  a écrit :
> >>
> >> On Sun, Sep 13, 2020 at 8:32 PM wodel youchi  
> >> wrote:
> >> >
> >> > Hi,
> >> >
> >> > I've been using my core i5 6500 (skylake-client) for some time now to 
> >> > test oVirt on my machine.
> >> > However this is no longer the case.
> >> >
> >> > I am using Fedora 32 as my base system with nested-kvm enabled, when I 
> >> > try to install oVirt 4.4 as HCI single node, I get an error in the last 
> >> > phase which consists of copying the VM-Manager to the engine volume and 
> >> > boot it.
> >> > It is the boot that causes the problem, I get an error about the CPU :
> >> > the CPU is incompatible with host CPU: Host CPU does not provide 
> >> > required features: mpx
> >> >
> >> > This is the CPU part from virsh domcapabilities on my physical machine
> >> > 
> >> >
> >> >
> >> >  Skylake-Client-IBRS
> >> >  Intel
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >  
> >> >
> >> >
> >> >  qemu64
> >> >  qemu32
> >> >  phenom
> >> >  pentium3
> >> >  pentium2
> >> >  pentium
> >> >  n270
> >> >  kvm64
> >> >  kvm32
> >> >  coreduo
> >> >  core2duo
> >> >  athlon
> >> >  Westmere-IBRS
> >> >  Westmere
> >> >  Skylake-Server-IBRS
> >> >  Skylake-Server
> >> >  Skylake-Client-IBRS
> >> >  Skylake-Client
> >> >  SandyBridge-IBRS
> >> >  SandyBridge
> >> >  Penryn
> >> >  Opteron_G5
> >> >  Opteron_G4
> >> >  Opteron_G3
> >> >  Opteron_G2
> >> >  Opteron_G1
> >> >  Nehalem-IBRS
> >> >  Nehalem
> >> >  IvyBridge-IBRS
> >> >  IvyBridge
> >> >  Icelake-Server
> >> >  Icelake-Client
> >> >  Haswell-noTSX-IBRS
> >> >  Haswell-noTSX
> >> >  Has

Re: [libvirt-users] Starting VM fails with: "Setting different DAC user or group on /path... which is already in use" after upgrading to libvirt 5.6.0-1

2019-08-20 Thread Nir Soffer
On Tue, Aug 20, 2019 at 1:12 PM Michal Privoznik 
wrote:

> On 8/19/19 9:53 PM, Nir Soffer wrote:
> > Hi,
> >
> > I upgraded to a Fedora 29 host using virt-preview repo to
> > libvirt-daemon-5.6.0-1.fc29.x86_64
> > The host was using plain Fedora 29 without virt-preview before that.
> >
> > After the upgrade, starting some vms that were running fine fail now with
> > this error:
> >
> > Error starting domain: internal error: child reported (status=125):
> > Requested operation is not valid: Setting different DAC user or group on
> > /home/libvirt/images/voodoo4-os.img which is already in use
> >
> > Traceback (most recent call last):
> >File "/usr/share/virt-manager/virtManager/asyncjob.py", line 75, in
> > cb_wrapper
> >  callback(asyncjob, *args, **kwargs)
> >File "/usr/share/virt-manager/virtManager/asyncjob.py", line 111, in
> tmpcb
> >  callback(*args, **kwargs)
> >File "/usr/share/virt-manager/virtManager/object/libvirtobject.py",
> line
> > 66, in newfn
> >  ret = fn(self, *args, **kwargs)
> >File "/usr/share/virt-manager/virtManager/object/domain.py", line
> 1279,
> > in startup
> >  self._backend.create()
> >File "/usr/lib64/python3.7/site-packages/libvirt.py", line 1089, in
> create
> >  if ret == -1: raise libvirtError ('virDomainCreate() failed',
> dom=self)
> > libvirt.libvirtError: internal error: child reported (status=125):
> > Requested operation is not valid: Setting different DAC user or group on
> > /home/libvirt/images/voodoo4-os.img which is already in use
> >
> > These vms we created by creating one vm, and the cloning the vms.
> >
> > I tried to delete the disks and add them back in one of the vms, but the
> vm
> > still fail with the
> > same error.
> >
> > I hope that someone have a clue what is the issue, and how it can be
> fixed.
>
> How do you clone the vms?


Using virt-manager "Clone..." command.

The error message suggests that the image is
> in use - is it possible that you're trying to start two domains over the
> same disk?
>

No, the disks are different (these vms were running for 2-3 weeks) before
the upgrade.

# ls -lhiZ /home/libvirt/images/voodoo{4,5,8}-*.img
  247 -rw---. 1 root root system_u:object_r:virt_image_t:s0 20G Aug 17
03:55 /home/libvirt/images/voodoo4-gv0.img
37249 -rw---. 1 root root system_u:object_r:virt_image_t:s0 20G Aug 17
03:55 /home/libvirt/images/voodoo4-gv1.img
37252 -rw---. 1 root root system_u:object_r:virt_image_t:s0 50G Aug 17
03:52 /home/libvirt/images/voodoo4-os.img
37250 -rw---. 1 root root system_u:object_r:virt_image_t:s0 20G Aug 17
03:55 /home/libvirt/images/voodoo5-gv0.img
37281 -rw---. 1 root root system_u:object_r:virt_image_t:s0 20G Aug 17
03:55 /home/libvirt/images/voodoo5-gv1.img
  223 -rw---. 1 root root system_u:object_r:virt_image_t:s0 50G Aug 17
03:45 /home/libvirt/images/voodoo5-os.img
37253 -rw---. 1 root root system_u:object_r:virt_image_t:s0 20G Aug 17
03:55 /home/libvirt/images/voodoo8-gv0.img
37282 -rw---. 1 root root system_u:object_r:virt_image_t:s0 20G Aug 17
03:55 /home/libvirt/images/voodoo8-gv1.img
37251 -rw---. 1 root root system_u:object_r:virt_image_t:s0 50G Aug 17
03:46 /home/libvirt/images/voodoo8-os.img

I suspected that the disk serial is the issue - I'm using
the same serial for for all "os" and "gvX" disks, used in the guest
to locate the right disk (e.g. /dev/disk/by-id/virtio-gv0).

I tried to replace to use unique serials (e.g. gv0 -> voodoo4-gv0) but I
still see
the same error, so this must be something else.


>
> Michal
>
___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

[libvirt-users] Starting VM fails with: "Setting different DAC user or group on /path... which is already in use" after upgrading to libvirt 5.6.0-1

2019-08-19 Thread Nir Soffer
Hi,

I upgraded to a Fedora 29 host using virt-preview repo to
libvirt-daemon-5.6.0-1.fc29.x86_64
The host was using plain Fedora 29 without virt-preview before that.

After the upgrade, starting some vms that were running fine fail now with
this error:

Error starting domain: internal error: child reported (status=125):
Requested operation is not valid: Setting different DAC user or group on
/home/libvirt/images/voodoo4-os.img which is already in use

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 75, in
cb_wrapper
callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 111, in tmpcb
callback(*args, **kwargs)
  File "/usr/share/virt-manager/virtManager/object/libvirtobject.py", line
66, in newfn
ret = fn(self, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/object/domain.py", line 1279,
in startup
self._backend.create()
  File "/usr/lib64/python3.7/site-packages/libvirt.py", line 1089, in create
if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self)
libvirt.libvirtError: internal error: child reported (status=125):
Requested operation is not valid: Setting different DAC user or group on
/home/libvirt/images/voodoo4-os.img which is already in use

These vms we created by creating one vm, and the cloning the vms.

I tried to delete the disks and add them back in one of the vms, but the vm
still fail with the
same error.

I hope that someone have a clue what is the issue, and how it can be fixed.

Here some details about the setup:

vm1:


  voodoo4
  0b3aa57a-00b6-4e99-81f9-8f216f85ccaf
  voodoo4 (fedora 29, gluster)
  
http://libosinfo.org/xmlns/libvirt/domain/1.0";>
  http://fedoraproject.org/fedora/29"/>

  
  2097152
  2097152
  2
  
hvm

  
  



  
  

  
  



  
  destroy
  restart
  destroy
  


  
  
/usr/bin/qemu-kvm

  
  
  
  


  
  
  
  os
  


  
  
  
  gv0
  


  
  
  
  gv1
  


  


  



  
  
  


  
  
  


  
  
  


  
  
  


  
  
  


  
  
  


  
  
  


  
  
  


  


  
  
  
  


  

  


  


  
  


  
  


  




  


  


  
  


  


  


  


  /dev/urandom
  

  



vm 2:


  voodoo5
  8ded8ea2-6524-4fc0-94f6-31667338a5f2
  voodoo5 (fedora 29, gluster)
  
http://libosinfo.org/xmlns/libvirt/domain/1.0";>
  http://fedoraproject.org/fedora/29"/>

  
  2097152
  2097152
  2
  
hvm

  
  



  
  

  
  



  
  destroy
  restart
  destroy
  


  
  
/usr/bin/qemu-kvm

  
  
  
  os
  


  
  
  
  gv0
  


  
  
  
  gv1
  


  
  
  
  


  


  



  
  
  


  
  
  


  
  
  


  
  
  


  
  
  


  
  
  


  
  
  


  
  
  


  


  
  
  
  


  

  


  


  
  


  
  


  




  


  


  
  


  


  


  


  /dev/urandom
  

  



ls -lhZ /home/libvirt/images/voodoo4*
# ls -lhZ /home/libvirt/images/voodoo4*
-rw---. 1 root root system_u:object_r:virt_image_t:s0 20G Aug 17 03:55
/home/libvirt/images/voodoo4-gv0.img
-rw---. 1 root root system_u:object_r:virt_image_t:s0 20G Aug 17 03:55
/home/libvirt/images/voodoo4-gv1.img
-rw---. 1 root root system_u:object_r:virt_image_t:s0 50G Aug 17 03:52
/home/libvirt/images/voodoo4-os.img


cat /etc/libvirt/storage/images.xml


The related pool:


  images
  f7190095-947d-442b-b94b-4a99790795bc
  0
  0
  0
  
  
  
/home/libvirt/images

  0755
  -1
  -1

  


Nir
___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] stream finish throws exception via python API

2016-04-27 Thread Nir Soffer
בתאריך 27 באפר׳ 2016 5:37 אחה״צ,‏ "Cole Robinson"  כתב:
>
> On 04/27/2016 04:26 AM, Daniel P. Berrange wrote:
> > On Tue, Apr 26, 2016 at 03:17:19PM -0400, Cole Robinson wrote:
> >> On 04/26/2016 02:56 PM, Nir Soffer wrote:
> >>> On Tue, Apr 26, 2016 at 4:37 PM, Cole Robinson 
wrote:
> >>>> On 04/26/2016 09:35 AM, Shahar Havivi wrote:
> >>>>> On 26.04.16 15:30, Shahar Havivi wrote:
> >>>>
> >>>> Libvirt doesn't invoke qemu-img check anywhere AFAIK, so if that's
the only
> >>>> way to get that info, then it isn't available
> >>>
> >>> We would like to report progress for downloads, so we need to know in
advance
> >>> what is the size of the download.
> >>>
> >>> I guess we can use an estimate (e.g. capacity * 1.1), or maybe
someone have
> >>> a better idea how to estimate the download size?
> >>>
> >>
> >> Hmm, I didn't realize  for a qcow2 volume isn't the full
size on
> >> disk, but instead the size of the virtual disk image. We should
probably add
> >> an extra volume field to report the actual on disk size too. Please
file a
> >> RHEL bug
> >
> >  is intended to give the actual size on disk.
> >
>
> Hmm I see we do track a src->physical value via virstoragefile but that
isn't
> reflected in the storage volume XML at all, there's no  XML
element.
> Should be simple to add, but someone on ovirt side please file an RFE so
it's
> properly prioritized

Sure, we will file a bug.
>
> Thanks,
> Cole
___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] stream finish throws exception via python API

2016-04-26 Thread Nir Soffer
On Tue, Apr 26, 2016 at 4:37 PM, Cole Robinson  wrote:
> On 04/26/2016 09:35 AM, Shahar Havivi wrote:
>> On 26.04.16 15:30, Shahar Havivi wrote:
>>> On 26.04.16 14:14, Shahar Havivi wrote:
 On 25.04.16 09:11, Cole Robinson wrote:
> On 04/25/2016 08:10 AM, Shahar Havivi wrote:
>> On 17.04.16 15:41, Shahar Havivi wrote:
>>> Hi,
>>> The following snippet works fine e.g. receiving the data but when 
>>> calling
>>> stream.finish() we get the following error:
>>>
>>> stream = con.newStream()
>>> vol.download(stream, 0, 0, 0)
>>> buf = stream.recv(1024)
>>> stream.finish()
>>>
>>> libvirt: I/O Stream Utils error : internal error: I/O helper exited 
>>> abnormally
>>> Traceback (most recent call last):
>>>   File "./helpers/kvm2ovirt", line 149, in 
>>> download_volume(vol, item[1], diskno, disksitems, pksize)
>>>   File "./helpers/kvm2ovirt", line 102, in download_volume
>>> stream.finish()
>>>   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 5501, in 
>>> finish
>>> if ret == -1: raise libvirtError ('virStreamFinish() failed')
>>> libvirt.libvirtError: internal error: I/O helper exited abnormally
>>>
>
> The error message sucks, I'll send patches to improve it a little at 
> least.
>
> What's happening here is that the you haven't read all the data you 
> requested
> (it's vol.download(path, offset, length, flags), length == 0 means read 
> the
> whole file which I suspect you haven't done). In this case the iohelper
> program that libvirt uses won't complete feeding us all the data, and it 
> exits
> with SIGPIPE when we close the read end of the pipe.
>
> Now whether that should actually be an error condition is open to debate.
> virStreamFinish docs make it sound like it's legitimate to throw an error 
> if
> it appears that all requested data wasn't read.
 Thanks checking...
>>> Thanks Cole,
>>> I modify our script and for checking if stream.recv() returns zero the 
>>> finish
>>> works fine.
>>> Our API needs to know the size of bytes that you are going to stream back
>>> which is not provided by the allocation and capacity.
>>> (ie the same as du -s /path/to/disk)
>>>
>>>   Shahar.
>> We need the size of the image "Image end offset" you get from "qemu-img 
>> check",
>> Does libvirt have an API for that?
>>
>
> Libvirt doesn't invoke qemu-img check anywhere AFAIK, so if that's the only
> way to get that info, then it isn't available

We would like to report progress for downloads, so we need to know in advance
what is the size of the download.

I guess we can use an estimate (e.g. capacity * 1.1), or maybe someone have
a better idea how to estimate the download size?

A second issue, libvirt does not seem to understand sparseness, so we
are copying
around gigabytes of zeros.

For example, I created 102M empty image (using virt manager):

$ qemu-img info /home/nsoffer/var/libvirt/images/tiny.qcow2
image: /home/nsoffer/var/libvirt/images/tiny.qcow2
file format: qcow2
virtual size: 102M (107374592 bytes)
disk size: 304K
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: true
refcount bits: 16
corrupt: false

Downloading it using the attach test script:

$ python download.py /home/nsoffer/var/libvirt/images/tiny.qcow2
written 107741184 bytes

$ qemu-img info download.qcow2
image: download.qcow2
file format: qcow2
virtual size: 102M (107374592 bytes)
disk size: 103M
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: true
refcount bits: 16
corrupt: false

Same operation using qemu-img convert:

$ qemu-img convert -O qcow2
/home/nsoffer/var/libvirt/images/tiny.qcow2 convert.qcow2

$ qemu-img info convert.qcow2
image: convert.qcow2
file format: qcow2
virtual size: 102M (107374592 bytes)
disk size: 196K
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false

So with libvirt we downloaded 102M, and qemu copied only 196K.

Nir
import sys

import libvirt

if len(sys.argv) < 2:
print "Usage: download PATH"
sys.exit(2)

path = sys.argv[1]

con = libvirt.open('qemu:///system')
vol = con.storageVolLookupByPath(path)
stream = con.newStream()
vol.download(stream, 0, 0, 0)

total = 0
with open('download.qcow2', 'wb', 1048576) as f:
while True:
buf = stream.recv(1048576)
if len(buf) == 0:
stream.finish()
break
f.write(buf)
total += len(buf)

print 'written %d bytes' % total
___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users