Re: [ovirt-users] virtio-serial0 duplicate id
Hi, as suggested, I opened a bug for this issue: https://bugzilla.redhat.com/show_bug.cgi?id=1308885 all the best Jojo ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] virtio-serial0 duplicate id
On 2016-02-14 12:16, Arik Hadas wrote: - Original Message - - Original Message - On 11 Feb 2016, at 17:02, Johannes Tiefenbacherwrote: Hi, finally I am posting something to this list :) I read it for quite some time now and I am an ovirt user since 3.0. Hi, welcome:) I updated an engine installation from 3.2 to 3.6 (stepwise of course, and yes I know that's pretty outdated ;-). Then I updated the associated Centos6 hosts vdsm as well, from 3.10.x to 3.16.30. I also set my cluster comp level to 3.5(3.6 comp level is only possible with El7 hosts if I understood correctly). After my first failover test a VM could not be restarted, altough the host where it was running could correctly be fenced. The reason according to engine's log was this: VM is down with error. Exit message: internal error process exited while connecting to monitor: qemu-kvm: -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4: Duplicate ID 'virtio-serial0' for device I then recognized that I am not able to run this VM on any host. Ich checked the virtual hardware in the engine database and could confirm that ALL my VMs had this problem: 2 devices with alias='virtio-serial0’ it may very well be a bug, but it would be quite difficult to say unless it is reproducible. It may be broken from earlier releases Arik/Shmuel, maybe it rings a bell? In 3.6 we changed virtio-serial to be a managed device. The script named 03_06_0310_change_virtio_serial_to_managed_device.sql changes unmanaged virtio-serial devices (that were all unmanaged before) to be managed. A potential flow that will cause this duplication I can think of is: 1. Have a running VM in a pre-3.6 engine - it has unmanaged virtio-serial 2. Upgrade to 3.6 while the VM is running - the unmanaged virtio-serial becomes managed 3. Do something that will change the hash of the devices => the engine will add an additional unmanaged virtio-serial device Why didn't it happen before? because the handling of unmanaged devices was: 1. Upon change in the VM devices (their hash), ask for all the devices (full-list) 2. Remove all previous unmanaged devices 3. Add every device that does not exist in the database When we add an unmanaged device we generate a new ID (!) - therefore we had to remove all the previous unmanaged devices before adding the new ones. If the previous unmanaged virtio-serial became managed, it is not removed and we will end up having two virtio-serial devices. @Johannes - is it true that the VM was running before the engine got updated to 3.6 and wasn't powered-off since then? yes that's true I managed to simulate this. We probably need to prevent the addition of unmanaged virtio-serial in 3.6 engine but IMO we should also use the ID reported by VDSM instead of generating a new one to eliminate similar issues in the future. @Eli, Omer - can you recall why can't we use the ID we get from VDSM for the unmanaged devices? (we can continue this discussion in devel-list or in bugzilla..) e.g.: engine=# SELECT * FROM vm_device WHERE vm_device.device = 'virtio-serial' AND vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER BY vm_id; -[ RECORD 1 ]-+- device_id | 2821d03c-ce88-4613-9095-e88eadcd3792 vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec type | controller device| virtio-serial address | boot_order| 0 spec_params | { } is_managed| t is_plugged| f is_readonly | f _create_date | 2016-01-14 08:30:43.797161+01 _update_date | 2016-02-10 10:04:56.228724+01 alias | virtio-serial0 custom_properties | { } snapshot_id | logical_name | is_using_scsi_reservation | f -[ RECORD 2 ]-+- device_id | 29e0805f-d836-451a-9ec3-9031baa995e6 vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec type | controller device| virtio-serial address | {bus=0x00, domain=0x, type=pci, slot=0x04, function=0x0} boot_order| 0 spec_params | { } is_managed| f is_plugged| t is_readonly | f _create_date | 2016-02-11 13:47:02.69992+01 _update_date | alias | virtio-serial0 custom_properties | snapshot_id | logical_name | is_using_scsi_reservation | f My solution was this: DELETE FROM vm_device WHERE vm_id='cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' AND vm_device.device = 'virtio-serial' AND address = ''; (just renaming one of the aliases to virtio-serial1" did not help) I believe it is not
Re: [ovirt-users] virtio-serial0 duplicate id
- Original Message - > > > On 11 Feb 2016, at 17:02, Johannes Tiefenbacherwrote: > > > > Hi, > > finally I am posting something to this list :) I read it for quite some > > time now and I am an ovirt user since 3.0. > > Hi, > welcome:) > > > > > > > I updated an engine installation from 3.2 to 3.6 (stepwise of course, and > > yes I know that's pretty outdated ;-). Then I updated the associated > > Centos6 hosts vdsm as well, from 3.10.x to 3.16.30. I also set my cluster > > comp level to 3.5(3.6 comp level is only possible with El7 hosts if I > > understood correctly). > > > > After my first failover test a VM could not be restarted, altough the host > > where it was running could correctly be fenced. > > > > The reason according to engine's log was this: > > > > VM is down with error. Exit message: internal error process exited > > while connecting to monitor: qemu-kvm: -device > > virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4: > > Duplicate ID 'virtio-serial0' for device > > > > > > I then recognized that I am not able to run this VM on any host. Ich > > checked the virtual hardware in the engine database and could confirm that > > ALL my VMs had this problem: 2 devices with alias='virtio-serial0’ > > it may very well be a bug, but it would be quite difficult to say unless it > is reproducible. It may be broken from earlier releases > Arik/Shmuel, maybe it rings a bell? In 3.6 we changed virtio-serial to be a managed device. The script named 03_06_0310_change_virtio_serial_to_managed_device.sql changes unmanaged virtio-serial devices (that were all unmanaged before) to be managed. A potential flow that will cause this duplication I can think of is: 1. Have a running VM in a pre-3.6 engine - it has unmanaged virtio-serial 2. Upgrade to 3.6 while the VM is running - the unmanaged virtio-serial becomes managed 3. Do something that will change the hash of the devices => the engine will add an additional unmanaged virtio-serial device Why didn't it happen before? because the handling of unmanaged devices was: 1. Upon change in the VM devices (their hash), ask for all the devices (full-list) 2. Remove all previous unmanaged devices 3. Add every device that does not exist in the database When we add an unmanaged device we generate a new ID (!) - therefore we had to remove all the previous unmanaged devices before adding the new ones. If the previous unmanaged virtio-serial became managed, it is not removed and we will end up having two virtio-serial devices. @Johannes - is it true that the VM was running before the engine got updated to 3.6 and wasn't powered-off since then? I managed to simulate this. We probably need to prevent the addition of unmanaged virtio-serial in 3.6 engine but IMO we should also use the ID reported by VDSM instead of generating a new one to eliminate similar issues in the future. @Eli, Omer - can you recall why can't we use the ID we get from VDSM for the unmanaged devices? (we can continue this discussion in devel-list or in bugzilla..) > > > > > e.g.: > > > > > > engine=# SELECT * FROM vm_device WHERE vm_device.device = 'virtio-serial' > > AND vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER BY vm_id; > > -[ RECORD 1 > > ]-+- > > device_id | 2821d03c-ce88-4613-9095-e88eadcd3792 > > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec > > type | controller > > device| virtio-serial > > address | > > boot_order| 0 > > spec_params | { } > > is_managed| t > > is_plugged| f > > is_readonly | f > > _create_date | 2016-01-14 08:30:43.797161+01 > > _update_date | 2016-02-10 10:04:56.228724+01 > > alias | virtio-serial0 > > custom_properties | { } > > snapshot_id | > > logical_name | > > is_using_scsi_reservation | f > > -[ RECORD 2 > > ]-+- > > device_id | 29e0805f-d836-451a-9ec3-9031baa995e6 > > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec > > type | controller > > device| virtio-serial > > address | {bus=0x00, domain=0x, type=pci, slot=0x04, > > function=0x0} > > boot_order| 0 > > spec_params | { } > > is_managed| f > > is_plugged| t > > is_readonly | f > > _create_date | 2016-02-11 13:47:02.69992+01 > > _update_date | > > alias | virtio-serial0 > > custom_properties | > > snapshot_id | > > logical_name | > > is_using_scsi_reservation | f > > > > > > >
Re: [ovirt-users] virtio-serial0 duplicate id
- Original Message - > > > - Original Message - > > > > > On 11 Feb 2016, at 17:02, Johannes Tiefenbacherwrote: > > > > > > Hi, > > > finally I am posting something to this list :) I read it for quite some > > > time now and I am an ovirt user since 3.0. > > > > Hi, > > welcome:) > > > > > > > > > > > I updated an engine installation from 3.2 to 3.6 (stepwise of course, and > > > yes I know that's pretty outdated ;-). Then I updated the associated > > > Centos6 hosts vdsm as well, from 3.10.x to 3.16.30. I also set my cluster > > > comp level to 3.5(3.6 comp level is only possible with El7 hosts if I > > > understood correctly). > > > > > > After my first failover test a VM could not be restarted, altough the > > > host > > > where it was running could correctly be fenced. > > > > > > The reason according to engine's log was this: > > > > > > VM is down with error. Exit message: internal error process > > > exited > > > while connecting to monitor: qemu-kvm: -device > > > virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4: > > > Duplicate ID 'virtio-serial0' for device > > > > > > > > > I then recognized that I am not able to run this VM on any host. Ich > > > checked the virtual hardware in the engine database and could confirm > > > that > > > ALL my VMs had this problem: 2 devices with alias='virtio-serial0’ > > > > it may very well be a bug, but it would be quite difficult to say unless it > > is reproducible. It may be broken from earlier releases > > Arik/Shmuel, maybe it rings a bell? > > In 3.6 we changed virtio-serial to be a managed device. > The script named 03_06_0310_change_virtio_serial_to_managed_device.sql > changes unmanaged virtio-serial devices (that were all unmanaged before) to > be managed. > A potential flow that will cause this duplication I can think of is: > 1. Have a running VM in a pre-3.6 engine - it has unmanaged virtio-serial > 2. Upgrade to 3.6 while the VM is running - the unmanaged virtio-serial > becomes managed > 3. Do something that will change the hash of the devices > => the engine will add an additional unmanaged virtio-serial device > > Why didn't it happen before? because the handling of unmanaged devices was: > 1. Upon change in the VM devices (their hash), ask for all the devices > (full-list) > 2. Remove all previous unmanaged devices > 3. Add every device that does not exist in the database > When we add an unmanaged device we generate a new ID (!) - therefore we had > to remove all the previous unmanaged devices before adding the new ones. > If the previous unmanaged virtio-serial became managed, it is not removed and > we will end up having two virtio-serial devices. > > @Johannes - is it true that the VM was running before the engine got updated > to 3.6 and wasn't powered-off since then? > > I managed to simulate this. > We probably need to prevent the addition of unmanaged virtio-serial in 3.6 > engine but IMO we should also use the ID reported by VDSM instead of > generating a new one to eliminate similar issues in the future. > @Eli, Omer - can you recall why can't we use the ID we get from VDSM for the > unmanaged devices? > (we can continue this discussion in devel-list or in bugzilla..) > > > > > > > > > e.g.: > > > > > > > > > engine=# SELECT * FROM vm_device WHERE vm_device.device = 'virtio-serial' > > > AND vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER BY vm_id; > > > -[ RECORD 1 > > > ]-+- > > > device_id | 2821d03c-ce88-4613-9095-e88eadcd3792 > > > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec > > > type | controller > > > device| virtio-serial > > > address | > > > boot_order| 0 > > > spec_params | { } > > > is_managed| t > > > is_plugged| f > > > is_readonly | f > > > _create_date | 2016-01-14 08:30:43.797161+01 > > > _update_date | 2016-02-10 10:04:56.228724+01 > > > alias | virtio-serial0 > > > custom_properties | { } > > > snapshot_id | > > > logical_name | > > > is_using_scsi_reservation | f > > > -[ RECORD 2 > > > ]-+- > > > device_id | 29e0805f-d836-451a-9ec3-9031baa995e6 > > > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec > > > type | controller > > > device| virtio-serial > > > address | {bus=0x00, domain=0x, type=pci, > > > slot=0x04, > > > function=0x0} > > > boot_order| 0 > > > spec_params | { } > > > is_managed| f > > > is_plugged| t > > > is_readonly | f > > > _create_date
Re: [ovirt-users] virtio-serial0 duplicate id
> On 11 Feb 2016, at 17:02, Johannes Tiefenbacherwrote: > > Hi, > finally I am posting something to this list :) I read it for quite some time > now and I am an ovirt user since 3.0. Hi, welcome:) > > > I updated an engine installation from 3.2 to 3.6 (stepwise of course, and yes > I know that's pretty outdated ;-). Then I updated the associated Centos6 > hosts vdsm as well, from 3.10.x to 3.16.30. I also set my cluster comp level > to 3.5(3.6 comp level is only possible with El7 hosts if I understood > correctly). > > After my first failover test a VM could not be restarted, altough the host > where it was running could correctly be fenced. > > The reason according to engine's log was this: > > VM is down with error. Exit message: internal error process exited > while connecting to monitor: qemu-kvm: -device > virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4: > Duplicate ID 'virtio-serial0' for device > > > I then recognized that I am not able to run this VM on any host. Ich checked > the virtual hardware in the engine database and could confirm that ALL my VMs > had this problem: 2 devices with alias='virtio-serial0’ it may very well be a bug, but it would be quite difficult to say unless it is reproducible. It may be broken from earlier releases Arik/Shmuel, maybe it rings a bell? > > e.g.: > > > engine=# SELECT * FROM vm_device WHERE vm_device.device = 'virtio-serial' AND > vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER BY vm_id; > -[ RECORD 1 > ]-+- > device_id | 2821d03c-ce88-4613-9095-e88eadcd3792 > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec > type | controller > device| virtio-serial > address | > boot_order| 0 > spec_params | { } > is_managed| t > is_plugged| f > is_readonly | f > _create_date | 2016-01-14 08:30:43.797161+01 > _update_date | 2016-02-10 10:04:56.228724+01 > alias | virtio-serial0 > custom_properties | { } > snapshot_id | > logical_name | > is_using_scsi_reservation | f > -[ RECORD 2 > ]-+- > device_id | 29e0805f-d836-451a-9ec3-9031baa995e6 > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec > type | controller > device| virtio-serial > address | {bus=0x00, domain=0x, type=pci, slot=0x04, > function=0x0} > boot_order| 0 > spec_params | { } > is_managed| f > is_plugged| t > is_readonly | f > _create_date | 2016-02-11 13:47:02.69992+01 > _update_date | > alias | virtio-serial0 > custom_properties | > snapshot_id | > logical_name | > is_using_scsi_reservation | f > > > > My solution was this: > > DELETE FROM vm_device WHERE vm_id='cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' AND > vm_device.device = 'virtio-serial' AND address = ''; > > (just renaming one of the aliases to virtio-serial1" did not help) > > > > Is this a known issue? Couldn't find anything so far. > > Should I also post this to the developer list? I am not subscribed there yet, > wanted to check out here first. > > > thanks in advance and all the best > Jojo @ LINBIT VIE > > > > > > > > > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] virtio-serial0 duplicate id
Hi, finally I am posting something to this list :) I read it for quite some time now and I am an ovirt user since 3.0. I updated an engine installation from 3.2 to 3.6 (stepwise of course, and yes I know that's pretty outdated ;-). Then I updated the associated Centos6 hosts vdsm as well, from 3.10.x to 3.16.30. I also set my cluster comp level to 3.5(3.6 comp level is only possible with El7 hosts if I understood correctly). After my first failover test a VM could not be restarted, altough the host where it was running could correctly be fenced. The reason according to engine's log was this: VM is down with error. Exit message: internal error process exited while connecting to monitor: qemu-kvm: -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4: Duplicate ID 'virtio-serial0' for device I then recognized that I am not able to run this VM on any host. Ich checked the virtual hardware in the engine database and could confirm that ALL my VMs had this problem: 2 devices with alias='virtio-serial0' e.g.: engine=# SELECT * FROM vm_device WHERE vm_device.device = 'virtio-serial' AND vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER BY vm_id; -[ RECORD 1 ]-+- device_id | 2821d03c-ce88-4613-9095-e88eadcd3792 vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec type | controller device| virtio-serial address | boot_order| 0 spec_params | { } is_managed| t is_plugged| f is_readonly | f _create_date | 2016-01-14 08:30:43.797161+01 _update_date | 2016-02-10 10:04:56.228724+01 alias | virtio-serial0 custom_properties | { } snapshot_id | logical_name | is_using_scsi_reservation | f -[ RECORD 2 ]-+- device_id | 29e0805f-d836-451a-9ec3-9031baa995e6 vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec type | controller device| virtio-serial address | {bus=0x00, domain=0x, type=pci, slot=0x04, function=0x0} boot_order| 0 spec_params | { } is_managed| f is_plugged| t is_readonly | f _create_date | 2016-02-11 13:47:02.69992+01 _update_date | alias | virtio-serial0 custom_properties | snapshot_id | logical_name | is_using_scsi_reservation | f My solution was this: DELETE FROM vm_device WHERE vm_id='cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' AND vm_device.device = 'virtio-serial' AND address = ''; (just renaming one of the aliases to virtio-serial1" did not help) Is this a known issue? Couldn't find anything so far. Should I also post this to the developer list? I am not subscribed there yet, wanted to check out here first. thanks in advance and all the best Jojo @ LINBIT VIE ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users