Re: [ovirt-users] virtio-serial0 duplicate id

2016-02-18 Thread Johannes Tiefenbacher

Hi, as suggested, I opened a bug for this issue:

https://bugzilla.redhat.com/show_bug.cgi?id=1308885

all the best
Jojo
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] virtio-serial0 duplicate id

2016-02-15 Thread jojo

On 2016-02-14 12:16, Arik Hadas wrote:


- Original Message -


- Original Message -

On 11 Feb 2016, at 17:02, Johannes Tiefenbacher  wrote:

Hi,
finally I am posting something to this list :) I read it for quite some
time now and I am an ovirt user since 3.0.

Hi,
welcome:)



I updated an engine installation from 3.2 to 3.6 (stepwise of course, and
yes I know that's pretty outdated ;-). Then I updated the associated
Centos6 hosts vdsm as well, from 3.10.x to 3.16.30. I also set my cluster
comp level to 3.5(3.6 comp level is only possible with El7 hosts if I
understood correctly).

After my first failover test a VM could not be restarted, altough the
host
where it was running could correctly be fenced.

The reason according to engine's log was this:

VM  is down with error. Exit message: internal error process
exited
while connecting to monitor: qemu-kvm: -device
virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4:
Duplicate ID 'virtio-serial0' for device


I then recognized that I am not able to run this VM on any host. Ich
checked the virtual hardware in the engine database and could confirm
that
ALL my VMs had this problem: 2 devices with alias='virtio-serial0’

it may very well be a bug, but it would be quite difficult to say unless it
is reproducible. It may be broken from earlier releases
Arik/Shmuel, maybe it rings a bell?

In 3.6 we changed virtio-serial to be a managed device.
The script named 03_06_0310_change_virtio_serial_to_managed_device.sql
changes unmanaged virtio-serial devices (that were all unmanaged before) to
be managed.
A potential flow that will cause this duplication I can think of is:
1. Have a running VM in a pre-3.6 engine - it has unmanaged virtio-serial
2. Upgrade to 3.6 while the VM is running - the unmanaged virtio-serial
becomes managed
3. Do something that will change the hash of the devices
=> the engine will add an additional unmanaged virtio-serial device

Why didn't it happen before? because the handling of unmanaged devices was:
1. Upon change in the VM devices (their hash), ask for all the devices
(full-list)
2. Remove all previous unmanaged devices
3. Add every device that does not exist in the database
When we add an unmanaged device we generate a new ID (!) - therefore we had
to remove all the previous unmanaged devices before adding the new ones.
If the previous unmanaged virtio-serial became managed, it is not removed and
we will end up having two virtio-serial devices.

@Johannes - is it true that the VM was running before the engine got updated
to 3.6 and wasn't powered-off since then?

yes that's true


I managed to simulate this.
We probably need to prevent the addition of unmanaged virtio-serial in 3.6
engine but IMO we should also use the ID reported by VDSM instead of
generating a new one to eliminate similar issues in the future.
@Eli, Omer - can you recall why can't we use the ID we get from VDSM for the
unmanaged devices?
(we can continue this discussion in devel-list or in bugzilla..)


e.g.:


engine=# SELECT * FROM vm_device WHERE vm_device.device = 'virtio-serial'
AND vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER BY vm_id;
-[ RECORD 1
]-+-
device_id | 2821d03c-ce88-4613-9095-e88eadcd3792
vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
type  | controller
device| virtio-serial
address   |
boot_order| 0
spec_params   | { }
is_managed| t
is_plugged| f
is_readonly   | f
_create_date  | 2016-01-14 08:30:43.797161+01
_update_date  | 2016-02-10 10:04:56.228724+01
alias | virtio-serial0
custom_properties | { }
snapshot_id   |
logical_name  |
is_using_scsi_reservation | f
-[ RECORD 2
]-+-
device_id | 29e0805f-d836-451a-9ec3-9031baa995e6
vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
type  | controller
device| virtio-serial
address   | {bus=0x00, domain=0x, type=pci,
slot=0x04,
function=0x0}
boot_order| 0
spec_params   | { }
is_managed| f
is_plugged| t
is_readonly   | f
_create_date  | 2016-02-11 13:47:02.69992+01
_update_date  |
alias | virtio-serial0
custom_properties |
snapshot_id   |
logical_name  |
is_using_scsi_reservation | f



My solution was this:

DELETE FROM vm_device WHERE vm_id='cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec'
AND vm_device.device = 'virtio-serial' AND address = '';

(just renaming one of the aliases to virtio-serial1" did not help)

I believe it is not 

Re: [ovirt-users] virtio-serial0 duplicate id

2016-02-14 Thread Arik Hadas


- Original Message -
> 
> > On 11 Feb 2016, at 17:02, Johannes Tiefenbacher  wrote:
> > 
> > Hi,
> > finally I am posting something to this list :) I read it for quite some
> > time now and I am an ovirt user since 3.0.
> 
> Hi,
> welcome:)
> 
> > 
> > 
> > I updated an engine installation from 3.2 to 3.6 (stepwise of course, and
> > yes I know that's pretty outdated ;-). Then I updated the associated
> > Centos6 hosts vdsm as well, from 3.10.x to 3.16.30. I also set my cluster
> > comp level to 3.5(3.6 comp level is only possible with El7 hosts if I
> > understood correctly).
> > 
> > After my first failover test a VM could not be restarted, altough the host
> > where it was running could correctly be fenced.
> > 
> > The reason according to engine's log was this:
> > 
> > VM  is down with error. Exit message: internal error process exited
> > while connecting to monitor: qemu-kvm: -device
> > virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4:
> > Duplicate ID 'virtio-serial0' for device
> > 
> > 
> > I then recognized that I am not able to run this VM on any host. Ich
> > checked the virtual hardware in the engine database and could confirm that
> > ALL my VMs had this problem: 2 devices with alias='virtio-serial0’
> 
> it may very well be a bug, but it would be quite difficult to say unless it
> is reproducible. It may be broken from earlier releases
> Arik/Shmuel, maybe it rings a bell?

In 3.6 we changed virtio-serial to be a managed device.
The script named 03_06_0310_change_virtio_serial_to_managed_device.sql changes 
unmanaged virtio-serial devices (that were all unmanaged before) to be managed.
A potential flow that will cause this duplication I can think of is:
1. Have a running VM in a pre-3.6 engine - it has unmanaged virtio-serial
2. Upgrade to 3.6 while the VM is running - the unmanaged virtio-serial becomes 
managed
3. Do something that will change the hash of the devices
=> the engine will add an additional unmanaged virtio-serial device

Why didn't it happen before? because the handling of unmanaged devices was:
1. Upon change in the VM devices (their hash), ask for all the devices 
(full-list)
2. Remove all previous unmanaged devices
3. Add every device that does not exist in the database
When we add an unmanaged device we generate a new ID (!) - therefore we had to 
remove all the previous unmanaged devices before adding the new ones.
If the previous unmanaged virtio-serial became managed, it is not removed and 
we will end up having two virtio-serial devices.

@Johannes - is it true that the VM was running before the engine got updated to 
3.6 and wasn't powered-off since then?

I managed to simulate this.
We probably need to prevent the addition of unmanaged virtio-serial in 3.6 
engine but IMO we should also use the ID reported by VDSM instead of generating 
a new one to eliminate similar issues in the future.
@Eli, Omer - can you recall why can't we use the ID we get from VDSM for the 
unmanaged devices?
(we can continue this discussion in devel-list or in bugzilla..)

> 
> > 
> > e.g.:
> > 
> > 
> > engine=# SELECT * FROM vm_device WHERE vm_device.device = 'virtio-serial'
> > AND vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER BY vm_id;
> > -[ RECORD 1
> > ]-+-
> > device_id | 2821d03c-ce88-4613-9095-e88eadcd3792
> > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
> > type  | controller
> > device| virtio-serial
> > address   |
> > boot_order| 0
> > spec_params   | { }
> > is_managed| t
> > is_plugged| f
> > is_readonly   | f
> > _create_date  | 2016-01-14 08:30:43.797161+01
> > _update_date  | 2016-02-10 10:04:56.228724+01
> > alias | virtio-serial0
> > custom_properties | { }
> > snapshot_id   |
> > logical_name  |
> > is_using_scsi_reservation | f
> > -[ RECORD 2
> > ]-+-
> > device_id | 29e0805f-d836-451a-9ec3-9031baa995e6
> > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
> > type  | controller
> > device| virtio-serial
> > address   | {bus=0x00, domain=0x, type=pci, slot=0x04,
> > function=0x0}
> > boot_order| 0
> > spec_params   | { }
> > is_managed| f
> > is_plugged| t
> > is_readonly   | f
> > _create_date  | 2016-02-11 13:47:02.69992+01
> > _update_date  |
> > alias | virtio-serial0
> > custom_properties |
> > snapshot_id   |
> > logical_name  |
> > is_using_scsi_reservation | f
> > 
> > 
> > 
> 

Re: [ovirt-users] virtio-serial0 duplicate id

2016-02-14 Thread Arik Hadas


- Original Message -
> 
> 
> - Original Message -
> > 
> > > On 11 Feb 2016, at 17:02, Johannes Tiefenbacher  wrote:
> > > 
> > > Hi,
> > > finally I am posting something to this list :) I read it for quite some
> > > time now and I am an ovirt user since 3.0.
> > 
> > Hi,
> > welcome:)
> > 
> > > 
> > > 
> > > I updated an engine installation from 3.2 to 3.6 (stepwise of course, and
> > > yes I know that's pretty outdated ;-). Then I updated the associated
> > > Centos6 hosts vdsm as well, from 3.10.x to 3.16.30. I also set my cluster
> > > comp level to 3.5(3.6 comp level is only possible with El7 hosts if I
> > > understood correctly).
> > > 
> > > After my first failover test a VM could not be restarted, altough the
> > > host
> > > where it was running could correctly be fenced.
> > > 
> > > The reason according to engine's log was this:
> > > 
> > > VM  is down with error. Exit message: internal error process
> > > exited
> > > while connecting to monitor: qemu-kvm: -device
> > > virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4:
> > > Duplicate ID 'virtio-serial0' for device
> > > 
> > > 
> > > I then recognized that I am not able to run this VM on any host. Ich
> > > checked the virtual hardware in the engine database and could confirm
> > > that
> > > ALL my VMs had this problem: 2 devices with alias='virtio-serial0’
> > 
> > it may very well be a bug, but it would be quite difficult to say unless it
> > is reproducible. It may be broken from earlier releases
> > Arik/Shmuel, maybe it rings a bell?
> 
> In 3.6 we changed virtio-serial to be a managed device.
> The script named 03_06_0310_change_virtio_serial_to_managed_device.sql
> changes unmanaged virtio-serial devices (that were all unmanaged before) to
> be managed.
> A potential flow that will cause this duplication I can think of is:
> 1. Have a running VM in a pre-3.6 engine - it has unmanaged virtio-serial
> 2. Upgrade to 3.6 while the VM is running - the unmanaged virtio-serial
> becomes managed
> 3. Do something that will change the hash of the devices
> => the engine will add an additional unmanaged virtio-serial device
> 
> Why didn't it happen before? because the handling of unmanaged devices was:
> 1. Upon change in the VM devices (their hash), ask for all the devices
> (full-list)
> 2. Remove all previous unmanaged devices
> 3. Add every device that does not exist in the database
> When we add an unmanaged device we generate a new ID (!) - therefore we had
> to remove all the previous unmanaged devices before adding the new ones.
> If the previous unmanaged virtio-serial became managed, it is not removed and
> we will end up having two virtio-serial devices.
> 
> @Johannes - is it true that the VM was running before the engine got updated
> to 3.6 and wasn't powered-off since then?
> 
> I managed to simulate this.
> We probably need to prevent the addition of unmanaged virtio-serial in 3.6
> engine but IMO we should also use the ID reported by VDSM instead of
> generating a new one to eliminate similar issues in the future.
> @Eli, Omer - can you recall why can't we use the ID we get from VDSM for the
> unmanaged devices?
> (we can continue this discussion in devel-list or in bugzilla..)
> 
> > 
> > > 
> > > e.g.:
> > > 
> > > 
> > > engine=# SELECT * FROM vm_device WHERE vm_device.device = 'virtio-serial'
> > > AND vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER BY vm_id;
> > > -[ RECORD 1
> > > ]-+-
> > > device_id | 2821d03c-ce88-4613-9095-e88eadcd3792
> > > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
> > > type  | controller
> > > device| virtio-serial
> > > address   |
> > > boot_order| 0
> > > spec_params   | { }
> > > is_managed| t
> > > is_plugged| f
> > > is_readonly   | f
> > > _create_date  | 2016-01-14 08:30:43.797161+01
> > > _update_date  | 2016-02-10 10:04:56.228724+01
> > > alias | virtio-serial0
> > > custom_properties | { }
> > > snapshot_id   |
> > > logical_name  |
> > > is_using_scsi_reservation | f
> > > -[ RECORD 2
> > > ]-+-
> > > device_id | 29e0805f-d836-451a-9ec3-9031baa995e6
> > > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
> > > type  | controller
> > > device| virtio-serial
> > > address   | {bus=0x00, domain=0x, type=pci,
> > > slot=0x04,
> > > function=0x0}
> > > boot_order| 0
> > > spec_params   | { }
> > > is_managed| f
> > > is_plugged| t
> > > is_readonly   | f
> > > _create_date  

Re: [ovirt-users] virtio-serial0 duplicate id

2016-02-12 Thread Michal Skrivanek

> On 11 Feb 2016, at 17:02, Johannes Tiefenbacher  wrote:
> 
> Hi,
> finally I am posting something to this list :) I read it for quite some time 
> now and I am an ovirt user since 3.0.

Hi,
welcome:)

> 
> 
> I updated an engine installation from 3.2 to 3.6 (stepwise of course, and yes 
> I know that's pretty outdated ;-). Then I updated the associated Centos6 
> hosts vdsm as well, from 3.10.x to 3.16.30. I also set my cluster comp level 
> to 3.5(3.6 comp level is only possible with El7 hosts if I understood 
> correctly).
> 
> After my first failover test a VM could not be restarted, altough the host 
> where it was running could correctly be fenced.
> 
> The reason according to engine's log was this:
> 
> VM  is down with error. Exit message: internal error process exited 
> while connecting to monitor: qemu-kvm: -device 
> virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4: 
> Duplicate ID 'virtio-serial0' for device
> 
> 
> I then recognized that I am not able to run this VM on any host. Ich checked 
> the virtual hardware in the engine database and could confirm that ALL my VMs 
> had this problem: 2 devices with alias='virtio-serial0’

it may very well be a bug, but it would be quite difficult to say unless it is 
reproducible. It may be broken from earlier releases
Arik/Shmuel, maybe it rings a bell?

> 
> e.g.:
> 
> 
> engine=# SELECT * FROM vm_device WHERE vm_device.device = 'virtio-serial' AND 
> vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER BY vm_id;
> -[ RECORD 1 
> ]-+-
> device_id | 2821d03c-ce88-4613-9095-e88eadcd3792
> vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
> type  | controller
> device| virtio-serial
> address   |
> boot_order| 0
> spec_params   | { }
> is_managed| t
> is_plugged| f
> is_readonly   | f
> _create_date  | 2016-01-14 08:30:43.797161+01
> _update_date  | 2016-02-10 10:04:56.228724+01
> alias | virtio-serial0
> custom_properties | { }
> snapshot_id   |
> logical_name  |
> is_using_scsi_reservation | f
> -[ RECORD 2 
> ]-+-
> device_id | 29e0805f-d836-451a-9ec3-9031baa995e6
> vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
> type  | controller
> device| virtio-serial
> address   | {bus=0x00, domain=0x, type=pci, slot=0x04, 
> function=0x0}
> boot_order| 0
> spec_params   | { }
> is_managed| f
> is_plugged| t
> is_readonly   | f
> _create_date  | 2016-02-11 13:47:02.69992+01
> _update_date  |
> alias | virtio-serial0
> custom_properties |
> snapshot_id   |
> logical_name  |
> is_using_scsi_reservation | f
> 
> 
> 
> My solution was this:
> 
> DELETE FROM vm_device WHERE vm_id='cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' AND 
> vm_device.device = 'virtio-serial' AND address = '';
> 
> (just renaming one of the aliases to virtio-serial1" did not help)
> 
> 
> 
> Is this a known issue? Couldn't find anything so far.
> 
> Should I also post this to the developer list? I am not subscribed there yet, 
> wanted to check out here first.
> 
> 
> thanks in advance and all the best
> Jojo @ LINBIT VIE
> 
> 
> 
> 
> 
> 
> 
> 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] virtio-serial0 duplicate id

2016-02-11 Thread Johannes Tiefenbacher

Hi,
finally I am posting something to this list :) I read it for quite some 
time now and I am an ovirt user since 3.0.



I updated an engine installation from 3.2 to 3.6 (stepwise of course, 
and yes I know that's pretty outdated ;-). Then I updated the associated 
Centos6 hosts vdsm as well, from 3.10.x to 3.16.30. I also set my 
cluster comp level to 3.5(3.6 comp level is only possible with El7 hosts 
if I understood correctly).


After my first failover test a VM could not be restarted, altough the 
host where it was running could correctly be fenced.


The reason according to engine's log was this:

VM  is down with error. Exit message: internal error process 
exited while connecting to monitor: qemu-kvm: -device 
virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4: 
Duplicate ID 'virtio-serial0' for device



I then recognized that I am not able to run this VM on any host. Ich 
checked the virtual hardware in the engine database and could confirm 
that ALL my VMs had this problem: 2 devices with alias='virtio-serial0'


e.g.:


engine=# SELECT * FROM vm_device WHERE vm_device.device = 
'virtio-serial' AND vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER 
BY vm_id;
-[ RECORD 1 
]-+-

device_id | 2821d03c-ce88-4613-9095-e88eadcd3792
vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
type  | controller
device| virtio-serial
address   |
boot_order| 0
spec_params   | { }
is_managed| t
is_plugged| f
is_readonly   | f
_create_date  | 2016-01-14 08:30:43.797161+01
_update_date  | 2016-02-10 10:04:56.228724+01
alias | virtio-serial0
custom_properties | { }
snapshot_id   |
logical_name  |
is_using_scsi_reservation | f
-[ RECORD 2 
]-+-

device_id | 29e0805f-d836-451a-9ec3-9031baa995e6
vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
type  | controller
device| virtio-serial
address   | {bus=0x00, domain=0x, type=pci, 
slot=0x04, function=0x0}

boot_order| 0
spec_params   | { }
is_managed| f
is_plugged| t
is_readonly   | f
_create_date  | 2016-02-11 13:47:02.69992+01
_update_date  |
alias | virtio-serial0
custom_properties |
snapshot_id   |
logical_name  |
is_using_scsi_reservation | f



My solution was this:

DELETE FROM vm_device WHERE vm_id='cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' 
AND vm_device.device = 'virtio-serial' AND address = '';


(just renaming one of the aliases to virtio-serial1" did not help)



Is this a known issue? Couldn't find anything so far.

Should I also post this to the developer list? I am not subscribed there 
yet, wanted to check out here first.



thanks in advance and all the best
Jojo @ LINBIT VIE








___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users