Re: [ovirt-devel] Re: URGENT - ovirt-engine broken for 3 days Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-10 Thread Dafna Ron
we have a passing ovirt-engine build today.
Thank you all for a fast response.
Dafna


On Thu, May 9, 2019 at 12:43 PM Sandro Bonazzola 
wrote:

>
>
> Il giorno gio 9 mag 2019 alle ore 12:59 Dafna Ron  ha
> scritto:
>
>> As IL are on independence day, anyone else can merge?
>> https://gerrit.ovirt.org/#/c/99845/
>>
>>
> I have merge rights but I need at least CI to pass. Waiting on jenkins.
>
>
>>
>> On Thu, May 9, 2019 at 11:30 AM Dafna Ron  wrote:
>>
>>> Thanks Andrej.
>>> I will follow the patch and update.
>>> Dafna
>>>
>>> On Thu, May 9, 2019 at 11:23 AM Andrej Krejcir 
>>> wrote:
>>>
 Hi,

 Ok, I have posted the reverting patch:
 https://gerrit.ovirt.org/#/c/99845/

 I'm still investigating what is the problem. Sorry for the delay, we
 had a public holiday yesturday.


 Andrej

 On Thu, 9 May 2019 at 11:20, Dafna Ron  wrote:

> Hi,
>
> I have not heard back on this issue and ovirt-engine has been broken
> for the past 3 days.
>
> As this does not seem a simple debug and fix I suggest reverting the
> patch and investigating later.
>
> thanks,
> Dafna
>
>
>
> On Wed, May 8, 2019 at 9:42 AM Dafna Ron  wrote:
>
>> Any news?
>>
>> Thanks,
>> Dafna
>>
>>
>> On Tue, May 7, 2019 at 4:57 PM Dafna Ron  wrote:
>>
>>> thanks for the quick reply and investigation.
>>> Please update me if I can help any further and if you find the cause
>>> and have a patch let me know.
>>> Note that ovirt-engine project is broken and if we cannot find the
>>> cause relatively fast we should consider reverting the patch to allow a 
>>> new
>>> package to be built in CQ with other changes that were submitted.
>>>
>>> Thanks,
>>> Dafna
>>>
>>>
>>> On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir 
>>> wrote:
>>>
 After running a few OSTs manually, it seems that the patch is the
 cause. Investigating...

 On Tue, 7 May 2019 at 14:58, Andrej Krejcir 
 wrote:

> Hi,
>
> The issue is probably not caused by the patch.
>
> This log line means that the VM does not exist in the DB:
>
> 2019-05-07 06:02:04,215-04 WARN
> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
> Validation
> of action 'MigrateMultipleVms' failed for user admin@internal-authz.
> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
>
> I will investigate more, why the VM is missing.
>
> On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:
>
>> Hi,
>>
>> We are failing test upgrade_hosts on
>> upgrade-from-release-suite-master.
>> From the logs I can see that we are calling migrate vm when we
>> have only one host and the vm seem to have been shut down before the
>> maintenance call is issued.
>>
>> Can you please look into this?
>>
>> suspected patch reported as root cause by CQ is:
>>
>> https://gerrit.ovirt.org/#/c/98920/ - core: Add
>> MigrateMultipleVms command and use it for host maintenance
>>
>>
>> logs are found here:
>>
>>
>>
>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/
>>
>>
>> I can see the issue is vm migration when putting host in
>> maintenance:
>>
>>
>> 2019-05-07 06:02:04,170-04 INFO
>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
>> [05592db2-f859-487b-b779-4b32eec5bab
>> 3] Running command: MaintenanceVdsCommand internal: true.
>> Entities affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: 
>> VDS
>> 2019-05-07 06:02:04,215-04 WARN
>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
>> Validation
>> of action
>> 'MigrateMultipleVms' failed for user admin@internal-authz.
>> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
>> 2019-05-07 06:02:04,221-04 ERROR
>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
>> Failed to
>> migrate one or
>> more VMs.
>> 2019-05-07 06:02:04,227-04 ERROR
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
>> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to s

Re: [ovirt-devel] Re: URGENT - ovirt-engine broken for 3 days Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-09 Thread Sandro Bonazzola
Il giorno gio 9 mag 2019 alle ore 12:59 Dafna Ron  ha
scritto:

> As IL are on independence day, anyone else can merge?
> https://gerrit.ovirt.org/#/c/99845/
>
>
I have merge rights but I need at least CI to pass. Waiting on jenkins.


>
> On Thu, May 9, 2019 at 11:30 AM Dafna Ron  wrote:
>
>> Thanks Andrej.
>> I will follow the patch and update.
>> Dafna
>>
>> On Thu, May 9, 2019 at 11:23 AM Andrej Krejcir 
>> wrote:
>>
>>> Hi,
>>>
>>> Ok, I have posted the reverting patch:
>>> https://gerrit.ovirt.org/#/c/99845/
>>>
>>> I'm still investigating what is the problem. Sorry for the delay, we had
>>> a public holiday yesturday.
>>>
>>>
>>> Andrej
>>>
>>> On Thu, 9 May 2019 at 11:20, Dafna Ron  wrote:
>>>
 Hi,

 I have not heard back on this issue and ovirt-engine has been broken
 for the past 3 days.

 As this does not seem a simple debug and fix I suggest reverting the
 patch and investigating later.

 thanks,
 Dafna



 On Wed, May 8, 2019 at 9:42 AM Dafna Ron  wrote:

> Any news?
>
> Thanks,
> Dafna
>
>
> On Tue, May 7, 2019 at 4:57 PM Dafna Ron  wrote:
>
>> thanks for the quick reply and investigation.
>> Please update me if I can help any further and if you find the cause
>> and have a patch let me know.
>> Note that ovirt-engine project is broken and if we cannot find the
>> cause relatively fast we should consider reverting the patch to allow a 
>> new
>> package to be built in CQ with other changes that were submitted.
>>
>> Thanks,
>> Dafna
>>
>>
>> On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir 
>> wrote:
>>
>>> After running a few OSTs manually, it seems that the patch is the
>>> cause. Investigating...
>>>
>>> On Tue, 7 May 2019 at 14:58, Andrej Krejcir 
>>> wrote:
>>>
 Hi,

 The issue is probably not caused by the patch.

 This log line means that the VM does not exist in the DB:

 2019-05-07 06:02:04,215-04 WARN
 [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
 Validation
 of action 'MigrateMultipleVms' failed for user admin@internal-authz.
 Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND

 I will investigate more, why the VM is missing.

 On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:

> Hi,
>
> We are failing test upgrade_hosts on
> upgrade-from-release-suite-master.
> From the logs I can see that we are calling migrate vm when we
> have only one host and the vm seem to have been shut down before the
> maintenance call is issued.
>
> Can you please look into this?
>
> suspected patch reported as root cause by CQ is:
>
> https://gerrit.ovirt.org/#/c/98920/ - core: Add
> MigrateMultipleVms command and use it for host maintenance
>
>
> logs are found here:
>
>
>
> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/
>
>
> I can see the issue is vm migration when putting host in
> maintenance:
>
>
> 2019-05-07 06:02:04,170-04 INFO
> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
> [05592db2-f859-487b-b779-4b32eec5bab
> 3] Running command: MaintenanceVdsCommand internal: true. Entities
> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
> 2019-05-07 06:02:04,215-04 WARN
> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
> Validation
> of action
> 'MigrateMultipleVms' failed for user admin@internal-authz.
> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
> 2019-05-07 06:02:04,221-04 ERROR
> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
> Failed to
> migrate one or
> more VMs.
> 2019-05-07 06:02:04,227-04 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
> 2019-05-07 06:02:04,239-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
> Acquired to object 'Eng
>>>

Re: URGENT - ovirt-engine broken for 3 days Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-09 Thread Dafna Ron
As IL are on independence day, anyone else can merge?
https://gerrit.ovirt.org/#/c/99845/


On Thu, May 9, 2019 at 11:30 AM Dafna Ron  wrote:

> Thanks Andrej.
> I will follow the patch and update.
> Dafna
>
> On Thu, May 9, 2019 at 11:23 AM Andrej Krejcir 
> wrote:
>
>> Hi,
>>
>> Ok, I have posted the reverting patch:
>> https://gerrit.ovirt.org/#/c/99845/
>>
>> I'm still investigating what is the problem. Sorry for the delay, we had
>> a public holiday yesturday.
>>
>>
>> Andrej
>>
>> On Thu, 9 May 2019 at 11:20, Dafna Ron  wrote:
>>
>>> Hi,
>>>
>>> I have not heard back on this issue and ovirt-engine has been broken for
>>> the past 3 days.
>>>
>>> As this does not seem a simple debug and fix I suggest reverting the
>>> patch and investigating later.
>>>
>>> thanks,
>>> Dafna
>>>
>>>
>>>
>>> On Wed, May 8, 2019 at 9:42 AM Dafna Ron  wrote:
>>>
 Any news?

 Thanks,
 Dafna


 On Tue, May 7, 2019 at 4:57 PM Dafna Ron  wrote:

> thanks for the quick reply and investigation.
> Please update me if I can help any further and if you find the cause
> and have a patch let me know.
> Note that ovirt-engine project is broken and if we cannot find the
> cause relatively fast we should consider reverting the patch to allow a 
> new
> package to be built in CQ with other changes that were submitted.
>
> Thanks,
> Dafna
>
>
> On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir 
> wrote:
>
>> After running a few OSTs manually, it seems that the patch is the
>> cause. Investigating...
>>
>> On Tue, 7 May 2019 at 14:58, Andrej Krejcir 
>> wrote:
>>
>>> Hi,
>>>
>>> The issue is probably not caused by the patch.
>>>
>>> This log line means that the VM does not exist in the DB:
>>>
>>> 2019-05-07 06:02:04,215-04 WARN
>>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
>>> Validation
>>> of action 'MigrateMultipleVms' failed for user admin@internal-authz.
>>> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
>>>
>>> I will investigate more, why the VM is missing.
>>>
>>> On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:
>>>
 Hi,

 We are failing test upgrade_hosts on
 upgrade-from-release-suite-master.
 From the logs I can see that we are calling migrate vm when we have
 only one host and the vm seem to have been shut down before the 
 maintenance
 call is issued.

 Can you please look into this?

 suspected patch reported as root cause by CQ is:

 https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms
 command and use it for host maintenance


 logs are found here:



 http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/


 I can see the issue is vm migration when putting host in
 maintenance:


 2019-05-07 06:02:04,170-04 INFO
 [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
 [05592db2-f859-487b-b779-4b32eec5bab
 3] Running command: MaintenanceVdsCommand internal: true. Entities
 affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
 2019-05-07 06:02:04,215-04 WARN
 [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
 Validation
 of action
 'MigrateMultipleVms' failed for user admin@internal-authz.
 Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
 2019-05-07 06:02:04,221-04 ERROR
 [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
 Failed to
 migrate one or
 more VMs.
 2019-05-07 06:02:04,227-04 ERROR
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
 T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
 lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
 2019-05-07 06:02:04,239-04 INFO
 [org.ovirt.engine.core.bll.ActivateVdsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
 Acquired to object 'Eng
 ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
 sharedLocks=''}'
 2019-05-07 06:02:04,242-04 INFO
 [org.ovirt.engine.core.bll.ActivateVdsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] 
>

Re: URGENT - ovirt-engine broken for 3 days Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-09 Thread Dafna Ron
Thanks Andrej.
I will follow the patch and update.
Dafna

On Thu, May 9, 2019 at 11:23 AM Andrej Krejcir  wrote:

> Hi,
>
> Ok, I have posted the reverting patch: https://gerrit.ovirt.org/#/c/99845/
>
> I'm still investigating what is the problem. Sorry for the delay, we had a
> public holiday yesturday.
>
>
> Andrej
>
> On Thu, 9 May 2019 at 11:20, Dafna Ron  wrote:
>
>> Hi,
>>
>> I have not heard back on this issue and ovirt-engine has been broken for
>> the past 3 days.
>>
>> As this does not seem a simple debug and fix I suggest reverting the
>> patch and investigating later.
>>
>> thanks,
>> Dafna
>>
>>
>>
>> On Wed, May 8, 2019 at 9:42 AM Dafna Ron  wrote:
>>
>>> Any news?
>>>
>>> Thanks,
>>> Dafna
>>>
>>>
>>> On Tue, May 7, 2019 at 4:57 PM Dafna Ron  wrote:
>>>
 thanks for the quick reply and investigation.
 Please update me if I can help any further and if you find the cause
 and have a patch let me know.
 Note that ovirt-engine project is broken and if we cannot find the
 cause relatively fast we should consider reverting the patch to allow a new
 package to be built in CQ with other changes that were submitted.

 Thanks,
 Dafna


 On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir 
 wrote:

> After running a few OSTs manually, it seems that the patch is the
> cause. Investigating...
>
> On Tue, 7 May 2019 at 14:58, Andrej Krejcir 
> wrote:
>
>> Hi,
>>
>> The issue is probably not caused by the patch.
>>
>> This log line means that the VM does not exist in the DB:
>>
>> 2019-05-07 06:02:04,215-04 WARN
>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
>> Validation
>> of action 'MigrateMultipleVms' failed for user admin@internal-authz.
>> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
>>
>> I will investigate more, why the VM is missing.
>>
>> On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:
>>
>>> Hi,
>>>
>>> We are failing test upgrade_hosts on
>>> upgrade-from-release-suite-master.
>>> From the logs I can see that we are calling migrate vm when we have
>>> only one host and the vm seem to have been shut down before the 
>>> maintenance
>>> call is issued.
>>>
>>> Can you please look into this?
>>>
>>> suspected patch reported as root cause by CQ is:
>>>
>>> https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms
>>> command and use it for host maintenance
>>>
>>>
>>> logs are found here:
>>>
>>>
>>>
>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/
>>>
>>>
>>> I can see the issue is vm migration when putting host in
>>> maintenance:
>>>
>>>
>>> 2019-05-07 06:02:04,170-04 INFO
>>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
>>> [05592db2-f859-487b-b779-4b32eec5bab
>>> 3] Running command: MaintenanceVdsCommand internal: true. Entities
>>> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
>>> 2019-05-07 06:02:04,215-04 WARN
>>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
>>> Validation
>>> of action
>>> 'MigrateMultipleVms' failed for user admin@internal-authz. Reasons:
>>> ACTION_TYPE_FAILED_VMS_NOT_FOUND
>>> 2019-05-07 06:02:04,221-04 ERROR
>>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed 
>>> to
>>> migrate one or
>>> more VMs.
>>> 2019-05-07 06:02:04,227-04 ERROR
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
>>> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
>>> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
>>> 2019-05-07 06:02:04,239-04 INFO
>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
>>> Acquired to object 'Eng
>>> ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
>>> sharedLocks=''}'
>>> 2019-05-07 06:02:04,242-04 INFO
>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running
>>> command: ActivateVds
>>> Command internal: true. Entities affected : ID:
>>> 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group 
>>> MANIPULATE_HOST
>>> with role type ADMIN
>>> 2019-05-07 06:02:04,243-04 INFO
>>> [org.ovirt.engine.core.bll.ActivateVdsComm

Re: URGENT - ovirt-engine broken for 3 days Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-09 Thread Andrej Krejcir
Hi,

Ok, I have posted the reverting patch: https://gerrit.ovirt.org/#/c/99845/

I'm still investigating what is the problem. Sorry for the delay, we had a
public holiday yesturday.


Andrej

On Thu, 9 May 2019 at 11:20, Dafna Ron  wrote:

> Hi,
>
> I have not heard back on this issue and ovirt-engine has been broken for
> the past 3 days.
>
> As this does not seem a simple debug and fix I suggest reverting the patch
> and investigating later.
>
> thanks,
> Dafna
>
>
>
> On Wed, May 8, 2019 at 9:42 AM Dafna Ron  wrote:
>
>> Any news?
>>
>> Thanks,
>> Dafna
>>
>>
>> On Tue, May 7, 2019 at 4:57 PM Dafna Ron  wrote:
>>
>>> thanks for the quick reply and investigation.
>>> Please update me if I can help any further and if you find the cause and
>>> have a patch let me know.
>>> Note that ovirt-engine project is broken and if we cannot find the cause
>>> relatively fast we should consider reverting the patch to allow a new
>>> package to be built in CQ with other changes that were submitted.
>>>
>>> Thanks,
>>> Dafna
>>>
>>>
>>> On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir 
>>> wrote:
>>>
 After running a few OSTs manually, it seems that the patch is the
 cause. Investigating...

 On Tue, 7 May 2019 at 14:58, Andrej Krejcir 
 wrote:

> Hi,
>
> The issue is probably not caused by the patch.
>
> This log line means that the VM does not exist in the DB:
>
> 2019-05-07 06:02:04,215-04 WARN
> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
> Validation
> of action 'MigrateMultipleVms' failed for user admin@internal-authz.
> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
>
> I will investigate more, why the VM is missing.
>
> On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:
>
>> Hi,
>>
>> We are failing test upgrade_hosts on
>> upgrade-from-release-suite-master.
>> From the logs I can see that we are calling migrate vm when we have
>> only one host and the vm seem to have been shut down before the 
>> maintenance
>> call is issued.
>>
>> Can you please look into this?
>>
>> suspected patch reported as root cause by CQ is:
>>
>> https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms
>> command and use it for host maintenance
>>
>>
>> logs are found here:
>>
>>
>>
>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/
>>
>>
>> I can see the issue is vm migration when putting host in maintenance:
>>
>>
>> 2019-05-07 06:02:04,170-04 INFO
>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
>> [05592db2-f859-487b-b779-4b32eec5bab
>> 3] Running command: MaintenanceVdsCommand internal: true. Entities
>> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
>> 2019-05-07 06:02:04,215-04 WARN
>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
>> Validation
>> of action
>> 'MigrateMultipleVms' failed for user admin@internal-authz. Reasons:
>> ACTION_TYPE_FAILED_VMS_NOT_FOUND
>> 2019-05-07 06:02:04,221-04 ERROR
>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed 
>> to
>> migrate one or
>> more VMs.
>> 2019-05-07 06:02:04,227-04 ERROR
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
>> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
>> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
>> 2019-05-07 06:02:04,239-04 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
>> Acquired to object 'Eng
>> ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
>> sharedLocks=''}'
>> 2019-05-07 06:02:04,242-04 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running
>> command: ActivateVds
>> Command internal: true. Entities affected : ID:
>> 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group 
>> MANIPULATE_HOST
>> with role type ADMIN
>> 2019-05-07 06:02:04,243-04 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before
>> acquiring lock in ord
>> er to prevent monitoring for host
>> 'lago-upgrade-from-release-suite-master-host-0' from data-center 
>> 'test-dc'
>> 2019-05-07 06:02

URGENT - ovirt-engine broken for 3 days Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-09 Thread Dafna Ron
Hi,

I have not heard back on this issue and ovirt-engine has been broken for
the past 3 days.

As this does not seem a simple debug and fix I suggest reverting the patch
and investigating later.

thanks,
Dafna



On Wed, May 8, 2019 at 9:42 AM Dafna Ron  wrote:

> Any news?
>
> Thanks,
> Dafna
>
>
> On Tue, May 7, 2019 at 4:57 PM Dafna Ron  wrote:
>
>> thanks for the quick reply and investigation.
>> Please update me if I can help any further and if you find the cause and
>> have a patch let me know.
>> Note that ovirt-engine project is broken and if we cannot find the cause
>> relatively fast we should consider reverting the patch to allow a new
>> package to be built in CQ with other changes that were submitted.
>>
>> Thanks,
>> Dafna
>>
>>
>> On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir 
>> wrote:
>>
>>> After running a few OSTs manually, it seems that the patch is the cause.
>>> Investigating...
>>>
>>> On Tue, 7 May 2019 at 14:58, Andrej Krejcir  wrote:
>>>
 Hi,

 The issue is probably not caused by the patch.

 This log line means that the VM does not exist in the DB:

 2019-05-07 06:02:04,215-04 WARN
 [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
 of action 'MigrateMultipleVms' failed for user admin@internal-authz.
 Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND

 I will investigate more, why the VM is missing.

 On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:

> Hi,
>
> We are failing test upgrade_hosts on
> upgrade-from-release-suite-master.
> From the logs I can see that we are calling migrate vm when we have
> only one host and the vm seem to have been shut down before the 
> maintenance
> call is issued.
>
> Can you please look into this?
>
> suspected patch reported as root cause by CQ is:
>
> https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms
> command and use it for host maintenance
>
>
> logs are found here:
>
>
>
> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/
>
>
> I can see the issue is vm migration when putting host in maintenance:
>
>
> 2019-05-07 06:02:04,170-04 INFO
> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
> [05592db2-f859-487b-b779-4b32eec5bab
> 3] Running command: MaintenanceVdsCommand internal: true. Entities
> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
> 2019-05-07 06:02:04,215-04 WARN
> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
> Validation
> of action
> 'MigrateMultipleVms' failed for user admin@internal-authz. Reasons:
> ACTION_TYPE_FAILED_VMS_NOT_FOUND
> 2019-05-07 06:02:04,221-04 ERROR
> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed to
> migrate one or
> more VMs.
> 2019-05-07 06:02:04,227-04 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
> 2019-05-07 06:02:04,239-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
> Acquired to object 'Eng
> ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
> sharedLocks=''}'
> 2019-05-07 06:02:04,242-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running
> command: ActivateVds
> Command internal: true. Entities affected : ID:
> 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group MANIPULATE_HOST
> with role type ADMIN
> 2019-05-07 06:02:04,243-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before
> acquiring lock in ord
> er to prevent monitoring for host
> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
> 2019-05-07 06:02:04,243-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
> acquired, from now a mo
> nitoring of host will be skipped for host
> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
> 2019-05-07 06:02:04,252-04 INFO
> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> (

Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-08 Thread Dafna Ron
Any news?

Thanks,
Dafna


On Tue, May 7, 2019 at 4:57 PM Dafna Ron  wrote:

> thanks for the quick reply and investigation.
> Please update me if I can help any further and if you find the cause and
> have a patch let me know.
> Note that ovirt-engine project is broken and if we cannot find the cause
> relatively fast we should consider reverting the patch to allow a new
> package to be built in CQ with other changes that were submitted.
>
> Thanks,
> Dafna
>
>
> On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir  wrote:
>
>> After running a few OSTs manually, it seems that the patch is the cause.
>> Investigating...
>>
>> On Tue, 7 May 2019 at 14:58, Andrej Krejcir  wrote:
>>
>>> Hi,
>>>
>>> The issue is probably not caused by the patch.
>>>
>>> This log line means that the VM does not exist in the DB:
>>>
>>> 2019-05-07 06:02:04,215-04 WARN
>>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
>>> of action 'MigrateMultipleVms' failed for user admin@internal-authz.
>>> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
>>>
>>> I will investigate more, why the VM is missing.
>>>
>>> On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:
>>>
 Hi,

 We are failing test upgrade_hosts on upgrade-from-release-suite-master.
 From the logs I can see that we are calling migrate vm when we have
 only one host and the vm seem to have been shut down before the maintenance
 call is issued.

 Can you please look into this?

 suspected patch reported as root cause by CQ is:

 https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms
 command and use it for host maintenance


 logs are found here:



 http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/


 I can see the issue is vm migration when putting host in maintenance:


 2019-05-07 06:02:04,170-04 INFO
 [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
 [05592db2-f859-487b-b779-4b32eec5bab
 3] Running command: MaintenanceVdsCommand internal: true. Entities
 affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
 2019-05-07 06:02:04,215-04 WARN
 [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
 of action
 'MigrateMultipleVms' failed for user admin@internal-authz. Reasons:
 ACTION_TYPE_FAILED_VMS_NOT_FOUND
 2019-05-07 06:02:04,221-04 ERROR
 [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed to
 migrate one or
 more VMs.
 2019-05-07 06:02:04,227-04 ERROR
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
 T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
 lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
 2019-05-07 06:02:04,239-04 INFO
 [org.ovirt.engine.core.bll.ActivateVdsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
 Acquired to object 'Eng
 ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
 sharedLocks=''}'
 2019-05-07 06:02:04,242-04 INFO
 [org.ovirt.engine.core.bll.ActivateVdsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running
 command: ActivateVds
 Command internal: true. Entities affected : ID:
 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group MANIPULATE_HOST
 with role type ADMIN
 2019-05-07 06:02:04,243-04 INFO
 [org.ovirt.engine.core.bll.ActivateVdsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before
 acquiring lock in ord
 er to prevent monitoring for host
 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
 2019-05-07 06:02:04,243-04 INFO
 [org.ovirt.engine.core.bll.ActivateVdsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
 acquired, from now a mo
 nitoring of host will be skipped for host
 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
 2019-05-07 06:02:04,252-04 INFO
 [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] START,
 SetVdsStatu
 sVDSCommand(HostName = lago-upgrade-from-release-suite-master-host-0,
 SetVdsStatusVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
 status='Unassigned', n
 onOperationalReason='NONE', stopSpmFailureLogged='false',
 maintenanceReason='null'}), log id: 2c8aa211
 2019-05-07 06

Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-07 Thread Dafna Ron
thanks for the quick reply and investigation.
Please update me if I can help any further and if you find the cause and
have a patch let me know.
Note that ovirt-engine project is broken and if we cannot find the cause
relatively fast we should consider reverting the patch to allow a new
package to be built in CQ with other changes that were submitted.

Thanks,
Dafna


On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir  wrote:

> After running a few OSTs manually, it seems that the patch is the cause.
> Investigating...
>
> On Tue, 7 May 2019 at 14:58, Andrej Krejcir  wrote:
>
>> Hi,
>>
>> The issue is probably not caused by the patch.
>>
>> This log line means that the VM does not exist in the DB:
>>
>> 2019-05-07 06:02:04,215-04 WARN
>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
>> of action 'MigrateMultipleVms' failed for user admin@internal-authz.
>> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
>>
>> I will investigate more, why the VM is missing.
>>
>> On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:
>>
>>> Hi,
>>>
>>> We are failing test upgrade_hosts on upgrade-from-release-suite-master.
>>> From the logs I can see that we are calling migrate vm when we have only
>>> one host and the vm seem to have been shut down before the maintenance call
>>> is issued.
>>>
>>> Can you please look into this?
>>>
>>> suspected patch reported as root cause by CQ is:
>>>
>>> https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms
>>> command and use it for host maintenance
>>>
>>>
>>> logs are found here:
>>>
>>>
>>>
>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/
>>>
>>>
>>> I can see the issue is vm migration when putting host in maintenance:
>>>
>>>
>>> 2019-05-07 06:02:04,170-04 INFO
>>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
>>> [05592db2-f859-487b-b779-4b32eec5bab
>>> 3] Running command: MaintenanceVdsCommand internal: true. Entities
>>> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
>>> 2019-05-07 06:02:04,215-04 WARN
>>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
>>> of action
>>> 'MigrateMultipleVms' failed for user admin@internal-authz. Reasons:
>>> ACTION_TYPE_FAILED_VMS_NOT_FOUND
>>> 2019-05-07 06:02:04,221-04 ERROR
>>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed to
>>> migrate one or
>>> more VMs.
>>> 2019-05-07 06:02:04,227-04 ERROR
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
>>> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
>>> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
>>> 2019-05-07 06:02:04,239-04 INFO
>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
>>> Acquired to object 'Eng
>>> ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
>>> sharedLocks=''}'
>>> 2019-05-07 06:02:04,242-04 INFO
>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running
>>> command: ActivateVds
>>> Command internal: true. Entities affected : ID:
>>> 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group MANIPULATE_HOST
>>> with role type ADMIN
>>> 2019-05-07 06:02:04,243-04 INFO
>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before
>>> acquiring lock in ord
>>> er to prevent monitoring for host
>>> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
>>> 2019-05-07 06:02:04,243-04 INFO
>>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
>>> acquired, from now a mo
>>> nitoring of host will be skipped for host
>>> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
>>> 2019-05-07 06:02:04,252-04 INFO
>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] START,
>>> SetVdsStatu
>>> sVDSCommand(HostName = lago-upgrade-from-release-suite-master-host-0,
>>> SetVdsStatusVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
>>> status='Unassigned', n
>>> onOperationalReason='NONE', stopSpmFailureLogged='false',
>>> maintenanceReason='null'}), log id: 2c8aa211
>>> 2019-05-07 06:02:04,256-04 INFO
>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] FINISH,
>>> SetVdsStat
>>> usVDSCommand, return: , lo

Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-07 Thread Andrej Krejcir
After running a few OSTs manually, it seems that the patch is the cause.
Investigating...

On Tue, 7 May 2019 at 14:58, Andrej Krejcir  wrote:

> Hi,
>
> The issue is probably not caused by the patch.
>
> This log line means that the VM does not exist in the DB:
>
> 2019-05-07 06:02:04,215-04 WARN
> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
> of action 'MigrateMultipleVms' failed for user admin@internal-authz.
> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
>
> I will investigate more, why the VM is missing.
>
> On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:
>
>> Hi,
>>
>> We are failing test upgrade_hosts on upgrade-from-release-suite-master.
>> From the logs I can see that we are calling migrate vm when we have only
>> one host and the vm seem to have been shut down before the maintenance call
>> is issued.
>>
>> Can you please look into this?
>>
>> suspected patch reported as root cause by CQ is:
>>
>> https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms
>> command and use it for host maintenance
>>
>>
>> logs are found here:
>>
>>
>>
>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/
>>
>>
>> I can see the issue is vm migration when putting host in maintenance:
>>
>>
>> 2019-05-07 06:02:04,170-04 INFO
>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
>> [05592db2-f859-487b-b779-4b32eec5bab
>> 3] Running command: MaintenanceVdsCommand internal: true. Entities
>> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
>> 2019-05-07 06:02:04,215-04 WARN
>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
>> of action
>> 'MigrateMultipleVms' failed for user admin@internal-authz. Reasons:
>> ACTION_TYPE_FAILED_VMS_NOT_FOUND
>> 2019-05-07 06:02:04,221-04 ERROR
>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed to
>> migrate one or
>> more VMs.
>> 2019-05-07 06:02:04,227-04 ERROR
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
>> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
>> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
>> 2019-05-07 06:02:04,239-04 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
>> Acquired to object 'Eng
>> ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
>> sharedLocks=''}'
>> 2019-05-07 06:02:04,242-04 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running
>> command: ActivateVds
>> Command internal: true. Entities affected : ID:
>> 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group MANIPULATE_HOST
>> with role type ADMIN
>> 2019-05-07 06:02:04,243-04 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before
>> acquiring lock in ord
>> er to prevent monitoring for host
>> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
>> 2019-05-07 06:02:04,243-04 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
>> acquired, from now a mo
>> nitoring of host will be skipped for host
>> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
>> 2019-05-07 06:02:04,252-04 INFO
>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] START,
>> SetVdsStatu
>> sVDSCommand(HostName = lago-upgrade-from-release-suite-master-host-0,
>> SetVdsStatusVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
>> status='Unassigned', n
>> onOperationalReason='NONE', stopSpmFailureLogged='false',
>> maintenanceReason='null'}), log id: 2c8aa211
>> 2019-05-07 06:02:04,256-04 INFO
>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] FINISH,
>> SetVdsStat
>> usVDSCommand, return: , log id: 2c8aa211
>> 2019-05-07 06:02:04,261-04 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Activate
>> host finished. Lock
>> released. Monitoring can run now for host
>> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
>> 2019-05-07 06:02:04,265-04 INFO
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] EVEN
>> T_ID: VDS_ACTIVATE(16), Acti

Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-07 Thread Andrej Krejcir
Hi,

The issue is probably not caused by the patch.

This log line means that the VM does not exist in the DB:

2019-05-07 06:02:04,215-04 WARN
[org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
of action 'MigrateMultipleVms' failed for user admin@internal-authz.
Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND

I will investigate more, why the VM is missing.

On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:

> Hi,
>
> We are failing test upgrade_hosts on upgrade-from-release-suite-master.
> From the logs I can see that we are calling migrate vm when we have only
> one host and the vm seem to have been shut down before the maintenance call
> is issued.
>
> Can you please look into this?
>
> suspected patch reported as root cause by CQ is:
>
> https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms
> command and use it for host maintenance
>
>
> logs are found here:
>
>
>
> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/
>
>
> I can see the issue is vm migration when putting host in maintenance:
>
>
> 2019-05-07 06:02:04,170-04 INFO
> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
> [05592db2-f859-487b-b779-4b32eec5bab
> 3] Running command: MaintenanceVdsCommand internal: true. Entities
> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
> 2019-05-07 06:02:04,215-04 WARN
> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
> of action
> 'MigrateMultipleVms' failed for user admin@internal-authz. Reasons:
> ACTION_TYPE_FAILED_VMS_NOT_FOUND
> 2019-05-07 06:02:04,221-04 ERROR
> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed to
> migrate one or
> more VMs.
> 2019-05-07 06:02:04,227-04 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
> 2019-05-07 06:02:04,239-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
> Acquired to object 'Eng
> ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
> sharedLocks=''}'
> 2019-05-07 06:02:04,242-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running
> command: ActivateVds
> Command internal: true. Entities affected : ID:
> 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group MANIPULATE_HOST
> with role type ADMIN
> 2019-05-07 06:02:04,243-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before
> acquiring lock in ord
> er to prevent monitoring for host
> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
> 2019-05-07 06:02:04,243-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
> acquired, from now a mo
> nitoring of host will be skipped for host
> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
> 2019-05-07 06:02:04,252-04 INFO
> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] START,
> SetVdsStatu
> sVDSCommand(HostName = lago-upgrade-from-release-suite-master-host-0,
> SetVdsStatusVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
> status='Unassigned', n
> onOperationalReason='NONE', stopSpmFailureLogged='false',
> maintenanceReason='null'}), log id: 2c8aa211
> 2019-05-07 06:02:04,256-04 INFO
> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] FINISH,
> SetVdsStat
> usVDSCommand, return: , log id: 2c8aa211
> 2019-05-07 06:02:04,261-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Activate
> host finished. Lock
> released. Monitoring can run now for host
> 'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
> 2019-05-07 06:02:04,265-04 INFO
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] EVEN
> T_ID: VDS_ACTIVATE(16), Activation of host
> lago-upgrade-from-release-suite-master-host-0 initiated by
> admin@internal-authz.
> 2019-05-07 06:02:04,266-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock freed
> to 

Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-07 Thread Dafna Ron
Hi,

We are failing test upgrade_hosts on upgrade-from-release-suite-master.
>From the logs I can see that we are calling migrate vm when we have only
one host and the vm seem to have been shut down before the maintenance call
is issued.

Can you please look into this?

suspected patch reported as root cause by CQ is:

https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms command
and use it for host maintenance


logs are found here:


http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/


I can see the issue is vm migration when putting host in maintenance:


2019-05-07 06:02:04,170-04 INFO
[org.ovirt.engine.core.bll.MaintenanceVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2)
[05592db2-f859-487b-b779-4b32eec5bab
3] Running command: MaintenanceVdsCommand internal: true. Entities affected
: ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
2019-05-07 06:02:04,215-04 WARN
[org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Validation
of action
'MigrateMultipleVms' failed for user admin@internal-authz. Reasons:
ACTION_TYPE_FAILED_VMS_NOT_FOUND
2019-05-07 06:02:04,221-04 ERROR
[org.ovirt.engine.core.bll.MaintenanceVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed to
migrate one or
more VMs.
2019-05-07 06:02:04,227-04 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
2019-05-07 06:02:04,239-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
Acquired to object 'Eng
ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
sharedLocks=''}'
2019-05-07 06:02:04,242-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running
command: ActivateVds
Command internal: true. Entities affected : ID:
38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group MANIPULATE_HOST
with role type ADMIN
2019-05-07 06:02:04,243-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before
acquiring lock in ord
er to prevent monitoring for host
'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
2019-05-07 06:02:04,243-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
acquired, from now a mo
nitoring of host will be skipped for host
'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
2019-05-07 06:02:04,252-04 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] START,
SetVdsStatu
sVDSCommand(HostName = lago-upgrade-from-release-suite-master-host-0,
SetVdsStatusVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9',
status='Unassigned', n
onOperationalReason='NONE', stopSpmFailureLogged='false',
maintenanceReason='null'}), log id: 2c8aa211
2019-05-07 06:02:04,256-04 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] FINISH,
SetVdsStat
usVDSCommand, return: , log id: 2c8aa211
2019-05-07 06:02:04,261-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Activate
host finished. Lock
released. Monitoring can run now for host
'lago-upgrade-from-release-suite-master-host-0' from data-center 'test-dc'
2019-05-07 06:02:04,265-04 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] EVEN
T_ID: VDS_ACTIVATE(16), Activation of host
lago-upgrade-from-release-suite-master-host-0 initiated by
admin@internal-authz.
2019-05-07 06:02:04,266-04 INFO
[org.ovirt.engine.core.bll.ActivateVdsCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock freed
to object 'Engine
Lock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]',
sharedLocks=''}'
2019-05-07 06:02:04,484-04 ERROR
[org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCallback]
(EE-ManagedThreadFactory-engineScheduled-Thread-96)
[05592db2-f859-487b-b779-4b32
eec5bab3] Host 'lago-upgrade-from-release-suite-master-host-0' failed to
move to maintenance mode. Upgrade process is terminated.

I can see there was only one vm running:


drwxrwxr-x. 2 dron dron 1024 May 7 11:49 qemu
[dron@dron post-004_basic_sanity.py]$ ls -l
lago-upgrade-from-release-suite-master-host-0/_var_log/libvirt/qemu/
total 6
-rw-rw-r--. 1 dron dron 4466 May 7 10:12 vm-with-iface.log