Re: [ovirt-devel] Re: URGENT - ovirt-engine broken for 3 days Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-10 Thread Dafna Ron
we have a passing ovirt-engine build today.
Thank you all for a fast response.
Dafna


On Thu, May 9, 2019 at 12:43 PM Sandro Bonazzola 
wrote:

>
>
> Il giorno gio 9 mag 2019 alle ore 12:59 Dafna Ron  ha
> scritto:
>
>> As IL are on independence day, anyone else can merge?
>> https://gerrit.ovirt.org/#/c/99845/
>>
>>
> I have merge rights but I need at least CI to pass. Waiting on jenkins.
>
>
>>
>> On Thu, May 9, 2019 at 11:30 AM Dafna Ron  wrote:
>>
>>> Thanks Andrej.
>>> I will follow the patch and update.
>>> Dafna
>>>
>>> On Thu, May 9, 2019 at 11:23 AM Andrej Krejcir 
>>> wrote:
>>>
 Hi,

 Ok, I have posted the reverting patch:
 https://gerrit.ovirt.org/#/c/99845/

 I'm still investigating what is the problem. Sorry for the delay, we
 had a public holiday yesturday.


 Andrej

 On Thu, 9 May 2019 at 11:20, Dafna Ron  wrote:

> Hi,
>
> I have not heard back on this issue and ovirt-engine has been broken
> for the past 3 days.
>
> As this does not seem a simple debug and fix I suggest reverting the
> patch and investigating later.
>
> thanks,
> Dafna
>
>
>
> On Wed, May 8, 2019 at 9:42 AM Dafna Ron  wrote:
>
>> Any news?
>>
>> Thanks,
>> Dafna
>>
>>
>> On Tue, May 7, 2019 at 4:57 PM Dafna Ron  wrote:
>>
>>> thanks for the quick reply and investigation.
>>> Please update me if I can help any further and if you find the cause
>>> and have a patch let me know.
>>> Note that ovirt-engine project is broken and if we cannot find the
>>> cause relatively fast we should consider reverting the patch to allow a 
>>> new
>>> package to be built in CQ with other changes that were submitted.
>>>
>>> Thanks,
>>> Dafna
>>>
>>>
>>> On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir 
>>> wrote:
>>>
 After running a few OSTs manually, it seems that the patch is the
 cause. Investigating...

 On Tue, 7 May 2019 at 14:58, Andrej Krejcir 
 wrote:

> Hi,
>
> The issue is probably not caused by the patch.
>
> This log line means that the VM does not exist in the DB:
>
> 2019-05-07 06:02:04,215-04 WARN
> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
> Validation
> of action 'MigrateMultipleVms' failed for user admin@internal-authz.
> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
>
> I will investigate more, why the VM is missing.
>
> On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:
>
>> Hi,
>>
>> We are failing test upgrade_hosts on
>> upgrade-from-release-suite-master.
>> From the logs I can see that we are calling migrate vm when we
>> have only one host and the vm seem to have been shut down before the
>> maintenance call is issued.
>>
>> Can you please look into this?
>>
>> suspected patch reported as root cause by CQ is:
>>
>> https://gerrit.ovirt.org/#/c/98920/ - core: Add
>> MigrateMultipleVms command and use it for host maintenance
>>
>>
>> logs are found here:
>>
>>
>>
>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/
>>
>>
>> I can see the issue is vm migration when putting host in
>> maintenance:
>>
>>
>> 2019-05-07 06:02:04,170-04 INFO
>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
>> [05592db2-f859-487b-b779-4b32eec5bab
>> 3] Running command: MaintenanceVdsCommand internal: true.
>> Entities affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: 
>> VDS
>> 2019-05-07 06:02:04,215-04 WARN
>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
>> Validation
>> of action
>> 'MigrateMultipleVms' failed for user admin@internal-authz.
>> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
>> 2019-05-07 06:02:04,221-04 ERROR
>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
>> Failed to
>> migrate one or
>> more VMs.
>> 2019-05-07 06:02:04,227-04 ERROR
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
>> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to 

Re: [ovirt-devel] Re: URGENT - ovirt-engine broken for 3 days Re: Subject: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 05-05-2019 ] [ upgrade_hosts ]

2019-05-09 Thread Sandro Bonazzola
Il giorno gio 9 mag 2019 alle ore 12:59 Dafna Ron  ha
scritto:

> As IL are on independence day, anyone else can merge?
> https://gerrit.ovirt.org/#/c/99845/
>
>
I have merge rights but I need at least CI to pass. Waiting on jenkins.


>
> On Thu, May 9, 2019 at 11:30 AM Dafna Ron  wrote:
>
>> Thanks Andrej.
>> I will follow the patch and update.
>> Dafna
>>
>> On Thu, May 9, 2019 at 11:23 AM Andrej Krejcir 
>> wrote:
>>
>>> Hi,
>>>
>>> Ok, I have posted the reverting patch:
>>> https://gerrit.ovirt.org/#/c/99845/
>>>
>>> I'm still investigating what is the problem. Sorry for the delay, we had
>>> a public holiday yesturday.
>>>
>>>
>>> Andrej
>>>
>>> On Thu, 9 May 2019 at 11:20, Dafna Ron  wrote:
>>>
 Hi,

 I have not heard back on this issue and ovirt-engine has been broken
 for the past 3 days.

 As this does not seem a simple debug and fix I suggest reverting the
 patch and investigating later.

 thanks,
 Dafna



 On Wed, May 8, 2019 at 9:42 AM Dafna Ron  wrote:

> Any news?
>
> Thanks,
> Dafna
>
>
> On Tue, May 7, 2019 at 4:57 PM Dafna Ron  wrote:
>
>> thanks for the quick reply and investigation.
>> Please update me if I can help any further and if you find the cause
>> and have a patch let me know.
>> Note that ovirt-engine project is broken and if we cannot find the
>> cause relatively fast we should consider reverting the patch to allow a 
>> new
>> package to be built in CQ with other changes that were submitted.
>>
>> Thanks,
>> Dafna
>>
>>
>> On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir 
>> wrote:
>>
>>> After running a few OSTs manually, it seems that the patch is the
>>> cause. Investigating...
>>>
>>> On Tue, 7 May 2019 at 14:58, Andrej Krejcir 
>>> wrote:
>>>
 Hi,

 The issue is probably not caused by the patch.

 This log line means that the VM does not exist in the DB:

 2019-05-07 06:02:04,215-04 WARN
 [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
 (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
 Validation
 of action 'MigrateMultipleVms' failed for user admin@internal-authz.
 Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND

 I will investigate more, why the VM is missing.

 On Tue, 7 May 2019 at 14:07, Dafna Ron  wrote:

> Hi,
>
> We are failing test upgrade_hosts on
> upgrade-from-release-suite-master.
> From the logs I can see that we are calling migrate vm when we
> have only one host and the vm seem to have been shut down before the
> maintenance call is issued.
>
> Can you please look into this?
>
> suspected patch reported as root cause by CQ is:
>
> https://gerrit.ovirt.org/#/c/98920/ - core: Add
> MigrateMultipleVms command and use it for host maintenance
>
>
> logs are found here:
>
>
>
> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/
>
>
> I can see the issue is vm migration when putting host in
> maintenance:
>
>
> 2019-05-07 06:02:04,170-04 INFO
> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2)
> [05592db2-f859-487b-b779-4b32eec5bab
> 3] Running command: MaintenanceVdsCommand internal: true. Entities
> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS
> 2019-05-07 06:02:04,215-04 WARN
> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
> Validation
> of action
> 'MigrateMultipleVms' failed for user admin@internal-authz.
> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND
> 2019-05-07 06:02:04,221-04 ERROR
> [org.ovirt.engine.core.bll.MaintenanceVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] 
> Failed to
> migrate one or
> more VMs.
> 2019-05-07 06:02:04,227-04 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN
> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host
> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode.
> 2019-05-07 06:02:04,239-04 INFO
> [org.ovirt.engine.core.bll.ActivateVdsCommand]
> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock
> Acquired to object 'Eng