Thanks Andrej. I will follow the patch and update. Dafna On Thu, May 9, 2019 at 11:23 AM Andrej Krejcir <akrej...@redhat.com> wrote:
> Hi, > > Ok, I have posted the reverting patch: https://gerrit.ovirt.org/#/c/99845/ > > I'm still investigating what is the problem. Sorry for the delay, we had a > public holiday yesturday. > > > Andrej > > On Thu, 9 May 2019 at 11:20, Dafna Ron <d...@redhat.com> wrote: > >> Hi, >> >> I have not heard back on this issue and ovirt-engine has been broken for >> the past 3 days. >> >> As this does not seem a simple debug and fix I suggest reverting the >> patch and investigating later. >> >> thanks, >> Dafna >> >> >> >> On Wed, May 8, 2019 at 9:42 AM Dafna Ron <d...@redhat.com> wrote: >> >>> Any news? >>> >>> Thanks, >>> Dafna >>> >>> >>> On Tue, May 7, 2019 at 4:57 PM Dafna Ron <d...@redhat.com> wrote: >>> >>>> thanks for the quick reply and investigation. >>>> Please update me if I can help any further and if you find the cause >>>> and have a patch let me know. >>>> Note that ovirt-engine project is broken and if we cannot find the >>>> cause relatively fast we should consider reverting the patch to allow a new >>>> package to be built in CQ with other changes that were submitted. >>>> >>>> Thanks, >>>> Dafna >>>> >>>> >>>> On Tue, May 7, 2019 at 4:42 PM Andrej Krejcir <akrej...@redhat.com> >>>> wrote: >>>> >>>>> After running a few OSTs manually, it seems that the patch is the >>>>> cause. Investigating... >>>>> >>>>> On Tue, 7 May 2019 at 14:58, Andrej Krejcir <akrej...@redhat.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> The issue is probably not caused by the patch. >>>>>> >>>>>> This log line means that the VM does not exist in the DB: >>>>>> >>>>>> 2019-05-07 06:02:04,215-04 WARN >>>>>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand] >>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] >>>>>> Validation >>>>>> of action 'MigrateMultipleVms' failed for user admin@internal-authz. >>>>>> Reasons: ACTION_TYPE_FAILED_VMS_NOT_FOUND >>>>>> >>>>>> I will investigate more, why the VM is missing. >>>>>> >>>>>> On Tue, 7 May 2019 at 14:07, Dafna Ron <d...@redhat.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> We are failing test upgrade_hosts on >>>>>>> upgrade-from-release-suite-master. >>>>>>> From the logs I can see that we are calling migrate vm when we have >>>>>>> only one host and the vm seem to have been shut down before the >>>>>>> maintenance >>>>>>> call is issued. >>>>>>> >>>>>>> Can you please look into this? >>>>>>> >>>>>>> suspected patch reported as root cause by CQ is: >>>>>>> >>>>>>> https://gerrit.ovirt.org/#/c/98920/ - core: Add MigrateMultipleVms >>>>>>> command and use it for host maintenance >>>>>>> >>>>>>> >>>>>>> logs are found here: >>>>>>> >>>>>>> >>>>>>> >>>>>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14021/artifact/upgrade-from-release-suite.el7.x86_64/test_logs/upgrade-from-release-suite-master/post-004_basic_sanity.py/ >>>>>>> >>>>>>> >>>>>>> I can see the issue is vm migration when putting host in >>>>>>> maintenance: >>>>>>> >>>>>>> >>>>>>> 2019-05-07 06:02:04,170-04 INFO >>>>>>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) >>>>>>> [05592db2-f859-487b-b779-4b32eec5bab >>>>>>> 3] Running command: MaintenanceVdsCommand internal: true. Entities >>>>>>> affected : ID: 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDS >>>>>>> 2019-05-07 06:02:04,215-04 WARN >>>>>>> [org.ovirt.engine.core.bll.MigrateMultipleVmsCommand] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] >>>>>>> Validation >>>>>>> of action >>>>>>> 'MigrateMultipleVms' failed for user admin@internal-authz. Reasons: >>>>>>> ACTION_TYPE_FAILED_VMS_NOT_FOUND >>>>>>> 2019-05-07 06:02:04,221-04 ERROR >>>>>>> [org.ovirt.engine.core.bll.MaintenanceVdsCommand] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] Failed >>>>>>> to >>>>>>> migrate one or >>>>>>> more VMs. >>>>>>> 2019-05-07 06:02:04,227-04 ERROR >>>>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [33485140] EVEN >>>>>>> T_ID: VDS_MAINTENANCE_FAILED(17), Failed to switch Host >>>>>>> lago-upgrade-from-release-suite-master-host-0 to Maintenance mode. >>>>>>> 2019-05-07 06:02:04,239-04 INFO >>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock >>>>>>> Acquired to object 'Eng >>>>>>> ineLock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]', >>>>>>> sharedLocks=''}' >>>>>>> 2019-05-07 06:02:04,242-04 INFO >>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Running >>>>>>> command: ActivateVds >>>>>>> Command internal: true. Entities affected : ID: >>>>>>> 38e1379b-c3b6-4a2e-91df-d1f346e414a9 Type: VDSAction group >>>>>>> MANIPULATE_HOST >>>>>>> with role type ADMIN >>>>>>> 2019-05-07 06:02:04,243-04 INFO >>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Before >>>>>>> acquiring lock in ord >>>>>>> er to prevent monitoring for host >>>>>>> 'lago-upgrade-from-release-suite-master-host-0' from data-center >>>>>>> 'test-dc' >>>>>>> 2019-05-07 06:02:04,243-04 INFO >>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock >>>>>>> acquired, from now a mo >>>>>>> nitoring of host will be skipped for host >>>>>>> 'lago-upgrade-from-release-suite-master-host-0' from data-center >>>>>>> 'test-dc' >>>>>>> 2019-05-07 06:02:04,252-04 INFO >>>>>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] START, >>>>>>> SetVdsStatu >>>>>>> sVDSCommand(HostName = >>>>>>> lago-upgrade-from-release-suite-master-host-0, >>>>>>> SetVdsStatusVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >>>>>>> status='Unassigned', n >>>>>>> onOperationalReason='NONE', stopSpmFailureLogged='false', >>>>>>> maintenanceReason='null'}), log id: 2c8aa211 >>>>>>> 2019-05-07 06:02:04,256-04 INFO >>>>>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] FINISH, >>>>>>> SetVdsStat >>>>>>> usVDSCommand, return: , log id: 2c8aa211 >>>>>>> 2019-05-07 06:02:04,261-04 INFO >>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] >>>>>>> Activate >>>>>>> host finished. Lock >>>>>>> released. Monitoring can run now for host >>>>>>> 'lago-upgrade-from-release-suite-master-host-0' from data-center >>>>>>> 'test-dc' >>>>>>> 2019-05-07 06:02:04,265-04 INFO >>>>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] EVEN >>>>>>> T_ID: VDS_ACTIVATE(16), Activation of host >>>>>>> lago-upgrade-from-release-suite-master-host-0 initiated by >>>>>>> admin@internal-authz. >>>>>>> 2019-05-07 06:02:04,266-04 INFO >>>>>>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>>>>>> (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [70840477] Lock >>>>>>> freed >>>>>>> to object 'Engine >>>>>>> Lock:{exclusiveLocks='[38e1379b-c3b6-4a2e-91df-d1f346e414a9=VDS]', >>>>>>> sharedLocks=''}' >>>>>>> 2019-05-07 06:02:04,484-04 ERROR >>>>>>> [org.ovirt.engine.core.bll.hostdeploy.HostUpgradeCallback] >>>>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-96) >>>>>>> [05592db2-f859-487b-b779-4b32 >>>>>>> eec5bab3] Host 'lago-upgrade-from-release-suite-master-host-0' >>>>>>> failed to move to maintenance mode. Upgrade process is terminated. >>>>>>> >>>>>>> I can see there was only one vm running: >>>>>>> >>>>>>> >>>>>>> drwxrwxr-x. 2 dron dron 1024 May 7 11:49 qemu >>>>>>> [dron@dron post-004_basic_sanity.py]$ ls -l >>>>>>> lago-upgrade-from-release-suite-master-host-0/_var_log/libvirt/qemu/ >>>>>>> total 6 >>>>>>> -rw-rw-r--. 1 dron dron 4466 May 7 10:12 vm-with-iface.log >>>>>>> >>>>>>> and I can see that there was an attempt to terminate it with an >>>>>>> error that it does not exist: >>>>>>> >>>>>>> >>>>>>> stroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >>>>>>> vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0', >>>>>>> gracefully='false', reason='', ig >>>>>>> noreNoVm='false'}), log id: 24278e9b >>>>>>> 2019-05-07 06:01:41,082-04 INFO >>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default >>>>>>> task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] START, DestroyVDSComma >>>>>>> nd(HostName = lago-upgrade-from-release-suite-master-host-0, >>>>>>> DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >>>>>>> vmId='dfbd75e2-a9cb-4fca-8788-a169 >>>>>>> 54db4abf', secondsToWait='0', gracefully='false', reason='', >>>>>>> ignoreNoVm='false'}), log id: 78bba2f8 >>>>>>> 2019-05-07 06:01:42,090-04 INFO >>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default >>>>>>> task-1) [105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVDSComm >>>>>>> and, return: , log id: 78bba2f8 >>>>>>> 2019-05-07 06:01:42,090-04 INFO >>>>>>> [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1) >>>>>>> [105f7555-517b-4bf9-b86e-6eb42375de20] FINISH, DestroyVmVDSCommand, r >>>>>>> eturn: , log id: 24278e9b >>>>>>> 2019-05-07 06:01:42,094-04 INFO >>>>>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] >>>>>>> (ForkJoinPool-1-worker-4) [] VM 'dfbd75e2-a9cb-4fca-8788-a16954db4abf' >>>>>>> was >>>>>>> reported >>>>>>> as Down on VDS >>>>>>> '38e1379b-c3b6-4a2e-91df-d1f346e414a9'(lago-upgrade-from-release-suite-master-host-0) >>>>>>> 2019-05-07 06:01:42,096-04 INFO >>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] >>>>>>> (ForkJoinPool-1-worker-4) [] START, DestroyVDSCommand(HostName = >>>>>>> lago-upgrade- >>>>>>> from-release-suite-master-host-0, >>>>>>> DestroyVmVDSCommandParameters:{hostId='38e1379b-c3b6-4a2e-91df-d1f346e414a9', >>>>>>> vmId='dfbd75e2-a9cb-4fca-8788-a16954db4abf', secondsToWait='0 >>>>>>> ', gracefully='false', reason='', ignoreNoVm='true'}), log id: >>>>>>> 1dbd31eb >>>>>>> 2019-05-07 06:01:42,114-04 INFO >>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] >>>>>>> (ForkJoinPool-1-worker-4) [] Failed to destroy VM >>>>>>> 'dfbd75e2-a9cb-4fca-8788-a16 >>>>>>> 954db4abf' because VM does not exist, ignoring >>>>>>> >>>>>>> >>>>>>> >>>>>>>
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/6JO6RPROKBSCX2ROV6WEWDHJADSUWJRS/