After investigating it looks like the issues started when this patch was merged.
Marcin, can you help debug it. https://gerrit.ovirt.org/#/c/107399/ Thanks Galit On Mon, Mar 30, 2020 at 6:42 PM Martin Perina <mper...@redhat.com> wrote: > > > On Mon, Mar 30, 2020 at 5:38 PM Galit Rosenthal <grose...@redhat.com> > wrote: > >> It looks like the local repo stops running. >> When I run curl before the failure just to check the status, I can see it >> isn't accessible. >> >> I'm trying to see where it fails or what cause it to fail. >> >> I manage to reproduce on BM >> > > I thought that moving setup_storage will mitigate the issue: > https://gerrit.ovirt.org/#/c/107989/ > But it just postponed the error to further phase, now adding host failing > to the same issue: Failed to download metadata for repo 'alocalsync' > > https://jenkins.ovirt.org/view/oVirt system > tests/job/ovirt-system-tests_manual/6710 > > So Galit, please take a look, oVirt CQ is suffering from this issue for > more than a week now > >> >> On Mon, Mar 30, 2020 at 6:23 PM Marcin Sobczyk <msobc...@redhat.com> >> wrote: >> >>> Hi Galit >>> >>> I can see the issue again - now in manual OST runs: >>> >>> >>> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/6711/consoleFull#L2,856 >>> >>> Regards, Marcin >>> >>> On 3/23/20 10:09 PM, Marcin Sobczyk wrote: >>> >>> >>> >>> On 3/23/20 8:51 PM, Galit Rosenthal wrote: >>> >>> I run it now locally using the extra sources as it runs in the CQ and it >>> didn't fail for me. >>> >>> I will continue to investigate tomorrow, >>> >>> Marcin, did you see this issue also in check_patch or only in CQ? >>> >>> I wasn't aware of the issue till Nir raised it - I was working with the >>> patch previously >>> and both check-patch and manual runs were fine. I think it concerns only >>> CQ then. >>> >>> Regards, >>> Galit >>> >>> On Mon, Mar 23, 2020 at 4:29 PM Galit Rosenthal <grose...@redhat.com> >>> wrote: >>> >>>> I will look at it. >>>> >>>> On Mon, Mar 23, 2020 at 4:18 PM Martin Perina <mper...@redhat.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Mar 23, 2020 at 3:16 PM Marcin Sobczyk <msobc...@redhat.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On 3/23/20 3:10 PM, Marcin Sobczyk wrote: >>>>>> > >>>>>> > >>>>>> > On 3/23/20 2:53 PM, Nir Soffer wrote: >>>>>> >> On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk < >>>>>> msobc...@redhat.com> >>>>>> >> wrote: >>>>>> >>> >>>>>> >>> >>>>>> >>> On 3/23/20 2:17 PM, Nir Soffer wrote: >>>>>> >>>> On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk >>>>>> >>>> <msobc...@redhat.com> wrote: >>>>>> >>>>> >>>>>> >>>>> On 3/21/20 1:18 AM, Nir Soffer wrote: >>>>>> >>>>> >>>>>> >>>>> On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsof...@redhat.com> >>>>>> >>>>>> >>>>> wrote: >>>>>> >>>>>> Looks like infrastructure issue setting up storage on engine >>>>>> host. >>>>>> >>>>>> >>>>>> >>>>>> Here are 2 failing builds with unrelated changes: >>>>>> >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/ >>>>>> >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/ >>>>>> >>>>> Rebuilding still fails in setup_storage: >>>>>> >>>>> >>>>>> >>>>> >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/ >>>>>> >>>>> >>>>>> >>>>> >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/ >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>>> Is this a known issue? >>>>>> >>>>>> >>>>>> >>>>>> Error Message >>>>>> >>>>>> >>>>>> >>>>>> AssertionError: setup_storage.sh failed. Exit code is 1 assert >>>>>> 1 >>>>>> >>>>>> == 0 -1 +0 >>>>>> >>>>>> >>>>>> >>>>>> Stacktrace >>>>>> >>>>>> >>>>>> >>>>>> prefix = <ovirtlago.prefix.OvirtPrefix object at >>>>>> 0x7f6fd2b998d0> >>>>>> >>>>>> >>>>>> >>>>>> @pytest.mark.run(order=14) >>>>>> >>>>>> def test_configure_storage(prefix): >>>>>> >>>>>> engine = prefix.virt_env.engine_vm() >>>>>> >>>>>> result = engine.ssh( >>>>>> >>>>>> [ >>>>>> >>>>>> '/tmp/setup_storage.sh', >>>>>> >>>>>> ], >>>>>> >>>>>> ) >>>>>> >>>>>>> assert result.code == 0, 'setup_storage.sh failed. >>>>>> Exit >>>>>> >>>>>>> code is %s' % result.code >>>>>> >>>>>> E AssertionError: setup_storage.sh failed. Exit code is 1 >>>>>> >>>>>> E assert 1 == 0 >>>>>> >>>>>> E -1 >>>>>> >>>>>> E +0 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> The pytest traceback is nice, but in this case it is does not >>>>>> >>>>>> show any useful info. >>>>>> >>>>>> >>>>>> >>>>>> Since we run a script using ssh, the error message should >>>>>> include >>>>>> >>>>>> the process stdout and stderr >>>>>> >>>>>> which probably can explain the failure. >>>>>> >>>>> I posted https://gerrit.ovirt.org/#/c/107830/ to improve >>>>>> logging >>>>>> >>>>> during storage setup. >>>>>> >>>>> Unfortunately AFAICS it didn't fail, so I guess we'll have to >>>>>> >>>>> merge it and wait for a failed job to get some helpful logs. >>>>>> >>>> Thanks. >>>>>> >>>> >>>>>> >>>> It still fails for me with current code: >>>>>> >>>> >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/ >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> Same when using current vdsm master. >>>>>> >>> Updated the patch according to your suggestions and currently >>>>>> trying >>>>>> >>> out >>>>>> >>> OST for the 4th time - >>>>>> >>> all previous runs succeeded. I guess I'm out of luck :) >>>>>> >> It succeeds on your local OST setup but fail on Jenkins? >>>>>> > No, I mean jenkins - both check-patch runs didn't fail on this >>>>>> script. >>>>>> > I also tried running OST manually twice and same thing happened. >>>>>> > Anyway - the patch has been merged now so if any failure occurs in >>>>>> CQ >>>>>> > we should know what's going on. >>>>>> Ok, finally caught a failure in CQ [1]: >>>>>> >>>>>> [2020-03-23T14:14:09.836Z] if result.code != 0: >>>>>> [2020-03-23T14:14:09.836Z] msg = ( >>>>>> [2020-03-23T14:14:09.836Z] 'setup_storage.sh failed >>>>>> with >>>>>> exit code: {}.\n' >>>>>> [2020-03-23T14:14:09.836Z] 'stdout:\n{}' >>>>>> [2020-03-23T14:14:09.836Z] 'stderr:\n{}' >>>>>> [2020-03-23T14:14:09.836Z] ).format(result.code, >>>>>> result.out, >>>>>> result.err) >>>>>> [2020-03-23T14:14:09.836Z] > raise RuntimeError(msg) >>>>>> [2020-03-23T14:14:09.836Z] E RuntimeError: setup_storage.sh >>>>>> failed with exit code: 1. >>>>>> [2020-03-23T14:14:09.836Z] E stdout: >>>>>> [2020-03-23T14:14:09.836Z] E Reposync & Extra Sources >>>>>> Content 0.0 B/s | 0 B 00:00 >>>>>> [2020-03-23T14:14:09.836Z] E stderr: >>>>>> [2020-03-23T14:14:09.836Z] E + set -xe >>>>>> [2020-03-23T14:14:09.836Z] E + >>>>>> MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2 >>>>>> [2020-03-23T14:14:09.836Z] E + >>>>>> ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3 >>>>>> [2020-03-23T14:14:09.836Z] E + NUM_LUNS=5 >>>>>> [2020-03-23T14:14:09.836Z] E ++ uname -r >>>>>> [2020-03-23T14:14:09.836Z] E ++ awk -F. '{print $(NF-1)}' >>>>>> [2020-03-23T14:14:09.836Z] E + DIST=el8_1 >>>>>> [2020-03-23T14:14:09.836Z] E + main >>>>>> [2020-03-23T14:14:09.836Z] E ++ hostname >>>>>> [2020-03-23T14:14:09.836Z] E + [[ >>>>>> lago-basic-suite-master-engine == *\i\p\v\6* ]] >>>>>> [2020-03-23T14:14:09.836Z] E + install_deps >>>>>> [2020-03-23T14:14:09.836Z] E + systemctl disable --now >>>>>> kdump.service >>>>>> [2020-03-23T14:14:09.836Z] E Removed >>>>>> /etc/systemd/system/multi-user.target.wants/kdump.service. >>>>>> [2020-03-23T14:14:09.836Z] E + yum install --nogpgcheck -y >>>>>> nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils >>>>>> lsscsi >>>>>> policycoreutils-python-utils >>>>>> [2020-03-23T14:14:09.836Z] E Failed to download metadata >>>>>> for >>>>>> repo 'alocalsync' >>>>>> [2020-03-23T14:14:09.836Z] E Error: Failed to download >>>>>> metadata for repo 'alocalsync' >>>>>> >>>>>> >>>>>> [1] >>>>>> >>>>>> https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-queue-tester/detail/ovirt-master_change-queue-tester/21420/pipeline >>>>> >>>>> >>>>> Galit, could you please take a look? >>>>> >>>>>> >>>>>> >>>>>> > >>>>>> >> >>>>>> >>>>>> Also I wonder why this code is called as a test >>>>>> >>>>>> (test_configure_storage). This looks like setup >>>>>> >>>>>> step so it should run as a fixture. >>>>>> >>>>> That's true, but the pytest porting effort was about providing >>>>>> a >>>>>> >>>>> bare minimum to move away from nose. >>>>>> >>>>> Organizing the tests into proper setup/fixtures is a huge task >>>>>> and >>>>>> >>>>> will be probably implemented >>>>>> >>>>> incrementally in the nearest future. >>>>>> >>>> Understood >>>>>> >>>> >>>>>> > >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Martin Perina >>>>> Manager, Software Engineering >>>>> Red Hat Czech s.r.o. >>>>> >>>> >>>> >>>> -- >>>> >>>> GALIT ROSENTHAL >>>> >>>> SOFTWARE ENGINEER >>>> >>>> Red Hat >>>> >>>> <https://www.redhat.com/> >>>> >>>> ga...@redhat.com T: 972-9-7692230 >>>> <https://red.ht/sig> >>>> >>> >>> >>> -- >>> >>> GALIT ROSENTHAL >>> >>> SOFTWARE ENGINEER >>> >>> Red Hat >>> >>> <https://www.redhat.com/> >>> >>> ga...@redhat.com T: 972-9-7692230 >>> <https://red.ht/sig> >>> >>> >>> >>> >> >> -- >> >> GALIT ROSENTHAL >> >> SOFTWARE ENGINEER >> >> Red Hat >> >> <https://www.redhat.com/> >> >> ga...@redhat.com T: 972-9-7692230 >> <https://red.ht/sig> >> > > > -- > Martin Perina > Manager, Software Engineering > Red Hat Czech s.r.o. > -- GALIT ROSENTHAL SOFTWARE ENGINEER Red Hat <https://www.redhat.com/> ga...@redhat.com T: 972-9-7692230 <https://red.ht/sig>
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/4UWCFMCSBDPQKCUXAZPUUGWIA5AXWFW5/