After investigating it looks like the issues started when this patch was
merged.

Marcin, can you help debug it.

https://gerrit.ovirt.org/#/c/107399/

Thanks
Galit

On Mon, Mar 30, 2020 at 6:42 PM Martin Perina <mper...@redhat.com> wrote:

>
>
> On Mon, Mar 30, 2020 at 5:38 PM Galit Rosenthal <grose...@redhat.com>
> wrote:
>
>> It looks like the local repo stops running.
>> When I run curl before the failure just to check the status, I can see it
>> isn't accessible.
>>
>> I'm trying to see where it fails or what cause it to fail.
>>
>> I manage to reproduce on BM
>>
>
> I thought that moving setup_storage will mitigate the issue:
> https://gerrit.ovirt.org/#/c/107989/
> But it just postponed the error to further phase, now adding host failing
> to the same issue: Failed to download metadata for repo 'alocalsync'
>
> https://jenkins.ovirt.org/view/oVirt system
> tests/job/ovirt-system-tests_manual/6710
>
> So Galit, please take a look, oVirt CQ is suffering from this issue for
> more than a week now
>
>>
>> On Mon, Mar 30, 2020 at 6:23 PM Marcin Sobczyk <msobc...@redhat.com>
>> wrote:
>>
>>> Hi Galit
>>>
>>> I can see the issue again - now in manual OST runs:
>>>
>>>
>>> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/6711/consoleFull#L2,856
>>>
>>> Regards, Marcin
>>>
>>> On 3/23/20 10:09 PM, Marcin Sobczyk wrote:
>>>
>>>
>>>
>>> On 3/23/20 8:51 PM, Galit Rosenthal wrote:
>>>
>>> I run it now locally using the extra sources as it runs in the CQ and it
>>> didn't fail for me.
>>>
>>> I will continue to investigate tomorrow,
>>>
>>> Marcin, did you see this issue also in check_patch or only in CQ?
>>>
>>> I wasn't aware of the issue till Nir raised it - I was working with the
>>> patch previously
>>> and both check-patch and manual runs were fine. I think it concerns only
>>> CQ then.
>>>
>>> Regards,
>>> Galit
>>>
>>> On Mon, Mar 23, 2020 at 4:29 PM Galit Rosenthal <grose...@redhat.com>
>>> wrote:
>>>
>>>> I will look at it.
>>>>
>>>> On Mon, Mar 23, 2020 at 4:18 PM Martin Perina <mper...@redhat.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Mar 23, 2020 at 3:16 PM Marcin Sobczyk <msobc...@redhat.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On 3/23/20 3:10 PM, Marcin Sobczyk wrote:
>>>>>> >
>>>>>> >
>>>>>> > On 3/23/20 2:53 PM, Nir Soffer wrote:
>>>>>> >> On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk <
>>>>>> msobc...@redhat.com>
>>>>>> >> wrote:
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> On 3/23/20 2:17 PM, Nir Soffer wrote:
>>>>>> >>>> On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk
>>>>>> >>>> <msobc...@redhat.com> wrote:
>>>>>> >>>>>
>>>>>> >>>>> On 3/21/20 1:18 AM, Nir Soffer wrote:
>>>>>> >>>>>
>>>>>> >>>>> On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer <nsof...@redhat.com>
>>>>>>
>>>>>> >>>>> wrote:
>>>>>> >>>>>> Looks like infrastructure issue setting up storage on engine
>>>>>> host.
>>>>>> >>>>>>
>>>>>> >>>>>> Here are 2 failing builds with unrelated changes:
>>>>>> >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/
>>>>>> >>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/
>>>>>> >>>>> Rebuilding still fails in setup_storage:
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>>> Is this a known issue?
>>>>>> >>>>>>
>>>>>> >>>>>> Error Message
>>>>>> >>>>>>
>>>>>> >>>>>> AssertionError: setup_storage.sh failed. Exit code is 1 assert
>>>>>> 1
>>>>>> >>>>>> == 0   -1   +0
>>>>>> >>>>>>
>>>>>> >>>>>> Stacktrace
>>>>>> >>>>>>
>>>>>> >>>>>> prefix = <ovirtlago.prefix.OvirtPrefix object at
>>>>>> 0x7f6fd2b998d0>
>>>>>> >>>>>>
>>>>>> >>>>>>       @pytest.mark.run(order=14)
>>>>>> >>>>>>       def test_configure_storage(prefix):
>>>>>> >>>>>>           engine = prefix.virt_env.engine_vm()
>>>>>> >>>>>>           result = engine.ssh(
>>>>>> >>>>>>               [
>>>>>> >>>>>>                   '/tmp/setup_storage.sh',
>>>>>> >>>>>>               ],
>>>>>> >>>>>>           )
>>>>>> >>>>>>>         assert result.code == 0, 'setup_storage.sh failed.
>>>>>> Exit
>>>>>> >>>>>>> code is %s' % result.code
>>>>>> >>>>>> E       AssertionError: setup_storage.sh failed. Exit code is 1
>>>>>> >>>>>> E       assert 1 == 0
>>>>>> >>>>>> E         -1
>>>>>> >>>>>> E         +0
>>>>>> >>>>>>
>>>>>> >>>>>>
>>>>>> >>>>>> The pytest traceback is nice, but in this case it is does not
>>>>>> >>>>>> show any useful info.
>>>>>> >>>>>>
>>>>>> >>>>>> Since we run a script using ssh, the error message should
>>>>>> include
>>>>>> >>>>>> the process stdout and stderr
>>>>>> >>>>>> which probably can explain the failure.
>>>>>> >>>>> I posted https://gerrit.ovirt.org/#/c/107830/ to improve
>>>>>> logging
>>>>>> >>>>> during storage setup.
>>>>>> >>>>> Unfortunately AFAICS it didn't fail, so I guess we'll have to
>>>>>> >>>>> merge it and wait for a failed job to get some helpful logs.
>>>>>> >>>> Thanks.
>>>>>> >>>>
>>>>>> >>>> It still fails for me with current code:
>>>>>> >>>>
>>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>> Same when using current vdsm master.
>>>>>> >>> Updated the patch according to your suggestions and currently
>>>>>> trying
>>>>>> >>> out
>>>>>> >>> OST for the 4th time -
>>>>>> >>> all previous runs succeeded. I guess I'm out of luck :)
>>>>>> >> It succeeds on your local OST setup but fail on Jenkins?
>>>>>> > No, I mean jenkins - both check-patch runs didn't fail on this
>>>>>> script.
>>>>>> > I also tried running OST manually twice and same thing happened.
>>>>>> > Anyway - the patch has been merged now so if any failure occurs in
>>>>>> CQ
>>>>>> > we should know what's going on.
>>>>>> Ok, finally caught a failure in CQ [1]:
>>>>>>
>>>>>> [2020-03-23T14:14:09.836Z]         if result.code != 0:
>>>>>> [2020-03-23T14:14:09.836Z]             msg = (
>>>>>> [2020-03-23T14:14:09.836Z]                 'setup_storage.sh failed
>>>>>> with
>>>>>> exit code: {}.\n'
>>>>>> [2020-03-23T14:14:09.836Z]                 'stdout:\n{}'
>>>>>> [2020-03-23T14:14:09.836Z]                 'stderr:\n{}'
>>>>>> [2020-03-23T14:14:09.836Z]             ).format(result.code,
>>>>>> result.out,
>>>>>> result.err)
>>>>>> [2020-03-23T14:14:09.836Z] >           raise RuntimeError(msg)
>>>>>> [2020-03-23T14:14:09.836Z] E           RuntimeError: setup_storage.sh
>>>>>> failed with exit code: 1.
>>>>>> [2020-03-23T14:14:09.836Z] E           stdout:
>>>>>> [2020-03-23T14:14:09.836Z] E           Reposync & Extra Sources
>>>>>> Content                0.0  B/s |   0  B     00:00
>>>>>> [2020-03-23T14:14:09.836Z] E           stderr:
>>>>>> [2020-03-23T14:14:09.836Z] E           + set -xe
>>>>>> [2020-03-23T14:14:09.836Z] E           +
>>>>>> MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2
>>>>>> [2020-03-23T14:14:09.836Z] E           +
>>>>>> ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3
>>>>>> [2020-03-23T14:14:09.836Z] E           + NUM_LUNS=5
>>>>>> [2020-03-23T14:14:09.836Z] E           ++ uname -r
>>>>>> [2020-03-23T14:14:09.836Z] E           ++ awk -F. '{print $(NF-1)}'
>>>>>> [2020-03-23T14:14:09.836Z] E           + DIST=el8_1
>>>>>> [2020-03-23T14:14:09.836Z] E           + main
>>>>>> [2020-03-23T14:14:09.836Z] E           ++ hostname
>>>>>> [2020-03-23T14:14:09.836Z] E           + [[
>>>>>> lago-basic-suite-master-engine == *\i\p\v\6* ]]
>>>>>> [2020-03-23T14:14:09.836Z] E           + install_deps
>>>>>> [2020-03-23T14:14:09.836Z] E           + systemctl disable --now
>>>>>> kdump.service
>>>>>> [2020-03-23T14:14:09.836Z] E           Removed
>>>>>> /etc/systemd/system/multi-user.target.wants/kdump.service.
>>>>>> [2020-03-23T14:14:09.836Z] E           + yum install --nogpgcheck -y
>>>>>> nfs-utils rpcbind lvm2 targetcli sg3_utils iscsi-initiator-utils
>>>>>> lsscsi
>>>>>> policycoreutils-python-utils
>>>>>> [2020-03-23T14:14:09.836Z] E           Failed to download metadata
>>>>>> for
>>>>>> repo 'alocalsync'
>>>>>> [2020-03-23T14:14:09.836Z] E           Error: Failed to download
>>>>>> metadata for repo 'alocalsync'
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>>
>>>>>> https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-queue-tester/detail/ovirt-master_change-queue-tester/21420/pipeline
>>>>>
>>>>>
>>>>> Galit, could you please take a look?
>>>>>
>>>>>>
>>>>>>
>>>>>> >
>>>>>> >>
>>>>>> >>>>>> Also I wonder why this code is called as a test
>>>>>> >>>>>> (test_configure_storage). This looks like setup
>>>>>> >>>>>> step so it should run as a fixture.
>>>>>> >>>>> That's true, but the pytest porting effort was about providing
>>>>>> a
>>>>>> >>>>> bare minimum to move away from nose.
>>>>>> >>>>> Organizing the tests into proper setup/fixtures is a huge task
>>>>>> and
>>>>>> >>>>> will be probably implemented
>>>>>> >>>>> incrementally in the nearest future.
>>>>>> >>>> Understood
>>>>>> >>>>
>>>>>> >
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Martin Perina
>>>>> Manager, Software Engineering
>>>>> Red Hat Czech s.r.o.
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> GALIT ROSENTHAL
>>>>
>>>> SOFTWARE ENGINEER
>>>>
>>>> Red Hat
>>>>
>>>> <https://www.redhat.com/>
>>>>
>>>> ga...@redhat.com    T: 972-9-7692230
>>>> <https://red.ht/sig>
>>>>
>>>
>>>
>>> --
>>>
>>> GALIT ROSENTHAL
>>>
>>> SOFTWARE ENGINEER
>>>
>>> Red Hat
>>>
>>> <https://www.redhat.com/>
>>>
>>> ga...@redhat.com    T: 972-9-7692230
>>> <https://red.ht/sig>
>>>
>>>
>>>
>>>
>>
>> --
>>
>> GALIT ROSENTHAL
>>
>> SOFTWARE ENGINEER
>>
>> Red Hat
>>
>> <https://www.redhat.com/>
>>
>> ga...@redhat.com    T: 972-9-7692230
>> <https://red.ht/sig>
>>
>
>
> --
> Martin Perina
> Manager, Software Engineering
> Red Hat Czech s.r.o.
>


-- 

GALIT ROSENTHAL

SOFTWARE ENGINEER

Red Hat

<https://www.redhat.com/>

ga...@redhat.com    T: 972-9-7692230
<https://red.ht/sig>
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/4UWCFMCSBDPQKCUXAZPUUGWIA5AXWFW5/

Reply via email to