Hi Galit

I can see the issue again - now in manual OST runs:

https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/6711/consoleFull#L2,856

Regards, Marcin

On 3/23/20 10:09 PM, Marcin Sobczyk wrote:


On 3/23/20 8:51 PM, Galit Rosenthal wrote:
I run it now locally using the extra sources as it runs in the CQ and it didn't fail for me.

I will continue to investigate tomorrow,

Marcin, did you see this issue also in check_patch or only in CQ?
I wasn't aware of the issue till Nir raised it - I was working with the patch previously and both check-patch and manual runs were fine. I think it concerns only CQ then.

Regards,
Galit

On Mon, Mar 23, 2020 at 4:29 PM Galit Rosenthal <grose...@redhat.com <mailto:grose...@redhat.com>> wrote:

    I will look at it.

    On Mon, Mar 23, 2020 at 4:18 PM Martin Perina <mper...@redhat.com
    <mailto:mper...@redhat.com>> wrote:



        On Mon, Mar 23, 2020 at 3:16 PM Marcin Sobczyk
        <msobc...@redhat.com <mailto:msobc...@redhat.com>> wrote:



            On 3/23/20 3:10 PM, Marcin Sobczyk wrote:
            >
            >
            > On 3/23/20 2:53 PM, Nir Soffer wrote:
            >> On Mon, Mar 23, 2020 at 3:26 PM Marcin Sobczyk
            <msobc...@redhat.com <mailto:msobc...@redhat.com>>
            >> wrote:
            >>>
            >>>
            >>> On 3/23/20 2:17 PM, Nir Soffer wrote:
            >>>> On Mon, Mar 23, 2020 at 1:25 PM Marcin Sobczyk
            >>>> <msobc...@redhat.com <mailto:msobc...@redhat.com>>
            wrote:
            >>>>>
            >>>>> On 3/21/20 1:18 AM, Nir Soffer wrote:
            >>>>>
            >>>>> On Fri, Mar 20, 2020 at 9:35 PM Nir Soffer
            <nsof...@redhat.com <mailto:nsof...@redhat.com>>
            >>>>> wrote:
            >>>>>> Looks like infrastructure issue setting up storage
            on engine host.
            >>>>>>
            >>>>>> Here are 2 failing builds with unrelated changes:
            >>>>>>
            https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6677/
            >>>>>>
            https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6678/
            >>>>> Rebuilding still fails in setup_storage:
            >>>>>
            >>>>>
            
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6679/testReport/

            >>>>>
            >>>>>
            
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6680/testReport/

            >>>>>
            >>>>>
            >>>>>> Is this a known issue?
            >>>>>>
            >>>>>> Error Message
            >>>>>>
            >>>>>> AssertionError: setup_storage.sh failed. Exit code
            is 1 assert 1
            >>>>>> == 0   -1   +0
            >>>>>>
            >>>>>> Stacktrace
            >>>>>>
            >>>>>> prefix = <ovirtlago.prefix.OvirtPrefix object at
            0x7f6fd2b998d0>
            >>>>>>
            >>>>>> @pytest.mark.run(order=14)
            >>>>>>       def test_configure_storage(prefix):
            >>>>>>           engine = prefix.virt_env.engine_vm()
            >>>>>>           result = engine.ssh(
            >>>>>>               [
            >>>>>> '/tmp/setup_storage.sh',
            >>>>>>               ],
            >>>>>>           )
            >>>>>>>         assert result.code == 0,
            'setup_storage.sh failed. Exit
            >>>>>>> code is %s' % result.code
            >>>>>> E       AssertionError: setup_storage.sh failed.
            Exit code is 1
            >>>>>> E       assert 1 == 0
            >>>>>> E         -1
            >>>>>> E         +0
            >>>>>>
            >>>>>>
            >>>>>> The pytest traceback is nice, but in this case it
            is does not
            >>>>>> show any useful info.
            >>>>>>
            >>>>>> Since we run a script using ssh, the error message
            should include
            >>>>>> the process stdout and stderr
            >>>>>> which probably can explain the failure.
            >>>>> I posted https://gerrit.ovirt.org/#/c/107830/ to
            improve logging
            >>>>> during storage setup.
            >>>>> Unfortunately AFAICS it didn't fail, so I guess
            we'll have to
            >>>>> merge it and wait for a failed job to get some
            helpful logs.
            >>>> Thanks.
            >>>>
            >>>> It still fails for me with current code:
            >>>>
            
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6689/testReport/

            >>>>
            >>>>
            >>>> Same when using current vdsm master.
            >>> Updated the patch according to your suggestions and
            currently trying
            >>> out
            >>> OST for the 4th time -
            >>> all previous runs succeeded. I guess I'm out of luck :)
            >> It succeeds on your local OST setup but fail on Jenkins?
            > No, I mean jenkins - both check-patch runs didn't fail
            on this script.
            > I also tried running OST manually twice and same thing
            happened.
            > Anyway - the patch has been merged now so if any
            failure occurs in CQ
            > we should know what's going on.
            Ok, finally caught a failure in CQ [1]:

            [2020-03-23T14:14:09.836Z]         if result.code != 0:
            [2020-03-23T14:14:09.836Z]             msg = (
            [2020-03-23T14:14:09.836Z] 'setup_storage.sh failed with
            exit code: {}.\n'
            [2020-03-23T14:14:09.836Z] 'stdout:\n{}'
            [2020-03-23T14:14:09.836Z] 'stderr:\n{}'
            [2020-03-23T14:14:09.836Z] ).format(result.code, result.out,
            result.err)
            [2020-03-23T14:14:09.836Z] >           raise
            RuntimeError(msg)
            [2020-03-23T14:14:09.836Z] E RuntimeError: setup_storage.sh
            failed with exit code: 1.
            [2020-03-23T14:14:09.836Z] E           stdout:
            [2020-03-23T14:14:09.836Z] E           Reposync & Extra
            Sources
            Content                0.0  B/s |   0  B     00:00
            [2020-03-23T14:14:09.836Z] E           stderr:
            [2020-03-23T14:14:09.836Z] E           + set -xe
            [2020-03-23T14:14:09.836Z] E           +
            MAIN_NFS_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2
            [2020-03-23T14:14:09.836Z] E           +
            ISCSI_DEV=disk/by-id/scsi-0QEMU_QEMU_HARDDISK_3
            [2020-03-23T14:14:09.836Z] E           + NUM_LUNS=5
            [2020-03-23T14:14:09.836Z] E           ++ uname -r
            [2020-03-23T14:14:09.836Z] E           ++ awk -F. '{print
            $(NF-1)}'
            [2020-03-23T14:14:09.836Z] E           + DIST=el8_1
            [2020-03-23T14:14:09.836Z] E           + main
            [2020-03-23T14:14:09.836Z] E           ++ hostname
            [2020-03-23T14:14:09.836Z] E           + [[
            lago-basic-suite-master-engine == *\i\p\v\6* ]]
            [2020-03-23T14:14:09.836Z] E           + install_deps
            [2020-03-23T14:14:09.836Z] E           + systemctl
            disable --now
            kdump.service
            [2020-03-23T14:14:09.836Z] E           Removed
            /etc/systemd/system/multi-user.target.wants/kdump.service.
            [2020-03-23T14:14:09.836Z] E           + yum install
            --nogpgcheck -y
            nfs-utils rpcbind lvm2 targetcli sg3_utils
            iscsi-initiator-utils lsscsi
            policycoreutils-python-utils
            [2020-03-23T14:14:09.836Z] E           Failed to download
            metadata for
            repo 'alocalsync'
            [2020-03-23T14:14:09.836Z] E           Error: Failed to
            download
            metadata for repo 'alocalsync'


            [1]
            
https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-master_change-queue-tester/detail/ovirt-master_change-queue-tester/21420/pipeline


        Galit, could you please take a look?



            >
            >>
            >>>>>> Also I wonder why this code is called as a test
            >>>>>> (test_configure_storage). This looks like setup
            >>>>>> step so it should run as a fixture.
            >>>>> That's true, but the pytest porting effort was
            about providing a
            >>>>> bare minimum to move away from nose.
            >>>>> Organizing the tests into proper setup/fixtures is
            a huge task and
            >>>>> will be probably implemented
            >>>>> incrementally in the nearest future.
            >>>> Understood
            >>>>
            >



-- Martin Perina
        Manager, Software Engineering
        Red Hat Czech s.r.o.



--
    GALIT ROSENTHAL

    SOFTWARE ENGINEER

    Red Hat

    <https://www.redhat.com/>

    ga...@redhat.com <mailto:ga...@redhat.com> T: 972-9-7692230
    <tel:972-9-7692230>

    <https://red.ht/sig>



--

GALIT ROSENTHAL

SOFTWARE ENGINEER

Red Hat

<https://www.redhat.com/>

ga...@redhat.com <mailto:ga...@redhat.com> T: 972-9-7692230 <tel:972-9-7692230>

<https://red.ht/sig>



_______________________________________________
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/MIVDNCZTH4S5OPQ4JIVQPUNRHOC4DC7V/

Reply via email to