Hi all,

After investigating, I really don't think my patch is the cause of the issue:

- my patch touched 'vdsm-client' only and failing stages AFAICT have nothing to do with it - even now, I run basic-suite-4.3 successfully - https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/4400/ (ATM of writing still running, but past 'add_master_storage_domain')
- I can't reproduce it locally

Regards, Marcin

On 3/25/19 10:13 AM, Eyal Edri wrote:
Still fails, now on a different component. ( ovirt-web-ui-extentions )

https://jenkins.ovirt.org/job/ovirt-4.3_change-queue-tester/339/

On Fri, Mar 22, 2019 at 3:59 PM Dan Kenigsberg <dan...@redhat.com <mailto:dan...@redhat.com>> wrote:



    On Fri, Mar 22, 2019 at 3:21 PM Marcin Sobczyk
    <msobc...@redhat.com <mailto:msobc...@redhat.com>> wrote:

        Dafna,

        in 'verify_add_hosts' we specifically wait for single host to
        be up with a timeout:

          144     up_hosts = hosts_service.list(search='datacenter={} AND 
status=up'.format(DC_NAME))
          145     if len(up_hosts):
          146         return True

        The log files say, that it took ~50 secs for one of the hosts
        to be up (seems reasonable) and no timeout is being reported.
        Just after running 'verify_add_hosts', we run
        'add_master_storage_domain', which calls '_hosts_in_dc' function.
        That function does the exact same check, but it fails:

          113     hosts = hosts_service.list(search='datacenter={} AND 
status=up'.format(dc_name))
          114     if hosts:
          115         if random_host:
          116             return random.choice(hosts)

    I don't think it is relevant to our current failure; but I
    consider random_host=True as a bad practice. As if we do not have
    enough moving parts, we are adding intentional randomness.
    Reproducibility is far more important than coverage - particularly
    for a shared system test like OST.

          117         else:
          118             return sorted(hosts, key=lambda host:host.name  
<http://host.name>)
          119     raise RuntimeError('Could not find hosts that are up in DC 
%s' % dc_name)

        I'm also not able to reproduce this issue locally on my
        server. The investigation continues...


    I think that it would be fair to take the filtering by host state
    out of Engine and into the test, where we can easily log the
    current status of each host. Then we'd have better understanding
    on the next failure.

        On 3/22/19 1:17 PM, Marcin Sobczyk wrote:

        Hi,

        sure, I'm on it - it's weird though, I did ran 4.3 basic
        suite for this patch manually and everything was ok.

        On 3/22/19 1:05 PM, Dafna Ron wrote:
        Hi,

        We are failing branch 4.3 for test:
        002_bootstrap.add_master_storage_domain

        It seems that in one of the hosts, the vdsm is not starting
        there is nothing in vdsm.log or in supervdsm.log

        CQ identified this patch as the suspected root cause:

        https://gerrit.ovirt.org/#/c/98748/ - vdsm: client: Add
        support for flow id

        Milan, Marcin, can you please have a look?

        full logs:

        
http://jenkins.ovirt.org/job/ovirt-4.3_change-queue-tester/326/artifact/basic-suite.el7.x86_64/test_logs/basic-suite-4.3/post-002_bootstrap.py/

        the only error I can see is about host not being up (makes
        sense as vdsm is not running)


              Stacktrace

           File "/usr/lib64/python2.7/unittest/case.py", line 369, in run
             testMethod()
           File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in 
runTest
             self.test(*self.arg)
           File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 
142, in wrapped_test
             test()
           File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 
60, in wrapper
             return func(get_test_prefix(), *args, **kwargs)
           File 
"/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py",
 line 417, in add_master_storage_domain
             add_iscsi_storage_domain(prefix)
           File 
"/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py",
 line 561, in add_iscsi_storage_domain
             host=_random_host_from_dc(api, DC_NAME),
           File 
"/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py",
 line 122, in _random_host_from_dc
             return _hosts_in_dc(api, dc_name, True)
           File 
"/home/jenkins/workspace/ovirt-4.3_change-queue-tester/ovirt-system-tests/basic-suite-4.3/test-scenarios/002_bootstrap.py",
 line 119, in _hosts_in_dc
             raise RuntimeError('Could not find hosts that are up in DC %s' % 
dc_name)
        'Could not find hosts that are up in DC test-dc\n-------------------- >> 
begin captured logging << --------------------\nlago.ssh: DEBUG: start 
task:937bdea7-a2a3-47ad-9383-36647ea37ddf:Get ssh client for 
lago-basic-suite-4-3-engine:\nlago.ssh: DEBUG: end 
task:937bdea7-a2a3-47ad-9383-36647ea37ddf:Get ssh client for 
lago-basic-suite-4-3-engine:\nlago.ssh: DEBUG: Running c07b5ee2 on 
lago-basic-suite-4-3-engine: cat /root/multipath.txt\nlago.ssh: DEBUG: Command c07b5ee2 on 
lago-basic-suite-4-3-engine returned with 0\nlago.ssh: DEBUG: Command c07b5ee2 on 
lago-basic-suite-4-3-engine output:\n 
3600140516f88cafa71243648ea218995\n360014053e28f60001764fed9978ec4b3\n360014059edc777770114a6484891dcf1\n36001405d93d8585a50d43a4ad0bd8d19\n36001405e31361631de14bcf87d43e55a\n\n-----------
        _______________________________________________
        Devel mailing list -- devel@ovirt.org <mailto:devel@ovirt.org>
        To unsubscribe send an email to devel-le...@ovirt.org
        <mailto:devel-le...@ovirt.org>
        Privacy Statement: https://www.ovirt.org/site/privacy-policy/
        oVirt Code of Conduct:
        https://www.ovirt.org/community/about/community-guidelines/
        List Archives:
        
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/J4NCHXTK5ZYLXWW36DZKAUL5DN7WBNW4/

    _______________________________________________
    Devel mailing list -- devel@ovirt.org <mailto:devel@ovirt.org>
    To unsubscribe send an email to devel-le...@ovirt.org
    <mailto:devel-le...@ovirt.org>
    Privacy Statement: https://www.ovirt.org/site/privacy-policy/
    oVirt Code of Conduct:
    https://www.ovirt.org/community/about/community-guidelines/
    List Archives:
    
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ULS4OKU2YZFDQT5EDFYKLW5GFA52YZ7U/



--

Eyal edri


MANAGER

RHV/CNV DevOps

EMEA VIRTUALIZATION R&D


Red Hat EMEA <https://www.redhat.com/>

<https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/LLENG52IMDU3DRMPMOP7G72G2PW454MN/

Reply via email to