On Sun, Dec 18, 2016 at 4:17 PM, Barak Korren <bkor...@redhat.com> wrote:
> After we've fixed the various system test issues that arose from the
> CentOS 7.3 release, we're now seeing a new failure that seems to has
> to do with snapshot merges.
>
> I'm guessing this may have to do with something the went in last week
> while we "weren't looking".
>
> Failing job can be seen here:
> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/4275
>
> The test code snippet that is failing is as follows:
>
>     226     api.vms.get(VM0_NAME).snapshots.list()[-2].delete()
>     227     testlib.assert_true_within_short(
>     228         lambda:
>     229         (len(api.vms.get(VM0_NAME).snapshots.list()) == 2) and
>     230         (api.vms.get(VM0_NAME).snapshots.list()[-1].snapshot_status
>     231          == 'ok'),
>     232     )
>
> The failure itself is a test timeout:
>
>   ...
>   File 
> "/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
> line 228, in snapshot_merge
>     lambda:
>   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line
> 248, in assert_true_within_short
>     allowed_exceptions=allowed_exceptions,
>   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line
> 240, in assert_true_within
>     raise AssertionError('Timed out after %s seconds' % timeout)
> AssertionError: Timed out after 180 seconds
>
> Engine log generated during the test can be found here:
>
> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/4275/artifact/exported-artifacts/basic_suite_master.sh-el7/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity.py/lago-basic-suite-master-engine/_var_log_ovirt-engine/engine.log

Quickly looking in the engine log does not reveal anything useful.

Understanding issues like this require vdsm log from the host
performing the operation.

Looking in vdsm log:

http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/4275/artifact/exported-artifacts/basic_suite_master.sh-el7/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity.py/lago-basic-suite-master-host1/_var_log_vdsm/vdsm.log

2016-12-18 07:54:44,325 INFO  (jsonrpc/1) [dispatcher] Run and
protect: getDeviceList(storageType=3, guids=(), checkStatus=False,
options={}) (logUtils:49)

2016-12-18 07:54:44,565 INFO  (jsonrpc/1) [dispatcher] Run and
protect: getDeviceList, Return response: {'devList': [{'status':
'unknown', 'vendorID': 'LIO-ORG', 'capacity': '21474836480', 'fwrev':
'4.0', 'discard_zeroes_data': 0, 'vgUUID': '', 'pvsize': '',
'pathlist': [{'initiatorname': 'default', 'connection':
'192.168.200.3', 'iqn': 'iqn.2014-07.org.ovirt:storage', 'portal':
'1', 'user': 'username', 'password': '********', 'port': '3260'}],
'logicalblocksize': '512', 'discard_max_bytes': 0, 'pathstatus':
[{'type': 'iSCSI', 'physdev': 'sdd', 'capacity': '21474836480',
'state': 'active', 'lun': '3'}], 'devtype': 'iSCSI',
'physicalblocksize': '512', 'pvUUID': '', 'serial':
'SLIO-ORG_lun3_bdev_b94d3d26-d6e4-4bfc-9d91-4898731b721f', 'GUID':
'36001405b94d3d26d6e44bfc9d9148987', 'productID': 'lun3_bdev'},
{'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity':
'21474836480', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '',
'pvsize': '', 'pathlist': [{'initiatorname': 'default', 'connection':
'192.168.200.3', 'iqn': 'iqn.2014-07.org.ovirt:storage', 'portal':
'1', 'user': 'username', 'password': '********', 'port': '3260'}],
'logicalblocksize': '512', 'discard_max_bytes': 0, 'pathstatus':
[{'type': 'iSCSI', 'physdev': 'sdc', 'capacity': '21474836480',
'state': 'active', 'lun': '2'}], 'devtype': 'iSCSI',
'physicalblocksize': '512', 'pvUUID': '', 'serial':
'SLIO-ORG_lun2_bdev_eddd3ce8-2ea8-4326-a9a8-ebc609a768db', 'GUID':
'36001405eddd3ce82ea84326a9a8ebc60', 'productID': 'lun2_bdev'},
{'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity':
'21474836480', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '',
'pvsize': '', 'pathlist': [{'initiatorname': 'default', 'connection':
'192.168.200.3', 'iqn': 'iqn.2014-07.org.ovirt:storage', 'portal':
'1', 'user': 'username', 'password': '********', 'port': '3260'}],
'logicalblocksize': '512', 'discard_max_bytes': 0, 'pathstatus':
[{'type': 'iSCSI', 'physdev': 'sda', 'capacity': '21474836480',
'state': 'active', 'lun': '0'}], 'devtype': 'iSCSI',
'physicalblocksize': '512', 'pvUUID': '', 'serial':
'SLIO-ORG_lun0_bdev_2e4947ab-dee9-47b5-b55c-9e28a18d6831', 'GUID':
'360014052e4947abdee947b5b55c9e28a', 'productID': 'lun0_bdev'},
{'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity':
'21474836480', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '',
'pvsize': '', 'pathlist': [{'initiatorname': 'default', 'connection':
'192.168.200.3', 'iqn': 'iqn.2014-07.org.ovirt:storage', 'portal':
'1', 'user': 'username', 'password': '********', 'port': '3260'}],
'logicalblocksize': '512', 'discard_max_bytes': 0, 'pathstatus':
[{'type': 'iSCSI', 'physdev': 'sdb', 'capacity': '21474836480',
'state': 'active', 'lun': '1'}], 'devtype': 'iSCSI',
'physicalblocksize': '512', 'pvUUID': '', 'serial':
'SLIO-ORG_lun1_bdev_c581fee8-4da8-42c8-8414-305e003cd57c', 'GUID':
'36001405c581fee84da842c88414305e0', 'productID': 'lun1_bdev'},
{'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity':
'21474836480', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '',
'pvsize': '', 'pathlist': [{'initiatorname': 'default', 'connection':
'192.168.200.3', 'iqn': 'iqn.2014-07.org.ovirt:storage', 'portal':
'1', 'user': 'username', 'password': '********', 'port': '3260'}],
'logicalblocksize': '512', 'discard_max_bytes': 0, 'pathstatus':
[{'type': 'iSCSI', 'physdev': 'sde', 'capacity': '21474836480',
'state': 'active', 'lun': '4'}], 'devtype': 'iSCSI',
'physicalblocksize': '512', 'pvUUID': '', 'serial':
'SLIO-ORG_lun4_bdev_18c78046-b159-4d14-a65e-5f9ad3b5b4f2', 'GUID':
'3600140518c78046b1594d14a65e5f9ad', 'productID': 'lun4_bdev'}]}
(logUtils:52)

2016-12-18 07:54:44,567 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC
call Host.getDeviceList succeeded in 0.24 seconds (__init__:515)

2016-12-18 07:54:45,692 INFO  (jsonrpc/2) [dispatcher] Run and
protect: getDeviceList(storageType=3,
guids=[u'3600140518c78046b1594d14a65e5f9ad',
u'360014052e4947abdee947b5b55c9e28a'], checkStatus=False, options={})
(logUtils:49)

2016-12-18 07:54:45,882 WARN  (jsonrpc/2) [storage.LVM] lvm pvs
failed: 5 [] ['  WARNING: Not using lvmetad because config setting
use_lvmetad=0.', '  WARNING: To avoid corruption, rescan devices to
make changes visible (pvscan --cache).', '  Failed to find physical
volume "/dev/mapper/3600140518c78046b1594d14a65e5f9ad".'] (lvm:324)
2016-12-18 07:54:45,883 WARN  (jsonrpc/2) [storage.HSM] getPV failed
for guid: 3600140518c78046b1594d14a65e5f9ad (hsm:1970)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 1967, in _getDeviceList
    pv = lvm.getPV(guid)
  File "/usr/share/vdsm/storage/lvm.py", line 856, in getPV
    raise se.InaccessiblePhysDev((pvName,))
InaccessiblePhysDev: Multipath cannot access physical device(s):
"devices=(u'3600140518c78046b1594d14a65e5f9ad',)"
2016-12-18 07:54:45,944 WARN  (jsonrpc/2) [storage.LVM] lvm pvs
failed: 5 [] ['  WARNING: Not using lvmetad because config setting
use_lvmetad=0.', '  WARNING: To avoid corruption, rescan devices to
make changes visible (pvscan --cache).', '  Failed to find physical
volume "/dev/mapper/360014052e4947abdee947b5b55c9e28a".'] (lvm:324)
2016-12-18 07:54:45,944 WARN  (jsonrpc/2) [storage.HSM] getPV failed
for guid: 360014052e4947abdee947b5b55c9e28a (hsm:1970)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 1967, in _getDeviceList
    pv = lvm.getPV(guid)
  File "/usr/share/vdsm/storage/lvm.py", line 856, in getPV
    raise se.InaccessiblePhysDev((pvName,))
InaccessiblePhysDev: Multipath cannot access physical device(s):
"devices=(u'360014052e4947abdee947b5b55c9e28a',)"
2016-12-18 07:54:45,977 INFO  (jsonrpc/2) [dispatcher] Run and
protect: getDeviceList, Return response: {'devList': [{'status':
'unknown', 'vendorID': 'LIO-ORG', 'capacity': '21474836480', 'fwrev':
'4.0', 'discard_zeroes_data': 0, 'vgUUID': '', 'pvsize': '',
'pathlist': [{'initiatorname': 'default', 'connection':
'192.168.200.3', 'iqn': 'iqn.2014-07.org.ovirt:storage', 'portal':
'1', 'user': 'username', 'password': '********', 'port': '3260'}],
'logicalblocksize': '512', 'discard_max_bytes': 0, 'pathstatus':
[{'type': 'iSCSI', 'physdev': 'sda', 'capacity': '21474836480',
'state': 'active', 'lun': '0'}], 'devtype': 'iSCSI',
'physicalblocksize': '512', 'pvUUID': '', 'serial':
'SLIO-ORG_lun0_bdev_2e4947ab-dee9-47b5-b55c-9e28a18d6831', 'GUID':
'360014052e4947abdee947b5b55c9e28a', 'productID': 'lun0_bdev'},
{'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity':
'21474836480', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '',
'pvsize': '', 'pathlist': [{'initiatorname': 'default', 'connection':
'192.168.200.3', 'iqn': 'iqn.2014-07.org.ovirt:storage', 'portal':
'1', 'user': 'username', 'password': '********', 'port': '3260'}],
'logicalblocksize': '512', 'discard_max_bytes': 0, 'pathstatus':
[{'type': 'iSCSI', 'physdev': 'sde', 'capacity': '21474836480',
'state': 'active', 'lun': '4'}], 'devtype': 'iSCSI',
'physicalblocksize': '512', 'pvUUID': '', 'serial':
'SLIO-ORG_lun4_bdev_18c78046-b159-4d14-a65e-5f9ad3b5b4f2', 'GUID':
'3600140518c78046b1594d14a65e5f9ad', 'productID': 'lun4_bdev'}]}
(logUtils:52)
2016-12-18 07:54:45,978 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC
call Host.getDeviceList succeeded in 0.29 seconds (__init__:515)

Looks a pv is not accessible - storage issue?

Please check that the pvs are clean before using them for the test.
Clearing the luns on the the storage before the test can be a good idea.

2016-12-18 07:54:47,165 INFO  (jsonrpc/4) [dispatcher] Run and
protect: createVG(vgname=u'097c3032-99da-4e18-b547-19ff340f846e',
devlist=[u'3600140518c78046b1594d14a65e5f9ad',
u'360014052e4947abdee947b5b55c9e28a'], force=False, options=None)
(logUtils:49)

Engine is trying to create a vg with the pv
360014052e4947abdee947b5b55c9e28a that
fail before in getDeviceList.

This looks like a bug in engine, ignoring the results of getDeviceList.

2016-12-18 07:54:47,552 INFO  (jsonrpc/4) [dispatcher] Run and
protect: createVG, Return response: {'uuid':
'zJyjdF-AvTC-Z6wN-XNRO-1jDd-2SUH-dVQTJW'} (logUtils:52)
2016-12-18 07:54:47,552 INFO  (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC
call LVMVolumeGroup.create succeeded in 0.38 seconds (__init__:515)
2016-12-18 07:54:48,562 INFO  (jsonrpc/5) [dispatcher] Run and
protect: createStorageDomain(storageType=3,
sdUUID=u'097c3032-99da-4e18-b547-19ff340f846e', domainName=u'iscsi',
typeSpecificArg=u'zJyjdF-AvTC-Z6wN-XNRO-1jDd-2SUH-dVQTJW', domClass=1,
domVersion=u'3', options=None) (logUtils:49)
2016-12-18 07:54:48,563 ERROR (jsonrpc/5) [storage.StorageDomainCache]
looking for unfetched domain 097c3032-99da-4e18-b547-19ff340f846e
(sdc:151)
2016-12-18 07:54:48,563 ERROR (jsonrpc/5) [storage.StorageDomainCache]
looking for domain 097c3032-99da-4e18-b547-19ff340f846e (sdc:168)
2016-12-18 07:54:48,564 ERROR (jsonrpc/5) [storage.StorageDomainCache]
domain 097c3032-99da-4e18-b547-19ff340f846e not found (sdc:157)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sdc.py", line 155, in _findDomain
    dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 185, in _findUnfetchedDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'097c3032-99da-4e18-b547-19ff340f846e',)

We a lot of these errors in the rest of the log. This meas something
is wrong with this vg.

Needs deeper investigation from storage developer on both engine and vdsm side,
but I would start by making sure we use clean luns. We are not trying
to test esoteric
negative flows in the system tests.

Did we change something in the system tests project or lago while we
were not looking?

Can we reproduce this issue manually with same engine and vdsm versions?

Nir

>
> Please have a look.
>
> Thanks,
> Barak.
>
>
> --
> Barak Korren
> bkor...@redhat.com
> RHCE, RHCi, RHV-DevOps Team
> https://ifireball.wordpress.com/
> _______________________________________________
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra

Reply via email to