On Tue, Mar 10, 2020 at 8:14 PM Nir Soffer <nsof...@redhat.com> wrote:
>
> On Tue, Mar 10, 2020 at 7:03 PM Amit Bawer <aba...@redhat.com> wrote:
> >
> > Seems like a reproduce of 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1807050#c1
>
> Agree, because...
>
> > Snipped from 
> > https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/21146/artifact/basic-suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host-1/_var_log/vdsm/vdsm.log:
> >
> > 2020-03-10 05:59:18,549-0400 ERROR (jsonrpc/3) [storage.LVM] vg 
> > cceb9d83-7b76-4840-a189-c82f3c18760e has pv_count 2 but pv_names 
> > ('/dev/mapper/3600140544bef7e411164e5f94e13b5d8',) (lvm:578)
> > 2020-03-10 05:59:18,551-0400 INFO  (jsonrpc/3) [storage.StorageDomain] 
> > sdUUID=cceb9d83-7b76-4840-a189-c82f3c18760e (blockSD:1192)
> > 2020-03-10 05:59:18,551-0400 DEBUG (jsonrpc/3) [common.commands] 
> > /usr/bin/taskset --cpu-list 0-1 /usr/bin/sudo -n /sbin/lvm vgck --config 
> > 'devices {  preferred_names=["^/dev/mapper/"]  ignore_suspended_devices=1  
> > write_cache_state=0  disable_after_error_count=3  
> > filter=["a|^/dev/mapper/3600140544bef7e411164e5f94e13b5d8$|", "r|.*|"]  
> > hints="none" } global {  locking_type=1  prioritise_write_locks=1  
> > wait_for_locks=1  use_lvmetad=0 } backup {  retain_min=50  retain_days=0 }' 
> > cceb9d83-7b76-4840-a189-c82f3c18760e (cwd None) (commands:153)
> > 2020-03-10 05:59:18,634-0400 DEBUG (jsonrpc/3) [common.commands] FAILED: 
> > <err> = b"  WARNING: Couldn't find device with uuid 
> > FH6lfD-DZus-6Ndn-tkr8-5Hsy-lt2c-CDRPDU.\n  WARNING: VG 
> > cceb9d83-7b76-4840-a189-c82f3c18760e is missing PV 
> > FH6lfD-DZus-6Ndn-tkr8-5Hsy-lt2c-CDRPDU.\n  The volume group is missing 1 
> > physical volumes.\n"; <rc> = 5 (commands:185)
> > 2020-03-10 05:59:18,637-0400 INFO  (jsonrpc/3) [vdsm.api] FINISH 
> > getStorageDomainInfo error=Domain is either partially accessible or 
> > entirely inaccessible: ('cceb9d83-7b76-4840-a189-c82f3c18760e: ["  WARNING: 
> > Couldn\'t find device with uuid FH6lfD-DZus-6Ndn-tkr8-5Hsy-lt2c-CDRPDU.", 
> > \'  WARNING: VG cceb9d83-7b76-4840-a189-c82f3c18760e is missing PV 
> > FH6lfD-DZus-6Ndn-tkr8-5Hsy-lt2c-CDRPDU.\', \'  The volume group is missing 
> > 1 physical volumes.\']',) from=::ffff:192.168.201.4,47796, 
> > flow_id=5f02a1ec-db37-470d-b329-41b22f23582b, 
> > task_id=9be86ca4-49ac-47ea-b0e2-8182e33924ff (api:52)
>
> This command was run only once. Usually when a command using specific filter
> (e.g.  filter=["a|^/dev/mapper/3600140544bef7e411164e5f94e13b5d8$|", "r|.*|"])
> fails, we rebuild the filter. If the new filter is different (e.g has
> more devices) we
> run the command again.
>
> Since we ran the command only once we know that the filter is correct,
> so we have
> only /dev/mapper/3600140544bef7e411164e5f94e13b5d8 on the host. The other PV
> is not available when this command was run.
>
> We started the connection here:
>
> 2020-03-10 05:59:17,364-0400 DEBUG (jsonrpc/2) [common.commands]
> /usr/bin/taskset --cpu-list 0-1 /usr/bin/sudo -n /sbin/iscsiadm -m
> node -T iqn.2014-07.org.ovirt:storage -I default -p
> 192.168.200.4:3260,1 -l (cwd None) (commands:153)
> 2020-03-10 05:59:17,504-0400 DEBUG (jsonrpc/2) [common.commands]
> SUCCESS: <err> = b''; <rc> = 0 (commands:98)
>
> And finished here:
>
> 2020-03-10 05:59:17,610-0400 DEBUG (jsonrpc/2) [common.commands]
> /usr/bin/taskset --cpu-list 0-1 /sbin/udevadm settle --timeout=5 (cwd
> None) (commands:153)
> 2020-03-10 05:59:17,787-0400 DEBUG (jsonrpc/2) [common.commands]
> SUCCESS: <err> = b''; <rc> = 0 (commands:98)
>
> In /var/log/message we see the connection starting here:
>
> Mar 10 05:59:17 lago-basic-suite-master-host-1 iscsid[21973]: iscsid:
> Connection2:0 to [target: iqn.2014-07.org.ovirt:storage, portal:
> 192.168.200.4,3260] through [iface: default] is operational now
>
> Adding devices:
>
> Mar 10 05:59:17 lago-basic-suite-master-host-1 kernel: sd 3:0:0:0:
> [sdf] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
> Mar 10 05:59:17 lago-basic-suite-master-host-1 kernel: sd 3:0:0:4:
> [sdg] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
> Mar 10 05:59:17 lago-basic-suite-master-host-1 kernel: sd 3:0:0:3:
> [sdh] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
> Mar 10 05:59:17 lago-basic-suite-master-host-1 kernel: sd 3:0:0:2:
> [sdi] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
> Mar 10 05:59:17 lago-basic-suite-master-host-1 kernel: sd 3:0:0:1:
> [sdj] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
>
> Multipath adding devices to maps:
>
> Mar 10 05:59:18 lago-basic-suite-master-host-1 multipathd[21959]: sdb
> [8:16]: path added to devmap 36001405b39cc4e33bd24f35a81c0c140
> Mar 10 05:59:18 lago-basic-suite-master-host-1 multipathd[21959]: sdc
> [8:32]: path added to devmap 36001405a70a062950224fc985825aa0d
> Mar 10 05:59:18 lago-basic-suite-master-host-1 multipathd[21959]: sda
> [8:0]: path added to devmap 3600140559c49ea12b0d4dc1994ba4ef0
> Mar 10 05:59:18 lago-basic-suite-master-host-1 multipathd[21959]: sde
> [8:64]: path added to devmap 36001405f277c71b13814669926ffbae4
> Mar 10 05:59:18 lago-basic-suite-master-host-1 multipathd[21959]: sdi
> [8:128]: path added to devmap 3600140544bef7e411164e5f94e13b5d8  <<<
> This is probably the missing device
>
> So we need to wait for a while, until multipath handles all the devices.
>
> https://gerrit.ovirt.org/c/107206/ should avoid this issue.

Merged now. We should not see this issue now.

> Benny, please try to run OST.
>
> > 2020-03-10 05:59:18,637-0400 ERROR (jsonrpc/3) [storage.TaskManager.Task] 
> > (Task='9be86ca4-49ac-47ea-b0e2-8182e33924ff') Unexpected error (task:880)
> > Traceback (most recent call last):
> >   File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 887, 
> > in _run
> >     return fn(*args, **kargs)
> >   File "<decorator-gen-129>", line 2, in getStorageDomainInfo
> >   File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in 
> > method
> >     ret = func(*args, **kwargs)
> >   File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 2752, 
> > in getStorageDomainInfo
> >     dom = self.validateSdUUID(sdUUID)
> >   File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 310, in 
> > validateSdUUID
> >     sdDom.validate()
> >   File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 
> > 1193, in validate
> >     lvm.chkVG(self.sdUUID)
> >   File "/usr/lib/python3.6/site-packages/vdsm/storage/lvm.py", line 1278, 
> > in chkVG
> >     raise se.StorageDomainAccessError("%s: %s" % (vgName, err))
> > vdsm.storage.exception.StorageDomainAccessError: Domain is either partially 
> > accessible or entirely inaccessible: 
> > ('cceb9d83-7b76-4840-a189-c82f3c18760e: ["  WARNING: Couldn\'t find device 
> > with uuid FH6lfD-DZus-6Ndn-tkr8-5Hsy-lt2c-CDRPDU.", \'  WARNING: VG 
> > cceb9d83-7b76-4840-a189-c82f3c18760e is missing PV 
> > FH6lfD-DZus-6Ndn-tkr8-5Hsy-lt2c-CDRPDU.\', \'  The volume group is missing 
> > 1 physical volumes.\']',)
> > 2020-03-10 05:59:18,637-0400 INFO  (jsonrpc/3) [storage.TaskManager.Task] 
> > (Task='9be86ca4-49ac-47ea-b0e2-8182e33924ff') aborting: Task is aborted: 
> > 'value=Domain is either partially accessible or entirely inaccessible: 
> > (\'cceb9d83-7b76-4840-a189-c82f3c18760e: ["  WARNING: Couldn\\\'t find 
> > device with uuid FH6lfD-DZus-6Ndn-tkr8-5Hsy-lt2c-CDRPDU.", \\\'  WARNING: 
> > VG cceb9d83-7b76-4840-a189-c82f3c18760e is missing PV 
> > FH6lfD-DZus-6Ndn-tkr8-5Hsy-lt2c-CDRPDU.\\\', \\\'  The volume group is 
> > missing 1 physical volumes.\\\']\',) abortedcode=379' (task:1190)
> > 2020-03-10 05:59:18,638-0400 ERROR (jsonrpc/3) [storage.Dispatcher] FINISH 
> > getStorageDomainInfo error=Domain is either partially accessible or 
> > entirely inaccessible: ('cceb9d83-7b76-4840-a189-c82f3c18760e: ["  WARNING: 
> > Couldn\'t find device with uuid FH6lfD-DZus-6Ndn-tkr8-5Hsy-lt2c-CDRPDU.", 
> > \'  WARNING: VG cceb9d83-7b76-4840-a189-c82f3c18760e is missing PV 
> > FH6lfD-DZus-6Ndn-tkr8-5Hsy-lt2c-CDRPDU.\', \'  The volume group is missing 
> > 1 physical volumes.\']',) (dispatcher:83)
> >
> >
> > Suggest to try again once the BZ is fixed on master.
> >
> > On Tue, Mar 10, 2020 at 1:36 PM Yedidyah Bar David <d...@redhat.com> wrote:
> > >
> > > Hi all,
> > >
> > > Anyone looking at this?
> > >
> > > See e.g.:
> > >
> > > https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/21146/
> > >
> > > Thanks,
> > > --
> > > Didi
> > > _______________________________________________
> > > Devel mailing list -- devel@ovirt.org
> > > To unsubscribe send an email to devel-le...@ovirt.org
> > > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> > > oVirt Code of Conduct: 
> > > https://www.ovirt.org/community/about/community-guidelines/
> > > List Archives: 
> > > https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ED57V5XW4B3WC7AM5GRYDE6CJJL7PWPM/
> > _______________________________________________
> > Devel mailing list -- devel@ovirt.org
> > To unsubscribe send an email to devel-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> > oVirt Code of Conduct: 
> > https://www.ovirt.org/community/about/community-guidelines/
> > List Archives: 
> > https://lists.ovirt.org/archives/list/devel@ovirt.org/message/GFWZBWE3UT4OCB2GDJ7WPOG62TIKSU43/
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/4VB7YMBOZYXP3T5OS25L3HK6XAJUJ2XL/

Reply via email to