So, after a week of bashing my head against a wall we finally tracked it down. 

One of the developers was using the hosts for extra processing power and in the 
process he periodically turned up an sshfs mount, when this appeared in 
/etc/mnttab it broke vdsm and caused the error below. So was unrelated to the 
power outage.

This is in 4.2 - if I re-create in 4.3 I'll file a bug, I think the parser 
really should be a but more robust than this.


---- On Tue, 07 May 2019 15:47:45 +0100 Darrell Budic 
<mailto:bu...@onholyground.com> wrote ----


Was your hyper converged and is this storage gluster based?

Your error is DNS related, if a bit odd. Have you checked the resolv.conf 
configs and confirmed the servers listed there are reachable and responsive? 
When your hosts are active, are they able to mount all the storage domains they 
need? You should also make sure each HA node can reliably ping your gateway IP, 
failures there will cause nodes to bounce.



A starting place rather a solution, but the first places to look. Good luck!



  -Darrell





On May 7, 2019, at 5:14 AM, Alan G <mailto:alan+ov...@griff.me.uk> wrote:


Hi,



We have a dev cluster running 4.2. It had to be powered down as the building 
was going to loose power. Since we've brought it back up it has been massively 
un-stable (Hosts constantly switching state, VMs migrating all the time).



I now have one host running (with HE) and all others in maintenance mode. When 
I try activate another host i see storage errors in vdsm.log



2019-05-07 09:41:00,114+0000 ERROR (monitor/a98c0b4) [storage.Monitor] Error 
checking domain a98c0b42-47b9-4632-8b54-0ff3bd80d4c2 (monitor:424)

Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 416, in 
_checkDomainStatus

    masterStats = self.domain.validateMaster()

  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 941, in 
validateMaster

    if not self.validateMasterMount():

  File "/usr/lib/python2.7/site-packages/vdsm/storage/blockSD.py", line 1377, 
in validateMasterMount

    return mount.isMounted(self.getMasterDir())

  File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 161, in 
isMounted

    getMountFromTarget(target)

  File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 173, in 
getMountFromTarget

    for rec in _iterMountRecords():

  File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 143, in 
_iterMountRecords

    for rec in _iterKnownMounts():

  File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 139, in 
_iterKnownMounts

    yield _parseFstabLine(line)

  File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 81, in 
_parseFstabLine

    fs_spec = fileUtils.normalize_path(_unescape_spaces(fs_spec))

  File "/usr/lib/python2.7/site-packages/vdsm/storage/fileUtils.py", line 94, 
in normalize_path

    host, tail = address.hosttail_split(path)

  File "/usr/lib/python2.7/site-packages/vdsm/common/network/address.py", line 
43, in hosttail_split

    raise HosttailError('%s is not a valid hosttail address:' % hosttail)

HosttailError: :/ is not a valid hosttail address:



Not sure if it's related but since the restart the hosted_storage domain has 
been elected the master domain.



I'm a bit stuck at the moment. My only idea is to remove HE and switch to a 
standalone Engine VM running outside the cluster.



Thanks,



Alan




_______________________________________________
Users mailing list -- mailto:users@ovirt.org
To unsubscribe send an email to mailto:users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UDINZK5BQQHXYENSVV3OYFMVLG2YXBNT/






_______________________________________________
Users mailing list -- mailto:users@ovirt.org
To unsubscribe send an email to mailto:users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/I6YJQFP43R5NTQN3HG2VWBJW2WFFBGNB/
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QQSEPVUUAQCY3X4X2B2T3AKDFAR7KJ2F/

Reply via email to