So, after a week of bashing my head against a wall we finally tracked it down.
One of the developers was using the hosts for extra processing power and in the
process he periodically turned up an sshfs mount, when this appeared in
/etc/mnttab it broke vdsm and caused the error below. So was unrelated to the
power outage.
This is in 4.2 - if I re-create in 4.3 I'll file a bug, I think the parser
really should be a but more robust than this.
---- On Tue, 07 May 2019 15:47:45 +0100 Darrell Budic
<mailto:bu...@onholyground.com> wrote ----
Was your hyper converged and is this storage gluster based?
Your error is DNS related, if a bit odd. Have you checked the resolv.conf
configs and confirmed the servers listed there are reachable and responsive?
When your hosts are active, are they able to mount all the storage domains they
need? You should also make sure each HA node can reliably ping your gateway IP,
failures there will cause nodes to bounce.
A starting place rather a solution, but the first places to look. Good luck!
-Darrell
On May 7, 2019, at 5:14 AM, Alan G <mailto:alan+ov...@griff.me.uk> wrote:
Hi,
We have a dev cluster running 4.2. It had to be powered down as the building
was going to loose power. Since we've brought it back up it has been massively
un-stable (Hosts constantly switching state, VMs migrating all the time).
I now have one host running (with HE) and all others in maintenance mode. When
I try activate another host i see storage errors in vdsm.log
2019-05-07 09:41:00,114+0000 ERROR (monitor/a98c0b4) [storage.Monitor] Error
checking domain a98c0b42-47b9-4632-8b54-0ff3bd80d4c2 (monitor:424)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 416, in
_checkDomainStatus
masterStats = self.domain.validateMaster()
File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 941, in
validateMaster
if not self.validateMasterMount():
File "/usr/lib/python2.7/site-packages/vdsm/storage/blockSD.py", line 1377,
in validateMasterMount
return mount.isMounted(self.getMasterDir())
File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 161, in
isMounted
getMountFromTarget(target)
File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 173, in
getMountFromTarget
for rec in _iterMountRecords():
File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 143, in
_iterMountRecords
for rec in _iterKnownMounts():
File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 139, in
_iterKnownMounts
yield _parseFstabLine(line)
File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 81, in
_parseFstabLine
fs_spec = fileUtils.normalize_path(_unescape_spaces(fs_spec))
File "/usr/lib/python2.7/site-packages/vdsm/storage/fileUtils.py", line 94,
in normalize_path
host, tail = address.hosttail_split(path)
File "/usr/lib/python2.7/site-packages/vdsm/common/network/address.py", line
43, in hosttail_split
raise HosttailError('%s is not a valid hosttail address:' % hosttail)
HosttailError: :/ is not a valid hosttail address:
Not sure if it's related but since the restart the hosted_storage domain has
been elected the master domain.
I'm a bit stuck at the moment. My only idea is to remove HE and switch to a
standalone Engine VM running outside the cluster.
Thanks,
Alan
_______________________________________________
Users mailing list -- mailto:users@ovirt.org
To unsubscribe send an email to mailto:users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UDINZK5BQQHXYENSVV3OYFMVLG2YXBNT/
_______________________________________________
Users mailing list -- mailto:users@ovirt.org
To unsubscribe send an email to mailto:users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/I6YJQFP43R5NTQN3HG2VWBJW2WFFBGNB/
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QQSEPVUUAQCY3X4X2B2T3AKDFAR7KJ2F/