On 1/26/2014 3:10 PM, Itamar Heim wrote:
On 01/26/2014 10:08 PM, Ted Miller wrote:
is this gluster storage (guessing sunce you mentioned a 'volume')
yes (mentioned under "setup" above)
does it have a quorum?
Volume Name: VM2
Type: Replicate
Volume ID: 7bea8d3b-ec2a-4939-8da8-a82e6bda841e
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.41.65.2:/bricks/01/VM2
Brick2: 10.41.65.4:/bricks/01/VM2
Brick3: 10.41.65.4:/bricks/101/VM2
Options Reconfigured:
cluster.server-quorum-type: server
storage.owner-gid: 36
storage.owner-uid: 36
auth.allow: *
user.cifs: off
nfs.disa
(there were reports of split brain on the domain metadata before when
no quorum exist for gluster)
after full heal:
[root@office4a ~]$ gluster volume heal VM2 info
Gathering Heal info on volume VM2 has been successful
Brick 10.41.65.2:/bricks/01/VM2
Number of entries: 0
Brick 10.41.65.4:/bricks/01/VM2
Number of entries: 0
Brick 10.41.65.4:/bricks/101/VM2
Number of entries: 0
[root@office4a ~]$ gluster volume heal VM2 info split-brain
Gathering Heal info on volume VM2 has been successful
Brick 10.41.65.2:/bricks/01/VM2
Number of entries: 0
Brick 10.41.65.4:/bricks/01/VM2
Number of entries: 0
Brick 10.41.65.4:/bricks/101/VM2
Number of entries: 0
noticed this in host /var/log/messages (while looking for something else).
Loop seems to repeat over and over.
Jan 26 15:35:52 office4a sanlock[3763]: 2014-01-26 15:35:52-0500 14678 [30419]:
read_sectors delta_leader offset 512 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
Jan 26 15:35:53 office4a sanlock[3763]: 2014-01-26 15:35:53-0500 14679 [3771]:
s1997 add_lockspace fail result -90
Jan 26 15:35:58 office4a vdsm TaskManager.Task ERROR Task=`89885661-88eb-4ea3-8793-00438735e4ab`::Unexpected
error#012Traceback (most recent call last):#012 File "/usr/share/vdsm/storage/task.py", line 857, in
_run#012 return fn(*args, **kargs)#012 File "/usr/share/vdsm/logUtils.py", line 45, in wrapper#012 res =
f(*args, **kwargs)#012 File "/usr/share/vdsm/storage/hsm.py", line 2111, in getAllTasksStatuses#012
allTasksStatus = sp.getAllTasksStatuses()#012 File "/usr/share/vdsm/storage/securable.py", line 66, in
wrapper#012
raise SecureError()#012SecureError
Jan 26 15:35:59 office4a sanlock[3763]: 2014-01-26 15:35:59-0500 14686 [30495]:
read_sectors delta_leader offset 512 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
Jan 26 15:36:00 office4a sanlock[3763]: 2014-01-26 15:36:00-0500 14687 [3772]:
s1998 add_lockspace fail result -90
Jan 26 15:36:00 office4a vdsm TaskManager.Task ERROR Task=`8db9ff1a-2894-407a-915a-279f6a7eb205`::Unexpected error#012Traceback
(most recent call last):#012 File "/usr/share/vdsm/storage/task.py", line 857, in _run#012 return fn(*args,
**kargs)#012 File "/usr/share/vdsm/storage/task.py", line 318, in run#012 return self.cmd(*self.argslist,
**self.argsdict)#012 File "/usr/share/vdsm/storage/sp.py", line 273, in startSpm#012
self.masterDomain.acquireHostId(self.id)#012 File "/usr/share/vdsm/storage/sd.py", line 458, in acquireHostId#012
self._clusterLock.acquireHostId(hostId, async)#012 File "/usr/share/vdsm/storage/clusterlock.py", line 189, in
acquireHostId#012 raise se.AcquireHostIdFailure(self._sdUUID, e)#012AcquireHostIdFailure: Cannot acquire host id:
('0322a407-2b16-40dc-ac67-13d387c6eb4c', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long'))