Well that didn't go so well. I deleted both dom_md/ids & dom_md/leases in the cloned volume, and I still can't import the storage domain. The snapshot was also taken some 4 hours before the attempted import, so I'm surprised the locks haven't expired by themselves...
2017-02-16 21:58:24,630-03 INFO [org.ovirt.engine.core.bll.storage.connection.AddStorageServerConnectionCommand] (default task-45) [d59bc8c0-3c53-4a34-9d7c-8c982ee14e14] Lock Acquired to object 'EngineLock:{exclusiveLocks='[localhost:data-teste2=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}' 2017-02-16 21:58:24,645-03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (default task-45) [d59bc8c0-3c53-4a34-9d7c-8c982ee14e14] START, ConnectStorageServerVDSCommand(HostName = v5.dc0.example.com, StorageServerConnectionManagementVDSParameters:{runAsync='true', hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='GLUSTERFS', connectionList='[StorageServerConnections:{id='null', connection='localhost:data-teste2', iqn='null', vfsType='glusterfs', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null', iface='null', netIfaceName='null'}]'}), log id: 726df65e 2017-02-16 21:58:26,046-03 INFO [org.ovirt.engine.core.bll.storage.connection.AddStorageServerConnectionCommand] (default task-45) [d59bc8c0-3c53-4a34-9d7c-8c982ee14e14] Lock freed to object 'EngineLock:{exclusiveLocks='[localhost:data teste2=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}' 2017-02-16 21:58:26,206-03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainsListVDSCommand] (default task-52) [85548427-713f-4ffb-a385-a97a7ee4109d] START, HSMGetStorageDomainsListVDSCommand(HostName = v5.dc0.example.com, HSMGetStorageDomainsListVDSCommandParameters:{runAsync='true', hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='null', storageDomainType='Data', path='localhost:data-teste2'}), log id: 79f6cc88 2017-02-16 21:58:27,899-03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetStorageDomainsListVDSCommand] (default task-50) [38e87311-a7a5-49a8-bf18-857dd969cd5f] START, HSMGetStorageDomainsListVDSCommand(HostName = v5.dc0.example.com, HSMGetStorageDomainsListVDSCommandParameters:{runAsync='true', hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='null', storageDomainType='Data', path='localhost:data-teste2'}), log id: 7280d13 2017-02-16 21:58:29,156-03 INFO [org.ovirt.engine.core.bll.storage.connection.RemoveStorageServerConnectionCommand] (default task-56) [1b3826e4-4890-43d4-8854-16f3c573a31f] Lock Acquired to object 'EngineLock:{exclusiveLocks='[localhost:data-teste2=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>, 5e5f6610-c759-448b-a53d-9a456f513681=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}' 2017-02-16 21:58:29,168-03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (default task-57) [5e4b20cf-60d2-4ae9-951b-c2693603aa6f] START, DisconnectStorageServerVDSCommand(HostName = v5.dc0.example.com, StorageServerConnectionManagementVDSParameters:{runAsync='true', hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='GLUSTERFS', connectionList='[StorageServerConnections:{id='5e5f6610-c759-448b-a53d-9a456f513681', connection='localhost:data-teste2', iqn='null', vfsType='glusterfs', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null', iface='null', netIfaceName='null'}]'}), log id: 6042b108 2017-02-16 21:58:29,193-03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (default task-56) [1b3826e4-4890-43d4-8854-16f3c573a31f] START, DisconnectStorageServerVDSCommand(HostName = v5.dc0.example.com, StorageServerConnectionManagementVDSParameters:{runAsync='true', hostId='1a3f10f2-e4ce-44b9-9495-06e445cfa0b0', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='GLUSTERFS', connectionList='[StorageServerConnections:{id='5e5f6610-c759-448b-a53d-9a456f513681', connection='localhost:data-teste2', iqn='null', vfsType='glusterfs', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null', iface='null', netIfaceName='null'}]'}), log id: 4e9421cf 2017-02-16 21:58:31,398-03 INFO [org.ovirt.engine.core.bll.storage.connection.RemoveStorageServerConnectionCommand] (default task-56) [1b3826e4-4890-43d4-8854-16f3c573a31f] Lock freed to object 'EngineLock:{exclusiveLocks='[localhost:data-teste2=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>, 5e5f6610-c759-448b-a53d-9a456f513681=<STORAGE_CONNECTION, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}' Again, many thanks! Doug On 16 February 2017 at 18:53, Doug Ingham <dou...@gmail.com> wrote: > Hi Nir, > > On 16 February 2017 at 13:55, Nir Soffer <nsof...@redhat.com> wrote: > >> On Mon, Feb 13, 2017 at 3:35 PM, Doug Ingham <dou...@gmail.com> wrote: >> > Hi Sahina, >> > >> > On 13 February 2017 at 05:45, Sahina Bose <sab...@redhat.com> wrote: >> >> >> >> Any errors in the gluster mount logs for this gluster volume? >> >> >> >> How about "gluster vol heal <volname> info" - does it list any entries >> to >> >> heal? >> > >> > >> > After more investigating, I found out that there is a sanlock daemon >> that >> > runs with VDSM, independently of the HE, so I'd basically have to bring >> the >> > volume down & wait for the leases to expire/delete them* before I can >> import >> > the domain. >> > >> > *I understand removing /dom_md/leases/ should do the job? >> >> No, the issue is probably dom_md/ids accessed by sanlock, but removing >> files >> accessed by sanlock will not help, an open file will remain open until >> sanlock >> close the file. >> > > I'm testing this with volume snapshots at the moment, so there are no > processes accessing the new volume. > > > Did you try to reboot the host before installing it again? If you did and >> you >> still have these issues, you probably need to remove the previous >> installation >> properly before installing again. >> >> Adding Simone to help with uninstalling and reinstalling hosted engine. >> > > The Hosted-Engine database had been corrupted and the restore wasn't > running correctly, so I installed a new engine on a new server - no > restores or old data. The aim is to import the old storage domain into the > new Engine & then import the VMs into the new storage domain. > My only problem with this is that there appear to be some file based > leases somewhere that, unless I manage to locate & delete them, force me to > wait for the leases to timeout before I can import the old storage domain. > To minimise downtime, I'm trying to avoid having to wait for the leases to > timeout. > > Regards, > Doug > > >> >> Nir >> >> > >> > >> >> >> >> >> >> On Thu, Feb 9, 2017 at 11:57 PM, Doug Ingham <dou...@gmail.com> wrote: >> >>> >> >>> Some interesting output from the vdsm log... >> >>> >> >>> >> >>> 2017-02-09 15:16:24,051 INFO (jsonrpc/1) [storage.StorageDomain] >> >>> Resource namespace 01_img_60455567-ad30-42e3-a9df-62fe86c7fd25 >> already >> >>> registered (sd:731) >> >>> 2017-02-09 15:16:24,051 INFO (jsonrpc/1) [storage.StorageDomain] >> >>> Resource namespace 02_vol_60455567-ad30-42e3-a9df-62fe86c7fd25 >> already >> >>> registered (sd:740) >> >>> 2017-02-09 15:16:24,052 INFO (jsonrpc/1) [storage.SANLock] Acquiring >> >>> Lease(name='SDM', >> >>> path=u'/rhev/data-center/mnt/glusterSD/localhost:data2/60455 >> 567-ad30-42e3-a9df-6 >> >>> 2fe86c7fd25/dom_md/leases', offset=1048576) for host id 1 >> >>> (clusterlock:343) >> >>> 2017-02-09 15:16:24,057 INFO (jsonrpc/1) [storage.SANLock] Releasing >> >>> host id for domain 60455567-ad30-42e3-a9df-62fe86c7fd25 (id: 1) >> >>> (clusterlock:305) >> >>> 2017-02-09 15:16:25,149 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC >> >>> call GlusterHost.list succeeded in 0.17 seconds (__init__:515) >> >>> 2017-02-09 15:16:25,264 INFO (Reactor thread) >> >>> [ProtocolDetector.AcceptorImpl] Accepted connection from >> >>> ::ffff:127.0.0.1:55060 (protocoldetector:72) >> >>> 2017-02-09 15:16:25,270 INFO (Reactor thread) >> >>> [ProtocolDetector.Detector] Detected protocol stomp from >> >>> ::ffff:127.0.0.1:55060 (protocoldetector:127) >> >>> 2017-02-09 15:16:25,271 INFO (Reactor thread) [Broker.StompAdapter] >> >>> Processing CONNECT request (stompreactor:102) >> >>> 2017-02-09 15:16:25,271 INFO (JsonRpc (StompReactor)) >> >>> [Broker.StompAdapter] Subscribe command received (stompreactor:129) >> >>> 2017-02-09 15:16:25,416 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC >> >>> call Host.getHardwareInfo succeeded in 0.01 seconds (__init__:515) >> >>> 2017-02-09 15:16:25,419 INFO (jsonrpc/6) [dispatcher] Run and >> protect: >> >>> repoStats(options=None) (logUtils:49) >> >>> 2017-02-09 15:16:25,419 INFO (jsonrpc/6) [dispatcher] Run and >> protect: >> >>> repoStats, Return response: {u'e8d04da7-ad3d-4227-a45d-b5a29b2f43e5': >> >>> {'code': 0, 'actual': True >> >>> , 'version': 4, 'acquired': True, 'delay': '0.000854128', 'lastCheck': >> >>> '5.1', 'valid': True}, u'a77b8821-ff19-4d17-a3ce-a6c3a69436d5': >> {'code': 0, >> >>> 'actual': True, 'vers >> >>> ion': 4, 'acquired': True, 'delay': '0.000966556', 'lastCheck': '2.6', >> >>> 'valid': True}} (logUtils:52) >> >>> 2017-02-09 15:16:25,447 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC >> >>> call Host.getStats succeeded in 0.03 seconds (__init__:515) >> >>> 2017-02-09 15:16:25,450 ERROR (JsonRpc (StompReactor)) >> [vds.dispatcher] >> >>> SSL error receiving from <yajsonrpc.betterAsyncore.Dispatcher >> connected >> >>> ('::ffff:127.0.0.1', 55060, 0, 0) at 0x7f69c0043cf8>: unexpected eof >> >>> (betterAsyncore:113) >> >>> 2017-02-09 15:16:25,812 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC >> >>> call GlusterVolume.list succeeded in 0.10 seconds (__init__:515) >> >>> 2017-02-09 15:16:25,940 INFO (Reactor thread) >> >>> [ProtocolDetector.AcceptorImpl] Accepted connection from >> >>> ::ffff:127.0.0.1:55062 (protocoldetector:72) >> >>> 2017-02-09 15:16:25,946 INFO (Reactor thread) >> >>> [ProtocolDetector.Detector] Detected protocol stomp from >> >>> ::ffff:127.0.0.1:55062 (protocoldetector:127) >> >>> 2017-02-09 15:16:25,947 INFO (Reactor thread) [Broker.StompAdapter] >> >>> Processing CONNECT request (stompreactor:102) >> >>> 2017-02-09 15:16:25,947 INFO (JsonRpc (StompReactor)) >> >>> [Broker.StompAdapter] Subscribe command received (stompreactor:129) >> >>> 2017-02-09 15:16:26,058 ERROR (jsonrpc/1) [storage.TaskManager.Task] >> >>> (Task='02cad901-5fe8-4f2d-895b-14184f67feab') Unexpected error >> (task:870) >> >>> Traceback (most recent call last): >> >>> File "/usr/share/vdsm/storage/task.py", line 877, in _run >> >>> return fn(*args, **kargs) >> >>> File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, >> in >> >>> wrapper >> >>> res = f(*args, **kwargs) >> >>> File "/usr/share/vdsm/storage/hsm.py", line 812, in >> >>> forcedDetachStorageDomain >> >>> self._deatchStorageDomainFromOldPools(sdUUID) >> >>> File "/usr/share/vdsm/storage/hsm.py", line 790, in >> >>> _deatchStorageDomainFromOldPools >> >>> dom.acquireClusterLock(host_id) >> >>> File "/usr/share/vdsm/storage/sd.py", line 810, in >> acquireClusterLock >> >>> self._manifest.acquireDomainLock(hostID) >> >>> File "/usr/share/vdsm/storage/sd.py", line 499, in >> acquireDomainLock >> >>> self._domainLock.acquire(hostID, self.getDomainLease()) >> >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.p >> y", >> >>> line 362, in acquire >> >>> "Cannot acquire %s" % (lease,), str(e)) >> >>> AcquireLockFailure: Cannot obtain lock: >> >>> u"id=60455567-ad30-42e3-a9df-62fe86c7fd25, rc=5, out=Cannot acquire >> >>> Lease(name='SDM', >> >>> path=u'/rhev/data-center/mnt/glusterSD/localhost:data2/60455 >> 567-ad30-42e3-a9df-62fe86c7fd25/dom_md/leases', >> >>> offset=1048576), err=(5, 'Sanlock resource not acquired', >> 'Input/output >> >>> error')" >> >>> 2017-02-09 15:16:26,058 INFO (jsonrpc/1) [storage.TaskManager.Task] >> >>> (Task='02cad901-5fe8-4f2d-895b-14184f67feab') aborting: Task is >> aborted: >> >>> 'Cannot obtain lock' - code 651 (task:1175) >> >>> 2017-02-09 15:16:26,059 ERROR (jsonrpc/1) [storage.Dispatcher] >> {'status': >> >>> {'message': 'Cannot obtain lock: u"id=60455567-ad30-42e3-a9df-6 >> 2fe86c7fd25, >> >>> rc=5, out=Cannot acquire Lease(name=\'SDM\', >> >>> path=u\'/rhev/data-center/mnt/glusterSD/localhost:data2/6045 >> 5567-ad30-42e3-a9df-62fe86c7fd25/dom_md/leases\', >> >>> offset=1048576), err=(5, \'Sanlock resource not acquired\', >> \'Input/output >> >>> error\')"', 'code': 651}} (dispatcher:77) >> >>> 2017-02-09 15:16:26,059 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC >> >>> call StorageDomain.detach failed (error 651) in 23.04 seconds >> (__init__:515) >> >>> >> >>> -- >> >>> Doug >> >>> >> >>> _______________________________________________ >> >>> Users mailing list >> >>> Users@ovirt.org >> >>> http://lists.ovirt.org/mailman/listinfo/users >> >>> >> >> >> > >> > >> > >> > -- >> > Doug >> > >> > _______________________________________________ >> > Users mailing list >> > Users@ovirt.org >> > http://lists.ovirt.org/mailman/listinfo/users >> > >> > > > > -- > Doug > -- Doug
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users