Other details related to sanlock:

2020-11-27 20:25:10 7413 [61860]: verify_leader 1 wrong space name
hosted-engin hosted-engine
/run/vdsm/storage/de4645fc-f379-4837-916b-a0c2b89927d9/69dc22c2-8e60-43f1-8d18-6aabfbc98581/dfa4e933-2b9c-4057-a4c5-aa4485b070e9
2020-11-27 20:25:10 7413 [61860]: leader1 delta_acquire_begin error -226
lockspace hosted-engine host_id 1
2020-11-27 20:25:10 7413 [61860]: leader2 path
/run/vdsm/storage/de4645fc-f379-4837-916b-a0c2b89927d9/69dc22c2-8e60-43f1-8d18-6aabfbc98581/dfa4e933-2b9c-4057-a4c5-aa4485b070e9
offset 0
2020-11-27 20:25:10 7413 [61860]: leader3 m 12212010 v 30004 ss 512 nh 0 mh
1 oi 0 og 0 lv 0
2020-11-27 20:25:10 7413 [61860]: leader4 sn hosted-engin rn  ts 0 cs
23839828
2020-11-27 20:25:11 7414 [57456]: s38 add_lockspace fail result -226
2020-11-27 20:25:19 7421 [57456]: s39 lockspace
hosted-engine:1:/run/vdsm/storage/de4645fc-f379-4837-916b-a0c2b89927d9/69dc22c2-8e60-43f1-8d18-6aabfbc98581/dfa4e933-2b9c-4057-a4c5-aa4485b070e9:0
2020-11-27 20:25:19 7421 [62044]: verify_leader 1 wrong space name
hosted-engin hosted-engine
/run/vdsm/storage/de4645fc-f379-4837-916b-a0c2b89927d9/69dc22c2-8e60-43f1-8d18-6aabfbc98581/dfa4e933-2b9c-4057-a4c5-aa4485b070e9
2020-11-27 20:25:19 7421 [62044]: leader1 delta_acquire_begin error -226
lockspace hosted-engine host_id 1
2020-11-27 20:25:19 7421 [62044]: leader2 path
/run/vdsm/storage/de4645fc-f379-4837-916b-a0c2b89927d9/69dc22c2-8e60-43f1-8d18-6aabfbc98581/dfa4e933-2b9c-4057-a4c5-aa4485b070e9
offset 0
2020-11-27 20:25:19 7421 [62044]: leader3 m 12212010 v 30004 ss 512 nh 0 mh
1 oi 0 og 0 lv 0
2020-11-27 20:25:19 7421 [62044]: leader4 sn hosted-engin rn  ts 0 cs
23839828
2020-11-27 20:25:20 7422 [57456]: s39 add_lockspace fail result -226
2020-11-27 20:25:25 7427 [57456]: s40 lockspace
hosted-engine:1:/run/vdsm/storage/de4645fc-f379-4837-916b-a0c2b89927d9/69dc22c2-8e60-43f1-8d18-6aabfbc98581/dfa4e933-2b9c-4057-a4c5-aa4485b070e9:0
2020-11-27 20:25:25 7427 [62090]: verify_leader 1 wrong space name
hosted-engin hosted-engine
/run/vdsm/storage/de4645fc-f379-4837-916b-a0c2b89927d9/69dc22c2-8e60-43f1-8d18-6aabfbc98581/dfa4e933-2b9c-4057-a4c5-aa4485b070e9
2020-11-27 20:25:25 7427 [62090]: leader1 delta_acquire_begin error -226
lockspace hosted-engine host_id 1
2020-11-27 20:25:25 7427 [62090]: leader2 path
/run/vdsm/storage/de4645fc-f379-4837-916b-a0c2b89927d9/69dc22c2-8e60-43f1-8d18-6aabfbc98581/dfa4e933-2b9c-4057-a4c5-aa4485b070e9
offset 0
2020-11-27 20:25:25 7427 [62090]: leader3 m 12212010 v 30004 ss 512 nh 0 mh
1 oi 0 og 0 lv 0
2020-11-27 20:25:25 7427 [62090]: leader4 sn hosted-engin rn  ts 0 cs
23839828
2020-11-27 20:25:26 7428 [57456]: s40 add_lockspace fail result -226

Any help is welcome. Thank you,
Marco


On Fri, Nov 27, 2020 at 6:47 PM Marco Marino <mmar...@extraordy.com> wrote:

> Hi,
> I have an ovirt 4.4.3  with 2 clusters, hosted engine and iscsi storage.
> First cluster, composed of 2 servers (host1 and host2), is dedicated to the
> hosted engine, the second cluster is for vms. Furthermore, there is a SAN
> with 3 luns: one for hosted engine storage, one for vms and one unused. My
> SAN is built on top of a pacemaker/drbd cluster with 2 nodes with a virtual
> ip used as iscsi Portal IP. Starting from today, after a failover of the
> iscsi cluster, I'm unable to start the hosted engine. It seems that there
> is some problem with storage.
> Actually I have only one node (host1) running in the cluster. It seems
> there is some lock on lvs, but I'm not sure of this.
>
> Here are some details about the problem:
>
> 1. iscsiadm -m session
> iSCSI Transport Class version 2.0-870
> version 6.2.0.878-2
> Target: iqn.2003-01.org.linux-iscsi.s1-node1.x8664:sn.2a734f67d5b1
> (non-flash)
> Current Portal: 10.3.8.8:3260,1
> Persistent Portal: 10.3.8.8:3260,1
> **********
> Interface:
> **********
> Iface Name: default
> Iface Transport: tcp
> Iface Initiatorname: iqn.1994-05.com.redhat:4b668221d9a9
> Iface IPaddress: 10.3.8.10
> Iface HWaddress: default
> Iface Netdev: default
> SID: 1
> iSCSI Connection State: LOGGED IN
> iSCSI Session State: LOGGED_IN
> Internal iscsid Session State: NO CHANGE
> *********
> Timeouts:
> *********
> Recovery Timeout: 5
> Target Reset Timeout: 30
> LUN Reset Timeout: 30
> Abort Timeout: 15
> *****
> CHAP:
> *****
> username: <empty>
> password: ********
> username_in: <empty>
> password_in: ********
> ************************
> Negotiated iSCSI params:
> ************************
> HeaderDigest: None
> DataDigest: None
> MaxRecvDataSegmentLength: 262144
> MaxXmitDataSegmentLength: 262144
> FirstBurstLength: 65536
> MaxBurstLength: 262144
> ImmediateData: Yes
> InitialR2T: Yes
> MaxOutstandingR2T: 1
> ************************
> Attached SCSI devices:
> ************************
> Host Number: 7 State: running
> scsi7 Channel 00 Id 0 Lun: 0
> Attached scsi disk sdb State: running
> scsi7 Channel 00 Id 0 Lun: 1
> Attached scsi disk sdc State: running
>
> 2. vdsm.log errors:
>
> 2020-11-27 18:37:16,786+0100 INFO  (jsonrpc/0) [api] FINISH getStats
> error=Virtual machine does not exist: {'vmId':
> 'f3a1194d-0632-43c6-8e12-7f22518cff87'} (api:129)
> .....
> 2020-11-27 18:37:52,864+0100 INFO  (jsonrpc/4) [vdsm.api] FINISH
> getVolumeInfo error=(-223, 'Sanlock resource read failure', 'Lease does not
> exist on storage') from=::1,60880,
> task_id=138a3615-d537-4e5f-a39c-335269ad0917 (api:52)
> 2020-11-27 18:37:52,864+0100 ERROR (jsonrpc/4) [storage.TaskManager.Task]
> (Task='138a3615-d537-4e5f-a39c-335269ad0917') Unexpected error (task:880)
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 887,
> in _run
>     return fn(*args, **kargs)
>   File "<decorator-gen-159>", line 2, in getVolumeInfo
>   File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in
> method
>     ret = func(*args, **kwargs)
>   File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 3142,
> in getVolumeInfo
>     info = self._produce_volume(sdUUID, imgUUID, volUUID).getInfo()
>   File "/usr/lib/python3.6/site-packages/vdsm/storage/volume.py", line
> 258, in getInfo
>     leasestatus = self.getLeaseStatus()
>   File "/usr/lib/python3.6/site-packages/vdsm/storage/volume.py", line
> 203, in getLeaseStatus
>     self.volUUID)
>   File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 549, in
> inquireVolumeLease
>     return self._domainLock.inquire(lease)
>   File "/usr/lib/python3.6/site-packages/vdsm/storage/clusterlock.py",
> line 464, in inquire
>     sector=self._block_size)
> sanlock.SanlockException: (-223, 'Sanlock resource read failure', 'Lease
> does not exist on storage')
> 2020-11-27 18:37:52,865+0100 INFO  (jsonrpc/4) [storage.TaskManager.Task]
> (Task='138a3615-d537-4e5f-a39c-335269ad0917') aborting: Task is aborted:
> "value=(-223, 'Sanlock resource read failure', 'Lease does not exist on
> storage') abortedcode=100" (task:1190)
>
>
> 3. supervdsm.log
> MainProcess|monitor/de4645f::DEBUG::2020-11-27
> 18:41:25,286::commands::153::common.commands::(start) /usr/bin/taskset
> --cpu-list 0-11 /usr/sbin/dmsetup remove
> de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9
> (cwd None)
> MainProcess|monitor/de4645f::DEBUG::2020-11-27
> 18:41:25,293::commands::98::common.commands::(run) FAILED: <err> =
> b'device-mapper: remove ioctl on
> de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9
>  failed: Device or resource busy\nCommand failed.\n'; <rc> = 1
> MainProcess|monitor/de4645f::ERROR::2020-11-27
> 18:41:25,294::supervdsm_server::97::SuperVdsm.ServerCallback::(wrapper)
> Error in devicemapper_removeMapping
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/vdsm/storage/devicemapper.py",
> line 141, in removeMapping
>     commands.run(cmd)
>   File "/usr/lib/python3.6/site-packages/vdsm/common/commands.py", line
> 101, in run
>     raise cmdutils.Error(args, p.returncode, out, err)
> vdsm.common.cmdutils.Error: Command ['/usr/sbin/dmsetup', 'remove',
> 'de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9']
> failed with rc=1 out=b'' err=b'device-mapper: remove ioctl on
> de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9
>  failed: Device or resource busy\nCommand failed.\n'
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/vdsm/supervdsm_server.py", line
> 95, in wrapper
>     res = func(*args, **kwargs)
>   File
> "/usr/lib/python3.6/site-packages/vdsm/supervdsm_api/devicemapper.py", line
> 29, in devicemapper_removeMapping
>     return devicemapper.removeMapping(deviceName)
>   File "/usr/lib/python3.6/site-packages/vdsm/storage/devicemapper.py",
> line 143, in removeMapping
>     raise Error("Could not remove mapping: {}".format(e))
> vdsm.storage.devicemapper.Error: Could not remove mapping: Command
> ['/usr/sbin/dmsetup', 'remove',
> 'de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9']
> failed with rc=1 out=b'' err=b'device-mapper: remove ioctl on
> de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9
>  failed: Device or resource busy\nCommand failed.\n'
>
>
> 4. /var/log/messages:
> ...
> Nov 27 18:43:46 host1 journal[20573]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to
> start necessary monitors
> Nov 27 18:43:46 host1 journal[20573]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call
> last):#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 85, in start_monitor#012    response = self._proxy.start_monitor(type,
> options)#012  File "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in
> __call__#012    return self.__send(self.__name, args)#012  File
> "/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in __request#012
>  verbose=self.__verbose#012  File "/usr/lib64/python3.6/xmlrpc/client.py",
> line 1154, in request#012    return self.single_request(host, handler,
> request_body, verbose)#012  File "/usr/lib64/python3.6/xmlrpc/client.py",
> line 1166, in single_request#012    http_conn = self.send_request(host,
> handler, request_body, verbose)#012  File
> "/usr/lib64/python3.6/xmlrpc/client.py", line 1279, in send_request#012
>  self.send_content(connection, request_body)#012  File
> "/usr/lib64/python3.6/xmlrpc/client.py", line 1309, in send_content#012
>  connection.endheaders(request_body)#012  File
> "/usr/lib64/python3.6/http/client.py", line 1249, in endheaders#012
>  self._send_output(message_body, encode_chunked=encode_chunked)#012  File
> "/usr/lib64/python3.6/http/client.py", line 1036, in _send_output#012
>  self.send(msg)#012  File "/usr/lib64/python3.6/http/client.py", line 974,
> in send#012    self.connect()#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py",
> line 74, in connect#012
>  self.sock.connect(base64.b16decode(self.host))#012FileNotFoundError:
> [Errno 2] No such file or directory#012#012During handling of the above
> exception, another exception occurred:#012#012Traceback (most recent call
> last):#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 131, in _run_agent#012    return action(he)#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 55, in action_proper#012    return he.start_monitoring()#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 437, in start_monitoring#012    self._initialize_broker()#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 561, in _initialize_broker#012    m.get('options', {}))#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 91, in start_monitor#012    ).format(t=type, o=options,
> e=e)#012ovirt_hosted_engine_ha.lib.exceptions.RequestError: brokerlink -
> failed to start monitor via ovirt-ha-broker: [Errno 2] No such file or
> directory, [monitor: 'network', options: {'addr': '10.3.7.1',
> 'network_test': 'dns', 'tcp_t_address': '', 'tcp_t_port': ''}]
> Nov 27 18:43:46 host1 journal[20573]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
> Nov 27 18:43:46 host1 systemd[1]: ovirt-ha-agent.service: Main process
> exited, code=exited, status=157/n/a
> Nov 27 18:43:46 host1 systemd[1]: ovirt-ha-agent.service: Failed with
> result 'exit-code'.
> ....
>
> Please, let me know if other logs are needed.
> Thank you
> --
> Ai sensi dell'articolo 13 del Regolamento UE, n. 2016/679 (GDPR) si
> informa che Titolare del trattamento dei Suoi dati personali, anche
> particolari, compreso l'indirizzo di posta elettronica, è la EXTRAORDY
> S.r.l.. E' possibile revocare il consenso in qualsiasi momento senza
> pregiudicare la liceità del trattamento basata sul consenso prestato prima
> della revoca, nonché proporre reclamo all'Autorità di controllo. Agli
> interessati sono riconosciuti i diritti di cui agli artt. 15 ss. del
> Regolamento UE, n. 2016/679 e in particolare di chiedere al titolare del
> trattamento l'accesso ai dati personali e la rettifica o la cancellazione
> degli stessi o la limitazione del trattamento o la portabilità dei dati che
> lo riguardano o di opporsi al loro trattamento rivolgendo le richieste
> inviando un messaggio al seguente indirizzo e-mail priv...@extraordy.com.
>


-- 
Ai sensi dell'articolo 13 del Regolamento UE, n. 2016/679 (GDPR) si informa
che Titolare del trattamento dei Suoi dati personali, anche particolari,
compreso l'indirizzo di posta elettronica, è la EXTRAORDY S.r.l.. E'
possibile revocare il consenso in qualsiasi momento senza pregiudicare la
liceità del trattamento basata sul consenso prestato prima della revoca,
nonché proporre reclamo all'Autorità di controllo. Agli interessati sono
riconosciuti i diritti di cui agli artt. 15 ss. del Regolamento UE, n.
2016/679 e in particolare di chiedere al titolare del trattamento l'accesso
ai dati personali e la rettifica o la cancellazione degli stessi o la
limitazione del trattamento o la portabilità dei dati che lo riguardano o
di opporsi al loro trattamento rivolgendo le richieste inviando un
messaggio al seguente indirizzo e-mail priv...@extraordy.com.
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XXS6ZVUJN7OWOB22WF7HZN73RDCQCO2P/

Reply via email to