[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-15 Thread Xiubo Li


On 14/12/2022 14:52, Stolte, Felix wrote:
Issue is resolved now. After verifying that all esx hosts are 
configured for MRU, i took a closer look on the paths on each host.


`gwcli` reported lun in question was owned by gateway A, but one esx 
host used the path to gateway B for I/O. I reconfigured that 
particular host and it’s now using the correct path to gateway A. Logs 
are clean now and I/O on that Datastore is back to normal.


Yeah.

When the exsi client sent IOs to gateway B, the gateway B will try to 
acquire the exclusive lock and then ceph will blocklist the current 
owner, which is gateway A, of it after succeeding.


This is why you were seeing the gateways were blocklisting each other.



This was probably caused by an outage of one of our gateways last week 
(the physical host, not the daemon), where the iSCSI Daemon didn’t 
shut down cleanly.


One last question though:

From my understanding the "Dynamic Discovery“ just creates the „Static 
Discovery“ Targets for all available gateways. Is it also responsible 
for telling the client, which path to use (aka which gateway is the 
owner of a LUN)?


In linux initiator I know the multipath will correctly configure the 
paths' priority by checking the configuration from gateways together 
with the multipath setting locally.


Not sure how the exsi will behave exactly.

BRs

- Xiubo


-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-


Am 13.12.2022 um 13:21 schrieb Xiubo Li :


On 13/12/2022 18:57, Stolte, Felix wrote:

Hi Xiubo,

Thx for pointing me into the right direction. All involved esx host 
seem to use the correct policy. I am going to detach the LUN on each 
host one by one until i found the host causing the problem.



From the logs it means the client was switching the path in turn.

BTW, what's policy are you using ?

Thanks

- Xiubo


Regards Felix
-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-


Am 12.12.2022 um 13:03 schrieb Xiubo Li :

Hi Stolte,

For the VMware config could you refer to : 
https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?


What's the "Path Selection Policy with ALUA" you are using ? The 
ceph-iscsi couldn't implement the real AA, so if you use the RR I 
think it will be like this.


- Xiubo

On 12/12/2022 17:45, Stolte, Felix wrote:

Hi guys,

we are using ceph-iscsi to provide block storage for Microsoft Exchange and 
vmware vsphere. Ceph docs state that you need to configure Windows iSCSI 
Initatior for fail-over-only but there is no such point for vmware. In my 
tcmu-runner logs on both ceph-iscsi gateways I see the following:

2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.

At the 

[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-15 Thread Xiubo Li


On 14/12/2022 16:32, Stolte, Felix wrote:
We have been using tgt for five years and switched to ceph-iscsi (LIO 
Framework) two months ago. We observed a massive performance boost. 
Can’t say though if the performance increase was only related to the 
different software or if our TGT configuration was not as could as it 
could have been. Personally i prefer the ceph-iscsi configuration, 
it’s way easier to setup and you can create targets, lun, etc. either 
via gwcli or ceph dashboard.


Years ago I knew there was one user had compared the performance between 
tgt and ceph-iscsi for ceph with their products, they also got the 
similar result with yours.


BRs

- Xiubo



Regards
Felix
-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-


Am 13.12.2022 um 23:54 schrieb Joe Comeau :

I am curious about what is happening with your iscsi configuration
Is this a new iscsi config or something that has just cropped up ?
We are using/have been using vmware for 5+ years with iscsi
We are using the kernel iscsi vs tcmu
We are running ALUA and all datastores are setup as RR
We routinely reboot the iscsi gateways - during patching and updates 
and the storage migrates to and from all servers without issue
We usually wait about 10 minutes before a gateway restart, so there 
is not an outage

It has been extremely stable for us
Thanks Joe


>>> Xiubo Li  12/13/2022 4:21 AM >>>

On 13/12/2022 18:57, Stolte, Felix wrote:
> Hi Xiubo,
>
> Thx for pointing me into the right direction. All involved esx host
> seem to use the correct policy. I am going to detach the LUN on each
> host one by one until i found the host causing the problem.
>
From the logs it means the client was switching the path in turn.

BTW, what's policy are you using ?

Thanks

- Xiubo

> Regards Felix
> 
-
> 
-

> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
> 
-
> 
-

>
>> Am 12.12.2022 um 13:03 schrieb Xiubo Li :
>>
>> Hi Stolte,
>>
>> For the VMware config could you refer to :
>> https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ 
 ?

>>
>> What's the "Path Selection Policy with ALUA" you are using ? The
>> ceph-iscsi couldn't implement the real AA, so if you use the RR I
>> think it will be like this.
>>
>> - Xiubo
>>
>> On 12/12/2022 17:45, Stolte, Felix wrote:
>>> Hi guys,
>>>
>>> we are using ceph-iscsi to provide block storage for Microsoft 
Exchange and vmware vsphere. Ceph docs state that you need to 
configure Windows iSCSI Initatior for fail-over-only but there is no 
such point for vmware. In my tcmu-runner logs on both ceph-iscsi 
gateways I see the following:

>>>
>>> 2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>> 2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
>>> 2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>> 2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
>>> 2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:11.106 33789 [INFO] 

[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-15 Thread Xiubo Li



On 15/12/2022 02:46, Joe Comeau wrote:

That's correct - we use the kernel target not tcmu-runner


Okay.

There are some difference for the configurations between kernel target 
and the ceph-iscsi target.


Thanks,

- Xiubo



>>> Xiubo Li  12/13/2022 6:02 PM >>>

On 14/12/2022 06:54, Joe Comeau wrote:
> I am curious about what is happening with your iscsi configuration
> Is this a new iscsi config or something that has just cropped up ?
>
> We are using/have been using vmware for 5+ years with iscsi
> We are using the kernel iscsi vs tcmu
>

Do you mean you are using kernel target, not the ceph-iscsi/tcmu-runner
in user space, right ?

> We are running ALUA and all datastores are setup as RR
> We routinely reboot the iscsi gateways - during patching and updates 
and the storage migrates to and from all servers without issue
> We usually wait about 10 minutes before a gateway restart, so there 
is not an outage

>
> It has been extremely stable for us
>
> Thanks Joe
>
>
>
 Xiubo Li  12/13/2022 4:21 AM >>>
> On 13/12/2022 18:57, Stolte, Felix wrote:
>> Hi Xiubo,
>>
>> Thx for pointing me into the right direction. All involved esx host
>> seem to use the correct policy. I am going to detach the LUN on each
>> host one by one until i found the host causing the problem.
>>
>  From the logs it means the client was switching the path in turn.
>
> BTW, what's policy are you using ?
>
> Thanks
>
> - Xiubo
>
>> Regards Felix
>> 
-
>> 
-

>> Forschungszentrum Juelich GmbH
>> 52425 Juelich
>> Sitz der Gesellschaft: Juelich
>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>> Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
>> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
>> Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
>> 
-
>> 
-

>>
>>> Am 12.12.2022 um 13:03 schrieb Xiubo Li :
>>>
>>> Hi Stolte,
>>>
>>> For the VMware config could you refer to :
>>> https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ 
 ?

>>>
>>> What's the "Path Selection Policy with ALUA" you are using ? The
>>> ceph-iscsi couldn't implement the real AA, so if you use the RR I
>>> think it will be like this.
>>>
>>> - Xiubo
>>>
>>> On 12/12/2022 17:45, Stolte, Felix wrote:
 Hi guys,

 we are using ceph-iscsi to provide block storage for Microsoft 
Exchange and vmware vsphere. Ceph docs state that you need to 
configure Windows iSCSI Initatior for fail-over-only but there is no 
such point for vmware. In my tcmu-runner logs on both ceph-iscsi 
gateways I see the following:


 2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
 2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
 2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
 2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
 2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
 2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
 2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
 2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.


 At the same time there are these log entries in ceph.audit.logs:
 2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : 
audit [INF] from='client.? 10.100.8.55:0/2392201639' 
entity='client.admin' cmd=[{"prefix": "osd blocklist", "blocklistop": 
"add", "addr": "10

 .100.8.56:0/1598475844"}]: dispatch
 2022-12-12T10:36:06.731913+0100 mon.mon-e2-1 (mon.0) 783726 : 
audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": 
"osd blocklist", "blocklistop": "add", "addr": 
"10.100.8.56:0/1598475844"}]

 : dispatch
 2022-12-12T10:36:06.905082+0100 mon.mon-e2-1 (mon.0) 783727 : 
audit [INF] from='client.? ' entity='client.admin' cmd='[{"prefix": 
"osd blocklist", "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}

 ]': finished

 Can someone explaint to me, what is happening? Why are the 
gateways blacklisting each other? All involved daemons 

[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-14 Thread Joe Comeau
That's correct - we use the kernel target not tcmu-runner


>>> Xiubo Li  12/13/2022 6:02 PM >>>

On 14/12/2022 06:54, Joe Comeau wrote:
> I am curious about what is happening with your iscsi configuration
> Is this a new iscsi config or something that has just cropped up ?
>   
> We are using/have been using vmware for 5+ years with iscsi
> We are using the kernel iscsi vs tcmu
>   

Do you mean you are using kernel target, not the ceph-iscsi/tcmu-runner 
in user space, right ?

> We are running ALUA and all datastores are setup as RR
> We routinely reboot the iscsi gateways - during patching and updates and the 
> storage migrates to and from all servers without issue
> We usually wait about 10 minutes before a gateway restart, so there is not an 
> outage
>   
> It has been extremely stable for us
>   
> Thanks Joe
>   
>
>
 Xiubo Li  12/13/2022 4:21 AM >>>
> On 13/12/2022 18:57, Stolte, Felix wrote:
>> Hi Xiubo,
>>
>> Thx for pointing me into the right direction. All involved esx host
>> seem to use the correct policy. I am going to detach the LUN on each
>> host one by one until i found the host causing the problem.
>>
>  From the logs it means the client was switching the path in turn.
>
> BTW, what's policy are you using ?
>
> Thanks
>
> - Xiubo
>
>> Regards Felix
>> -
>> -
>> Forschungszentrum Juelich GmbH
>> 52425 Juelich
>> Sitz der Gesellschaft: Juelich
>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>> Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
>> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
>> Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
>> -
>> -
>>
>>> Am 12.12.2022 um 13:03 schrieb Xiubo Li :
>>>
>>> Hi Stolte,
>>>
>>> For the VMware config could you refer to :
>>> https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?
>>>
>>> What's the "Path Selection Policy with ALUA" you are using ? The
>>> ceph-iscsi couldn't implement the real AA, so if you use the RR I
>>> think it will be like this.
>>>
>>> - Xiubo
>>>
>>> On 12/12/2022 17:45, Stolte, Felix wrote:
 Hi guys,

 we are using ceph-iscsi to provide block storage for Microsoft Exchange 
 and vmware vsphere. Ceph docs state that you need to configure Windows 
 iSCSI Initatior for fail-over-only but there is no such point for vmware. 
 In my tcmu-runner logs on both ceph-iscsi gateways I see the following:

 2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
 rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
 2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
 rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
 2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
 rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
 2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
 rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
 2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
 rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
 2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
 rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
 2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
 rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
 2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
 rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.

 At the same time there are these log entries in ceph.audit.logs:
 2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : audit [INF] 
 from='client.? 10.100.8.55:0/2392201639' entity='client.admin' 
 cmd=[{"prefix": "osd blocklist", "blocklistop": "add", "addr": "10
 .100.8.56:0/1598475844"}]: dispatch
 2022-12-12T10:36:06.731913+0100 mon.mon-e2-1 (mon.0) 783726 : audit [INF] 
 from='client.? ' entity='client.admin' cmd=[{"prefix": "osd blocklist", 
 "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}]
 : dispatch
 2022-12-12T10:36:06.905082+0100 mon.mon-e2-1 (mon.0) 783727 : audit [INF] 
 from='client.? ' entity='client.admin' cmd='[{"prefix": "osd blocklist", 
 "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}
 ]': finished

 Can someone explaint to me, what is happening? Why are the gateways 
 blacklisting each other? All involved daemons are running Version 16.2.10. 
 ceph-iscsi gateways are running on Ubuntu 20.04 with ceph-isci package 
 from the Ubuntu repo (all 

[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-14 Thread Stolte, Felix
We have been using tgt for five years and switched to ceph-iscsi (LIO 
Framework) two months ago. We observed a massive performance boost. Can’t say 
though if the performance increase was only related to the different software 
or if our TGT configuration was not as could as it could have been. Personally 
i prefer the ceph-iscsi configuration, it’s way easier to setup and you can 
create targets, lun, etc. either via gwcli or ceph dashboard.

Regards
Felix
-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-

Am 13.12.2022 um 23:54 schrieb Joe Comeau :

I am curious about what is happening with your iscsi configuration
Is this a new iscsi config or something that has just cropped up ?

We are using/have been using vmware for 5+ years with iscsi
We are using the kernel iscsi vs tcmu

We are running ALUA and all datastores are setup as RR
We routinely reboot the iscsi gateways - during patching and updates and the 
storage migrates to and from all servers without issue
We usually wait about 10 minutes before a gateway restart, so there is not an 
outage

It has been extremely stable for us

Thanks Joe



>>> Xiubo Li  12/13/2022 4:21 AM >>>

On 13/12/2022 18:57, Stolte, Felix wrote:
> Hi Xiubo,
>
> Thx for pointing me into the right direction. All involved esx host
> seem to use the correct policy. I am going to detach the LUN on each
> host one by one until i found the host causing the problem.
>
From the logs it means the client was switching the path in turn.

BTW, what's policy are you using ?

Thanks

- Xiubo

> Regards Felix
> -
> -
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
> -
> -
>
>> Am 12.12.2022 um 13:03 schrieb Xiubo Li :
>>
>> Hi Stolte,
>>
>> For the VMware config could you refer to :
>> https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?
>>
>> What's the "Path Selection Policy with ALUA" you are using ? The
>> ceph-iscsi couldn't implement the real AA, so if you use the RR I
>> think it will be like this.
>>
>> - Xiubo
>>
>> On 12/12/2022 17:45, Stolte, Felix wrote:
>>> Hi guys,
>>>
>>> we are using ceph-iscsi to provide block storage for Microsoft Exchange and 
>>> vmware vsphere. Ceph docs state that you need to configure Windows iSCSI 
>>> Initatior for fail-over-only but there is no such point for vmware. In my 
>>> tcmu-runner logs on both ceph-iscsi gateways I see the following:
>>>
>>> 2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>> 2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
>>> rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
>>> 2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>> 2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
>>> rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
>>> 2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>>
>>> At the same time there are these log entries in ceph.audit.logs:
>>> 2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : audit [INF] 
>>> 

[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-13 Thread Stolte, Felix
Issue is resolved now. After verifying that all esx hosts are configured for 
MRU, i took a closer look on the paths on each host.

`gwcli` reported lun in question was owned by gateway A, but one esx host used 
the path to gateway B for I/O. I reconfigured that particular host and it’s now 
using the correct path to gateway A. Logs are clean now and I/O on that 
Datastore is back to normal.

This was probably caused by an outage of one of our gateways last week (the 
physical host, not the daemon), where the iSCSI Daemon didn’t shut down cleanly.

One last question though:

From my understanding the "Dynamic Discovery“ just creates the „Static 
Discovery“ Targets for all available gateways. Is it also responsible for 
telling the client, which path to use (aka which gateway is the owner of a LUN)?

-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-

Am 13.12.2022 um 13:21 schrieb Xiubo Li :



On 13/12/2022 18:57, Stolte, Felix wrote:
Hi Xiubo,

Thx for pointing me into the right direction. All involved esx host seem to use 
the correct policy. I am going to detach the LUN on each host one by one until 
i found the host causing the problem.

From the logs it means the client was switching the path in turn.

BTW, what's policy are you using ?

Thanks

- Xiubo

Regards Felix
-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-

Am 12.12.2022 um 13:03 schrieb Xiubo Li 
:


Hi Stolte,

For the VMware config could you refer to : 
https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?

What's the "Path Selection Policy with ALUA" you are using ? The ceph-iscsi 
couldn't implement the real AA, so if you use the RR I think it will be like 
this.

- Xiubo

On 12/12/2022 17:45, Stolte, Felix wrote:

Hi guys,

we are using ceph-iscsi to provide block storage for Microsoft Exchange and 
vmware vsphere. Ceph docs state that you need to configure Windows iSCSI 
Initatior for fail-over-only but there is no such point for vmware. In my 
tcmu-runner logs on both ceph-iscsi gateways I see the following:

2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.

At the same time there are these log entries in ceph.audit.logs:
2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : audit [INF] 
from='client.? 10.100.8.55:0/2392201639' entity='client.admin' cmd=[{"prefix": 
"osd blocklist", "blocklistop": "add", "addr": "10
.100.8.56:0/1598475844"}]: dispatch
2022-12-12T10:36:06.731913+0100 mon.mon-e2-1 (mon.0) 783726 : audit [INF] 
from='client.? ' entity='client.admin' cmd=[{"prefix": "osd blocklist", 
"blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}]
: dispatch

[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-13 Thread Xiubo Li



On 14/12/2022 06:54, Joe Comeau wrote:

I am curious about what is happening with your iscsi configuration
Is this a new iscsi config or something that has just cropped up ?
  
We are using/have been using vmware for 5+ years with iscsi

We are using the kernel iscsi vs tcmu
  


Do you mean you are using kernel target, not the ceph-iscsi/tcmu-runner 
in user space, right ?



We are running ALUA and all datastores are setup as RR
We routinely reboot the iscsi gateways - during patching and updates and the 
storage migrates to and from all servers without issue
We usually wait about 10 minutes before a gateway restart, so there is not an 
outage
  
It has been extremely stable for us
  
Thanks Joe
  




Xiubo Li  12/13/2022 4:21 AM >>>

On 13/12/2022 18:57, Stolte, Felix wrote:

Hi Xiubo,

Thx for pointing me into the right direction. All involved esx host
seem to use the correct policy. I am going to detach the LUN on each
host one by one until i found the host causing the problem.


 From the logs it means the client was switching the path in turn.

BTW, what's policy are you using ?

Thanks

- Xiubo


Regards Felix
-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-


Am 12.12.2022 um 13:03 schrieb Xiubo Li :

Hi Stolte,

For the VMware config could you refer to :
https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?

What's the "Path Selection Policy with ALUA" you are using ? The
ceph-iscsi couldn't implement the real AA, so if you use the RR I
think it will be like this.

- Xiubo

On 12/12/2022 17:45, Stolte, Felix wrote:

Hi guys,

we are using ceph-iscsi to provide block storage for Microsoft Exchange and 
vmware vsphere. Ceph docs state that you need to configure Windows iSCSI 
Initatior for fail-over-only but there is no such point for vmware. In my 
tcmu-runner logs on both ceph-iscsi gateways I see the following:

2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.

At the same time there are these log entries in ceph.audit.logs:
2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : audit [INF] from='client.? 10.100.8.55:0/2392201639' 
entity='client.admin' cmd=[{"prefix": "osd blocklist", "blocklistop": "add", "addr": 
"10
.100.8.56:0/1598475844"}]: dispatch
2022-12-12T10:36:06.731913+0100 mon.mon-e2-1 (mon.0) 783726 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": 
"osd blocklist", "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}]
: dispatch
2022-12-12T10:36:06.905082+0100 mon.mon-e2-1 (mon.0) 783727 : audit [INF] from='client.? ' entity='client.admin' cmd='[{"prefix": 
"osd blocklist", "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}
]': finished

Can someone explaint to me, what is happening? Why are the gateways blacklisting each 
other? All involved daemons are running Version 16.2.10. ceph-iscsi gateways are 
running on Ubuntu 20.04 with ceph-isci package from the Ubuntu repo (all other 
packers came directly from ceph.com)


regards Felix

-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir 

[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-13 Thread Joe Comeau
I am curious about what is happening with your iscsi configuration
Is this a new iscsi config or something that has just cropped up ?
 
We are using/have been using vmware for 5+ years with iscsi
We are using the kernel iscsi vs tcmu
 
We are running ALUA and all datastores are setup as RR
We routinely reboot the iscsi gateways - during patching and updates and the 
storage migrates to and from all servers without issue
We usually wait about 10 minutes before a gateway restart, so there is not an 
outage 
 
It has been extremely stable for us
 
Thanks Joe
 


>>> Xiubo Li  12/13/2022 4:21 AM >>>

On 13/12/2022 18:57, Stolte, Felix wrote:
> Hi Xiubo,
>
> Thx for pointing me into the right direction. All involved esx host 
> seem to use the correct policy. I am going to detach the LUN on each 
> host one by one until i found the host causing the problem.
>
>From the logs it means the client was switching the path in turn.

BTW, what's policy are you using ?

Thanks

- Xiubo

> Regards Felix
> -
> -
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
> -
> -
>
>> Am 12.12.2022 um 13:03 schrieb Xiubo Li :
>>
>> Hi Stolte,
>>
>> For the VMware config could you refer to : 
>> https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?
>>
>> What's the "Path Selection Policy with ALUA" you are using ? The 
>> ceph-iscsi couldn't implement the real AA, so if you use the RR I 
>> think it will be like this.
>>
>> - Xiubo
>>
>> On 12/12/2022 17:45, Stolte, Felix wrote:
>>> Hi guys,
>>>
>>> we are using ceph-iscsi to provide block storage for Microsoft Exchange and 
>>> vmware vsphere. Ceph docs state that you need to configure Windows iSCSI 
>>> Initatior for fail-over-only but there is no such point for vmware. In my 
>>> tcmu-runner logs on both ceph-iscsi gateways I see the following:
>>>
>>> 2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>> 2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
>>> rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
>>> 2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>> 2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
>>> rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
>>> 2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>>
>>> At the same time there are these log entries in ceph.audit.logs:
>>> 2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : audit [INF] 
>>> from='client.? 10.100.8.55:0/2392201639' entity='client.admin' 
>>> cmd=[{"prefix": "osd blocklist", "blocklistop": "add", "addr": "10
>>> .100.8.56:0/1598475844"}]: dispatch
>>> 2022-12-12T10:36:06.731913+0100 mon.mon-e2-1 (mon.0) 783726 : audit [INF] 
>>> from='client.? ' entity='client.admin' cmd=[{"prefix": "osd blocklist", 
>>> "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}]
>>> : dispatch
>>> 2022-12-12T10:36:06.905082+0100 mon.mon-e2-1 (mon.0) 783727 : audit [INF] 
>>> from='client.? ' entity='client.admin' cmd='[{"prefix": "osd blocklist", 
>>> "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}
>>> ]': finished
>>>
>>> Can someone explaint to me, what is happening? Why are the gateways 
>>> blacklisting each other? All involved daemons are running Version 16.2.10. 
>>> ceph-iscsi gateways are running on Ubuntu 20.04 with ceph-isci package from 
>>> the Ubuntu repo (all other packers came directly from 
>>> ceph.com)
>>>
>>>
>>> regards Felix
>>>
>>> -
>>> -
>>> Forschungszentrum Juelich GmbH
>>> 52425 Juelich
>>> Sitz der 

[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-13 Thread Xiubo Li



On 13/12/2022 18:57, Stolte, Felix wrote:

Hi Xiubo,

Thx for pointing me into the right direction. All involved esx host 
seem to use the correct policy. I am going to detach the LUN on each 
host one by one until i found the host causing the problem.



From the logs it means the client was switching the path in turn.

BTW, what's policy are you using ?

Thanks

- Xiubo


Regards Felix
-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-


Am 12.12.2022 um 13:03 schrieb Xiubo Li :

Hi Stolte,

For the VMware config could you refer to : 
https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?


What's the "Path Selection Policy with ALUA" you are using ? The 
ceph-iscsi couldn't implement the real AA, so if you use the RR I 
think it will be like this.


- Xiubo

On 12/12/2022 17:45, Stolte, Felix wrote:

Hi guys,

we are using ceph-iscsi to provide block storage for Microsoft Exchange and 
vmware vsphere. Ceph docs state that you need to configure Windows iSCSI 
Initatior for fail-over-only but there is no such point for vmware. In my 
tcmu-runner logs on both ceph-iscsi gateways I see the following:

2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.

At the same time there are these log entries in ceph.audit.logs:
2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : audit [INF] from='client.? 10.100.8.55:0/2392201639' 
entity='client.admin' cmd=[{"prefix": "osd blocklist", "blocklistop": "add", "addr": 
"10
.100.8.56:0/1598475844"}]: dispatch
2022-12-12T10:36:06.731913+0100 mon.mon-e2-1 (mon.0) 783726 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": 
"osd blocklist", "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}]
: dispatch
2022-12-12T10:36:06.905082+0100 mon.mon-e2-1 (mon.0) 783727 : audit [INF] from='client.? ' entity='client.admin' cmd='[{"prefix": 
"osd blocklist", "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}
]': finished

Can someone explaint to me, what is happening? Why are the gateways blacklisting each 
other? All involved daemons are running Version 16.2.10. ceph-iscsi gateways are 
running on Ubuntu 20.04 with ceph-isci package from the Ubuntu repo (all other 
packers came directly from ceph.com)


regards Felix

-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-


___
ceph-users mailing list --ceph-users@ceph.io
To unsubscribe send an email toceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-13 Thread Stolte, Felix
Hi Xiubo,

Thx for pointing me into the right direction. All involved esx host seem to use 
the correct policy. I am going to detach the LUN on each host one by one until 
i found the host causing the problem.

Regards Felix
-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-

Am 12.12.2022 um 13:03 schrieb Xiubo Li :


Hi Stolte,

For the VMware config could you refer to : 
https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?

What's the "Path Selection Policy with ALUA" you are using ? The ceph-iscsi 
couldn't implement the real AA, so if you use the RR I think it will be like 
this.

- Xiubo

On 12/12/2022 17:45, Stolte, Felix wrote:

Hi guys,

we are using ceph-iscsi to provide block storage for Microsoft Exchange and 
vmware vsphere. Ceph docs state that you need to configure Windows iSCSI 
Initatior for fail-over-only but there is no such point for vmware. In my 
tcmu-runner logs on both ceph-iscsi gateways I see the following:

2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.

At the same time there are these log entries in ceph.audit.logs:
2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : audit [INF] 
from='client.? 10.100.8.55:0/2392201639' entity='client.admin' cmd=[{"prefix": 
"osd blocklist", "blocklistop": "add", "addr": "10
.100.8.56:0/1598475844"}]: dispatch
2022-12-12T10:36:06.731913+0100 mon.mon-e2-1 (mon.0) 783726 : audit [INF] 
from='client.? ' entity='client.admin' cmd=[{"prefix": "osd blocklist", 
"blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}]
: dispatch
2022-12-12T10:36:06.905082+0100 mon.mon-e2-1 (mon.0) 783727 : audit [INF] 
from='client.? ' entity='client.admin' cmd='[{"prefix": "osd blocklist", 
"blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}
]': finished

Can someone explaint to me, what is happening? Why are the gateways 
blacklisting each other? All involved daemons are running Version 16.2.10. 
ceph-iscsi gateways are running on Ubuntu 20.04 with ceph-isci package from the 
Ubuntu repo (all other packers came directly from ceph.com 
)


regards Felix

-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to 
ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-12 Thread Xiubo Li

Hi Stolte,

For the VMware config could you refer to : 
https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?


What's the "Path Selection Policy with ALUA" you are using ? The 
ceph-iscsi couldn't implement the real AA, so if you use the RR I think 
it will be like this.


- Xiubo

On 12/12/2022 17:45, Stolte, Felix wrote:

Hi guys,

we are using ceph-iscsi to provide block storage for Microsoft Exchange and 
vmware vsphere. Ceph docs state that you need to configure Windows iSCSI 
Initatior for fail-over-only but there is no such point for vmware. In my 
tcmu-runner logs on both ceph-iscsi gateways I see the following:

2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.

At the same time there are these log entries in ceph.audit.logs:
2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : audit [INF] from='client.? 10.100.8.55:0/2392201639' 
entity='client.admin' cmd=[{"prefix": "osd blocklist", "blocklistop": "add", "addr": 
"10
.100.8.56:0/1598475844"}]: dispatch
2022-12-12T10:36:06.731913+0100 mon.mon-e2-1 (mon.0) 783726 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": 
"osd blocklist", "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}]
: dispatch
2022-12-12T10:36:06.905082+0100 mon.mon-e2-1 (mon.0) 783727 : audit [INF] from='client.? ' entity='client.admin' cmd='[{"prefix": 
"osd blocklist", "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}
]': finished

Can someone explaint to me, what is happening? Why are the gateways blacklisting each 
other? All involved daemons are running Version 16.2.10. ceph-iscsi gateways are 
running on Ubuntu 20.04 with ceph-isci package from the Ubuntu repo (all other 
packers came directly from ceph.com )


regards Felix

-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
-
-


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io