> On 24 Oct 2016, at 17:29, Nick Fisk <n...@fisk.me.uk> wrote:
> 
>> -----Original Message-----
>> From: Yan, Zheng [mailto:uker...@gmail.com]
>> Sent: 24 October 2016 10:19
>> To: Gregory Farnum <gfar...@redhat.com>
>> Cc: Nick Fisk <n...@fisk.me.uk>; Zheng Yan <z...@redhat.com>; Ceph Users 
>> <ceph-users@lists.ceph.com>
>> Subject: Re: [ceph-users] Ceph and TCP States
>> 
>> X-Assp-URIBL failed: 'ceph-users-ceph.com'(black.uribl.com )
>> X-Assp-Spam-Level: *****
>> X-Assp-Envelope-From: uker...@gmail.com
>> X-Assp-Intended-For: n...@fisk.me.uk
>> X-Assp-ID: ASSP.fisk.me.uk (47730-03772)
>> X-Assp-Version: 1.9.1.4(1.0.00)
>> 
>> On Sat, Oct 22, 2016 at 4:14 AM, Gregory Farnum <gfar...@redhat.com> wrote:
>>> On Fri, Oct 21, 2016 at 7:56 AM, Nick Fisk <n...@fisk.me.uk> wrote:
>>>>> -----Original Message-----
>>>>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
>>>>> Behalf Of Haomai Wang
>>>>> Sent: 21 October 2016 15:40
>>>>> To: Nick Fisk <n...@fisk.me.uk>
>>>>> Cc: ceph-users@lists.ceph.com
>>>>> Subject: Re: [ceph-users] Ceph and TCP States
>>>>> 
>>>>> 
>>>>> 
>>>>> On Fri, Oct 21, 2016 at 10:31 PM, Nick Fisk <mailto:n...@fisk.me.uk> 
>>>>> wrote:
>>>>>> -----Original Message-----
>>>>>> From: ceph-users [mailto:mailto:ceph-users-boun...@lists.ceph.com]
>>>>>> On Behalf Of Haomai Wang
>>>>>> Sent: 21 October 2016 15:28
>>>>>> To: Nick Fisk <mailto:n...@fisk.me.uk>
>>>>>> Cc: mailto:ceph-users@lists.ceph.com
>>>>>> Subject: Re: [ceph-users] Ceph and TCP States
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Oct 21, 2016 at 10:19 PM, Nick Fisk 
>>>>>> <mailto:mailto:n...@fisk.me.uk> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> I'm just testing out using a Ceph client in a DMZ behind a FW from
>>>>>> the main Ceph cluster. One thing I have noticed is that if the
>>>>>> state table on the FW is emptied maybe by restarting it or just clearing 
>>>>>> the state table...etc. Then the Ceph client will hang for a
>> long time as the TCP session can no longer pass through the FW and just gets 
>> blocked instead.
>>>>>> 
>>>>>> This "FW" is linux firewall or hardware FW?
>>>>> 
>>>>> PFSense running on dedicated HW. Eventually they will be in a HA pair so 
>>>>> states should persist, but trying to work around this for
>> now.
>>>>> Bit annoying having CephFS lock hard for 15 minutes even though the 
>>>>> network connection only went down for a few seconds.
>>>>> 
>>>>>    hmm, I'm not familiar with this fw. And from my view, whether
>>>>> RST packet sent is decided by FW. But I think you can try 
>>>>> "/proc/sys/net/ipv4/tcp_keepalive_time", if FW reset tcp session, tcp
>> keepalive should detect and send a rst.
>>>> 
>>>> Yeah I think that’s where the problem lies. Most Firewalls tend to 
>>>> silently drop denied packets without sending RST's, so Ceph
>> effectively just thinks that its experiencing packet loss and will never 
>> retry until the 15 minute timeout period is up. Am I right in
>> thinking I can't tune down this parameter for a CephFS kernel client as it 
>> doesn't use the ceph.conf file?
>>> 
>>> The kernel client has a lot of mount options and can be configured in
>>> a few ways via debugfs et al; I think there's a setting for the
>>> timeout as well. If you can't find it, I'm sure Zheng knows. :) -Greg
>> 
>> So far, there is no mount option to control keepalive time for client-to-mds 
>> connection.
> 
> I think, although can't be 100%, that most of the problem is around 
> client<->mon traffic. I'm pretty sure I saw a timeout to one of the mons 
> flash up on the screen just before everything sprung back into life.
> 

which version of kernel are you using? kernel client supports keepalive2 since 
4.3 kernel. keepalive2 is supposed to detect the connection issue.




>> 
>>> 
>>>> 
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> I believe this behaviour can be adjusted by the "ms tcp read
>>>>>> timeout" setting to limit its impact, but wondering if anybody has
>>>>>> any other ideas. I'm also thinking of experimenting with either 
>>>>>> stateless FW rules for Ceph or getting the FW to send back RST
>> packets instead of silently dropping packets.
>>>>>> 
>>>>>> hmm, I think it depends on FW
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Nick
>>>>>> 
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> mailto:mailto:ceph-users@lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> mailto:ceph-users@lists.ceph.com
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>>> 
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to