Re: [ceph-users] cephfs kernel client instability

2019-01-28 Thread Martin Palma
Upgrading to 4.15.0-43-generic fixed the problem.

Best,
Martin

On Fri, Jan 25, 2019 at 9:43 PM Ilya Dryomov  wrote:
>
> On Fri, Jan 25, 2019 at 9:40 AM Martin Palma  wrote:
> >
> > > Do you see them repeating every 30 seconds?
> >
> > yes:
> >
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.737615] libceph: mon4
> > 10.8.55.203:6789 session lost, hunting for new mon
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.737620] libceph: mon3
> > 10.8.55.202:6789 session lost, hunting for new mon
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.737728] libceph: mon2
> > 10.8.55.201:6789 session lost, hunting for new mon
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.739711] libceph: mon1
> > 10.7.55.202:6789 session established
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.739899] libceph: mon1
> > 10.7.55.202:6789 session established
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.740015] libceph: mon3
> > 10.8.55.202:6789 session established
> > Jan 25 09:34:43 sdccgw01 kernel: [6306819.881560] libceph: mon2
> > 10.8.55.201:6789 session lost, hunting for new mon
> > Jan 25 09:34:43 sdccgw01 kernel: [6306819.883730] libceph: mon4
> > 10.8.55.203:6789 session established
> > Jan 25 09:34:47 sdccgw01 kernel: [6306823.977566] libceph: mon0
> > 10.7.55.201:6789 session lost, hunting for new mon
> > Jan 25 09:34:47 sdccgw01 kernel: [6306823.980033] libceph: mon1
> > 10.7.55.202:6789 session established
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.457449] libceph: mon1
> > 10.7.55.202:6789 session lost, hunting for new mon
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.457450] libceph: mon3
> > 10.8.55.202:6789 session lost, hunting for new mon
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.457612] libceph: mon1
> > 10.7.55.202:6789 session lost, hunting for new mon
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.459168] libceph: mon3
> > 10.8.55.202:6789 session established
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.459537] libceph: mon4
> > 10.8.55.203:6789 session established
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.459792] libceph: mon4
> > 10.8.55.203:6789 session established
> >
> > > Which kernel are you running?
> >
> > Current running kernel is 4.11.0-13-generic  (Ubuntu 16.04.5 LTS), and
> > the latest that is provided is  4.15.0-43-generic
>
> Looks like https://tracker.ceph.com/issues/23537 indeed.  A kernel
> upgrade will fix it.
>
> Thanks,
>
> Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-25 Thread Ilya Dryomov
On Fri, Jan 25, 2019 at 9:40 AM Martin Palma  wrote:
>
> > Do you see them repeating every 30 seconds?
>
> yes:
>
> Jan 25 09:34:37 sdccgw01 kernel: [6306813.737615] libceph: mon4
> 10.8.55.203:6789 session lost, hunting for new mon
> Jan 25 09:34:37 sdccgw01 kernel: [6306813.737620] libceph: mon3
> 10.8.55.202:6789 session lost, hunting for new mon
> Jan 25 09:34:37 sdccgw01 kernel: [6306813.737728] libceph: mon2
> 10.8.55.201:6789 session lost, hunting for new mon
> Jan 25 09:34:37 sdccgw01 kernel: [6306813.739711] libceph: mon1
> 10.7.55.202:6789 session established
> Jan 25 09:34:37 sdccgw01 kernel: [6306813.739899] libceph: mon1
> 10.7.55.202:6789 session established
> Jan 25 09:34:37 sdccgw01 kernel: [6306813.740015] libceph: mon3
> 10.8.55.202:6789 session established
> Jan 25 09:34:43 sdccgw01 kernel: [6306819.881560] libceph: mon2
> 10.8.55.201:6789 session lost, hunting for new mon
> Jan 25 09:34:43 sdccgw01 kernel: [6306819.883730] libceph: mon4
> 10.8.55.203:6789 session established
> Jan 25 09:34:47 sdccgw01 kernel: [6306823.977566] libceph: mon0
> 10.7.55.201:6789 session lost, hunting for new mon
> Jan 25 09:34:47 sdccgw01 kernel: [6306823.980033] libceph: mon1
> 10.7.55.202:6789 session established
> Jan 25 09:35:07 sdccgw01 kernel: [6306844.457449] libceph: mon1
> 10.7.55.202:6789 session lost, hunting for new mon
> Jan 25 09:35:07 sdccgw01 kernel: [6306844.457450] libceph: mon3
> 10.8.55.202:6789 session lost, hunting for new mon
> Jan 25 09:35:07 sdccgw01 kernel: [6306844.457612] libceph: mon1
> 10.7.55.202:6789 session lost, hunting for new mon
> Jan 25 09:35:07 sdccgw01 kernel: [6306844.459168] libceph: mon3
> 10.8.55.202:6789 session established
> Jan 25 09:35:07 sdccgw01 kernel: [6306844.459537] libceph: mon4
> 10.8.55.203:6789 session established
> Jan 25 09:35:07 sdccgw01 kernel: [6306844.459792] libceph: mon4
> 10.8.55.203:6789 session established
>
> > Which kernel are you running?
>
> Current running kernel is 4.11.0-13-generic  (Ubuntu 16.04.5 LTS), and
> the latest that is provided is  4.15.0-43-generic

Looks like https://tracker.ceph.com/issues/23537 indeed.  A kernel
upgrade will fix it.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-25 Thread Martin Palma
> Do you see them repeating every 30 seconds?

yes:

Jan 25 09:34:37 sdccgw01 kernel: [6306813.737615] libceph: mon4
10.8.55.203:6789 session lost, hunting for new mon
Jan 25 09:34:37 sdccgw01 kernel: [6306813.737620] libceph: mon3
10.8.55.202:6789 session lost, hunting for new mon
Jan 25 09:34:37 sdccgw01 kernel: [6306813.737728] libceph: mon2
10.8.55.201:6789 session lost, hunting for new mon
Jan 25 09:34:37 sdccgw01 kernel: [6306813.739711] libceph: mon1
10.7.55.202:6789 session established
Jan 25 09:34:37 sdccgw01 kernel: [6306813.739899] libceph: mon1
10.7.55.202:6789 session established
Jan 25 09:34:37 sdccgw01 kernel: [6306813.740015] libceph: mon3
10.8.55.202:6789 session established
Jan 25 09:34:43 sdccgw01 kernel: [6306819.881560] libceph: mon2
10.8.55.201:6789 session lost, hunting for new mon
Jan 25 09:34:43 sdccgw01 kernel: [6306819.883730] libceph: mon4
10.8.55.203:6789 session established
Jan 25 09:34:47 sdccgw01 kernel: [6306823.977566] libceph: mon0
10.7.55.201:6789 session lost, hunting for new mon
Jan 25 09:34:47 sdccgw01 kernel: [6306823.980033] libceph: mon1
10.7.55.202:6789 session established
Jan 25 09:35:07 sdccgw01 kernel: [6306844.457449] libceph: mon1
10.7.55.202:6789 session lost, hunting for new mon
Jan 25 09:35:07 sdccgw01 kernel: [6306844.457450] libceph: mon3
10.8.55.202:6789 session lost, hunting for new mon
Jan 25 09:35:07 sdccgw01 kernel: [6306844.457612] libceph: mon1
10.7.55.202:6789 session lost, hunting for new mon
Jan 25 09:35:07 sdccgw01 kernel: [6306844.459168] libceph: mon3
10.8.55.202:6789 session established
Jan 25 09:35:07 sdccgw01 kernel: [6306844.459537] libceph: mon4
10.8.55.203:6789 session established
Jan 25 09:35:07 sdccgw01 kernel: [6306844.459792] libceph: mon4
10.8.55.203:6789 session established

> Which kernel are you running?

Current running kernel is 4.11.0-13-generic  (Ubuntu 16.04.5 LTS), and
the latest that is provided is  4.15.0-43-generic

Best,
Martin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-25 Thread Ilya Dryomov
On Fri, Jan 25, 2019 at 8:37 AM Martin Palma  wrote:
>
> Hi Ilya,
>
> thank you for the clarification. After setting the
> "osd_map_messages_max" to 10 the io errors and the MDS error
> "MDS_CLIENT_LATE_RELEASE" are gone.
>
> The messages of  "mon session lost, hunting for new new mon" didn't go
> away... can it be that this is related to
> https://tracker.ceph.com/issues/23537

Do you see them repeating every 30 seconds?

Which kernel are you running?

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-24 Thread Martin Palma
Hi Ilya,

thank you for the clarification. After setting the
"osd_map_messages_max" to 10 the io errors and the MDS error
"MDS_CLIENT_LATE_RELEASE" are gone.

The messages of  "mon session lost, hunting for new new mon" didn't go
away... can it be that this is related to
https://tracker.ceph.com/issues/23537

Best,
Martin

On Thu, Jan 24, 2019 at 10:16 PM Ilya Dryomov  wrote:
>
> On Thu, Jan 24, 2019 at 6:21 PM Andras Pataki
>  wrote:
> >
> > Hi Ilya,
> >
> > Thanks for the clarification - very helpful.
> > I've lowered osd_map_messages_max to 10, and this resolves the issue
> > about the kernel being unhappy about large messages when the OSDMap
> > changes.  One comment here though: you mentioned that Luminous uses 40
> > as the default, which is indeed the case.  The documentation for
> > Luminous (and master), however, says that the default is 100.
>
> Looks like that page hasn't been kept up to date.  I'll fix that
> section.
>
> >
> > One other follow-up question on the kernel client about something I've
> > been seeing while testing.  Does the kernel client clean up when the MDS
> > asks due to cache pressure?  On a machine I ran something that touches a
> > lot of files, so the kernel client accumulated over 4 million caps.
> > Many hours after all the activity finished (i.e. many hours after
> > anything accesses ceph on that node) the kernel client still holds
> > millions of caps, and the MDS periodically complains about clients not
> > responding to cache pressure.  How is this supposed to be handled?
> > Obviously asking the kernel to drop caches via /proc/sys/vm/drop_caches
> > does a very thorough cleanup, but something in the middle would be better.
>
> The kernel client sitting on way too many caps for way too long is
> a long standing issue.  Adding Zheng who has recently been doing some
> work to facilitate cap releases and put a limit on the overall cap
> count.
>
> Thanks,
>
> Ilya
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-24 Thread Ilya Dryomov
On Thu, Jan 24, 2019 at 6:21 PM Andras Pataki
 wrote:
>
> Hi Ilya,
>
> Thanks for the clarification - very helpful.
> I've lowered osd_map_messages_max to 10, and this resolves the issue
> about the kernel being unhappy about large messages when the OSDMap
> changes.  One comment here though: you mentioned that Luminous uses 40
> as the default, which is indeed the case.  The documentation for
> Luminous (and master), however, says that the default is 100.

Looks like that page hasn't been kept up to date.  I'll fix that
section.

>
> One other follow-up question on the kernel client about something I've
> been seeing while testing.  Does the kernel client clean up when the MDS
> asks due to cache pressure?  On a machine I ran something that touches a
> lot of files, so the kernel client accumulated over 4 million caps.
> Many hours after all the activity finished (i.e. many hours after
> anything accesses ceph on that node) the kernel client still holds
> millions of caps, and the MDS periodically complains about clients not
> responding to cache pressure.  How is this supposed to be handled?
> Obviously asking the kernel to drop caches via /proc/sys/vm/drop_caches
> does a very thorough cleanup, but something in the middle would be better.

The kernel client sitting on way too many caps for way too long is
a long standing issue.  Adding Zheng who has recently been doing some
work to facilitate cap releases and put a limit on the overall cap
count.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-24 Thread Ilya Dryomov
On Thu, Jan 24, 2019 at 8:16 PM Martin Palma  wrote:
>
> We are experiencing the same issues on clients with CephFS mounted
> using the kernel client and 4.x kernels.
>
> The problem  shows up when we add new OSDs, on reboots after
> installing patches and when changing the weight.
>
> Here the logs of a misbehaving client;
>
> [6242967.890611] libceph: mon4 10.8.55.203:6789 session established
> [6242968.010242] libceph: osd534 10.7.55.23:6814 io error
> [6242968.259616] libceph: mon1 10.7.55.202:6789 io error
> [6242968.259658] libceph: mon1 10.7.55.202:6789 session lost, hunting
> for new mon
> [6242968.359031] libceph: mon4 10.8.55.203:6789 session established
> [6242968.622692] libceph: osd534 10.7.55.23:6814 io error
> [6242968.692274] libceph: mon4 10.8.55.203:6789 io error
> [6242968.692337] libceph: mon4 10.8.55.203:6789 session lost, hunting
> for new mon
> [6242968.694216] libceph: mon0 10.7.55.201:6789 session established
> [6242969.099862] libceph: mon0 10.7.55.201:6789 io error
> [6242969.099888] libceph: mon0 10.7.55.201:6789 session lost, hunting
> for new mon
> [6242969.224565] libceph: osd534 10.7.55.23:6814 io error
>
> Additional to the MON io error we also got some OSD io errors.

This isn't surprising -- the kernel client can receive osdmaps from
both monitors and OSDs.

>
> Moreover when the error occurs several clients causes a
> "MDS_CLIENT_LATE_RELEASE" error on the MDS server.
>
> We are currently running on Luminous 12.2.10 and have around 580 OSDs
> and 5 monitor nodes. The cluster is running on CentOS 7.6.
>
> The ‘osd_map_message_max’ setting is set to the default value of 40.
> But we are still getting these errors.

My advise is the same: set it to 20 or even 10.  The problem is this
setting is in terms of the number of osdmaps instead of the size of the
resulting message.  I've filed

  http://tracker.ceph.com/issues/38040

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-24 Thread Andras Pataki

Hi Ilya,

Thanks for the clarification - very helpful.
I've lowered osd_map_messages_max to 10, and this resolves the issue 
about the kernel being unhappy about large messages when the OSDMap 
changes.  One comment here though: you mentioned that Luminous uses 40 
as the default, which is indeed the case.  The documentation for 
Luminous (and master), however, says that the default is 100.


One other follow-up question on the kernel client about something I've 
been seeing while testing.  Does the kernel client clean up when the MDS 
asks due to cache pressure?  On a machine I ran something that touches a 
lot of files, so the kernel client accumulated over 4 million caps.  
Many hours after all the activity finished (i.e. many hours after 
anything accesses ceph on that node) the kernel client still holds 
millions of caps, and the MDS periodically complains about clients not 
responding to cache pressure.  How is this supposed to be handled?  
Obviously asking the kernel to drop caches via /proc/sys/vm/drop_caches 
does a very thorough cleanup, but something in the middle would be better.


Andras


On 1/16/19 1:45 PM, Ilya Dryomov wrote:

On Wed, Jan 16, 2019 at 7:12 PM Andras Pataki
 wrote:

Hi Ilya/Kjetil,

I've done some debugging and tcpdump-ing to see what the interaction
between the kernel client and the mon looks like.  Indeed -
CEPH_MSG_MAX_FRONT defined as 16Mb seems low for the default mon
messages for our cluster (with osd_mon_messages_max at 100).  We have
about 3500 osd's, and the kernel advertises itself as older than

This is too big, especially for a fairly large cluster such as yours.
The default was reduced to 40 in luminous.  Given about 3500 OSDs, you
might want to set it to 20 or even 10.


Luminous, so it gets full map updates.  The FRONT message size on the
wire I saw was over 24Mb.  I'll try setting osd_mon_messages_max to 30
and do some more testing, but from the debugging it definitely seems
like the issue.

Is the kernel driver really not up to date to be considered at least a
Luminous client by the mon (i.e. it has some feature really missing)?  I
looked at the bits, and the MON seems to want is bit 59 in ceph features
shared by FS_BTIME, FS_CHANGE_ATTR, MSG_ADDR2.  Can the kernel client be
used when setting require-min-compat to luminous (either with the 4.19.x
kernel or the Redhat/Centos 7.6 kernel)?  Some background here would be
helpful.

Yes, the kernel client is missing support for that feature bit, however
4.13+ and RHEL 7.5+ _can_ be used with require-min-compat-client set to
luminous.  See

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-May/027002.html

Thanks,

 Ilya

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-24 Thread Martin Palma
We are experiencing the same issues on clients with CephFS mounted
using the kernel client and 4.x kernels.

The problem  shows up when we add new OSDs, on reboots after
installing patches and when changing the weight.

Here the logs of a misbehaving client;

[6242967.890611] libceph: mon4 10.8.55.203:6789 session established
[6242968.010242] libceph: osd534 10.7.55.23:6814 io error
[6242968.259616] libceph: mon1 10.7.55.202:6789 io error
[6242968.259658] libceph: mon1 10.7.55.202:6789 session lost, hunting
for new mon
[6242968.359031] libceph: mon4 10.8.55.203:6789 session established
[6242968.622692] libceph: osd534 10.7.55.23:6814 io error
[6242968.692274] libceph: mon4 10.8.55.203:6789 io error
[6242968.692337] libceph: mon4 10.8.55.203:6789 session lost, hunting
for new mon
[6242968.694216] libceph: mon0 10.7.55.201:6789 session established
[6242969.099862] libceph: mon0 10.7.55.201:6789 io error
[6242969.099888] libceph: mon0 10.7.55.201:6789 session lost, hunting
for new mon
[6242969.224565] libceph: osd534 10.7.55.23:6814 io error

Additional to the MON io error we also got some OSD io errors.

Moreover when the error occurs several clients causes a
"MDS_CLIENT_LATE_RELEASE" error on the MDS server.

We are currently running on Luminous 12.2.10 and have around 580 OSDs
and 5 monitor nodes. The cluster is running on CentOS 7.6.

The ‘osd_map_message_max’ setting is set to the default value of 40.
But we are still getting these errors.

Best,
Martin


On Wed, Jan 16, 2019 at 7:46 PM Ilya Dryomov  wrote:
>
> On Wed, Jan 16, 2019 at 7:12 PM Andras Pataki
>  wrote:
> >
> > Hi Ilya/Kjetil,
> >
> > I've done some debugging and tcpdump-ing to see what the interaction
> > between the kernel client and the mon looks like.  Indeed -
> > CEPH_MSG_MAX_FRONT defined as 16Mb seems low for the default mon
> > messages for our cluster (with osd_mon_messages_max at 100).  We have
> > about 3500 osd's, and the kernel advertises itself as older than
>
> This is too big, especially for a fairly large cluster such as yours.
> The default was reduced to 40 in luminous.  Given about 3500 OSDs, you
> might want to set it to 20 or even 10.
>
> > Luminous, so it gets full map updates.  The FRONT message size on the
> > wire I saw was over 24Mb.  I'll try setting osd_mon_messages_max to 30
> > and do some more testing, but from the debugging it definitely seems
> > like the issue.
> >
> > Is the kernel driver really not up to date to be considered at least a
> > Luminous client by the mon (i.e. it has some feature really missing)?  I
> > looked at the bits, and the MON seems to want is bit 59 in ceph features
> > shared by FS_BTIME, FS_CHANGE_ATTR, MSG_ADDR2.  Can the kernel client be
> > used when setting require-min-compat to luminous (either with the 4.19.x
> > kernel or the Redhat/Centos 7.6 kernel)?  Some background here would be
> > helpful.
>
> Yes, the kernel client is missing support for that feature bit, however
> 4.13+ and RHEL 7.5+ _can_ be used with require-min-compat-client set to
> luminous.  See
>
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-May/027002.html
>
> Thanks,
>
> Ilya
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-16 Thread Ilya Dryomov
On Wed, Jan 16, 2019 at 7:12 PM Andras Pataki
 wrote:
>
> Hi Ilya/Kjetil,
>
> I've done some debugging and tcpdump-ing to see what the interaction
> between the kernel client and the mon looks like.  Indeed -
> CEPH_MSG_MAX_FRONT defined as 16Mb seems low for the default mon
> messages for our cluster (with osd_mon_messages_max at 100).  We have
> about 3500 osd's, and the kernel advertises itself as older than

This is too big, especially for a fairly large cluster such as yours.
The default was reduced to 40 in luminous.  Given about 3500 OSDs, you
might want to set it to 20 or even 10.

> Luminous, so it gets full map updates.  The FRONT message size on the
> wire I saw was over 24Mb.  I'll try setting osd_mon_messages_max to 30
> and do some more testing, but from the debugging it definitely seems
> like the issue.
>
> Is the kernel driver really not up to date to be considered at least a
> Luminous client by the mon (i.e. it has some feature really missing)?  I
> looked at the bits, and the MON seems to want is bit 59 in ceph features
> shared by FS_BTIME, FS_CHANGE_ATTR, MSG_ADDR2.  Can the kernel client be
> used when setting require-min-compat to luminous (either with the 4.19.x
> kernel or the Redhat/Centos 7.6 kernel)?  Some background here would be
> helpful.

Yes, the kernel client is missing support for that feature bit, however
4.13+ and RHEL 7.5+ _can_ be used with require-min-compat-client set to
luminous.  See

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-May/027002.html

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-16 Thread Andras Pataki

Hi Ilya/Kjetil,

I've done some debugging and tcpdump-ing to see what the interaction 
between the kernel client and the mon looks like.  Indeed - 
CEPH_MSG_MAX_FRONT defined as 16Mb seems low for the default mon 
messages for our cluster (with osd_mon_messages_max at 100).  We have 
about 3500 osd's, and the kernel advertises itself as older than 
Luminous, so it gets full map updates.  The FRONT message size on the 
wire I saw was over 24Mb.  I'll try setting osd_mon_messages_max to 30 
and do some more testing, but from the debugging it definitely seems 
like the issue.


Is the kernel driver really not up to date to be considered at least a 
Luminous client by the mon (i.e. it has some feature really missing)?  I 
looked at the bits, and the MON seems to want is bit 59 in ceph features 
shared by FS_BTIME, FS_CHANGE_ATTR, MSG_ADDR2.  Can the kernel client be 
used when setting require-min-compat to luminous (either with the 4.19.x 
kernel or the Redhat/Centos 7.6 kernel)?  Some background here would be 
helpful.


Thanks for your help and suggestions!

Andras


On 1/16/19 5:20 AM, Ilya Dryomov wrote:

On Wed, Jan 16, 2019 at 1:27 AM Kjetil Joergensen  wrote:

Hi,

you could try reducing "osd map message max", some code paths that end up as 
-EIO (kernel: libceph: mon1 *** io error) is exceeding 
include/linux/ceph/libceph.h:CEPH_MSG_MAX_{FRONT,MIDDLE,DATA}_LEN.

This "worked for us" - YMMV.

Kjetil, how big is your cluster?  Do you remember the circumstances
of when you started seeing these errors?

Andras, please let us know if this resolves the issue.  Decreasing
"osd map message max" for large clusters can help with the overall
memory consumption and is probably a good idea in general, but then
these kernel client limits are pretty arbitrary, so we can look at
bumping them.

Thanks,

 Ilya

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-16 Thread Ilya Dryomov
On Wed, Jan 16, 2019 at 1:27 AM Kjetil Joergensen  wrote:
>
> Hi,
>
> you could try reducing "osd map message max", some code paths that end up as 
> -EIO (kernel: libceph: mon1 *** io error) is exceeding 
> include/linux/ceph/libceph.h:CEPH_MSG_MAX_{FRONT,MIDDLE,DATA}_LEN.
>
> This "worked for us" - YMMV.

Kjetil, how big is your cluster?  Do you remember the circumstances
of when you started seeing these errors?

Andras, please let us know if this resolves the issue.  Decreasing
"osd map message max" for large clusters can help with the overall
memory consumption and is probably a good idea in general, but then
these kernel client limits are pretty arbitrary, so we can look at
bumping them.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs kernel client instability

2019-01-15 Thread Kjetil Joergensen
Hi,

you could try reducing "osd map message max", some code paths that end up
as -EIO (kernel: libceph: mon1 *** io error) is exceeding
include/linux/ceph/libceph.h:CEPH_MSG_MAX_{FRONT,MIDDLE,DATA}_LEN.

This "worked for us" - YMMV.

-KJ

On Tue, Jan 15, 2019 at 6:14 AM Andras Pataki 
wrote:

> An update on our cephfs kernel client troubles.  After doing some heavier
> testing with a newer kernel 4.19.13, it seems like it also gets into a bad
> state when it can't connect to monitors (all back end processes are on
> 12.2.8):
>
> Jan 15 08:49:00 mon5 kernel: libceph: mon1 10.128.150.11:6789 session
> established
> Jan 15 08:49:01 mon5 kernel: libceph: mon1 10.128.150.11:6789 io error
> Jan 15 08:49:01 mon5 kernel: libceph: mon1 10.128.150.11:6789 session
> lost, hunting for new mon
> Jan 15 08:49:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 session
> established
> Jan 15 08:49:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 io error
> Jan 15 08:49:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 session
> lost, hunting for new mon
> Jan 15 08:49:04 mon5 kernel: libceph: mon1 10.128.150.11:6789 session
> established
> Jan 15 08:49:04 mon5 kernel: libceph: mon1 10.128.150.11:6789 io error
> Jan 15 08:49:04 mon5 kernel: libceph: mon1 10.128.150.11:6789 session
> lost, hunting for new mon
> Jan 15 08:49:06 mon5 kernel: libceph: mon2 10.128.150.12:6789 session
> established
> Jan 15 08:49:07 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
> Jan 15 08:49:07 mon5 kernel: libceph: mon2 10.128.150.12:6789 session
> lost, hunting for new mon
> ... repeating forever ...
>
> # uname -r
> 4.19.13
>
> and on the mon node (10.128.150.10) at log level 20, I see that it is
> building/encoding a lot of maps (10.128.36.18 is the client in question):
> 2019-01-15 08:49:01.355017 7fffee40c700 10 mon.cephmon00@0(leader) e40
> _ms_dispatch new session 0x62dc6c00 MonSession(client.36800361
> 10.128.36.18:0/2911716500 is open , features 0x27018fb86aa42ada (jewel))
> features 0x27018fb86aa42ada
> 2019-01-15 08:49:01.355021 7fffee40c700 20 mon.cephmon00@0(leader) e40
> caps
> 2019-01-15 08:49:01.355026 7fffee40c700 10 mon.cephmon00@0(leader).auth
> v58457 preprocess_query auth(proto 0 34 bytes epoch 0) from client.36800361
> 10.128.36.18:0/2911716500
> 
> 2019-01-15 08:49:01.355817 7fffee40c700 10 mon.cephmon00@0(leader).osd
> e1254390 check_osdmap_sub 0x65373340 next 1254102 (onetime)
> 2019-01-15 08:49:01.355819 7fffee40c700  5 mon.cephmon00@0(leader).osd
> e1254390 send_incremental [1254102..1254390] to client.36800361
> 10.128.36.18:0/2911716500
> 2019-01-15 08:49:01.355821 7fffee40c700 10 mon.cephmon00@0(leader).osd
> e1254390 build_incremental [1254102..1254141] with features 27018fb86aa42ada
> 2019-01-15 08:49:01.364859 7fffee40c700 20 mon.cephmon00@0(leader).osd
> e1254390 reencode_incremental_map 1254141 with features 504412504116439552
> 2019-01-15 08:49:01.372131 7fffee40c700 20 mon.cephmon00@0(leader).osd
> e1254390 build_incrementalinc 1254141 1237271 bytes
> 2019-01-15 08:49:01.372180 7fffee40c700 20 mon.cephmon00@0(leader).osd
> e1254390 reencode_incremental_map 1254140 with features 504412504116439552
> 2019-01-15 08:49:01.372187 7fffee40c700 20 mon.cephmon00@0(leader).osd
> e1254390 build_incrementalinc 1254140 260 bytes
> 2019-01-15 08:49:01.380981 7fffee40c700 20 mon.cephmon00@0(leader).osd
> e1254390 reencode_incremental_map 1254139 with features 504412504116439552
> 2019-01-15 08:49:01.387983 7fffee40c700 20 mon.cephmon00@0(leader).osd
> e1254390 build_incrementalinc 1254139 1237351 bytes
> 2019-01-15 08:49:01.388043 7fffee40c700 20 mon.cephmon00@0(leader).osd
> e1254390 reencode_incremental_map 1254138 with features 504412504116439552
> 2019-01-15 08:49:01.388049 7fffee40c700 20 mon.cephmon00@0(leader).osd
> e1254390 build_incrementalinc 1254138 232 bytes
> 2019-01-15 08:49:01.396781 7fffee40c700 20 mon.cephmon00@0(leader).osd
> e1254390 reencode_incremental_map 1254137 with features 504412504116439552
>  ... a lot more of similar messages
> 2019-01-15 08:49:04.210936 7fffee40c700 20 mon.cephmon00@0(leader).osd
> e1254390 reencode_incremental_map 1254382 with features 504412504116439552
> 2019-01-15 08:49:04.211032 7fffee40c700 20 mon.cephmon00@0(leader).osd
> e1254390 build_incrementalinc 1254382 232 bytes
> 2019-01-15 08:49:04.211066 7fffee40c700 10 mon.cephmon00@0(leader) e40
> ms_handle_reset 0x6450f800 10.128.36.18:0/2911716500
> 2019-01-15 08:49:04.211070 7fffee40c700 10 mon.cephmon00@0(leader) e40
> reset/close on session client.36800361 10.128.36.18:0/2911716500
> 2019-01-15 08:49:04.211073 7fffee40c700 10 mon.cephmon00@0(leader) e40
> remove_session 0x62dc6c00 client.36800361 10.128.36.18:0/2911716500
> features 0x27018fb86aa42ada
>
> looks like the client disconnects (either for waiting too long, or for
> some protocol error?).  Any hints on why so many maps need to be reencoded
> (to jewel), or how to improve this behavior would be much appreciated.  

Re: [ceph-users] cephfs kernel client instability

2019-01-15 Thread Andras Pataki
An update on our cephfs kernel client troubles.  After doing some 
heavier testing with a newer kernel 4.19.13, it seems like it also gets 
into a bad state when it can't connect to monitors (all back end 
processes are on 12.2.8):


Jan 15 08:49:00 mon5 kernel: libceph: mon1 10.128.150.11:6789 session 
established

Jan 15 08:49:01 mon5 kernel: libceph: mon1 10.128.150.11:6789 io error
Jan 15 08:49:01 mon5 kernel: libceph: mon1 10.128.150.11:6789 session 
lost, hunting for new mon
Jan 15 08:49:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 session 
established

Jan 15 08:49:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 io error
Jan 15 08:49:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 session 
lost, hunting for new mon
Jan 15 08:49:04 mon5 kernel: libceph: mon1 10.128.150.11:6789 session 
established

Jan 15 08:49:04 mon5 kernel: libceph: mon1 10.128.150.11:6789 io error
Jan 15 08:49:04 mon5 kernel: libceph: mon1 10.128.150.11:6789 session 
lost, hunting for new mon
Jan 15 08:49:06 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
established

Jan 15 08:49:07 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
Jan 15 08:49:07 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
lost, hunting for new mon

... repeating forever ...

# uname -r
4.19.13

and on the mon node (10.128.150.10) at log level 20, I see that it is 
building/encoding a lot of maps (10.128.36.18 is the client in question):
2019-01-15 08:49:01.355017 7fffee40c700 10 mon.cephmon00@0(leader) e40 
_ms_dispatch new session 0x62dc6c00 MonSession(client.36800361 
10.128.36.18:0/2911716500 is open , features 0x27018fb86aa42ada (jewel)) 
features 0x27018fb86aa42ada
2019-01-15 08:49:01.355021 7fffee40c700 20 mon.cephmon00@0(leader) e40  
caps
2019-01-15 08:49:01.355026 7fffee40c700 10 mon.cephmon00@0(leader).auth 
v58457 preprocess_query auth(proto 0 34 bytes epoch 0) from 
client.36800361 10.128.36.18:0/2911716500


2019-01-15 08:49:01.355817 7fffee40c700 10 mon.cephmon00@0(leader).osd 
e1254390 check_osdmap_sub 0x65373340 next 1254102 (onetime)
2019-01-15 08:49:01.355819 7fffee40c700  5 mon.cephmon00@0(leader).osd 
e1254390 send_incremental [1254102..1254390] to client.36800361 
10.128.36.18:0/2911716500
2019-01-15 08:49:01.355821 7fffee40c700 10 mon.cephmon00@0(leader).osd 
e1254390 build_incremental [1254102..1254141] with features 27018fb86aa42ada
2019-01-15 08:49:01.364859 7fffee40c700 20 mon.cephmon00@0(leader).osd 
e1254390 reencode_incremental_map 1254141 with features 504412504116439552
2019-01-15 08:49:01.372131 7fffee40c700 20 mon.cephmon00@0(leader).osd 
e1254390 build_incremental    inc 1254141 1237271 bytes
2019-01-15 08:49:01.372180 7fffee40c700 20 mon.cephmon00@0(leader).osd 
e1254390 reencode_incremental_map 1254140 with features 504412504116439552
2019-01-15 08:49:01.372187 7fffee40c700 20 mon.cephmon00@0(leader).osd 
e1254390 build_incremental    inc 1254140 260 bytes
2019-01-15 08:49:01.380981 7fffee40c700 20 mon.cephmon00@0(leader).osd 
e1254390 reencode_incremental_map 1254139 with features 504412504116439552
2019-01-15 08:49:01.387983 7fffee40c700 20 mon.cephmon00@0(leader).osd 
e1254390 build_incremental    inc 1254139 1237351 bytes
2019-01-15 08:49:01.388043 7fffee40c700 20 mon.cephmon00@0(leader).osd 
e1254390 reencode_incremental_map 1254138 with features 504412504116439552
2019-01-15 08:49:01.388049 7fffee40c700 20 mon.cephmon00@0(leader).osd 
e1254390 build_incremental    inc 1254138 232 bytes
2019-01-15 08:49:01.396781 7fffee40c700 20 mon.cephmon00@0(leader).osd 
e1254390 reencode_incremental_map 1254137 with features 504412504116439552

 ... a lot more of similar messages
2019-01-15 08:49:04.210936 7fffee40c700 20 mon.cephmon00@0(leader).osd 
e1254390 reencode_incremental_map 1254382 with features 504412504116439552
2019-01-15 08:49:04.211032 7fffee40c700 20 mon.cephmon00@0(leader).osd 
e1254390 build_incremental    inc 1254382 232 bytes
2019-01-15 08:49:04.211066 7fffee40c700 10 mon.cephmon00@0(leader) e40 
ms_handle_reset 0x6450f800 10.128.36.18:0/2911716500
2019-01-15 08:49:04.211070 7fffee40c700 10 mon.cephmon00@0(leader) e40 
reset/close on session client.36800361 10.128.36.18:0/2911716500
2019-01-15 08:49:04.211073 7fffee40c700 10 mon.cephmon00@0(leader) e40 
remove_session 0x62dc6c00 client.36800361 10.128.36.18:0/2911716500 
features 0x27018fb86aa42ada


looks like the client disconnects (either for waiting too long, or for 
some protocol error?).  Any hints on why so many maps need to be 
reencoded (to jewel), or how to improve this behavior would be much 
appreciated.  We would really be interested in using the kernel client 
instead of fuse, but this seems to be a stumbling block.


Thanks,

Andras


On 1/3/19 6:49 AM, Andras Pataki wrote:
I wonder if anyone could offer any insight on the issue below, 
regarding the CentOS 7.6 kernel cephfs client connecting to a Luminous 
cluster.  I have since tried a much newer 4.19.13 kernel, which did 
not show the same issue (but 

Re: [ceph-users] cephfs kernel client instability

2019-01-03 Thread Andras Pataki
I wonder if anyone could offer any insight on the issue below, regarding 
the CentOS 7.6 kernel cephfs client connecting to a Luminous cluster.  I 
have since tried a much newer 4.19.13 kernel, which did not show the 
same issue (but unfortunately for various reasons unrelated to ceph, we 
can't go to such a new kernel).


Am I reading it right that somehow the monitor thinks this kernel is old 
and needs to prepare special maps in some older format for it, and that 
takes too long and the kernel just gives up, or perhaps has some other 
communication protocol error?  It seems like one of these mon 
communication sessions only lasts half a second.  Then it reconnects to 
another mon, and gets the same result, etc.  Any way around this?


Andras


On 12/26/18 7:55 PM, Andras Pataki wrote:
We've been using ceph-fuse with a pretty good stability record 
(against the Luminous 12.2.8 back end).  Unfortunately ceph-fuse has 
extremely poor small file performance (understandably), so we've been 
testing the kernel client.  The latest RedHat kernel 
3.10.0-957.1.3.el7.x86_64 seems to work pretty well, as long as the 
cluster is running in a completely clean state.  However, it seems 
that as soon as there is something happening to the cluster, the 
kernel client crashes pretty badly.


Today's example: I've reweighted some OSDs to balance the disk usage a 
bit (set nobackfill, reweight the OSDs, check the new hypothetical 
space usage, then unset nobackfill).   As soon as the reweighting 
procedure started, the kernel client went into an infinite loop trying 
to unsuccessfully connect to mons:


Dec 26 19:28:53 mon5 kernel: libceph: mon0 10.128.150.10:6789 io error
Dec 26 19:28:53 mon5 kernel: libceph: mon0 10.128.150.10:6789 session 
lost, hunting for new mon
Dec 26 19:28:53 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
established

Dec 26 19:28:58 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
Dec 26 19:28:58 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
lost, hunting for new mon
Dec 26 19:28:58 mon5 kernel: libceph: mon1 10.128.150.11:6789 session 
established

Dec 26 19:28:59 mon5 kernel: libceph: mon1 10.128.150.11:6789 io error
Dec 26 19:28:59 mon5 kernel: libceph: mon1 10.128.150.11:6789 session 
lost, hunting for new mon
Dec 26 19:28:59 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
established

Dec 26 19:28:59 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
Dec 26 19:28:59 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
lost, hunting for new mon
Dec 26 19:28:59 mon5 kernel: libceph: mon0 10.128.150.10:6789 session 
established

Dec 26 19:29:00 mon5 kernel: libceph: mon0 10.128.150.10:6789 io error
Dec 26 19:29:00 mon5 kernel: libceph: mon0 10.128.150.10:6789 session 
lost, hunting for new mon
Dec 26 19:29:00 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
established

Dec 26 19:29:00 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
Dec 26 19:29:00 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
lost, hunting for new mon
Dec 26 19:29:00 mon5 kernel: libceph: mon1 10.128.150.11:6789 session 
established

Dec 26 19:29:00 mon5 kernel: libceph: mon1 10.128.150.11:6789 io error
Dec 26 19:29:00 mon5 kernel: libceph: mon1 10.128.150.11:6789 session 
lost, hunting for new mon
Dec 26 19:29:00 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
established

Dec 26 19:29:01 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
Dec 26 19:29:01 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
lost, hunting for new mon
Dec 26 19:29:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 session 
established

Dec 26 19:29:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 io error
Dec 26 19:29:01 mon5 kernel: libceph: mon0 10.128.150.10:6789 session 
lost, hunting for new mon
Dec 26 19:29:01 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
established

Dec 26 19:29:02 mon5 kernel: libceph: mon2 10.128.150.12:6789 io error
Dec 26 19:29:02 mon5 kernel: libceph: mon2 10.128.150.12:6789 session 
lost, hunting for new mon
Dec 26 19:29:02 mon5 kernel: libceph: mon1 10.128.150.11:6789 session 
established

Dec 26 19:29:02 mon5 kernel: libceph: mon1 10.128.150.11:6789 io error
Dec 26 19:29:02 mon5 kernel: libceph: mon1 10.128.150.11:6789 session 
lost, hunting for new mon

... etc ...

seemingly never recovering.  The cluster is healthy, all other clients 
are successfully doing I/O:


[root@cephmon00 ceph]# ceph -s
  cluster:
    id: d7b33135-0940-4e48-8aa6-1d2026597c2f
    health: HEALTH_WARN
    noout flag(s) set
    1 backfillfull osd(s)
    4 pool(s) backfillfull
    239119058/12419244975 objects misplaced (1.925%)

  services:
    mon: 3 daemons, quorum cephmon00,cephmon01,cephmon02
    mgr: cephmon00(active)
    mds: cephfs-1/1/1 up  {0=cephmds00=up:active}, 1 up:standby
    osd: 3534 osds: 3534 up, 3534 in; 5040 remapped pgs
 flags noout