Upgrading to 4.15.0-43-generic fixed the problem.
Best,
Martin
On Fri, Jan 25, 2019 at 9:43 PM Ilya Dryomov wrote:
>
> On Fri, Jan 25, 2019 at 9:40 AM Martin Palma wrote:
> >
> > > Do you see them repeating every 30 seconds?
> >
> > yes:
> >
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.737615]
On Fri, Jan 25, 2019 at 9:40 AM Martin Palma wrote:
>
> > Do you see them repeating every 30 seconds?
>
> yes:
>
> Jan 25 09:34:37 sdccgw01 kernel: [6306813.737615] libceph: mon4
> 10.8.55.203:6789 session lost, hunting for new mon
> Jan 25 09:34:37 sdccgw01 kernel: [6306813.737620] libceph: mon3
> Do you see them repeating every 30 seconds?
yes:
Jan 25 09:34:37 sdccgw01 kernel: [6306813.737615] libceph: mon4
10.8.55.203:6789 session lost, hunting for new mon
Jan 25 09:34:37 sdccgw01 kernel: [6306813.737620] libceph: mon3
10.8.55.202:6789 session lost, hunting for new mon
Jan 25 09:34:37
On Fri, Jan 25, 2019 at 8:37 AM Martin Palma wrote:
>
> Hi Ilya,
>
> thank you for the clarification. After setting the
> "osd_map_messages_max" to 10 the io errors and the MDS error
> "MDS_CLIENT_LATE_RELEASE" are gone.
>
> The messages of "mon session lost, hunting for new new mon" didn't go
>
Hi Ilya,
thank you for the clarification. After setting the
"osd_map_messages_max" to 10 the io errors and the MDS error
"MDS_CLIENT_LATE_RELEASE" are gone.
The messages of "mon session lost, hunting for new new mon" didn't go
away... can it be that this is related to
https://tracker.ceph.com/is
On Thu, Jan 24, 2019 at 6:21 PM Andras Pataki
wrote:
>
> Hi Ilya,
>
> Thanks for the clarification - very helpful.
> I've lowered osd_map_messages_max to 10, and this resolves the issue
> about the kernel being unhappy about large messages when the OSDMap
> changes. One comment here though: you m
On Thu, Jan 24, 2019 at 8:16 PM Martin Palma wrote:
>
> We are experiencing the same issues on clients with CephFS mounted
> using the kernel client and 4.x kernels.
>
> The problem shows up when we add new OSDs, on reboots after
> installing patches and when changing the weight.
>
> Here the log
Hi Ilya,
Thanks for the clarification - very helpful.
I've lowered osd_map_messages_max to 10, and this resolves the issue
about the kernel being unhappy about large messages when the OSDMap
changes. One comment here though: you mentioned that Luminous uses 40
as the default, which is indeed
We are experiencing the same issues on clients with CephFS mounted
using the kernel client and 4.x kernels.
The problem shows up when we add new OSDs, on reboots after
installing patches and when changing the weight.
Here the logs of a misbehaving client;
[6242967.890611] libceph: mon4 10.8.55.
On Wed, Jan 16, 2019 at 7:12 PM Andras Pataki
wrote:
>
> Hi Ilya/Kjetil,
>
> I've done some debugging and tcpdump-ing to see what the interaction
> between the kernel client and the mon looks like. Indeed -
> CEPH_MSG_MAX_FRONT defined as 16Mb seems low for the default mon
> messages for our clus
Hi Ilya/Kjetil,
I've done some debugging and tcpdump-ing to see what the interaction
between the kernel client and the mon looks like. Indeed -
CEPH_MSG_MAX_FRONT defined as 16Mb seems low for the default mon
messages for our cluster (with osd_mon_messages_max at 100). We have
about 3500 os
On Wed, Jan 16, 2019 at 1:27 AM Kjetil Joergensen wrote:
>
> Hi,
>
> you could try reducing "osd map message max", some code paths that end up as
> -EIO (kernel: libceph: mon1 *** io error) is exceeding
> include/linux/ceph/libceph.h:CEPH_MSG_MAX_{FRONT,MIDDLE,DATA}_LEN.
>
> This "worked for us"
Hi,
you could try reducing "osd map message max", some code paths that end up
as -EIO (kernel: libceph: mon1 *** io error) is exceeding
include/linux/ceph/libceph.h:CEPH_MSG_MAX_{FRONT,MIDDLE,DATA}_LEN.
This "worked for us" - YMMV.
-KJ
On Tue, Jan 15, 2019 at 6:14 AM Andras Pataki
wrote:
> An
An update on our cephfs kernel client troubles. After doing some
heavier testing with a newer kernel 4.19.13, it seems like it also gets
into a bad state when it can't connect to monitors (all back end
processes are on 12.2.8):
Jan 15 08:49:00 mon5 kernel: libceph: mon1 10.128.150.11:6789 ses
I wonder if anyone could offer any insight on the issue below, regarding
the CentOS 7.6 kernel cephfs client connecting to a Luminous cluster. I
have since tried a much newer 4.19.13 kernel, which did not show the
same issue (but unfortunately for various reasons unrelated to ceph, we
can't go
15 matches
Mail list logo