Re: [ceph-users] Ceph MDS randomly hangs with no useful error message

2020-01-22 Thread Janek Bevendorff
I don't find any clue from the backtrace. please run 'ceph daemon mds. dump_historic_ops' and ''ceph daemon mds.xxx perf reset; ceph daemon mds.xxx perf dump'. send the outputs to us. Hi, I assume you mean ceph daemon mds.xxx perf reset _all_? Here's the output of historic ops

Re: [ceph-users] Ceph MDS randomly hangs with no useful error message

2020-01-20 Thread Yan, Zheng
On Tue, Jan 21, 2020 at 12:09 AM Janek Bevendorff wrote: > > Hi, I did as you asked and created a thread dump with GDB on the > blocking MDS. Here's the result: https://pastebin.com/pPbNvfdb > I don't find any clue from the backtrace. please run 'ceph daemon mds. dump_historic_ops' and

Re: [ceph-users] Ceph MDS randomly hangs with no useful error message

2020-01-20 Thread Janek Bevendorff
Hi, I did as you asked and created a thread dump with GDB on the blocking MDS. Here's the result: https://pastebin.com/pPbNvfdb On 17/01/2020 13:07, Yan, Zheng wrote: On Fri, Jan 17, 2020 at 4:47 PM Janek Bevendorff wrote: Hi, We have a CephFS in our cluster with 3 MDS to which > 300

Re: [ceph-users] Ceph MDS randomly hangs with no useful error message

2020-01-17 Thread Janek Bevendorff
Thanks. I will do that. Right now, we have quite a few lags when listing folders, which is probably due to another client heavily using the system. Unfortunately, it's rather hard to debug at the moment, since the suspected client has to use our Ganesha bridge instead of connecting to the Ceph

Re: [ceph-users] Ceph MDS randomly hangs with no useful error message

2020-01-17 Thread Yan, Zheng
On Fri, Jan 17, 2020 at 4:47 PM Janek Bevendorff wrote: > > Hi, > > We have a CephFS in our cluster with 3 MDS to which > 300 clients > connect at any given time. The FS contains about 80 TB of data and many > million files, so it is important that meta data operations work > smoothly even when

[ceph-users] Ceph MDS randomly hangs with no useful error message

2020-01-17 Thread Janek Bevendorff
Hi, We have a CephFS in our cluster with 3 MDS to which > 300 clients connect at any given time. The FS contains about 80 TB of data and many million files, so it is important that meta data operations work smoothly even when listing large directories. Previously, we had massive stability