[ceph-users] Re: Persistent problem with slow metadata

2020-08-24 Thread Eugen Block
Hi, there have been several threads about this topic [1], most likely it's the metadata operation during the cleanup that saturates your disks. The recommended settings seem to be: [osd] osd op queue = wpq osd op queue cut off = high This helped us a lot, the number of slow requests has dec

[ceph-users] Re: Persistent problem with slow metadata

2020-08-24 Thread Momčilo Medić
Hi Eugen, On Mon, 2020-08-24 at 14:26 +, Eugen Block wrote: > Hi, > > there have been several threads about this topic [1], most likely > it's > the metadata operation during the cleanup that saturates your disks. > > The recommended settings seem to be: > > [osd] > osd op queue = wpq > o

[ceph-users] Re: Persistent problem with slow metadata

2020-08-25 Thread Momčilo Medić
Hi friends, I was re-reading documentation[1] when I noticed that 64GiB of RAM should suffice even for a 1000 clients. That really makes our issue that much more difficult to troubleshoot. There are no assumptions that I can make that can encompass all of the details I observe. With no assumption

[ceph-users] Re: Persistent problem with slow metadata

2020-08-25 Thread david.neal
nd up and testing might be the way to go? Kind regards, Dave -- Original Message -- From: "Momčilo Medić" To: ceph-users@ceph.io Sent: Tuesday, 25 Aug, 2020 At 14:36 Subject: [ceph-users] Re: Persistent problem with slow metadata Hi friends, I was re-reading documentation[1] when

[ceph-users] Re: Persistent problem with slow metadata

2020-08-26 Thread Eugen Block
Hi, root@cephosd01:~# ceph config get mds.cephosd01 osd_op_queue wpq root@0cephosd01:~# ceph config get mds.cephosd01 osd_op_queue_cut_off high just to make sure, I referred to OSD not MDS settings, maybe check again? I wouldn't focus too much on the MDS service, 64 GB RAM should be enough,

[ceph-users] Re: Persistent problem with slow metadata

2020-08-31 Thread Momčilo Medić
Hi Dave, On Tue, 2020-08-25 at 15:25 +0100, david.neal wrote: > Hi Momo, > > This can be caused by many things apart from the ceph sw. > > For example I saw this once with the MTU in openvswitch not fully > matching on a few nodes . We realised this using ping between nodes. > For a 9000 MTU: >

[ceph-users] Re: Persistent problem with slow metadata

2020-08-31 Thread Momčilo Medić
Hey Eugen, On Wed, 2020-08-26 at 09:29 +, Eugen Block wrote: > Hi, > > > > root@cephosd01:~# ceph config get mds.cephosd01 osd_op_queue > > > wpq > > > root@0cephosd01:~# ceph config get mds.cephosd01 > > > osd_op_queue_cut_off > > > high > > just to make sure, I referred to OSD not MDS sett

[ceph-users] Re: Persistent problem with slow metadata

2020-08-31 Thread Eugen Block
Disks are utilized roughly between 70 and 80 percent. Not sure why would operations slow down when disks are getting more utilization. If that would be the case, I'd expect Ceph to issue a warning. It is warning you, that's why you see slow requests. ;-) But just to be clear, by utilization I

[ceph-users] Re: Persistent problem with slow metadata

2020-08-31 Thread Momčilo Medić
On Mon, 2020-08-31 at 14:36 +, Eugen Block wrote: > > Disks are utilized roughly between 70 and 80 percent. Not sure why > > would operations slow down when disks are getting more utilization. > > If that would be the case, I'd expect Ceph to issue a warning. > > It is warning you, that's why