Thank you Dennis! We have made most of these changes and are waiting to see what happens.
Thank you, Ray -----Original Message----- From: Denis Polom <denispo...@gmail.com> Sent: Saturday, March 12, 2022 1:40 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: Scrubbing Hi, I had similar problem on my larce cluster. What I found and helped me to solve it: Due to bad drives and replacing drives too often due to scrub error there was always some recovery operations going on. I did set this: osd_scrub_during_recovery true and it basically solved my issue. If not then you can try change the interval. I did it also from default once per week to two weeks: osd_deep_scrub_interval 1209600 and if you want or need to speed it up to get rid of not scrubbed in time PGs take a look into osd_max_scrubs default is 1 and if I need to speed it up I set it to 3 and I didn't recognize any performance impact. dp On 3/11/22 17:32, Ray Cunningham wrote: > That's what I thought. We looked at the cluster storage nodes and found them > all to be less than .2 normalized maximum load. > > Our 'normal' BW for client IO according to ceph -s is around 60MB/s-100MB/s. > I don't usually look at the IOPs so I don't have that number right now. We > have seen GB/s numbers during repairs, so the cluster can get up there when > the workload requires. > > We discovered that this system never got the auto repair setting configured > to true and since we turned that on, we have been repairing PGs for the past > 24 hours. So, maybe we've been bottlenecked by those? > > Thank you, > Ray > > > -----Original Message----- > From: norman.kern <norman.k...@gmx.com> > Sent: Thursday, March 10, 2022 9:27 > To: Ray Cunningham <ray.cunning...@keepertech.com> > Cc: ceph-users@ceph.io > Subject: Re: [ceph-users] Re: Scrubbing > > Ray, > > You can use node-exporter+prom+grafana to collect the load of CPUs > statistics. You can use uptime command to get the current statistics. > > On 3/10/22 10:51 PM, Ray Cunningham wrote: >> From: >> >> osd_scrub_load_threshold >> The normalized maximum load. Ceph will not scrub when the system load (as >> defined by getloadavg() / number of online CPUs) is higher than this number. >> Default is 0.5. >> >> Does anyone know how I can run getloadavg() / number of online CPUs so I can >> see what our load is? Is that a ceph command, or an OS command? >> >> Thank you, >> Ray >> >> >> -----Original Message----- >> From: Ray Cunningham >> Sent: Thursday, March 10, 2022 7:59 AM >> To: norman.kern <norman.k...@gmx.com> >> Cc: ceph-users@ceph.io >> Subject: RE: [ceph-users] Scrubbing >> >> >> We have 16 Storage Servers each with 16TB HDDs and 2TB SSDs for DB/WAL, so >> we are using bluestore. The system is running Nautilus 14.2.19 at the >> moment, with an upgrade scheduled this month. I can't give you a complete >> ceph config dump as this is an offline customer system, but I can get >> answers for specific questions. >> >> Off the top of my head, we have set: >> >> osd_max_scrubs 20 >> osd_scrub_auto_repair true >> osd _scrub_load_threashold 0.6 >> We do not limit srub hours. >> >> Thank you, >> Ray >> >> >> >> >> -----Original Message----- >> From: norman.kern <norman.k...@gmx.com> >> Sent: Wednesday, March 9, 2022 7:28 PM >> To: Ray Cunningham <ray.cunning...@keepertech.com> >> Cc: ceph-users@ceph.io >> Subject: Re: [ceph-users] Scrubbing >> >> Ray, >> >> Can you provide more information about your cluster(hardware and software >> configs)? >> >> On 3/10/22 7:40 AM, Ray Cunningham wrote: >>> make any difference. Do >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an >> email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io