Thank you Dennis! We have made most of these changes and are waiting to see 
what happens. 

Thank you,
Ray 

-----Original Message-----
From: Denis Polom <denispo...@gmail.com> 
Sent: Saturday, March 12, 2022 1:40 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Scrubbing

Hi,

I had similar problem on my larce cluster.

What I found and helped me to solve it:

Due to bad drives and replacing drives too often due to scrub error there was 
always some recovery operations going on.

I did set this:

osd_scrub_during_recovery true

and it basically solved my issue.

If not then you can try change the interval.

I did it also from default once per week to two weeks:

osd_deep_scrub_interval 1209600

and if you want or need to speed it up to get rid of not scrubbed in time PGs 
take a look into

osd_max_scrubs

default is 1 and if I need to speed it up I set it to 3 and I didn't recognize 
any performance impact.


dp

On 3/11/22 17:32, Ray Cunningham wrote:
> That's what I thought. We looked at the cluster storage nodes and found them 
> all to be less than .2 normalized maximum load.
>
> Our 'normal' BW for client IO according to ceph -s is around 60MB/s-100MB/s. 
> I don't usually look at the IOPs so I don't have that number right now. We 
> have seen GB/s numbers during repairs, so the cluster can get up there when 
> the workload requires.
>
> We discovered that this system never got the auto repair setting configured 
> to true and since we turned that on, we have been repairing PGs for the past 
> 24 hours. So, maybe we've been bottlenecked by those?
>
> Thank you,
> Ray
>   
>
> -----Original Message-----
> From: norman.kern <norman.k...@gmx.com>
> Sent: Thursday, March 10, 2022 9:27
> To: Ray Cunningham <ray.cunning...@keepertech.com>
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] Re: Scrubbing
>
> Ray,
>
> You can use node-exporter+prom+grafana  to collect the load of CPUs 
> statistics. You can use uptime command to get the current statistics.
>
> On 3/10/22 10:51 PM, Ray Cunningham wrote:
>> From:
>>
>> osd_scrub_load_threshold
>> The normalized maximum load. Ceph will not scrub when the system load (as 
>> defined by getloadavg() / number of online CPUs) is higher than this number. 
>> Default is 0.5.
>>
>> Does anyone know how I can run getloadavg() / number of online CPUs so I can 
>> see what our load is? Is that a ceph command, or an OS command?
>>
>> Thank you,
>> Ray
>>
>>
>> -----Original Message-----
>> From: Ray Cunningham
>> Sent: Thursday, March 10, 2022 7:59 AM
>> To: norman.kern <norman.k...@gmx.com>
>> Cc: ceph-users@ceph.io
>> Subject: RE: [ceph-users] Scrubbing
>>
>>
>> We have 16 Storage Servers each with 16TB HDDs and 2TB SSDs for DB/WAL, so 
>> we are using bluestore. The system is running Nautilus 14.2.19 at the 
>> moment, with an upgrade scheduled this month. I can't give you a complete 
>> ceph config dump as this is an offline customer system, but I can get 
>> answers for specific questions.
>>
>> Off the top of my head, we have set:
>>
>> osd_max_scrubs 20
>> osd_scrub_auto_repair true
>> osd _scrub_load_threashold 0.6
>> We do not limit srub hours.
>>
>> Thank you,
>> Ray
>>
>>
>>
>>
>> -----Original Message-----
>> From: norman.kern <norman.k...@gmx.com>
>> Sent: Wednesday, March 9, 2022 7:28 PM
>> To: Ray Cunningham <ray.cunning...@keepertech.com>
>> Cc: ceph-users@ceph.io
>> Subject: Re: [ceph-users] Scrubbing
>>
>> Ray,
>>
>> Can you  provide more information about your cluster(hardware and software 
>> configs)?
>>
>> On 3/10/22 7:40 AM, Ray Cunningham wrote:
>>>     make any difference. Do
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
>> email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to