On Mon, Mar 5, 2018 at 9:56 AM Jonathan D. Proulx <j...@csail.mit.edu> wrote:

> Hi All,
>
> I've recently noticed my deep scrubs are EXTREAMLY poorly
> distributed.  They are stating with in the 18->06 local time start
> stop time but are not distrubuted over enough days or well distributed
> over the range of days they have.
>
> root@ceph-mon0:~# for date in `ceph pg dump | awk '/active/{print $20}'`;
> do date +%D -d $date; done | sort | uniq -c
> dumped all
>       1 03/01/18
>       6 03/03/18
>    8358 03/04/18
>    1875 03/05/18
>
> So very nearly all 10240 pgs scrubbed lastnight/this morning.  I've
> been kicking this around for a while since I noticed poor distribution
> over a 7 day range when I was really pretty sure I'd changed that from
> the 7d default to 28d.
>
> Tried kicking it out to 42 days about a week ago with:
>
> ceph tell osd.* injectargs '--osd_deep_scrub_interval 3628800'
>
>
> There were many error suggesting it could nto reread the change and I'd
> need to restart the OSDs but 'ceph daemon osd.0 config show |grep
> osd_deep_scrub_interval' showed the right value so I let it roll for a
> week but the scrubs did not spread out.
>
> So Friday I set that value in ceph.conf and did rolling restarts of
> all OSDs.  Then doubled checked running value on all daemons.
> Checking Sunday the nightly deeps scrubs (based on LAST_DEEP_SCRUB
> voodoo above) show near enough 1/42nd of PGs had been scrubbed
> Saturday night that I thought this was working.
>
> This morning I checked again and got the results above.
>
> I would expect after changing to a 42d scrub cycle I'd see approx 1/42
> of the PGs deep scrub each night untill there was a roughly even
> distribution over the past 42 days.
>
> So which thing is broken my config or my expectations?
>

Sadly, changing the interval settings does not directly change the
scheduling of deep scrubs. Instead, it merely influences whether a PG will
get queued for scrub when it is examined as a candidate, based on how
out-of-date its scrub is. (That is, nothing holistically goes "I need to
scrub 1/n of these PGs every night"; there's a simple task that says "is
this PG's last scrub more than n days old?")

Users have shared various scripts on the list for setting up a more even
scrub distribution by fiddling with the settings and poking at specific PGs
to try and smear them out over the whole time period; I'd check archives or
google for those. :)
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to