Also, realize the deep scrub interval is a per-PG thing and (unfortunately)
the OSD doesn't use a global view of its PG deep scrub ages to try and
schedule them intelligently across that time. If you really want to try and
force this out, I believe a few sites have written scripts to do it by
turning off deep scrubs, forcing individual PGs to deep scrub at intervals,
and then enabling deep scrubs again.
-Greg

On Wed, Sep 27, 2017 at 6:34 AM David Turner <drakonst...@gmail.com> wrote:

> This isn't an answer, but a suggestion to try and help track it down as
> I'm not sure what the problem is. Try querying the admin socket for your
> osds and look through all of their config options and settings for
> something that might explain why you have multiple deep scrubs happening on
> a single osd at the same time.
>
> However if you misspoke and only have 1 deep scrub per osd but multiple
> people node, then what you are seeing is expected behavior.  I believe that
> luminous added a sleep seeing for scrub io that also might help.  Looking
> through the admin socket dump of settings looking for scrub should give you
> some ideas of things to try.
>
> On Tue, Sep 26, 2017, 2:04 PM J David <j.david.li...@gmail.com> wrote:
>
>> With “osd max scrubs” set to 1 in ceph.conf, which I believe is also
>> the default, at almost all times, there are 2-3 deep scrubs running.
>>
>> 3 simultaneous deep scrubs is enough to cause a constant stream of:
>>
>> mon.ceph1 [WRN] Health check update: 69 slow requests are blocked > 32
>> sec (REQUEST_SLOW)
>>
>> This seems to correspond with all three deep scrubs hitting the same
>> OSD at the same time, starving out all other I/O requests for that
>> OSD.  But it can happen less frequently and less severely with two or
>> even one deep scrub running.  Nonetheless, consumers of the cluster
>> are not thrilled with regular instances of 30-60 second disk I/Os.
>>
>> The cluster is five nodes, 15 OSDs, and there is one pool with 512
>> placement groups.  The cluster is running:
>>
>> ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous
>> (rc)
>>
>> All of the OSDs are bluestore, with HDD storage and SSD block.db.
>>
>> Even setting “osd deep scrub interval = 1843200” hasn’t resolved this
>> issue, though it seems to get the number down from 3 to 2, which at
>> least cuts down on the frequency of requests stalling out.  With 512
>> pgs, that should mean that one pg gets deep-scrubbed per hour, and it
>> seems like a deep-scrub takes about 20 minutes.  So what should be
>> happening is that 1/3rd of the time there should be one deep scrub,
>> and 2/3rds of the time there shouldn’t be any.  Yet instead we have
>> 2-3 deep scrubs running at all times.
>>
>> Looking at “ceph pg dump” shows that about 7 deep scrubs get launched per
>> hour:
>>
>> $sudo ceph pg dump | fgrep active | awk ‘{print$23” “$24" "$1}' |
>> fgrep 2017-09-26 | sort -rn | head -22
>> dumped all
>> 2017-09-26 16:42:46.781761 0.181
>> 2017-09-26 16:41:40.056816 0.59
>> 2017-09-26 16:39:26.216566 0.9e
>> 2017-09-26 16:26:43.379806 0.19f
>> 2017-09-26 16:24:16.321075 0.60
>> 2017-09-26 16:08:36.095040 0.134
>> 2017-09-26 16:03:33.478330 0.b5
>> 2017-09-26 15:55:14.205885 0.1e2
>> 2017-09-26 15:54:31.413481 0.98
>> 2017-09-26 15:45:58.329782 0.71
>> 2017-09-26 15:34:51.777681 0.1e5
>> 2017-09-26 15:32:49.669298 0.c7
>> 2017-09-26 15:01:48.590645 0.1f
>> 2017-09-26 15:01:00.082014 0.199
>> 2017-09-26 14:45:52.893951 0.d9
>> 2017-09-26 14:43:39.870689 0.140
>> 2017-09-26 14:28:56.217892 0.fc
>> 2017-09-26 14:28:49.665678 0.e3
>> 2017-09-26 14:11:04.718698 0.1d6
>> 2017-09-26 14:09:44.975028 0.72
>> 2017-09-26 14:06:17.945012 0.8a
>> 2017-09-26 13:54:44.199792 0.ec
>>
>> What’s going on here?
>>
>> Why isn’t the limit on scrubs being honored?
>>
>> It would also be great if scrub I/O were surfaced in “ceph status” the
>> way recovery I/O is, especially since it can have such a significant
>> impact on client operations.
>>
>> Thanks!
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to