Also, realize the deep scrub interval is a per-PG thing and (unfortunately) the OSD doesn't use a global view of its PG deep scrub ages to try and schedule them intelligently across that time. If you really want to try and force this out, I believe a few sites have written scripts to do it by turning off deep scrubs, forcing individual PGs to deep scrub at intervals, and then enabling deep scrubs again. -Greg
On Wed, Sep 27, 2017 at 6:34 AM David Turner <drakonst...@gmail.com> wrote: > This isn't an answer, but a suggestion to try and help track it down as > I'm not sure what the problem is. Try querying the admin socket for your > osds and look through all of their config options and settings for > something that might explain why you have multiple deep scrubs happening on > a single osd at the same time. > > However if you misspoke and only have 1 deep scrub per osd but multiple > people node, then what you are seeing is expected behavior. I believe that > luminous added a sleep seeing for scrub io that also might help. Looking > through the admin socket dump of settings looking for scrub should give you > some ideas of things to try. > > On Tue, Sep 26, 2017, 2:04 PM J David <j.david.li...@gmail.com> wrote: > >> With “osd max scrubs” set to 1 in ceph.conf, which I believe is also >> the default, at almost all times, there are 2-3 deep scrubs running. >> >> 3 simultaneous deep scrubs is enough to cause a constant stream of: >> >> mon.ceph1 [WRN] Health check update: 69 slow requests are blocked > 32 >> sec (REQUEST_SLOW) >> >> This seems to correspond with all three deep scrubs hitting the same >> OSD at the same time, starving out all other I/O requests for that >> OSD. But it can happen less frequently and less severely with two or >> even one deep scrub running. Nonetheless, consumers of the cluster >> are not thrilled with regular instances of 30-60 second disk I/Os. >> >> The cluster is five nodes, 15 OSDs, and there is one pool with 512 >> placement groups. The cluster is running: >> >> ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous >> (rc) >> >> All of the OSDs are bluestore, with HDD storage and SSD block.db. >> >> Even setting “osd deep scrub interval = 1843200” hasn’t resolved this >> issue, though it seems to get the number down from 3 to 2, which at >> least cuts down on the frequency of requests stalling out. With 512 >> pgs, that should mean that one pg gets deep-scrubbed per hour, and it >> seems like a deep-scrub takes about 20 minutes. So what should be >> happening is that 1/3rd of the time there should be one deep scrub, >> and 2/3rds of the time there shouldn’t be any. Yet instead we have >> 2-3 deep scrubs running at all times. >> >> Looking at “ceph pg dump” shows that about 7 deep scrubs get launched per >> hour: >> >> $sudo ceph pg dump | fgrep active | awk ‘{print$23” “$24" "$1}' | >> fgrep 2017-09-26 | sort -rn | head -22 >> dumped all >> 2017-09-26 16:42:46.781761 0.181 >> 2017-09-26 16:41:40.056816 0.59 >> 2017-09-26 16:39:26.216566 0.9e >> 2017-09-26 16:26:43.379806 0.19f >> 2017-09-26 16:24:16.321075 0.60 >> 2017-09-26 16:08:36.095040 0.134 >> 2017-09-26 16:03:33.478330 0.b5 >> 2017-09-26 15:55:14.205885 0.1e2 >> 2017-09-26 15:54:31.413481 0.98 >> 2017-09-26 15:45:58.329782 0.71 >> 2017-09-26 15:34:51.777681 0.1e5 >> 2017-09-26 15:32:49.669298 0.c7 >> 2017-09-26 15:01:48.590645 0.1f >> 2017-09-26 15:01:00.082014 0.199 >> 2017-09-26 14:45:52.893951 0.d9 >> 2017-09-26 14:43:39.870689 0.140 >> 2017-09-26 14:28:56.217892 0.fc >> 2017-09-26 14:28:49.665678 0.e3 >> 2017-09-26 14:11:04.718698 0.1d6 >> 2017-09-26 14:09:44.975028 0.72 >> 2017-09-26 14:06:17.945012 0.8a >> 2017-09-26 13:54:44.199792 0.ec >> >> What’s going on here? >> >> Why isn’t the limit on scrubs being honored? >> >> It would also be great if scrub I/O were surfaced in “ceph status” the >> way recovery I/O is, especially since it can have such a significant >> impact on client operations. >> >> Thanks! >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com