Re: [ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-11 Thread Dan Van Der Ster
On 10 Jun 2014, at 11:59, Dan Van Der Ster daniel.vanders...@cern.ch wrote: One idea I had was to check the behaviour under different disk io schedulers, trying exploit thread io priorities with cfq. So I have a question for the developers about using ionice or ioprio_set to lower the IO

Re: [ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-10 Thread Dan Van Der Ster
Hi, I’m just starting to get interested in this topic, since today we’ve found that a weekly peak in latency correlates with a bulk (~30) of deep scrubbing PGs. One idea I had was to check the behaviour under different disk io schedulers, trying exploit thread io priorities with cfq. So I have

Re: [ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-10 Thread Craig Lewis
After doing this, I've found that I'm having problems with a few specific PGs. If I set nodeep-scrub, then manually deep-scrub one specific PG, the responsible OSDs get kicked out. I'm starting a new discussion, subject: I have PGs that I can't deep-scrub I'll re-test this correlation after I

[ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-09 Thread Craig Lewis
I've correlated a large deep scrubbing operation to cluster stability problems. My primary cluster does a small amount of deep scrubs all the time, spread out over the whole week. It has no stability problems. My secondary cluster doesn't spread them out. It saves them up, and tries to do all

Re: [ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-09 Thread Gregory Farnum
On Mon, Jun 9, 2014 at 3:22 PM, Craig Lewis cle...@centraldesktop.com wrote: I've correlated a large deep scrubbing operation to cluster stability problems. My primary cluster does a small amount of deep scrubs all the time, spread out over the whole week. It has no stability problems. My

Re: [ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-09 Thread Mike Dawson
Craig, I've struggled with the same issue for quite a while. If your i/o is similar to mine, I believe you are on the right track. For the past month or so, I have been running this cronjob: * * * * * for strPg in `ceph pg dump | egrep '^[0-9]\.[0-9a-f]{1,4}' | sort -k20 | awk '{