All IO drops to ZERO IOPS for 1-15 minutes during the deep-scrub on my cluster.
There is clearly a locking bug!
I have VMs - every day, several times, sometime on all of them disk IO
_completely_ stops. Disk queue is growing, 0 IOPS are performed, services are
dying with timeouts... At the sam
On Tue, Jan 28, 2014 at 01:30:46AM -0500, Mike Dawson wrote:
>
> On 1/27/2014 1:45 PM, Sage Weil wrote:
> >There is also
> >
> > ceph osd set noscrub
> >
> >and then later
> >
> > ceph osd unset noscrub
> >
> In my experience scrub isn't nearly as much of a problem as
> deep-scrub. On a IOPS con
On Mon, Jan 27, 2014 at 10:45:48AM -0800, Sage Weil wrote:
> There is also
>
> ceph osd set noscrub
>
> and then later
>
> ceph osd unset noscrub
>
> I forget whether this pauses an in-progress PG scrub or just makes it stop
> when it gets to the next PG boundary.
>
> sage
I bumped into t
On Mon, Jan 27, 2014 at 01:10:23PM -0500, Kyle Bader wrote:
> > Are there any tools we are not aware of for controlling, possibly pausing,
> > deep-scrub and/or getting some progress about the procedure ?
> > Also since I believe it would be a bad practice to disable deep-scrubbing
> > do you
> >
On 1/27/2014 1:45 PM, Sage Weil wrote:
There is also
ceph osd set noscrub
and then later
ceph osd unset noscrub
In my experience scrub isn't nearly as much of a problem as deep-scrub.
On a IOPS constrained cluster with writes approaching the available
aggregate spindle performance minu
There is also
ceph osd set noscrub
and then later
ceph osd unset noscrub
I forget whether this pauses an in-progress PG scrub or just makes it stop
when it gets to the next PG boundary.
sage
On Mon, 27 Jan 2014, Kyle Bader wrote:
> > Are there any tools we are not aware of for controllin
> Are there any tools we are not aware of for controlling, possibly pausing,
> deep-scrub and/or getting some progress about the procedure ?
> Also since I believe it would be a bad practice to disable deep-scrubbing do
> you
> have any recommendations of how to work around (or even solve) this is
Hello all,
We have been running RADOS in a large scale, production, public cloud
environment for a few months now and we are generally happy with it.
However, we experience performance problems when deep scrubbing is active.
We managed to reproduce them in our testing cluster running emperor, ev