> our Ceph cluster suddenly went into a state of OSDs constantly having
> blocked or slow requests, rendering the cluster unusable. This happened
> during normal use, there were no updates, etc.
our cluster seems to have recovered overnight and is back
to normal behaviour. This morning, everything
Hi Michael,
> Sounds like what I was having starting a couple of days ago, played
[...]
yes, that sounds ony too familiar. :-(
> Updated to 3.12 kernel and restarted all of the ceph nodes and it's now
> happily churning through a rados -p rbd bench 300 write -t 120 that
Weird - but if that s
Sounds like what I was having starting a couple of days ago, played
around with the conf, taking in/out suspect osd and doing full smart
tests on them that came back perfectly fine, doing network tests that
came back 110MB/s on all channels, doing OSD benches that reported all
OSD managing 80+
Hey,
What number do you have for a replication factor? As for three, 1.5k
IOPS may be a little bit high for 36 disks, and your OSD ids looks a bit
suspicious - there should not be 60+ OSDs based on calculation from
numbers below.
On 11/28/2013 12:45 AM, Oliver Schulz wrote:
> Dear Ceph Experts,
>
Dear Ceph Experts,
our Ceph cluster suddenly went into a state of OSDs constantly having
blocked or slow requests, rendering the cluster unusable. This happened
during normal use, there were no updates, etc.
All disks seem to be healthy (smartctl, iostat, etc.). A complete
hardware reboot includ