Re: [ceph-users] Constant slow / blocked requests with otherwise healthy cluster

2013-11-28 Thread Oliver Schulz
Hi Michael, Sounds like what I was having starting a couple of days ago, played [...] yes, that sounds ony too familiar. :-( Updated to 3.12 kernel and restarted all of the ceph nodes and it's now happily churning through a rados -p rbd bench 300 write -t 120 that Weird - but if that

Re: [ceph-users] Constant slow / blocked requests with otherwise healthy cluster

2013-11-28 Thread Oliver Schulz
our Ceph cluster suddenly went into a state of OSDs constantly having blocked or slow requests, rendering the cluster unusable. This happened during normal use, there were no updates, etc. our cluster seems to have recovered overnight and is back to normal behaviour. This morning, everything

[ceph-users] Constant slow / blocked requests with otherwise healthy cluster

2013-11-27 Thread Oliver Schulz
Dear Ceph Experts, our Ceph cluster suddenly went into a state of OSDs constantly having blocked or slow requests, rendering the cluster unusable. This happened during normal use, there were no updates, etc. All disks seem to be healthy (smartctl, iostat, etc.). A complete hardware reboot

Re: [ceph-users] Constant slow / blocked requests with otherwise healthy cluster

2013-11-27 Thread Andrey Korolyov
Hey, What number do you have for a replication factor? As for three, 1.5k IOPS may be a little bit high for 36 disks, and your OSD ids looks a bit suspicious - there should not be 60+ OSDs based on calculation from numbers below. On 11/28/2013 12:45 AM, Oliver Schulz wrote: Dear Ceph Experts,

Re: [ceph-users] Constant slow / blocked requests with otherwise healthy cluster

2013-11-27 Thread Michael
Sounds like what I was having starting a couple of days ago, played around with the conf, taking in/out suspect osd and doing full smart tests on them that came back perfectly fine, doing network tests that came back 110MB/s on all channels, doing OSD benches that reported all OSD managing 80+