On 5/7/14 15:33 , Dimitri Maziuk wrote:
On 05/07/2014 04:11 PM, Craig Lewis wrote:
On 5/7/14 13:40 , Sergey Malinin wrote:
Check dmesg and SMART data on both nodes. This behaviour is similar to
failing hdd.
It does sound like a failing disk... but there's nothing in dmesg, and
smartmontools hasn't emailed me about a failing disk. The same thing is
happening to more than 50% of my OSDs, in both nodes.
check 'iostat -dmx 5 5' (or some other numbers) -- if you see 100%+ disk
utilization, that could be the dying one.
A new OSD, osd.10, has started doing this. I currently have all of the
previously advised params (osd_max_backfill = 1,
osd_recovery_op_priority = 1, osd_recovery_max_active = 1) active.
I stopped the daemon, and started watching iostat
root@ceph1c:~# iostat sde -dmx 5
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
sde 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
sde 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
# I started the osd daemon during this next sample
sde 0.00 0.00 7.60 33.20 0.81 0.92
86.55 0.06 1.57 3.58 1.11 0.71 2.88
sde 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
sde 0.00 0.00 2.20 336.00 0.01 1.46
8.91 0.07 0.21 17.82 0.09 0.20 6.88
# During this next sample, the ceph-osd daemon started consuming exactly
100% CPU
sde 0.00 0.00 0.40 8.40 0.00 0.36
84.18 0.02 2.00 26.00 0.86 1.18 1.04
sde 0.00 0.00 2.20 336.00 0.01 1.46
8.91 0.07 0.21 17.82 0.09 0.20 6.88
sde 0.00 0.00 0.40 8.40 0.00 0.36
84.18 0.02 2.00 26.00 0.86 1.18 1.04
sde 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
sde 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sde 0.00 0.00 0.00 18.00 0.00 0.28
31.73 0.02 1.11 0.00 1.11 0.04 0.08
sde 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
sde 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
<snip repetitive rows>
sde 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
sde 0.00 0.00 1.20 0.00 0.08 0.00
132.00 0.02 20.67 20.67 0.00 20.67 2.48
sde 0.00 0.00 0.40 0.00 0.03 0.00
128.00 0.02 46.00 46.00 0.00 46.00 1.84
sde 0.00 0.00 0.20 0.00 0.01 0.00
128.00 0.01 44.00 44.00 0.00 44.00 0.88
sde 0.00 0.00 5.00 15.60 0.41 0.82
121.94 0.03 1.24 4.64 0.15 1.17 2.40
sde 0.00 0.00 0.00 27.40 0.00 0.17
12.44 0.49 17.96 0.00 17.96 0.53 1.44
# The suicide timer hits in this sample or the next, and the daemon restarts
sde 0.00 0.00 113.60 261.20 2.31 1.00
18.08 1.17 3.12 10.15 0.06 1.79 66.96
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
sde 0.00 0.00 176.20 134.60 3.15 1.31
29.40 1.79 5.77 10.12 0.08 3.16 98.16
sde 0.00 0.00 184.40 6.80 3.05 0.07
33.46 1.94 10.15 10.53 0.00 5.10 97.52
sde 0.00 0.00 202.20 28.80 3.60 0.26
34.26 2.06 8.92 10.18 0.06 4.09 94.40
sde 0.00 0.00 193.20 20.80 2.90 0.28
30.43 2.02 9.44 10.43 0.15 4.58 97.92
^C
During the first cycle, there was almost no data being read or written.
During the second cycle, I see a what looks like a normal recovery
operation. But the daemon still hits 100% CPU, and gets kicked out for
being unresponsive. The third and fourth cycles (not shown) look like
the first cycle.
So this is not a failing disk. 0% disk util and 100% CPU util means
the code is stuck in some sort of fast loop that doesn't need external
input. It could be some legit task that it's not able to complete
before being killed, or it could be a hung lock.
I'm going to try setting noout and nodown, and see if that helps. I'm
trying to test if it's some start up operation (leveldb compaction or
something) that can't complete before the other OSDs mark it down.
I'll give that an hour to see what happens. If it's still flapping
after that, I'll unset nodown, and disable the daemon for the time being.
--
*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email cle...@centraldesktop.com <mailto:cle...@centraldesktop.com>
*Central Desktop. Work together in ways you never thought possible.*
Connect with us Website <http://www.centraldesktop.com/> | Twitter
<http://www.twitter.com/centraldesktop> | Facebook
<http://www.facebook.com/CentralDesktop> | LinkedIn
<http://www.linkedin.com/groups?gid=147417> | Blog
<http://cdblog.centraldesktop.com/>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com