We recently had a few Ceph nodes go offline which required a reboot. I have
been able to get the cluster back to the state listed below however it does not
seem like it will progress past the point of 23473/287823588 objects misplaced.
Yesterday it was about 13% of the data that was misplaced…however this morning
it has goteen to 0.008% but has not moved past this point in about an hour.
Does anyone see anything in the output below that points to the problem and/or
are there any suggestions that I can follow in order to figure out why the
cluster health is not moving beyond this point?
---------------------------------------------------
root@rbd1:~# ceph -s
cluster:
id: 504b5794-34bd-44e7-a8c3-0494cf800c23
health: HEALTH_ERR
crush map has legacy tunables (require argonaut, min is firefly)
23473/287823588 objects misplaced (0.008%)
14 scrub errors
Reduced data availability: 2 pgs inactive
Possible data damage: 8 pgs inconsistent
services:
mon: 3 daemons, quorum hqceph1,hqceph2,hqceph3
mgr: hqceph2(active), standbys: hqceph3
osd: 288 osds: 270 up, 270 in; 2 remapped pgs
rgw: 1 daemon active
data:
pools: 17 pools, 9411 pgs
objects: 95.95M objects, 309TiB
usage: 936TiB used, 627TiB / 1.53PiB avail
pgs: 0.021% pgs not active
23473/287823588 objects misplaced (0.008%)
9369 active+clean
30 active+clean+scrubbing+deep
8 active+clean+inconsistent
2 activating+remapped
2 active+clean+scrubbing
io:
client: 1000B/s rd, 0B/s wr, 0op/s rd, 0op/s wr
root@rbd1:~# ceph health detail
HEALTH_ERR crush map has legacy tunables (require argonaut, min is firefly); 1
osds down; 23473/287823588 objects misplaced (0.008%); 14 scrub errors; Reduced
data availability: 3 pgs inactive, 13 pgs peering; Possible data damage: 8 pgs
inconsistent; Degraded data redundancy: 408658/287823588 objects degraded
(0.142%), 38 pgs degraded
OLD_CRUSH_TUNABLES crush map has legacy tunables (require argonaut, min is
firefly)
see http://docs.ceph.com/docs/master/rados/operations/crush-map/#tunables
OSD_DOWN 1 osds down
osd.95 (root=default,host=hqosd8) is down
OBJECT_MISPLACED 23473/287823588 objects misplaced (0.008%)
OSD_SCRUB_ERRORS 14 scrub errors
PG_AVAILABILITY Reduced data availability: 3 pgs inactive, 13 pgs peering
pg 3.b41 is stuck peering for 106.682058, current state peering, last
acting [204,190]
pg 3.c33 is stuck peering for 103.403643, current state peering, last
acting [228,274]
pg 3.d15 is stuck peering for 128.537454, current state peering, last
acting [286,24]
pg 3.fa9 is stuck peering for 106.526146, current state peering, last
acting [286,47]
pg 3.fb7 is stuck peering for 105.878878, current state peering, last
acting [62,97]
pg 3.13a2 is stuck peering for 106.491138, current state peering, last
acting [270,219]
pg 3.1521 is stuck inactive for 170180.165265, current state
activating+remapped, last acting [94,186,188]
pg 3.1565 is stuck peering for 106.782784, current state peering, last
acting [121,60]
pg 3.157c is stuck peering for 128.557448, current state peering, last
acting [128,268]
pg 3.1744 is stuck peering for 106.639603, current state peering, last
acting [192,142]
pg 3.1ac8 is stuck peering for 127.839550, current state peering, last
acting [221,190]
pg 3.1e24 is stuck peering for 128.201670, current state peering, last
acting [118,158]
pg 3.1e46 is stuck inactive for 169121.764376, current state
activating+remapped, last acting [87,199,170]
pg 18.36 is stuck peering for 128.554121, current state peering, last
acting [204]
pg 21.1ce is stuck peering for 106.582584, current state peering, last
acting [266,192]
PG_DAMAGED Possible data damage: 8 pgs inconsistent
pg 3.1ca is active+clean+inconsistent, acting [201,8,180]
pg 3.56a is active+clean+inconsistent, acting [148,240,8]
pg 3.b0f is active+clean+inconsistent, acting [148,260,8]
pg 3.b56 is active+clean+inconsistent, acting [218,8,240]
pg 3.10ff is active+clean+inconsistent, acting [262,8,211]
pg 3.1192 is active+clean+inconsistent, acting [192,8,187]
pg 3.124a is active+clean+inconsistent, acting [123,8,222]
pg 3.1c55 is active+clean+inconsistent, acting [180,8,287]
PG_DEGRADED Degraded data redundancy: 408658/287823588 objects degraded
(0.142%), 38 pgs degraded
pg 3.8f is active+undersized+degraded, acting [163,149]
pg 3.ba is active+undersized+degraded, acting [68,280]
pg 3.1aa is active+undersized+degraded, acting [176,211]
pg 3.29e is active+undersized+degraded, acting [241,194]
pg 3.323 is active+undersized+degraded, acting [78,194]
pg 3.343 is active+undersized+degraded, acting [242,144]
pg 3.4ae is active+undersized+degraded, acting [153,237]
pg 3.524 is active+undersized+degraded, acting [252,222]
pg 3.5c9 is active+undersized+degraded, acting [272,252]
pg 3.713 is active+undersized+degraded, acting [273,80]
pg 3.730 is active+undersized+degraded, acting [235,212]
pg 3.88f is active+undersized+degraded, acting [222,285]
pg 3.8cb is active+undersized+degraded, acting [285,20]
pg 3.9a0 is active+undersized+degraded, acting [240,200]
pg 3.c19 is active+undersized+degraded, acting [165,276]
pg 3.ec8 is active+undersized+degraded, acting [158,40]
pg 3.1025 is active+undersized+degraded, acting [258,274]
pg 3.1058 is active+undersized+degraded, acting [38,68]
pg 3.14e4 is active+undersized+degraded, acting [185,39]
pg 3.150c is active+undersized+degraded, acting [138,140]
pg 3.1545 is active+undersized+degraded, acting [222,55]
pg 3.15a6 is active+undersized+degraded, acting [242,272]
pg 3.1620 is active+undersized+degraded, acting [200,164]
pg 3.1710 is active+undersized+degraded, acting [176,285]
pg 3.1792 is active+undersized+degraded, acting [190,11]
pg 3.17bd is active+undersized+degraded, acting [207,15]
pg 3.17da is active+undersized+degraded, acting [5,160]
pg 3.183e is active+undersized+degraded, acting [273,136]
pg 3.197d is active+undersized+degraded, acting [241,139]
pg 3.1a3d is active+undersized+degraded, acting [184,121]
pg 3.1ba6 is active+undersized+degraded, acting [47,249]
pg 3.1c2b is active+undersized+degraded, acting [268,80]
pg 3.1ca2 is active+undersized+degraded, acting [280,152]
pg 3.1cd4 is active+undersized+degraded, acting [2,129]
pg 3.1e13 is active+undersized+degraded, acting [247,114]
pg 12.56 is active+undersized+degraded, acting [54]
pg 18.8 is undersized+degraded+peered, acting [260]
pg 21.9f is active+undersized+degraded, acting [215,201]
--------------------------------------------------------------------------------------------------
Thanks,
Shain
Shain Miley | Director of Platform and Infrastructure | Digital Media |
[email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]