Hello,

We have an issue on one of our clusters. One node with 9 OSD was down
for more than 12 hours. During that time cluster recovered without
problems. When host back to the cluster we got two PGs in incomplete
state. We decided to mark OSDs on this host as out but the two PGs are
still in incomplete state. Trying to query those pg hangs forever. We
were alredy trying restarting OSDs. Is there any way to solve this issue
without loosing data? Any help appreciate :)

# ceph health detail | grep incomplete
HEALTH_WARN 2 pgs incomplete; 2 pgs stuck inactive; 2 pgs stuck unclean;
200 requests are blocked > 32 sec; 2 osds have slow requests;
noscrub,nodeep-scrub flag(s) set
pg 3.2929 is stuck inactive since forever, current state incomplete,
last acting [109,272,83]
pg 3.1683 is stuck inactive since forever, current state incomplete,
last acting [166,329,281]
pg 3.2929 is stuck unclean since forever, current state incomplete, last
acting [109,272,83]
pg 3.1683 is stuck unclean since forever, current state incomplete, last
acting [166,329,281]
pg 3.1683 is incomplete, acting [166,329,281] (reducing pool vms
min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 3.2929 is incomplete, acting [109,272,83] (reducing pool vms min_size
from 2 may help; search ceph.com/docs for 'incomplete')

Directory for PG 3.1683 is present on OSD 166 and containes ~8GB.

We didn't try setting min_size to 1 yet (we treat is as a last resort).



Some cluster info:
# ceph --version

ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)

# ceph -s
     health HEALTH_WARN
            2 pgs incomplete
            2 pgs stuck inactive
            2 pgs stuck unclean
            200 requests are blocked > 32 sec
            noscrub,nodeep-scrub flag(s) set
     monmap e7: 5 mons at
{mon-03=*.2:6789/0,mon-04=*.36:6789/0,mon-05=*.81:6789/0,mon-06=*.0:6789/0,mon-07=*.40:6789/0}
            election epoch 3250, quorum 0,1,2,3,4
mon-06,mon-07,mon-04,mon-03,mon-05
     osdmap e613040: 346 osds: 346 up, 337 in
            flags noscrub,nodeep-scrub
      pgmap v27163053: 18624 pgs, 6 pools, 138 TB data, 39062 kobjects
            415 TB used, 186 TB / 601 TB avail
               18622 active+clean
                   2 incomplete
  client io 9992 kB/s rd, 64867 kB/s wr, 8458 op/s


# ceph osd pool get vms pg_num
pg_num: 16384

# ceph osd pool get vms size
size: 3

# ceph osd pool get vms min_size
min_size: 2


-- 
PS
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to