Hi,

I'm running a Ceph cluster on version 0.80.1
(a38fe1169b6d2ac98b427334c12d7cf81f809b74).

Here's roughly what happened: cluster has been created, pg and pgp num
have been increased step by step to (now) 1024 to have a fitting number
for the amount of OSDs and size. Then (days later) the size of the rbd
pool was increased from 2 to 3 and now the pool has some degraded PGs:

-------------------------- ceph -s -------------------------------------

root@storage001:~# ceph -s
    cluster 75b96824-9527-4c21-a5b4-01964736463a
     health HEALTH_WARN 44 pgs degraded; 68 pgs stuck unclean; recovery
6529/491939 objects degraded (1.327%)
     monmap e2: 3 mons at
{storage001=10.10.10.51:6789/0,storage002=10.10.10.52:6789/0,storage003=10.10.10.53:6789/0},
election epoch 76, quorum 0,1,2 storage001,storage002,storage003
     osdmap e2336: 16 osds: 16 up, 16 in
      pgmap v273813: 1152 pgs, 3 pools, 667 GB data, 166 kobjects
            1898 GB used, 18301 GB / 20200 GB avail
            6529/491939 objects degraded (1.327%)
                1084 active+clean
                  44 active+degraded
                  24 active+remapped
  client io 7583 B/s wr, 1 op/s

-------------------------- ceph -s -------------------------------------

(also, the pgmap version is steadily increasing still)

ceph health detail output: http://pastebin.com/vVw5vTPv

As you can see in this list, some PGs are stuck degraded with only 2
copies and others are stuck in a remapped state (have already made a 3rd
copy, but in the wrong spot). But for those that went to size=3 already,
they haven't placed a copy in storage003 (no osd num 10-15) and I don't
see why.


------------------------- OSD tree ------------------------------------

# id    weight  type name       up/down reweight
-1      19.72   root default
-2      9.05            host storage001
1       1.81                    osd.1   up      1
2       1.81                    osd.2   up      1
3       1.81                    osd.3   up      1
4       1.81                    osd.4   up      1
0       1.81                    osd.0   up      1
-3      9.05            host storage002
5       1.81                    osd.5   up      1
6       1.81                    osd.6   up      1
7       1.81                    osd.7   up      1
8       1.81                    osd.8   up      1
9       1.81                    osd.9   up      1
-4      1.62            host storage003
11      0.27                    osd.11  up      1
12      0.27                    osd.12  up      1
13      0.27                    osd.13  up      1
14      0.27                    osd.14  up      1
15      0.27                    osd.15  up      1
10      0.27                    osd.10  up      1

------------------------- OSD tree ------------------------------------

Lastly, here's the query status of one of the degraded PGs:

"recovery_state": [
        { "name": "Started\/Primary\/Active",
          "enter_time": "2014-06-10 14:19:33.083132",
          "might_have_unfound": [],
          "recovery_progress": { "backfill_targets": [],
              "waiting_on_backfill": [],
              "last_backfill_started": "0\/\/0\/\/-1",
              "backfill_info": { "begin": "0\/\/0\/\/-1",
                  "end": "0\/\/0\/\/-1",
                  "objects": []},
              "peer_backfill_info": [],
              "backfills_in_flight": [],
              "recovering": [],
              "pg_backend": { "pull_from_peer": [],
                  "pushing": []}},
          "scrub": { "scrubber.epoch_start": "0",
              "scrubber.active": 0,
              "scrubber.block_writes": 0,
              "scrubber.finalizing": 0,
              "scrubber.waiting_on": 0,
              "scrubber.waiting_on_whom": []}},
        { "name": "Started",
          "enter_time": "2014-06-10 14:19:32.076453"}],
  "agent_state": {}}

Its been stuck in this state for 4 days, so its apparently not a matter
of time. Measures already taken that didn't help:

for var in {0..15}; do ceph osd scrub $var; done;
#ceph expects an integer after scrub, so * didn't work.

after that completed:
for var in {0..15}; do ceph osd repair $var; done;

after the repaid operation finished (couldn't find a single PG that
didn't report "0 fixed"), we tried restarting the OSDs one by one (which
is why the recovery enter_time is so recent).

Any help would be greatly appreciated.


Regards,
Marc


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to