Hello,

Been working on Ceph for only a few weeks and have a small cluster in VMs.
I did a ceph-ansible rolling_update to nautilus from mimic and some of my
PG were stuck in 'active+undersized+remapped+backfilling' with no progress.
All OSD were up and in (see ceph tree below). The PGs only had 2 OSD in the
acting set, yet 3 in the UP set. I don't understand how the acting set can
have two and the upset can have 3; if anything, wouldn't the upset be a
subset of acting?
Anyway, I noticed that the ansible rolling_update set the following flags
'noout' AND 'norebalance'. PG query showed backfill target as OSD 0 (which
was missing from the acting set) and "waiting on backfill" was blank, as
such I'm very confused.
So it wants to backfill OSD 0, it's not blocked per empty set in
waiting_on_backfill, so what's holding it up? Why is it not in the acting
set? (what's the clear definition of acting vs up)

ceph osd tree
ID CLASS WEIGHT  TYPE NAME         STATUS REWEIGHT PRI-AFF
-1       0.08817 root default
-5       0.02939     host hostosd1
 0   hdd 0.00980         osd.0         up  1.00000 1.00000
 4   hdd 0.00980         osd.4         up  1.00000 1.00000
 7   hdd 0.00980         osd.7         up  1.00000 1.00000
-3       0.02939     host hostosd2
 1   hdd 0.00980         osd.1         up  1.00000 1.00000
 3   hdd 0.00980         osd.3         up  1.00000 1.00000
 6   hdd 0.00980         osd.6         up  1.00000 1.00000
-7       0.02939     host hostosd3
 2   hdd 0.00980         osd.2         up  1.00000 1.00000
 5   hdd 0.00980         osd.5         up  1.00000 1.00000
 8   hdd 0.00980         osd.8         up  1.00000 1.00000


PG Info
1.35          3                  0        0         0       0 8388623
0          0 3045     3045            active+undersized+remapped
+backfilling 2019-05-09 16:18:02.513033 50'107145 50:108127 [5,6,0]
5   [5,6]

PG Query
"state": "active+undersized+remapped+backfilling",
    "snap_trimq": "[]",
    "snap_trimq_len": 0,
    "epoch": 50,
    "up": [
        5,
        6,
        0
    ],
    "acting": [
        5,
        6
    ],
    "backfill_targets": [
        "0"
    ],
    "acting_recovery_backfill": [
        "0",
        "5",
        "6"
    ]

...

"waiting_on_backfill": [],
                "last_backfill_started": "MAX",
                "backfill_info": {
                    "begin": "MAX",
                    "end": "MAX",
                    "objects": []
                },
                "peer_backfill_info": [
                    "0",
                    {
                        "begin": "MAX",
                        "end": "MAX",
                        "objects": []
                    }
                ],
                "backfills_in_flight": [],
                "recovering": [],
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to