Hi,

I got into a weird and unexpected situation today. I added 6 hosts to an existing Pacific cluster (16.2.13, 20 existing OSD hosts across 2 DCs). The hosts were added to the root=default subtree, their designated location is one of two datacenters underneath the default root. Nothing unusual, I believe many people use different subtrees to organize their cluster, as do we in our own (and haven't seen above issue yet).

The main application is RGW, the main pool is erasure-coded (k=7, m=11). The crush rule looks like this:

rule rule-ec-k7m11 {
        id 1
        type erasure
        min_size 3
        max_size 18
        step set_chooseleaf_tries 5
        step set_choose_tries 100
        step take default class hdd
        step choose indep 2 type datacenter
        step chooseleaf indep 9 type host
        step emit
}

After almost all peering had finished the status showed 6 inactive + peering PGs for a while. I had to fail the mgr because it didn't report correct stats anymore, then it showed 16 unknown PGs. Their application noticed the (unexpected) disruption, after putting the hosts into their designated crush bucket (datacenter) the situation resolved. But I can't make any sense of it, I tried to reproduce it in my lab environment (Quincy), but to no avail. In my tests it behaves as expected, after new OSDs become active there are remapped PGs, but nothing happens until I add them to their designated location.

I know I could have prevented that with either osd_crush_initial_weight = 0, then move the crush buckets, then reweight, or by adding the crush buckets first, but usually I don't need to bother about these things.

Does anyone have an explanation? I'd appreciate any comments.

Thanks!
Eugen
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to