It’s not that the limit is *ignored*; sometimes the failure of the subtree
isn’t *detected*. Eg., I’ve seen this happen when a node experienced kernel
weirdness or OOM conditions such that the OSDs didn’t all get marked down at
the same time, so the PGs all started recovering. Admitedly it’s b
Thanks for the reply Anthony.
Those are all considerations I am very much aware of. I'm very curious about
this though:
> mon_osd_down_out_subtree_limit. There are cases where it doesn’t kick in and
> a whole node will attempt to rebalance
In what cases is the limit ignored? Do these excepti
Hi Robert,
thanks for looking at this. The explanation is a different one though.
Today I added disks to the second server that was in exactly the same state as
the other one reported below. I used this opportunity to do a modified reboot +
OSD adding sequence.
To recall the situation, I added