On 9/22/25 14:42, Niklas Hambüchen wrote:
Well, if they were away long enough to get "out", then it is somewhat
reasonable even for ~5m downtimes.
Right, but what I'm saying is that this is not what happens.
My reboot or disconnect takes < 5 minutes, and no OSD is `out` afterwards.
When I say "down for 5 minutes", I literally mean that the node goes down,
comes back up, and I'm sitting in front of its terminal and observe that all OSDs are
`up` and `in`.
Of course your explanation of what happens if it's `out` makes sense, but that
isn't my scenario; if Ceph had 10 hours to move data off, of course I would
have to expect at least 10 hours to move data back on. But it only has 5
minutes at max to move data off.
Are we talking about a replicated pool or EC here?
And what is your failure domain?
What might give insight is the following command:
watch -n 5 ceph pg ls remapped
In the first columns you can see how many objects / data are missing on
each PG. Maybe note what the status is for a specific set of OSDs before
a reboot. That might give a clue what happens.
There is a difference in recovery strategy between replicated and EC
[1]. Not sure about backfill, but this might be handled in a similar
manner. In a replicated pool I would expect the time to backfill
misplaced objects be roughly similar to the downtime. A bit longer
actually as there might be a few OSDs that have multiple PGs on them and
not enough backfill slots to do this in parallel (as was already
mentioned in this thread).
Gr. Stefan
[1]: https://docs.ceph.com/en/latest/dev/osd_internals/log_based_pg/
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]