Quoting Janne Johansson (icepic...@gmail.com):
> Yes, when you add a drive (or 10), some PGs decide they should have one or 
> more
> replicas on the new drives, a new empty PG is created there, and
> _then_ that replica
> will make that PG get into the "degraded" mode, meaning if it had 3
> fine active+clean
> replicas before, it now has 2 active+clean and one needing backfill to
> get into shape.
> 
> It is a slight mistake in reporting it in the same way as an error,
> even if it looks to the
> cluster just as if it was in error and needs fixing. This gives the
> new ceph admins a
> sense of urgency or danger whereas it should be perfectly normal to add space 
> to
> a cluster. Also, it could have chosen to add a fourth PG in a repl=3
> PG and fill from
> the one going out into the new empty PG and somehow keep itself with 3 working
> replicas, but ceph chooses to first discard one replica, then backfill
> into the empty
> one, leading to this kind of "error" report.

Thanks for the explanation. I agree with you that it would be more safe to
first backfill to the new PG instead of just assuming the new OSD will
be fine and discarding a perfectly healthy PG. We do have max_size 3 in
the CRUSH ruleset ... I wonder if Ceph would behave differently if we
would have max_size 4 ... to actually allow a fourth copy in the first
place ...

Gr. Stefan

-- 
| BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / i...@bit.nl
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to