Re: [ceph-users] CRUSH rebalance all at once or host-by-host?

Stefan Kooman Thu, 09 Jan 2020 00:41:19 -0800

Quoting Sean Matheny (s.math...@auckland.ac.nz):
> I tested this out by setting norebalance and norecover, moving the host 
> buckets under the rack buckets (all of them), and then unsetting. Ceph starts 
> melting down with escalating slow requests, even with backfill and recovery 
> parameters set to throttle. I moved the host buckets back to the default root 
> bucket, and things mostly came right, but I still had some inactive / unknown 
> pgs that I had to restart some OSDs to get back to health_ok.
> 
> I’m sure there’s a way you can tune things or fade in crush weights or 
> something, but I’m happy just moving one at a time.


For big changes like this you can use Dan's UPMAP trick:
https://www.slideshare.net/Inktank_Ceph/ceph-day-berlin-mastering-ceph-operations-upmap-and-the-mgr-balancer

Python script:
https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py

This way you can pause the process or get in "HEALTH_OK" state when
you want to.

Gr. Stefan


-- 
| BIT BV  https://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / i...@bit.nl
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CRUSH rebalance all at once or host-by-host?

Reply via email to