Quoting Sean Matheny (s.math...@auckland.ac.nz): > I tested this out by setting norebalance and norecover, moving the host > buckets under the rack buckets (all of them), and then unsetting. Ceph starts > melting down with escalating slow requests, even with backfill and recovery > parameters set to throttle. I moved the host buckets back to the default root > bucket, and things mostly came right, but I still had some inactive / unknown > pgs that I had to restart some OSDs to get back to health_ok. > > I’m sure there’s a way you can tune things or fade in crush weights or > something, but I’m happy just moving one at a time.
For big changes like this you can use Dan's UPMAP trick: https://www.slideshare.net/Inktank_Ceph/ceph-day-berlin-mastering-ceph-operations-upmap-and-the-mgr-balancer Python script: https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py This way you can pause the process or get in "HEALTH_OK" state when you want to. Gr. Stefan -- | BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com