I noticed that for my clusters, even a short 5-minute network outage or
single-host reboot can cause
pgs: 5586988/366684639 objects misplaced (1.524%)
which at the speed of
recovery: 2.2 GiB/s, 676 objects/s
can take hours to recover.
I don't understand how this can be. If it's down for so short, how can
rebalancing can take this long?
I'm using Ceph 19.2.2 on HDDs with SSDs as BlueStore "db" device.
Is this perhaps that writes of new files are written linearly to HDD (fast) but
recovery seeks around on my HDDs in random order (slow)?
In any case, this asymmetry is quite annoying.
Could anything be done against it?
Thanks!
Niklas
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]