Dear all,

We have a Ceph cluster with several nodes, each node contains 4-6 OSDs. We
are running the OS off USB drive to maximise the use of the drive bays for
the OSDs and so far everything is running fine.

Occasionally, the OS running on the USB drive would fail, and we would
normally replace the drive with a pre-configured similar OS and Ceph
running, so when the new OS boots up, it will automatically detect all the
OSDs and start them. It works fine without any issues.

However, the issue is in recovery. When one node goes down, all the OSDs
would be down and recovery will start to move the pg replicas on the
affected OSDs to other available OSDs, and cause the Ceph to be degraded,
say 5%, which is expected. However, when we boot up the failed node with a
new OS, and bring back the OSDs up, more PGs are being scheduled for
backfilling and instead of reducing, the degradation level will shoot up
again to, for example, 10%, and in some occasion, it goes up to 19%.

We had experience when one node is down, it will degraded to 5% and
recovery will start, but when we manage to bring back up the node (still
the same OS), the degradation level will reduce to below 1% and eventually
recovery will be completed faster.

Why the same behaviour doesn't apply on the above situation? The OSD
numbers are the same when the node boots up, the crush map weight values
are also the same. Only the hostname is different.

Any advice / suggestions?

Looking forward to your reply, thank you.

Cheers.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to