Hi all,
We have a Luminous 12.2.2 cluster with 3 nodes and i recently added a node
to it.
osd-max-backfills is at the default 1 so backfilling didn't go very fast
but that doesn't matter.
Once it started backfilling everything looked ok:
~300 pgs in backfill_wait
~10 pgs backfilling (~number of new osd's)
But i noticed the degraded objects increasing a lot. I presume a pg that is
in backfill_wait state doesn't accept any new writes anymore? Hence
increasing the degraded objects?
So far so good, but once a while i noticed a random OSD flapping (they come
back up automatically). This isn't because the disk is saturated but a
driver/controller/kernel incompatibility which 'hangs' the disk for a short
time (scsi abort_task error in syslog). Investigating further i noticed
this was already the case before the node expansion.
These OSD's flapping results in lots of pg states which are a bit worrying:
109 active+remapped+backfill_wait
80 active+undersized+degraded+remapped+backfill_wait
51 active+recovery_wait+degraded+remapped
41 active+recovery_wait+degraded
27 active+recovery_wait+undersized+degraded+remapped
14 active+undersized+remapped+backfill_wait
4 active+undersized+degraded+remapped+backfilling
I think the recovery_wait is more important then the backfill_wait, so i
like to prioritize these because the recovery_wait was triggered by the
flapping OSD's
furthermore the undersized ones should get absolute priority or is that
already the case?
I was thinking about setting "nobackfill" to prioritize recovery instead of
backfilling.
Would that help in this situation? Or am i making it even worse then?
ps. i tried increasing the heartbeat values for the OSD's to no avail, they
still get flagged as down once in a while after a hiccup of the driver.
i've injected the following settings into all OSD's and MON's:
osd heartbeat interval 18 (default = 6)
osd heartbeat grace 60 (default = 20)
osd mon heartbeat interval 60 (default = 30)
Am i adjusting the right settings or are there any other settings to
increase the heartbeat grace?
Do these settings need a restart of the daemons or is injecting sufficient?
ps2. the drives which are flapping are Seagate Enterprise Capacity 10TB
SATA 7k2 disks with model number: *ST10000NM0086. *Are these drives
notorious for this behaviour? Anyone has experience with these drives in a
CEPH environment?
Kind regards,
Caspar Smit
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com