Hi Piotr,

Thanks for your answer! I've set nodown and now it doesn't mark any OSD's
as down anymore :)

Any tip when everything is recovered/backfilled and unsetting the nodown
flag? Shutdown all activity to the ceph cluster before that moment?

If i unset the nodown flag and suddenly a lot of OSD's are flagged down it
should be better when there's no activity at all, that when the osd's come
back up there is nothing to be done (no due recovery/backfilling).

Kind regards,
Caspar

2018-06-07 8:47 GMT+02:00 Piotr Dałek <piotr.da...@corp.ovh.com>:

> On 18-06-06 09:29 PM, Caspar Smit wrote:
>
>> Hi all,
>>
>> We have a Luminous 12.2.2 cluster with 3 nodes and i recently added a
>> node to it.
>>
>> osd-max-backfills is at the default 1 so backfilling didn't go very fast
>> but that doesn't matter.
>>
>> Once it started backfilling everything looked ok:
>>
>> ~300 pgs in backfill_wait
>> ~10 pgs backfilling (~number of new osd's)
>>
>> But i noticed the degraded objects increasing a lot. I presume a pg that
>> is in backfill_wait state doesn't accept any new writes anymore? Hence
>> increasing the degraded objects?
>>
>> So far so good, but once a while i noticed a random OSD flapping (they
>> come back up automatically). This isn't because the disk is saturated but a
>> driver/controller/kernel incompatibility which 'hangs' the disk for a short
>> time (scsi abort_task error in syslog). Investigating further i noticed
>> this was already the case before the node expansion.
>> These OSD's flapping results in lots of pg states which are a bit
>> worrying:
>>
>>               109 active+remapped+backfill_wait
>>               80  active+undersized+degraded+remapped+backfill_wait
>>               51  active+recovery_wait+degraded+remapped
>>               41  active+recovery_wait+degraded
>>               27  active+recovery_wait+undersized+degraded+remapped
>>               14  active+undersized+remapped+backfill_wait
>>               4   active+undersized+degraded+remapped+backfilling
>>
>> I think the recovery_wait is more important then the backfill_wait, so i
>> like to prioritize these because the recovery_wait was triggered by the
>> flapping OSD's
>>
> >
>
>> furthermore the undersized ones should get absolute priority or is that
>> already the case?
>>
>> I was thinking about setting "nobackfill" to prioritize recovery instead
>> of backfilling.
>> Would that help in this situation? Or am i making it even worse then?
>>
>> ps. i tried increasing the heartbeat values for the OSD's to no avail,
>> they still get flagged as down once in a while after a hiccup of the driver.
>>
>
> First of all, use "nodown" flag so osds won't be marked down automatically
> and unset it once everything backfills/recovers and settles for good --
> note that there might be lingering osd down reports, so unsetting nodown
> might cause some of problematic osds to be instantly marked as down.
>
> Second, since Luminous you can use "ceph pg force-recovery" to ask
> particular pgs to recover first, even if there are other pgs to backfill
> and/or recovery.
>
> --
> Piotr Dałek
> piotr.da...@corp.ovh.com
> https://www.ovhcloud.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to