But the config reference says “high” is already the default value? 
(https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/ 
<https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/>)

This selects which priority ops will be sent to the strict queue verses the 
normal queue. The low setting sends all replication ops and higher to the 
strict queue, while the high option sends only replication acknowledgment ops 
and higher to the strict queue. Setting this to high should help when a few 
OSDs in the cluster are very busy especially when combined with wpq in the 
osd_op_queue setting. OSDs that are very busy handling replication traffic 
could starve primary client traffic on these OSDs without these settings. 
Requires a restart.
Valid Choices: low, high
Default: high


> On Feb 9, 2021, at 4:42 AM, Milan Kupcevic <milan_kupce...@harvard.edu> wrote:
> 
> On 2/9/21 7:29 AM, Michal Strnad wrote:
>> 
>> we are looking for a proper solution of slow_ops. When the disk failed,
>> node is restated ... a lot of slow operations appear. Even if disk (OSD)
>> or node is back again most of slow_ops are still there. On the internet
>> we found only advice that we have to restart monitor. But this is not
>> right approach. Do you have some better solution? How did you treat
>> slow_ops in your production clusters?
>> 
>> We are running the latest nautilus on all clusters.
>> 
> 
> 
> 
> This config setting should help:
> 
> ceph config set osd osd_op_queue_cut_off high
> 
> 
> 
> -- 
> Milan Kupcevic
> Senior Cyberinfrastructure Engineer at Project NESE
> Harvard University
> FAS Research Computing
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to