[ceph-users] Rocksdb compaction and OSD timeout

J-P Methot Thu, 07 Sep 2023 00:07:07 -0700

Hi,

We're running latest Pacific on our production cluster and we've beenseeing the dreaded 'OSD::osd_op_tp thread 0x7f346aa64700' had timed outafter 15.000000954s' error. We have reasons to believe this happens eachtime the RocksDB compaction process is launched on an OSD. My questionis, does the cluster detecting that an OSD has timed out interrupt thecompaction process? This seems to be what's happening, but it's notimmediately obvious. We are currently facing an infinite loop of randomOSDs timing out and if the compaction process is interrupted withoutfinishing, it may explain that.


--
Jean-Philippe Méthot
Senior Openstack system administrator
Administrateur système Openstack sénior
PlanetHoster inc.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Rocksdb compaction and OSD timeout

Reply via email to