Hi everyone, I'm having a really nasty issue since around two days where our cluster report a bunch of SLOW_OPS on one of our OSD as:
https://paste.openstack.org/show/b3DkgnJDVx05vL5o4OmY/ Here is the cluster specification: * Used to store Openstack related data (VMs/Snaphots/Volumes/Swift). * Based on CEPH Nautilus 14.2.8 installed using ceph-ansible. * Use an EC based storage profile. * We have a separate and dedicated frontend and backend 10Gbps network. * We don't have any network issues observed or reported by our monitoring system. Here is our current cluster status: https://paste.openstack.org/show/biVnkm9Yyog3lmSUn0UK/ Here is a detailed view of our cluster status: https://paste.openstack.org/show/bgKCSVuow0JUZITo2Ndj/ My main issue here is that this health alert is starting to fill the Monitor's disk and so trigger a MON_DISK_BIG alert. I'm worried as I'm having a hard time to identify which osd operation is actually slow and especially, which image does it concern and which client is using it. So far I've try: * To match this client ID with any watcher of our stored volumes/vms/snaphots by extracting the whole list and then using the following command: *rbd status <pool>/<image>* Unfortunately none of the watchers is matching my reported client from the OSD on any pool. * * *To map this reported chunk of data to any of our store image using: *ceph osd map <pool>/rbd_data.5.89a4a940aba90b.00000000000000a0* Unfortunately any pool name existing within our cluster give me back an answer with no image information and a different watcher client ID. So my questions are: How can I identify which operation this OSD is trying to achieve as osd_op() is a bit large ^^ ? Does the *snapc *information part within the log relate to snapshot or is that something totally different? How can I identify the related images to this data chunk? Is there official documentation about SLOW_OPS operations code explaining how to read the logs like something that explains which block is PG number, which is the ID of something etc? Thanks a lot everyone and feel free to ask for additional information! G. _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io