[ceph-users] Re: Ceph OSD reported Slow operations

Eugen Block Wed, 01 Nov 2023 09:36:02 -0700

Hi,

for starters please add more cluster details like 'ceph status', 'cephversions', 'ceph osd df tree'. Increasing the to 10G was the rightthing to do, you don't get far with 1G with real cluster load. How arethe OSDs configured (HDD only, SSD only or HDD with rocksdb on SSD)?How is the disk utilization?


Regards,
Eugen

Zitat von prab...@cdac.in:

In a production setup of 36 OSDs( SAS disks) totalling 180 TBallocated to a single Ceph Cluster with 3 monitors and 3 managers.There were 830 volumes and VMs created in Openstack with Ceph as abackend. On Sep 21, users reported slowness in accessing the VMs.Analysing the logs lead us to problem with SAS , Network congestionand Ceph configuration( as all default values were used). We updatedthe Network from 1Gbps to 10Gbps for public and cluster networking.There was no change.The ceph benchmark performance showed that 28 OSDs out of 36 OSDsreported very low IOPS of 30 to 50 while the remaining showed 300+IOPS.We gradually started reducing the load on the ceph cluster and nowthe volumes count is 650. Now the slow operations has graduallyreduced but I am aware that this is not the solution.
Ceph configuration is updated with increasing the
osd_journal_size to 10 GB,
osd_max_backfills = 1
osd_recovery_max_active = 1
osd_recovery_op_priority = 1
bluestore_cache_trim_max_skip_pinned=10000
After one month, now we faced another issue with Mgr daemon stoppedin all 3 quorums and 16 OSDs went down. From theceph-mon,ceph-mgr.log could not get the reason. Please guide me asits a production setup
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph OSD reported Slow operations

Reply via email to