[ceph-users] Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

2022-10-07 Thread Zakhar Kirpichenko
Thanks for the suggestions, I will try this. /Z On Fri, 7 Oct 2022 at 18:13, Konstantin Shalygin wrote: > Zakhar, try to look to top of slow ops in daemon socket for this osd, you > may find 'snapc' operations, for example. By rbd head you can find rbd > image, and then try to look how much sna

[ceph-users] Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

2022-10-07 Thread Konstantin Shalygin
Zakhar, try to look to top of slow ops in daemon socket for this osd, you may find 'snapc' operations, for example. By rbd head you can find rbd image, and then try to look how much snapshots in chain for this image. More than 10 snaps for one image can increase client ops latency to tens millis

[ceph-users] Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

2022-10-07 Thread Eugen Block
Hi, I’d look for deep-scrubs on that OSD, those are logged, maybe those timestamps match your observations. Zitat von Zakhar Kirpichenko : Thanks for this! The drive doesn't show increased utilization on average, but it does sporadically get more I/O than other drives, usually in short bur

[ceph-users] Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

2022-10-07 Thread Zakhar Kirpichenko
Thanks for this! The drive doesn't show increased utilization on average, but it does sporadically get more I/O than other drives, usually in short bursts. I am now trying to find a way to trace this to a specific PG, pool and object (s) – not sure if that is possible. /Z On Fri, 7 Oct 2022, 12:

[ceph-users] Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

2022-10-07 Thread Dan van der Ster
Hi Zakhar, I can back up what Konstantin has reported -- we occasionally have HDDs performing very slowly even though all smart tests come back clean. Besides ceph osd perf showing a high latency, you could see high ioutil% with iostat. We normally replace those HDDs -- usually by draining and ze

[ceph-users] Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

2022-10-07 Thread Zakhar Kirpichenko
Unfortunately, that isn't the case: the drive is perfectly healthy and, according to all measurements I did on the host itself, it isn't any different from any other drive on that host size-, health- or performance-wise. The only difference I noticed is that this drive sporadically does more I/O t

[ceph-users] Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

2022-10-06 Thread Konstantin Shalygin
Hi, When you see one of 100 drives perf is unusually different, this may mean 'this drive is not like the others' and should be replaced k Sent from my iPhone > On 7 Oct 2022, at 07:33, Zakhar Kirpichenko wrote: > > Anyone, please? ___ ceph-users

[ceph-users] Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

2022-10-06 Thread Zakhar Kirpichenko
Anyone, please? On Thu, 6 Oct 2022 at 14:57, Zakhar Kirpichenko wrote: > Hi, > > I'm having a peculiar "issue" in my cluster, which I'm not sure whether > it's real: a particular OSD always shows significant latency in `ceph osd > perf` report, an order of magnitude higher than any other OSD. >