Hello fellow CEPH-users,
we are currently investigating latency spikes in our CEPH(14.2.11) prod cluster, usually occurring when under heavy load. TLDR: Do you have an idea where to investigate some kv commit latency spikes on a CEPH cluster with a LSI 9300-8i HBA and all SSD(Intel, Micron) OSDs?


The cluster consists of 3MDS nodes(2 active + 1 standby-replay), 3MON nodes(each running MGR+MON daemon) and 4OSD nodes(each having 8 SSD bluestore OSD disks). All nodes are running ubuntu latest 18.04 with kernel version 5.4(2 OSD Server still have 4.15 - but spike are seen on all of the OSD servers).

As the spikes seem to be randomly distributed across time(under load) and OSD, we followed the spikes to find following messages on the OSD nodes:
```
bluestore(/var/lib/ceph/osd/ceph-31) log_latency slow operation observed for kv_sync, latency = 5.22298s bluestore(/var/lib/ceph/osd/ceph-31) log_latency_fn slow operation observed for _txc_committed_kv, latency = 5.5732s, txc = 0x55b1a98d9e00
...
bluestore(/var/lib/ceph/osd/ceph-31) log_latency_fn slow operation observed for _txc_committed_kv, latency = 5.50842s, txc = 0x55b1aa197800 bluestore(/var/lib/ceph/osd/ceph-31) log_latency_fn slow operation observed for _txc_committed_kv, latency = 5.5058s, txc = 0x55b1b7e75c00
```

We found timely correlated kernel messages, which suggest that it might has something to do with the underlying SSDs
```
kernel: [3613612.312027] sd 4:0:10:0: attempting task abort!scmd(0x00000000dac86408), outstanding for 31384 ms & timeout 30000 ms kernel: [3613612.312034] sd 4:0:10:0: [sdg] tag#744 CDB: Write(10) 2a 00 be 11 b8 80 00 00 08 00 kernel: [3613612.312036] scsi target4:0:10: handle(0x0013), sas_address(0x4433221104000000), phy(4) kernel: [3613612.312038] scsi target4:0:10: enclosure logical id(0x500605b00e70a7b0), slot(7) kernel: [3613612.312039] scsi target4:0:10: enclosure level(0x0000), connector name( ) kernel: [3613612.312040] sd 4:0:10:0: No reference found at driver, assuming scmd(0x00000000dac86408) might have completed kernel: [3613612.312042] sd 4:0:10:0: task abort: SUCCESS scmd(0x00000000dac86408)
```

There are lots of the above blocks until the Kernel apparently has enough of them and just resets the device/interface:
```
kernel: [3613612.312267] sd 4:0:10:0: attempting task abort!scmd(0x00000000d7aaff5a), outstanding for 31388 ms & timeout 30000 ms kernel: [3613612.312269] sd 4:0:10:0: [sdg] tag#520 CDB: Write(10) 2a 00 be 11 b4 e0 00 00 08 00 kernel: [3613612.312269] scsi target4:0:10: handle(0x0013), sas_address(0x4433221104000000), phy(4) kernel: [3613612.312270] scsi target4:0:10: enclosure logical id(0x500605b00e70a7b0), slot(7) kernel: [3613612.312271] scsi target4:0:10: enclosure level(0x0000), connector name( ) kernel: [3613612.312272] sd 4:0:10:0: No reference found at driver, assuming scmd(0x00000000d7aaff5a) might have completed kernel: [3613612.312273] sd 4:0:10:0: task abort: SUCCESS scmd(0x00000000d7aaff5a)
kernel: [3613612.653004] sd 4:0:10:0: Power-on or device reset occurred
kernel: [3613613.254064] sd 4:0:10:0: Power-on or device reset occurred
```

OSD nodes are equipped with an "LSI 9300-8i SAS HBA" also we use two types of SSDs "Intel SSD D3-S4510 Series 1,92 TB", "Micron 5210 ION 1.92TB SSD". Since these resets happen on both SSDs we figured the least common denominator is the HBA so we did an Upgrade to the latest FW/BIOS on one OSD node. Sadly this did not solve the issue.


* The question is now has someone a similar hardware configuration and issues with it ?
* Do you have an idea what could be the cause of this behaviour ?
* Or which part to investigate further ?

Thanks for your hints and time reading :)
M
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to