Maybe my data can be useful to compare with? I have the samsung sm863. This[0] is what I get from fio directly on the ssd, and from an rbd ssd pool with 3x replication[1]. I also have included a comparisson with cephfs[3], would be nice if there would be some sort of manual page describing general to be expected ceph overhead.
[0] direct randwrite-4k-seq: (groupid=1, jobs=1): err= 0: pid=522903: Thu Sep 6 21:04:12 2018 write: IOPS=17.9k, BW=69.8MiB/s (73.2MB/s)(12.3GiB/180001msec) slat (usec): min=4, max=333, avg= 9.94, stdev= 5.00 clat (nsec): min=1141, max=1131.2k, avg=42560.69, stdev=9074.14 lat (usec): min=35, max=1137, avg=52.80, stdev= 9.42 clat percentiles (usec): | 1.00th=[ 33], 5.00th=[ 35], 10.00th=[ 35], 20.00th=[ 35], | 30.00th=[ 36], 40.00th=[ 36], 50.00th=[ 41], 60.00th=[ 43], | 70.00th=[ 49], 80.00th=[ 54], 90.00th=[ 57], 95.00th=[ 58], | 99.00th=[ 60], 99.50th=[ 62], 99.90th=[ 67], 99.95th=[ 70], | 99.99th=[ 174] bw ( KiB/s): min=34338, max=92268, per=84.26%, avg=60268.13, stdev=12283.36, samples=359 iops : min= 8584, max=23067, avg=15066.67, stdev=3070.87, samples=359 lat (usec) : 2=0.01%, 10=0.01%, 20=0.01%, 50=71.73%, 100=28.24% lat (usec) : 250=0.01%, 500=0.01%, 750=0.01% lat (msec) : 2=0.01% cpu : usr=12.96%, sys=26.87%, ctx=3218988, majf=0, minf=10962 IO depths : 1=116.8%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=0,3218724,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 randread-4k-seq: (groupid=3, jobs=1): err= 0: pid=523297: Thu Sep 6 21:04:12 2018 read: IOPS=10.2k, BW=39.7MiB/s (41.6MB/s)(7146MiB/180001msec) slat (usec): min=4, max=328, avg=15.39, stdev= 8.62 clat (nsec): min=1600, max=948792, avg=78946.53, stdev=36246.91 lat (usec): min=39, max=969, avg=94.75, stdev=37.43 clat percentiles (usec): | 1.00th=[ 38], 5.00th=[ 40], 10.00th=[ 40], 20.00th=[ 41], | 30.00th=[ 41], 40.00th=[ 52], 50.00th=[ 70], 60.00th=[ 110], | 70.00th=[ 112], 80.00th=[ 115], 90.00th=[ 125], 95.00th=[ 127], | 99.00th=[ 133], 99.50th=[ 135], 99.90th=[ 141], 99.95th=[ 147], | 99.99th=[ 243] bw ( KiB/s): min=19918, max=49336, per=84.40%, avg=34308.52, stdev=6891.67, samples=359 iops : min= 4979, max=12334, avg=8576.75, stdev=1722.92, samples=359 lat (usec) : 2=0.01%, 10=0.01%, 20=0.01%, 50=38.06%, 100=19.88% lat (usec) : 250=42.04%, 500=0.01%, 750=0.01%, 1000=0.01% cpu : usr=8.07%, sys=21.59%, ctx=1829588, majf=0, minf=10954 IO depths : 1=116.7%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=1829296,0,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 [1] rbd ssd 3x randwrite-4k-seq: (groupid=1, jobs=1): err= 0: pid=1448032: Fri May 24 19:41:48 2019 write: IOPS=655, BW=2620KiB/s (2683kB/s)(461MiB/180001msec) slat (usec): min=7, max=120, avg=10.79, stdev= 6.22 clat (usec): min=897, max=77251, avg=1512.76, stdev=368.36 lat (usec): min=906, max=77262, avg=1523.77, stdev=368.54 clat percentiles (usec): | 1.00th=[ 1106], 5.00th=[ 1205], 10.00th=[ 1254], 20.00th=[ 1319], | 30.00th=[ 1369], 40.00th=[ 1418], 50.00th=[ 1483], 60.00th=[ 1532], | 70.00th=[ 1598], 80.00th=[ 1663], 90.00th=[ 1778], 95.00th=[ 1893], | 99.00th=[ 2540], 99.50th=[ 2933], 99.90th=[ 3392], 99.95th=[ 4080], | 99.99th=[ 6194] bw ( KiB/s): min= 1543, max= 2830, per=79.66%, avg=2087.02, stdev=396.14, samples=359 iops : min= 385, max= 707, avg=521.39, stdev=99.06, samples=359 lat (usec) : 1000=0.06% lat (msec) : 2=97.19%, 4=2.70%, 10=0.04%, 20=0.01%, 50=0.01% lat (msec) : 100=0.01% cpu : usr=0.39%, sys=1.13%, ctx=118477, majf=0, minf=50 IO depths : 1=116.6%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=0,117905,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 randread-4k-seq: (groupid=3, jobs=1): err= 0: pid=1450173: Fri May 24 19:41:48 2019 read: IOPS=1812, BW=7251KiB/s (7425kB/s)(1275MiB/180001msec) slat (usec): min=6, max=161, avg=10.25, stdev= 6.37 clat (usec): min=182, max=23748, avg=538.35, stdev=136.71 lat (usec): min=189, max=23758, avg=548.86, stdev=137.19 clat percentiles (usec): | 1.00th=[ 265], 5.00th=[ 310], 10.00th=[ 351], 20.00th=[ 445], | 30.00th=[ 494], 40.00th=[ 519], 50.00th=[ 537], 60.00th=[ 562], | 70.00th=[ 594], 80.00th=[ 644], 90.00th=[ 701], 95.00th=[ 742], | 99.00th=[ 816], 99.50th=[ 840], 99.90th=[ 914], 99.95th=[ 1172], | 99.99th=[ 2442] bw ( KiB/s): min= 4643, max= 7991, per=79.54%, avg=5767.26, stdev=1080.89, samples=359 iops : min= 1160, max= 1997, avg=1441.43, stdev=270.23, samples=359 lat (usec) : 250=0.57%, 500=31.98%, 750=62.92%, 1000=4.46% lat (msec) : 2=0.05%, 4=0.01%, 10=0.01%, 50=0.01% cpu : usr=1.07%, sys=2.69%, ctx=327838, majf=0, minf=76 IO depths : 1=116.9%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=326298,0,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 [3] cephfs +---------+------------+-------+--+-----------+-------+-------+--------- --+-------+-------+-----------+------+------+-----------+-------+------+ --------------+------+------+--------------+------+------+-------------- +------+------+--------------+------+------+ | | | | | 4k r ran. | | | 4k w ran. | | | 4k r seq. | | | 4k w seq. | | | 1024k r ran. | | | 1024k w ran. | | | 1024k r seq. | | | 1024k w seq. | | | +---------+------------+-------+--+-----------+-------+-------+--------- --+-------+-------+-----------+------+------+-----------+-------+------+ --------------+------+------+--------------+------+------+-------------- +------+------+--------------+------+------+ | | | size | | lat | iops | kB/s | lat | iops | kB/s | lat | iops | MB/s | lat | iops | MB/s | lat | iops | MB/s | lat | iops | MB/s | lat | iops | MB/s | lat | iops | MB/s | +---------+------------+-------+--+-----------+-------+-------+--------- --+-------+-------+-----------+------+------+-----------+-------+------+ --------------+------+------+--------------+------+------+-------------- +------+------+--------------+------+------+ | Cephfs | ssd rep. 3 | | | 2.78 | 1781 | 7297 | 1.42 | 700 | 2871 | 0.29 | 3314 | 13.6 | 0.04 | 889 | 3.64 | 4.3 | 231 | 243 | 0.08 | 132 | 139 | 4.23 | 235 | 247 | 6.99 | 142 | 150 | +---------+------------+-------+--+-----------+-------+-------+--------- --+-------+-------+-----------+------+------+-----------+-------+------+ --------------+------+------+--------------+------+------+-------------- +------+------+--------------+------+------+ | Cephfs | ssd rep. 1 | | | 0.54 | 1809 | 7412 | 0.8 | 1238 | 5071 | 0.29 | 3325 | 13.6 | 0.56 | 1761 | 7.21 | 4.27 | 233 | 245 | 4.34 | 229 | 241 | 4.21 | 236 | 248 | 4.34 | 229 | 241 | +---------+------------+-------+--+-----------+-------+-------+--------- --+-------+-------+-----------+------+------+-----------+-------+------+ --------------+------+------+--------------+------+------+-------------- +------+------+--------------+------+------+ | Samsung | MZK7KM480 | 480GB | | 0.09 | 10.2k | 41600 | 0.05 | 17.9k | 73200 | 0.05 | 18k | 77.6 | 0.05 | 18.3k | 75.1 | 2.06 | 482 | 506 | 2.16 | 460 | 483 | 1.98 | 502 | 527 | 2.13 | 466 | 489 | +---------+------------+-------+--+-----------+-------+-------+--------- --+-------+-------+-----------+------+------+-----------+-------+------+ --------------+------+------+--------------+------+------+-------------- +------+------+--------------+------+------+ -----Original Message----- From: Robert Sander [mailto:r.san...@heinlein-support.de] Sent: vrijdag 24 mei 2019 15:26 To: ceph-users Subject: Re: [ceph-users] performance in a small cluster Am 24.05.19 um 14:43 schrieb Paul Emmerich: > 20 MB/s at 4K blocks is ~5000 iops, that's 1250 IOPS per SSD (assuming > replica 3). > > What we usually check in scenarios like these: > > * SSD model? Lots of cheap SSDs simply can't handle more than that The system has been newly created and is not busy at all. We tested a single SSD without OSD on top with fio: it can do 50K IOPS read and 16K IOPS write. > * Get some proper statistics such as OSD latencies, disk IO > utilization, etc. A benchmark without detailed performance data > doesn't really help to debug such a problem Yes, that is correct, we will try to setup a perfdata gathering system. Regards -- Robert Sander Heinlein Support GmbH Linux: Akademie - Support - Hosting http://www.heinlein-support.de Tel: 030-405051-43 Fax: 030-405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com