Thanks everyone for all of your guidance. To answer all the questions!

> Are the OSD nodes connected with 10Gb as well?
Yes

> Are you using SSDs for your index pool?  How many?
Yes, for a node with 39 HDD OSDs we are using 6 Index SSDs

> How big are your objects?
Most test run at 64K, but I have gone up to 4M with similar results

> Try to increase gradually to say 1gb and measure.
Tried with 1gb size and result was the same, just around 250Mb/s

rados bench gives me ~1Gb/s, so 4 times what I am getting through our haproxy 
balanced end point.
Our haproxy is version 1.5.18 and it looks like multithreading was introduced 
in 1.8 so that could be culprit.
These cpu stats during the benchmark seem to agree as well

03:34:15 PM CPU %usr  %nice %sys  %iowait %irq %soft %steal %guest %gnice %idle
03:34:16 PM all 3.23  0.00  4.12  0.00    0.00 0.76  0.00   0.00   0.00   91.88
03:34:16 PM 0   4.00  0.00  2.00  0.00    0.00 1.00  0.00   0.00   0.00   93.00
03:34:16 PM 1   2.06  0.00  2.06  0.00    0.00 0.00  0.00   0.00   0.00   95.88
03:34:16 PM 2   2.04  0.00  3.06  0.00    0.00 0.00  0.00   0.00   0.00   94.90
03:34:16 PM 3   2.00  0.00  2.00  0.00    0.00 0.00  0.00   0.00   0.00   96.00
03:34:16 PM 4   39.00 0.00  58.00 0.00    0.00 3.00  0.00   0.00   0.00   0.00
03:34:16 PM 5   0.99  0.00  1.98  0.00    0.00 0.00  0.00   0.00   0.00   97.03
03:34:16 PM 6   2.04  0.00  3.06  0.00    0.00 1.02  0.00   0.00   0.00   93.88
03:34:16 PM 7   1.01  0.00  2.02  0.00    0.00 0.00  0.00   0.00   0.00   96.97
03:34:16 PM 8   2.25  0.00  1.12  0.00    0.00 5.62  0.00   0.00   0.00   91.01
03:34:16 PM 9   2.02  0.00  2.02  0.00    0.00 0.00  0.00   0.00   0.00   95.96
03:34:16 PM 10  1.02  0.00  2.04  0.00    0.00 1.02  0.00   0.00   0.00   95.92
03:34:16 PM 11  1.98  0.00  1.98  0.00    0.00 0.99  0.00   0.00   0.00   95.05
03:34:16 PM 12  1.03  0.00  0.00  0.00    0.00 0.00  0.00   0.00   0.00   98.97
03:34:16 PM 13  1.02  0.00  0.00  0.00    0.00 0.00  0.00   0.00   0.00   98.98
03:34:16 PM 14  1.08  0.00  2.15  0.00    0.00 1.08  0.00   0.00   0.00   95.70
03:34:16 PM 15  0.00  0.00  0.00  0.00    0.00 0.00  0.00   0.00   0.00   100.00
03:34:16 PM 16  1.04  0.00  3.12  0.00    0.00 1.04  0.00   0.00   0.00   94.79
03:34:16 PM 17  1.00  0.00  0.00  0.00    0.00 0.00  0.00   0.00   0.00   99.00
03:34:16 PM 18  1.04  0.00  1.04  0.00    0.00 1.04  0.00   0.00   0.00   96.88
03:34:16 PM 19  0.99  0.00  0.00  0.00    0.00 0.00  0.00   0.00   0.00   99.01
03:34:16 PM 20  2.04  0.00  3.06  0.00    0.00 1.02  0.00   0.00   0.00   93.88
03:34:16 PM 21  2.02  0.00  0.00  0.00    0.00 0.00  0.00   0.00   0.00   97.98
03:34:16 PM 22  1.04  0.00  2.08  0.00    0.00 2.08  0.00   0.00   0.00   94.79
03:34:16 PM 23  3.00  0.00  7.00  0.00    0.00 0.00  0.00   0.00   0.00   90.00

I'll start by upgrading that, ramping up multithreading and let you know how it 
goes.

Cheers,
Dylan

On Fri, 2020-09-25 at 19:39 +0000, Dylan Griff wrote:
> Notice: This message was sent from outside the University of Victoria email 
> system but is claiming to be from UVic. Please be cautious with links, 
> attachments, and sensitive information.
> 
> 
> Notice: This message was sent from outside the University of Victoria email 
> system. Please be cautious with links and sensitive information.
> 
> 
> Hey folks!
> 
> Just shooting this out there in case someone has some advice. We're
> just setting up RGW object storage for one of our new Ceph clusters (3
> mons, 1072 OSDs, 34 nodes) and doing some benchmarking before letting
> users on it.
> 
> We have 10Gb network to our two RGW nodes behind a single ip on
> haproxy, and some iperf testing shows I can push that much; latencies
> look okay. However, when using a small cosbench cluster I am unable to
> get more than ~250Mb of read speed total.
> 
> If I add more nodes to the cosbench cluster it just spreads out the
> load evenly with the same cap Same results when running two cosbench
> clusters from different locations. I don't see any obvious bottlenecks
> in terms of the RGW server hardware limitations, but here I am asking
> for assistance so I don't put it past me missing something. I have
> attached one of my cosbench load files with keys removed, but I get
> similar results with different numbers of workers, objects, buckets,
> object sizes, and cosbench drivers.
> 
> Does anyone have any pointers on what I could find to nail this
> bottleneck down? Am I wrong in expecting more throughput? Let me know
> if I can get any other info for you.
> 
> Cheers,
> Dylan
> 
> --
> 
> Dylan Griff
> Senior System Administrator
> CLE D063
> RCS - Systems - University of Victoria
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to