Hi Mark, Greg and Kyle,
Sorry to response this late, and thanks for providing the directions for me to 
look at.

We have exact the same setup for OSD, pool replica (and even I tried to create 
the same number of PGs within the small cluster), however, I can still 
reproduce this constantly.

This is the command I run:
$ rados bench -p perf_40k_PG -b 5000 -t 3 --show-time 10 write

With 24 OSDs:
Average Latency: 0.00494123
Max latency:     0.511864
Min latency:      0.002198

With 330 OSDs:
Average Latency:    0.00913806
Max latency:             0.021967
Min latency:              0.005456

In terms of the crush rule, we are using the default one, for the small 
cluster, it has 3 OSD hosts (11 + 11 + 2), for the large cluster, we have 30 
OSD hosts (11 * 30).

I have a couple of questions:
 1. Is it possible that latency is due to that we have only three layer 
hierarchy? like root -> host -> OSD, and as we are using the Straw (by default) 
bucket type, which has O(N) speed, and if host number increase, so that the 
computation actually increase. I suspect not as the computation is in the order 
of microseconds per my understanding.

 2. Is it possible because we have more OSDs, the cluster will need to maintain 
far more connections between OSDs which potentially slow things down?

 3. Anything else i might miss?

Thanks all for the constant help.

Guang  


在 2013-10-22,下午10:22,Guang Yang <yguan...@yahoo.com> 写道:

> Hi Kyle and Greg,
> I will get back to you with more details tomorrow, thanks for the response.
> 
> Thanks,
> Guang
> 在 2013-10-22,上午9:37,Kyle Bader <kyle.ba...@gmail.com> 写道:
> 
>> Besides what Mark and Greg said it could be due to additional hops through 
>> network devices. What network devices are you using, what is the network  
>> topology and does your CRUSH map reflect the network topology?
>> 
>> On Oct 21, 2013 9:43 AM, "Gregory Farnum" <g...@inktank.com> wrote:
>> On Mon, Oct 21, 2013 at 7:13 AM, Guang Yang <yguan...@yahoo.com> wrote:
>> > Dear ceph-users,
>> > Recently I deployed a ceph cluster with RadosGW, from a small one (24 
>> > OSDs) to a much bigger one (330 OSDs).
>> >
>> > When using rados bench to test the small cluster (24 OSDs), it showed the 
>> > average latency was around 3ms (object size is 5K), while for the larger 
>> > one (330 OSDs), the average latency was around 7ms (object size 5K), twice 
>> > comparing the small cluster.
>> >
>> > The OSD within the two cluster have the same configuration, SAS disk,  and 
>> > two partitions for one disk, one for journal and the other for metadata.
>> >
>> > For PG numbers, the small cluster tested with the pool having 100 PGs, and 
>> > for the large cluster, the pool has 43333 PGs (as I will to further scale 
>> > the cluster, so I choose a much large PG).
>> >
>> > Does my test result make sense? Like when the PG number and OSD increase, 
>> > the latency might drop?
>> 
>> Besides what Mark said, can you describe your test in a little more
>> detail? Writing/reading, length of time, number of objects, etc.
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to