Thanks both of your reply. Hi haomai.
If we compare the RDMA with Tcp/Ip stack, as I know, we can use the RDMA to offload the traffic and reduce the CPU usage, which means the other components can user more CPU to increase some performance metrics, such as IOPS ? Hi Deepak, I would describe more detail of my environment and hope you can give me more advice about it. [Ceph Cluster] - 1 pool - 1 rbd [Host Daemon] - 1 ceph-mon - 8 ceph-hosts - 1 fio server (compile with librbd and librbd is compile to support the RDMA) [fio config] [global] ioengine=rbd clientname=admin pool=rbd rbdname=rbd clustername=ceph runtime=120 iodepth=128 numjobs=6 group_reporting size=256G direct=1 ramp_time=5 [r75w25] bs=4k rw=randrw rwmixread=75 In my RDMA experiment, I start the fio clinet on host 1 and it will trigger 3 fio servers (on each hosts) to start the rand_read_write for specific rbd. Although I don't specific the public/cluster network address in the ceph.conf, I guess all traffic between cluster will over 10G networks since I only input 10G's IP addresses in my manually deploy. Since the ceph.conf indicate to use the RDMA as the ms_type, I think the connection between fio and rbd is based on RDMA, During the fio processing, I observe following system metrics. 1. System CPU usage 2. NIC (1G) throughput 3. NIC (10G) throughtput 4. SSD IO stat. Only the CPU usage is full (100%) and used by fio server and ceph-osds, the other 3 metrics still have a room to use, so I think the bottleneck of my environment is CPU usage. So, according to those observation and concept of RDMA, I assume that the RDMA can offload the network traffic to reduce the CPU and give other co I think if we can use the RDMA for (cluster/private network), it can offload the network traffic within cluster to reduce the CPU usage to release more CPU for other components. If I have any misunderstanding , please correct me, Thanks your help! Best Regards, Hung-Wei Chiu(邱宏瑋) -- Computer Center, Department of Computer Science National Chiao Tung University 2017-03-24 2:22 GMT+08:00 Deepak Naidu <dna...@nvidia.com>: > RDMA is of interest to me. So my below comment. > > > > >> What surprised me is that the result of RDMA mode is almost the same > as the basic mode, the iops, latency, throughput, etc. > > > > Pardon my knowledge here. If I read your ceph.conf and your notes. It > seems that you are using RDMA only for “cluster/private network” ? so how > do you expect RDMA to be efficient on client IOPS/latency/throughput. > > > > > > -- > > Deepak > > > > > > *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf > Of *Haomai Wang > *Sent:* Thursday, March 23, 2017 4:34 AM > *To:* Hung-Wei Chiu (邱宏瑋) > *Cc:* ceph-users@lists.ceph.com > *Subject:* Re: [ceph-users] The performance of ceph with RDMA > > > > > > > > On Thu, Mar 23, 2017 at 5:49 AM, Hung-Wei Chiu (邱宏瑋) < > hwc...@cs.nctu.edu.tw> wrote: > > Hi, > > > > I use the latest (master branch, upgrade at 2017/03/22) to build ceph with > RDMA and use the fio to test its iops/latency/throughput. > > > > In my environment, I setup 3 hosts and list the detail of each host below. > > > > OS: ubuntu 16.04 > > Storage: SSD * 4 (256G * 4) > > Memory: 64GB. > > NICs: two NICs, one (intel 1G) for public network and the other (mellanox > 10G) for private network. > > > > There're 3 monitor and 24 osds equally distributed within 3 hosts which > means each hosts contains 1 mon and 8 osds. > > > > For my experiment, I use two configs, basic and RDMA. > > > > Basic > > [global] > > fsid = 0612cc7e-6239-456c-978b-b4df781fe831 > > mon initial members = ceph-1,ceph-2,ceph-3 > > mon host = 10.0.0.15,10.0.0.16,10.0.0.17 > > osd pool default size = 2 > > osd pool default pg num = 1024 > > osd pool default pgp num = 1024 > > > > > > RDMA > > [global] > > fsid = 0612cc7e-6239-456c-978b-b4df781fe831 > > mon initial members = ceph-1,ceph-2,ceph-3 > > mon host = 10.0.0.15,10.0.0.16,10.0.0.17 > > osd pool default size = 2 > > osd pool default pg num = 1024 > > osd pool default pgp num = 1024 > > ms_type=async+rdma > > ms_async_rdma_device_name = mlx4_0 > > > > > > What surprised me is that the result of RDMA mode is almost the same as > the basic mode, the iops, latency, throughput, etc. > > I also try to use different pattern of the fio parameter, such as read and > write ratio, random operations or sequence operations. > > All results are the same. > > > > yes, most of latency comes from other components now.. although we still > want to avoid extra copy in rdma side. > > > > so current rdma backend only means it just can be choice compared to > tcp/ip network. more benefits need to be get from others. > > > > > > In order to figure out what's going on. I do the following steps. > > > > 1. Follow this article (https://community.mellanox.com/docs/DOC-2086) to > make sure my RDMA environment. > > 2. To make sure the network traffic is transmitted by RDMA, I dump the > traffic within the private network and the answear is yes. it use the RDMA. > > 3. Modify the ms_async_rdma_buffer_size to (256 << 10), no change. > > 4. Modfiy the ms_async_rdma_send_buffers to 2048, no change. > > 5. Modify the ms_async_rdma_receive_buffers to 2048, no change. > > > > After above operations, I guess maybe my Ceph setup environment is not > good for RDMA to improve the performance. > > > > Do anyone know what kind of the ceph environment (replicated size, # of > osd, # of mon, etc) is good for RDMA? > > > > Thanks in advanced. > > > > > > > Best Regards, > > Hung-Wei Chiu(邱宏瑋) > > -- > Computer Center, Department of Computer Science > National Chiao Tung University > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ------------------------------ > This email message is for the sole use of the intended recipient(s) and > may contain confidential information. Any unauthorized review, use, > disclosure or distribution is prohibited. If you are not the intended > recipient, please contact the sender by reply email and destroy all copies > of the original message. > ------------------------------ >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com