The first thing to note is that lst reports results in binary units (MiB/s) while iperf reports results in decimal units (Gbps). If you do the conversion you get 2055.31 MiB/s = 2155 MB/s.
The other thing to check is the CPU usage. For TCP the CPU usage can be high. You should try RoCE+o2iblnd instead. Cheers, Andreas On Nov 26, 2019, at 21:26, Pinkesh Valdria <pinkesh.vald...@oracle.com<mailto:pinkesh.vald...@oracle.com>> wrote: Hello All, I created a new Lustre cluster on CentOS7.6 and I am running lnet_selftest_wrapper.sh to measure throughput on the network. The nodes are connected to each other using 25Gbps ethernet, so theoretical max is 25 Gbps * 125 = 3125 MB/s. Using iperf3, I get 22Gbps (2750 MB/s) between the nodes. [root@lustre-client-2 ~]# for c in 1 2 4 8 12 16 20 24 ; do echo $c ; ST=lst-output-$(date +%Y-%m-%d-%H:%M:%S) CN=$c SZ=1M TM=30 BRW=write CKSUM=simple LFROM="10.0.3.7@tcp1" LTO="10.0.3.6@tcp1" /root/lnet_selftest_wrapper.sh; done ; When I run lnet_selftest_wrapper.sh (from Lustre wiki<http://wiki.lustre.org/LNET_Selftest>) between 2 nodes, I get a max of 2055.31 MiB/s, Is that expected at the Lnet level? Or can I further tune the network and OS kernel (tuning I applied are below) to get better throughput? Result Snippet from lnet_selftest_wrapper.sh [LNet Rates of lfrom] [R] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s [W] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s [LNet Bandwidth of lfrom] [R] Avg: 0.31 MiB/s Min: 0.31 MiB/s Max: 0.31 MiB/s [W] Avg: 2055.30 MiB/s Min: 2055.30 MiB/s Max: 2055.30 MiB/s [LNet Rates of lto] [R] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s [W] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s [LNet Bandwidth of lto] [R] Avg: 2055.31 MiB/s Min: 2055.31 MiB/s Max: 2055.31 MiB/s [W] Avg: 0.32 MiB/s Min: 0.32 MiB/s Max: 0.32 MiB/s Tuning applied: Ethernet NICs: ip link set dev ens3 mtu 9000 ethtool -G ens3 rx 2047 tx 2047 rx-jumbo 8191 less /etc/sysctl.conf net.core.wmem_max=16777216 net.core.rmem_max=16777216 net.core.wmem_default=16777216 net.core.rmem_default=16777216 net.core.optmem_max=16777216 net.core.netdev_max_backlog=27000 kernel.sysrq=1 kernel.shmmax=18446744073692774399 net.core.somaxconn=8192 net.ipv4.tcp_adv_win_scale=2 net.ipv4.tcp_low_latency=1 net.ipv4.tcp_rmem = 212992 87380 16777216 net.ipv4.tcp_sack = 1 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_wmem = 212992 65536 16777216 vm.min_free_kbytes = 65536 net.ipv4.tcp_congestion_control = cubic net.ipv4.tcp_timestamps = 0 net.ipv4.tcp_congestion_control = htcp net.ipv4.tcp_no_metrics_save = 0 echo "# # tuned configuration # [main] summary=Broadly applicable tuning that provides excellent performance across a variety of common server workloads [disk] devices=!dm-*, !sda1, !sda2, !sda3 readahead=>4096 [cpu] force_latency=1 governor=performance energy_perf_bias=performance min_perf_pct=100 [vm] transparent_huge_pages=never [sysctl] kernel.sched_min_granularity_ns = 10000000 kernel.sched_wakeup_granularity_ns = 15000000 vm.dirty_ratio = 30 vm.dirty_background_ratio = 10 vm.swappiness=30 " > lustre-performance/tuned.conf tuned-adm profile lustre-performance Thanks, Pinkesh Valdria _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org