Re: [lustre-discuss] Lnet Self Test

2019-11-27 Thread Pinkesh Valdria
Thanks Andreas for your response.  

 

I ran anotherLnet Self test with 48 concurrent processes, since the nodes have 
52 physical cores and I was able to achieve same throughput (2052.71  MiB/s = 
2152 MB/s).

 

Is it expected to lose almost 600 MB/s (2750-2150= ) due to overheads on 
ethernet with Lnet?

 

 

Thanks,

Pinkesh Valdria

Oracle Cloud Infrastructure 

 

 

 

 

From: Andreas Dilger 
Date: Wednesday, November 27, 2019 at 1:25 AM
To: Pinkesh Valdria 
Cc: "lustre-discuss@lists.lustre.org" 
Subject: Re: [lustre-discuss] Lnet Self Test

 

The first thing to note is that lst reports results in binary units 

(MiB/s) while iperf reports results in decimal units (Gbps).  If you do the

conversion you get 2055.31 MiB/s = 2155 MB/s.

 

The other thing to check is the CPU usage. For TCP the CPU usage can

be high. You should try RoCE+o2iblnd instead. 

 

Cheers, Andreas


On Nov 26, 2019, at 21:26, Pinkesh Valdria  wrote:

Hello All, 

 

I created a new Lustre cluster on CentOS7.6 and I am running 
lnet_selftest_wrapper.sh to measure throughput on the network.  The nodes are 
connected to each other using 25Gbps ethernet, so theoretical max is 25 Gbps * 
125 = 3125 MB/s.Using iperf3,  I get 22Gbps (2750 MB/s) between the nodes.

 

 

[root@lustre-client-2 ~]# for c in 1 2 4 8 12 16 20 24 ;  do echo $c ; 
ST=lst-output-$(date +%Y-%m-%d-%H:%M:%S)  CN=$c  SZ=1M  TM=30 BRW=write 
CKSUM=simple LFROM="10.0.3.7@tcp1" LTO="10.0.3.6@tcp1" 
/root/lnet_selftest_wrapper.sh; done ;

 

When I run lnet_selftest_wrapper.sh (from Lustre wiki) between 2 nodes,  I get 
a max of  2055.31  MiB/s,  Is that expected at the Lnet level?  Or can I 
further tune the network and OS kernel (tuning I applied are below) to get 
better throughput?

 

 

 

Result Snippet from lnet_selftest_wrapper.sh

 

[LNet Rates of lfrom]

[R] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s

[W] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s

[LNet Bandwidth of lfrom]

[R] Avg: 0.31 MiB/s Min: 0.31 MiB/s Max: 0.31 MiB/s

[W] Avg: 2055.30  MiB/s Min: 2055.30  MiB/s Max: 2055.30  MiB/s

[LNet Rates of lto]

[R] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s

[W] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s

[LNet Bandwidth of lto]

[R] Avg: 2055.31  MiB/s Min: 2055.31  MiB/s Max: 2055.31  MiB/s

[W] Avg: 0.32 MiB/s Min: 0.32 MiB/s Max: 0.32 MiB/s

 

 

Tuning applied: 

Ethernet NICs: 

ip link set dev ens3 mtu 9000 

ethtool -G ens3 rx 2047 tx 2047 rx-jumbo 8191

 

 

less /etc/sysctl.conf

net.core.wmem_max=16777216

net.core.rmem_max=16777216

net.core.wmem_default=16777216

net.core.rmem_default=16777216

net.core.optmem_max=16777216

net.core.netdev_max_backlog=27000

kernel.sysrq=1

kernel.shmmax=18446744073692774399

net.core.somaxconn=8192

net.ipv4.tcp_adv_win_scale=2

net.ipv4.tcp_low_latency=1

net.ipv4.tcp_rmem = 212992 87380 16777216

net.ipv4.tcp_sack = 1

net.ipv4.tcp_timestamps = 1

net.ipv4.tcp_window_scaling = 1

net.ipv4.tcp_wmem = 212992 65536 16777216

vm.min_free_kbytes = 65536

net.ipv4.tcp_congestion_control = cubic

net.ipv4.tcp_timestamps = 0

net.ipv4.tcp_congestion_control = htcp

net.ipv4.tcp_no_metrics_save = 0

 

 

 

echo "#

# tuned configuration

#

[main]

summary=Broadly applicable tuning that provides excellent performance across a 
variety of common server workloads

 

[disk]

devices=!dm-*, !sda1, !sda2, !sda3

readahead=>4096

 

[cpu]

force_latency=1

governor=performance

energy_perf_bias=performance

min_perf_pct=100

[vm]

transparent_huge_pages=never

[sysctl]

kernel.sched_min_granularity_ns = 1000

kernel.sched_wakeup_granularity_ns = 1500

vm.dirty_ratio = 30

vm.dirty_background_ratio = 10

vm.swappiness=30

" > lustre-performance/tuned.conf

 

tuned-adm profile lustre-performance

 

 

Thanks,

Pinkesh Valdria

 

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Lnet Self Test

2019-11-27 Thread Andreas Dilger
The first thing to note is that lst reports results in binary units
(MiB/s) while iperf reports results in decimal units (Gbps).  If you do the
conversion you get 2055.31 MiB/s = 2155 MB/s.

The other thing to check is the CPU usage. For TCP the CPU usage can
be high. You should try RoCE+o2iblnd instead.

Cheers, Andreas

On Nov 26, 2019, at 21:26, Pinkesh Valdria 
mailto:pinkesh.vald...@oracle.com>> wrote:

Hello All,

I created a new Lustre cluster on CentOS7.6 and I am running 
lnet_selftest_wrapper.sh to measure throughput on the network.  The nodes are 
connected to each other using 25Gbps ethernet, so theoretical max is 25 Gbps * 
125 = 3125 MB/s.Using iperf3,  I get 22Gbps (2750 MB/s) between the nodes.


[root@lustre-client-2 ~]# for c in 1 2 4 8 12 16 20 24 ;  do echo $c ; 
ST=lst-output-$(date +%Y-%m-%d-%H:%M:%S)  CN=$c  SZ=1M  TM=30 BRW=write 
CKSUM=simple LFROM="10.0.3.7@tcp1" LTO="10.0.3.6@tcp1" 
/root/lnet_selftest_wrapper.sh; done ;

When I run lnet_selftest_wrapper.sh (from Lustre 
wiki) between 2 nodes,  I get a max of  
2055.31  MiB/s,  Is that expected at the Lnet level?  Or can I further tune the 
network and OS kernel (tuning I applied are below) to get better throughput?



Result Snippet from lnet_selftest_wrapper.sh

[LNet Rates of lfrom]
[R] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s
[W] Avg: 4112 RPC/s Min: 4112 RPC/s Max: 4112 RPC/s
[LNet Bandwidth of lfrom]
[R] Avg: 0.31 MiB/s Min: 0.31 MiB/s Max: 0.31 MiB/s
[W] Avg: 2055.30  MiB/s Min: 2055.30  MiB/s Max: 2055.30  MiB/s
[LNet Rates of lto]
[R] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s
[W] Avg: 4136 RPC/s Min: 4136 RPC/s Max: 4136 RPC/s
[LNet Bandwidth of lto]
[R] Avg: 2055.31  MiB/s Min: 2055.31  MiB/s Max: 2055.31  MiB/s
[W] Avg: 0.32 MiB/s Min: 0.32 MiB/s Max: 0.32 MiB/s


Tuning applied:
Ethernet NICs:

ip link set dev ens3 mtu 9000

ethtool -G ens3 rx 2047 tx 2047 rx-jumbo 8191


less /etc/sysctl.conf
net.core.wmem_max=16777216
net.core.rmem_max=16777216
net.core.wmem_default=16777216
net.core.rmem_default=16777216
net.core.optmem_max=16777216
net.core.netdev_max_backlog=27000
kernel.sysrq=1
kernel.shmmax=18446744073692774399
net.core.somaxconn=8192
net.ipv4.tcp_adv_win_scale=2
net.ipv4.tcp_low_latency=1
net.ipv4.tcp_rmem = 212992 87380 16777216
net.ipv4.tcp_sack = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_wmem = 212992 65536 16777216
vm.min_free_kbytes = 65536
net.ipv4.tcp_congestion_control = cubic
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_congestion_control = htcp
net.ipv4.tcp_no_metrics_save = 0



echo "#
# tuned configuration
#
[main]
summary=Broadly applicable tuning that provides excellent performance across a 
variety of common server workloads

[disk]
devices=!dm-*, !sda1, !sda2, !sda3
readahead=>4096

[cpu]
force_latency=1
governor=performance
energy_perf_bias=performance
min_perf_pct=100
[vm]
transparent_huge_pages=never
[sysctl]
kernel.sched_min_granularity_ns = 1000
kernel.sched_wakeup_granularity_ns = 1500
vm.dirty_ratio = 30
vm.dirty_background_ratio = 10
vm.swappiness=30
" > lustre-performance/tuned.conf

tuned-adm profile lustre-performance


Thanks,
Pinkesh Valdria

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org