On 8/22/17 9:22 AM, Mannthey, Keith wrote:
>
> You may want to file a jira ticket if ko2iblnd-opa setting were being
> automatically used on your Mellanox setup.  That is not expected.
>
yes they are automatically used on my Mellanox and the script
ko2iblnd-probe seems like not working properly.
>
>  
>
> On another note:  As you note you NVMe backend is much faster than QRD
> link speed.  You may want to look at using the new Multi-rall lnet
> feature to boost network bandwidth.  You can add a 2^nd QRD HCA/Port
> and get more Lnet bandwith from your OSS server.   It is a new feature
> that is a bit of work to use but if you are chasing bandwith it might
> be worth the effort.
>
I have a dual infiniband card so I was thinking to bond them to have
more bandwidth. Is this that you mean when you are talking about the
Muti-rail feature boost ?

thanks

Rick


>  
>
> Thanks,
>
> Keith
>
>  
>
> *From:*lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org]
> *On Behalf Of *Chris Horn
> *Sent:* Monday, August 21, 2017 12:40 PM
> *To:* Riccardo Veraldi <riccardo.vera...@cnaf.infn.it>; Arman
> Khalatyan <arm2...@gmail.com>
> *Cc:* lustre-discuss@lists.lustre.org
> *Subject:* Re: [lustre-discuss] Lustre poor performance
>
>  
>
> The ko2iblnd-opa settings are tuned specifically for Intel OmniPath.
> Take a look at the /usr/sbin/ko2iblnd-probe script to see how OPA
> hardware is detected and the “ko2iblnd-opa” settings get used.
>
>  
>
> Chris Horn
>
>  
>
> *From: *lustre-discuss <lustre-discuss-boun...@lists.lustre.org
> <mailto:lustre-discuss-boun...@lists.lustre.org>> on behalf of
> Riccardo Veraldi <riccardo.vera...@cnaf.infn.it
> <mailto:riccardo.vera...@cnaf.infn.it>>
> *Date: *Saturday, August 19, 2017 at 5:00 PM
> *To: *Arman Khalatyan <arm2...@gmail.com <mailto:arm2...@gmail.com>>
> *Cc: *"lustre-discuss@lists.lustre.org
> <mailto:lustre-discuss@lists.lustre.org>"
> <lustre-discuss@lists.lustre.org <mailto:lustre-discuss@lists.lustre.org>>
> *Subject: *Re: [lustre-discuss] Lustre poor performance
>
>  
>
> I ran again my Lnet self test and  this time adding --concurrency=16 
> I can use all of the IB bandwith (3.5GB/sec).
>
> the only thing I do not understand is why ko2iblnd.conf is not loaded
> properly and I had to remove the alias in the config file to allow
> the proper peer_credit settings to be loaded.
>
> thanks to everyone for helping
>
> Riccardo
>
> On 8/19/17 8:54 AM, Riccardo Veraldi wrote:
>
>
>     I found out that ko2iblnd is not getting settings from
>     /etc/modprobe/ko2iblnd.conf
>     alias ko2iblnd-opa ko2iblnd
>     options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64
>     credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32
>     fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4
>
>     install ko2iblnd /usr/sbin/ko2iblnd-probe
>
>     but if I modify ko2iblnd.conf like this, then settings are loaded:
>
>     options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024
>     concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048
>     fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4
>
>     install ko2iblnd /usr/sbin/ko2iblnd-probe
>
>     Lnet tests show better behaviour but still I Would expect more
>     than this.
>     Is it possible to tune parameters in /etc/modprobe/ko2iblnd.conf
>     so that Mellanox ConnectX-3 will work more efficiently ?
>
>     [LNet Rates of servers]
>     [R] Avg: 2286     RPC/s Min: 0        RPC/s Max: 4572     RPC/s
>     [W] Avg: 3322     RPC/s Min: 0        RPC/s Max: 6643     RPC/s
>     [LNet Bandwidth of servers]
>     [R] Avg: 625.23   MiB/s Min: 0.00     MiB/s Max: 1250.46  MiB/s
>     [W] Avg: 1035.85  MiB/s Min: 0.00     MiB/s Max: 2071.69  MiB/s
>     [LNet Rates of servers]
>     [R] Avg: 2286     RPC/s Min: 1        RPC/s Max: 4571     RPC/s
>     [W] Avg: 3321     RPC/s Min: 1        RPC/s Max: 6641     RPC/s
>     [LNet Bandwidth of servers]
>     [R] Avg: 625.55   MiB/s Min: 0.00     MiB/s Max: 1251.11  MiB/s
>     [W] Avg: 1035.05  MiB/s Min: 0.00     MiB/s Max: 2070.11  MiB/s
>     [LNet Rates of servers]
>     [R] Avg: 2291     RPC/s Min: 0        RPC/s Max: 4581     RPC/s
>     [W] Avg: 3329     RPC/s Min: 0        RPC/s Max: 6657     RPC/s
>     [LNet Bandwidth of servers]
>     [R] Avg: 626.55   MiB/s Min: 0.00     MiB/s Max: 1253.11  MiB/s
>     [W] Avg: 1038.05  MiB/s Min: 0.00     MiB/s Max: 2076.11  MiB/s
>     session is ended
>     ./lnet_test.sh: line 17: 23394 Terminated              lst stat
>     servers
>
>
>
>
>     On 8/19/17 4:20 AM, Arman Khalatyan wrote:
>
>         just minor comment,
>
>         you should push up performance of your nodes,they are not
>         running in the max cpu frequencies.Al tests might be
>         inconsistent. in order to get most of ib run following:
>
>         tuned-adm profile latency-performance
>
>         for more options use:
>
>         tuned-adm list
>
>          
>
>         It will be interesting to see the difference.
>
>          
>
>         Am 19.08.2017 3:57 vorm. schrieb "Riccardo Veraldi"
>         <riccardo.vera...@cnaf.infn.it
>         <mailto:riccardo.vera...@cnaf.infn.it>>:
>
>             Hello Keith and Dennis, these are the test I ran.
>
>               * obdfilter-survey, shows that I Can saturate disk
>                 performance, the NVMe/ZFS backend is performing very
>                 well and it is faster then my Infiniband network
>
>             *pool          alloc   free   read  write   read  write**
>             ------------  -----  -----  -----  -----  -----  -----
>             drpffb-ost01  3.31T  3.19T      3  35.7K  16.0K  7.03G
>               raidz1      3.31T  3.19T      3  35.7K  16.0K  7.03G
>                 nvme0n1       -      -      1  5.95K  7.99K  1.17G
>                 nvme1n1       -      -      0  6.01K      0  1.18G
>                 nvme2n1       -      -      0  5.93K      0  1.17G
>                 nvme3n1       -      -      0  5.88K      0  1.16G
>                 nvme4n1       -      -      1  5.95K  7.99K  1.17G
>                 nvme5n1       -      -      0  5.96K      0  1.17G
>             ------------  -----  -----  -----  -----  -----  -----*
>
>             this are the tests results
>
>             Fri Aug 18 16:54:48 PDT 2017 Obdfilter-survey for
>             case=disk from drp-tst-ffb01
>             ost  1 sz 10485760K rsz 1024K obj    1 thr    1
>             write*7633.08   *          SHORT rewrite
>             7558.78             SHORT read 3205.24 [3213.70, 3226.78]
>             ost  1 sz 10485760K rsz 1024K obj    1 thr    2
>             write*7996.89 *            SHORT rewrite
>             7903.42             SHORT read 5264.70             SHORT
>             ost  1 sz 10485760K rsz 1024K obj    2 thr    2 write
>             *7718.94*             SHORT rewrite 7977.84            
>             SHORT read 5802.17             SHORT
>
>               * Lnet self test, and here I see the problems. For
>                 reference 172.21.52.[83,84] are the two OSSes
>                 172.21.52.86 is the reader/writer. Here is the script
>                 that I ran
>
>             #!/bin/bash
>             export LST_SESSION=$$
>             lst new_session read_write
>             lst add_group servers 172.21.52.[83,84]@o2ib5
>             lst add_group readers 172.21.52.86@o2ib5
>             <mailto:172.21.52.86@o2ib5>
>             lst add_group writers 172.21.52.86@o2ib5
>             <mailto:172.21.52.86@o2ib5>
>             lst add_batch bulk_rw
>             lst add_test --batch bulk_rw --from readers --to servers \
>             brw read check=simple size=1M
>             lst add_test --batch bulk_rw --from writers --to servers \
>             brw write check=full size=1M
>             # start running
>             lst run bulk_rw
>             # display server stats for 30 seconds
>             lst stat servers & sleep 30; kill $!
>             # tear down
>             lst end_session
>
>              
>
>             here the results
>
>             SESSION: read_write FEATURES: 1 TIMEOUT: 300 FORCE: No
>             172.21.52.[83,84]@o2ib5 are added to session
>             172.21.52.86@o2ib5 <mailto:172.21.52.86@o2ib5> are added
>             to session
>             172.21.52.86@o2ib5 <mailto:172.21.52.86@o2ib5> are added
>             to session
>             Test was added successfully
>             Test was added successfully
>             bulk_rw is running now
>             [LNet Rates of servers]
>             [R] Avg: 1751     RPC/s Min: 0        RPC/s Max: 3502    
>             RPC/s
>             [W] Avg: 2525     RPC/s Min: 0        RPC/s Max: 5050    
>             RPC/s
>             [LNet Bandwidth of servers]
>             [R] Avg: 488.79   MiB/s Min: 0.00     MiB/s Max: 977.59  
>             MiB/s
>             [W] Avg: 773.99   MiB/s Min: 0.00     MiB/s Max: 1547.99 
>             MiB/s
>             [LNet Rates of servers]
>             [R] Avg: 1718     RPC/s Min: 0        RPC/s Max: 3435    
>             RPC/s
>             [W] Avg: 2479     RPC/s Min: 0        RPC/s Max: 4958    
>             RPC/s
>             [LNet Bandwidth of servers]
>             [R] Avg: 478.19   MiB/s Min: 0.00     MiB/s Max: 956.39  
>             MiB/s
>             [W] Avg: 761.74   MiB/s Min: 0.00     MiB/s Max: 1523.47 
>             MiB/s
>             [LNet Rates of servers]
>             [R] Avg: 1734     RPC/s Min: 0        RPC/s Max: 3467    
>             RPC/s
>             [W] Avg: 2506     RPC/s Min: 0        RPC/s Max: 5012    
>             RPC/s
>             [LNet Bandwidth of servers]
>             [R] Avg: 480.79   MiB/s Min: 0.00     MiB/s Max: 961.58  
>             MiB/s
>             [W] Avg: 772.49   MiB/s Min: 0.00     MiB/s Max: 1544.98 
>             MiB/s
>             [LNet Rates of servers]
>             [R] Avg: 1722     RPC/s Min: 0        RPC/s Max: 3444    
>             RPC/s
>             [W] Avg: 2486     RPC/s Min: 0        RPC/s Max: 4972    
>             RPC/s
>             [LNet Bandwidth of servers]
>             [R] Avg: 479.09   MiB/s Min: 0.00     MiB/s Max: 958.18  
>             MiB/s
>             [W] Avg: 764.19   MiB/s Min: 0.00     MiB/s Max: 1528.38 
>             MiB/s
>             [LNet Rates of servers]
>             [R] Avg: 1741     RPC/s Min: 0        RPC/s Max: 3482    
>             RPC/s
>             [W] Avg: 2513     RPC/s Min: 0        RPC/s Max: 5025    
>             RPC/s
>             [LNet Bandwidth of servers]
>             [R] Avg: 484.59   MiB/s Min: 0.00     MiB/s Max: 969.19  
>             MiB/s
>             [W] Avg: 771.94   MiB/s Min: 0.00     MiB/s Max: 1543.87 
>             MiB/s
>             session is ended
>             ./lnet_test.sh: line 17:  4940 Terminated              lst
>             stat servers
>
>             so looks like Lnet is really under performing  going at
>             least half and less than InfiniBand capabilities.
>             How can I find out what is causing this ?
>
>             running perf tools tests with infiniband tools I have good
>             results:
>
>              
>
>             ************************************
>             * Waiting for client to connect... *
>             ************************************
>
>             
> ---------------------------------------------------------------------------------------
>                                 Send BW Test
>              Dual-port       : OFF        Device         : mlx4_0
>              Number of qps   : 1        Transport type : IB
>              Connection type : RC        Using SRQ      : OFF
>              RX depth        : 512
>              CQ Moderation   : 100
>              Mtu             : 2048[B]
>              Link type       : IB
>              Max inline data : 0[B]
>              rdma_cm QPs     : OFF
>              Data ex. method : Ethernet
>             
> ---------------------------------------------------------------------------------------
>              local address: LID 0x07 QPN 0x020f PSN 0xacc37a
>              remote address: LID 0x0a QPN 0x020f PSN 0x91a069
>             
> ---------------------------------------------------------------------------------------
>              #bytes     #iterations    BW peak[MB/sec]    BW
>             average[MB/sec]   MsgRate[Mpps]
>             Conflicting CPU frequency values detected: 1249.234000 !=
>             1326.000000. CPU Frequency is not max.
>              2          1000             0.00               11.99     
>                    6.285330
>             Conflicting CPU frequency values detected: 1314.910000 !=
>             1395.460000. CPU Frequency is not max.
>              4          1000             0.00               28.26     
>                    7.409324
>             Conflicting CPU frequency values detected: 1314.910000 !=
>             1460.207000. CPU Frequency is not max.
>              8          1000             0.00               54.47     
>                    7.139164
>             Conflicting CPU frequency values detected: 1314.910000 !=
>             1244.320000. CPU Frequency is not max.
>              16         1000             0.00               113.13    
>                    7.413889
>             Conflicting CPU frequency values detected: 1314.910000 !=
>             1460.207000. CPU Frequency is not max.
>              32         1000             0.00               226.07    
>                    7.407811
>             Conflicting CPU frequency values detected: 1469.703000 !=
>             1301.031000. CPU Frequency is not max.
>              64         1000             0.00               452.12    
>                    7.407465
>             Conflicting CPU frequency values detected: 1469.703000 !=
>             1301.031000. CPU Frequency is not max.
>              128        1000             0.00               845.45    
>                    6.925918
>             Conflicting CPU frequency values detected: 1469.703000 !=
>             1362.257000. CPU Frequency is not max.
>              256        1000             0.00               1746.93   
>                    7.155406
>             Conflicting CPU frequency values detected: 1469.703000 !=
>             1362.257000. CPU Frequency is not max.
>              512        1000             0.00               2766.93   
>                    5.666682
>             Conflicting CPU frequency values detected: 1296.714000 !=
>             1204.675000. CPU Frequency is not max.
>              1024       1000             0.00               3516.26   
>                    3.600646
>             Conflicting CPU frequency values detected: 1296.714000 !=
>             1325.535000. CPU Frequency is not max.
>              2048       1000             0.00               3630.93   
>                    1.859035
>             Conflicting CPU frequency values detected: 1296.714000 !=
>             1331.312000. CPU Frequency is not max.
>              4096       1000             0.00               3702.39   
>                    0.947813
>             Conflicting CPU frequency values detected: 1296.714000 !=
>             1200.027000. CPU Frequency is not max.
>              8192       1000             0.00               3724.82   
>                    0.476777
>             Conflicting CPU frequency values detected: 1384.902000 !=
>             1314.113000. CPU Frequency is not max.
>              16384      1000             0.00               3731.21   
>                    0.238798
>             Conflicting CPU frequency values detected: 1578.078000 !=
>             1200.027000. CPU Frequency is not max.
>              32768      1000             0.00               3735.32   
>                    0.119530
>             Conflicting CPU frequency values detected: 1578.078000 !=
>             1200.027000. CPU Frequency is not max.
>              65536      1000             0.00               3736.98   
>                    0.059792
>             Conflicting CPU frequency values detected: 1578.078000 !=
>             1200.027000. CPU Frequency is not max.
>              131072     1000             0.00               3737.80   
>                    0.029902
>             Conflicting CPU frequency values detected: 1578.078000 !=
>             1200.027000. CPU Frequency is not max.
>              262144     1000             0.00               3738.43   
>                    0.014954
>             Conflicting CPU frequency values detected: 1570.507000 !=
>             1200.027000. CPU Frequency is not max.
>              524288     1000             0.00               3738.50   
>                    0.007477
>             Conflicting CPU frequency values detected: 1457.019000 !=
>             1236.152000. CPU Frequency is not max.
>              1048576    1000             0.00               3738.65   
>                    0.003739
>             Conflicting CPU frequency values detected: 1411.597000 !=
>             1234.957000. CPU Frequency is not max.
>              2097152    1000             0.00               3738.65   
>                    0.001869
>             Conflicting CPU frequency values detected: 1369.828000 !=
>             1516.851000. CPU Frequency is not max.
>              4194304    1000             0.00               3738.80   
>                    0.000935
>             Conflicting CPU frequency values detected: 1564.664000 !=
>             1247.574000. CPU Frequency is not max.
>              8388608    1000             0.00               3738.76   
>                    0.000467
>             
> ---------------------------------------------------------------------------------------
>
>             RDMA modules are loaded
>
>             rpcrdma                90366  0
>             rdma_ucm               26837  0
>             ib_uverbs              51854  2 ib_ucm,rdma_ucm
>             rdma_cm                53755  5
>             rpcrdma,ko2iblnd,ib_iser,rdma_ucm,ib_isert
>             ib_cm                  47149  5
>             rdma_cm,ib_srp,ib_ucm,ib_srpt,ib_ipoib
>             iw_cm                  46022  1 rdma_cm
>             ib_core               210381  15
>             
> rdma_cm,ib_cm,iw_cm,rpcrdma,ko2iblnd,mlx4_ib,ib_srp,ib_ucm,ib_iser,ib_srpt,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib,ib_isert
>             sunrpc                334343  17
>             nfs,nfsd,rpcsec_gss_krb5,auth_rpcgss,lockd,nfsv4,rpcrdma,nfs_acl
>
>             I do not know where to look to have Lnet performing
>             faster. I am running my ib0 interface in connected mode
>             with 65520 MTU size.
>
>             Any hint will be much appreciated
>
>             thank you
>
>             Rick
>
>              
>
>              
>
>              
>
>             On 8/18/17 9:05 AM, Mannthey, Keith wrote:
>
>                 I would suggest you a few other tests to help isolate where 
> the issue might be.  
>
>                  
>
>                 1. What is the single thread "DD" write speed?
>
>                  
>
>                 2. Lnet_selfttest:  Please see " Chapter 28. Testing Lustre 
> Network Performance (LNet Self-Test)" in the Lustre manual if this is a new 
> test for you. 
>
>                 This will help show how much Lnet bandwith you have from your 
> single client.  There are tunable in the lnet later that can affect things.  
> Which QRD HCA are you using?
>
>                  
>
>                 3. OBDFilter_survey :  Please see " 29.3. Testing OST 
> Performance (obdfilter-survey)" in the Lustre manual.  This test will help 
> demonstrate what the backed NVMe/ZFS setup can do at the OBD layer in Lustre. 
>  
>
>                  
>
>                 Thanks,
>
>                  Keith 
>
>                 -----Original Message-----
>
>                 From: lustre-discuss 
> [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Riccardo Veraldi
>
>                 Sent: Thursday, August 17, 2017 10:48 PM
>
>                 To: Dennis Nelson <dnel...@ddn.com> <mailto:dnel...@ddn.com>; 
> lustre-discuss@lists.lustre.org
>                 <mailto:lustre-discuss@lists.lustre.org>
>
>                 Subject: Re: [lustre-discuss] Lustre poor performance
>
>                  
>
>                 this is my lustre.conf
>
>                  
>
>                 [drp-tst-ffb01:~]$ cat /etc/modprobe.d/lustre.conf options 
> lnet networks=o2ib5(ib0),tcp5(enp1s0f0)
>
>                  
>
>                 data transfer is over infiniband
>
>                  
>
>                 ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 65520
>
>                         inet 172.21.52.83  netmask 255.255.252.0  broadcast 
> 172.21.55.255
>
>                  
>
>                  
>
>                 On 8/17/17 10:45 PM, Riccardo Veraldi wrote:
>
>                     On 8/17/17 9:22 PM, Dennis Nelson wrote:
>
>                         It appears that you are running iozone on a single 
> client?  What kind of network is tcp5?  Have you looked at the network to 
> make sure it is not the bottleneck?
>
>                          
>
>                     yes the data transfer is on ib0 interface and I did a 
> memory to memory 
>
>                     test through InfiniBand QDR  resulting in 3.7GB/sec.
>
>                     tcp is used to connect to the MDS. It is tcp5 to 
> differentiate it from 
>
>                     my other many Lustre clusters. I could have called it tcp 
> but it does 
>
>                     not make any difference performance wise.
>
>                     I ran the test from one single node yes, I ran the same 
> test also 
>
>                     locally on a zpool identical to the one on the Lustre OSS.
>
>                      Ihave 4 identical servers each of them with the aame 
> nvme disks:
>
>                      
>
>                     server1: OSS - OST1 Lustre/ZFS  raidz1
>
>                      
>
>                     server2: OSS - OST2 Lustre/ZFS  raidz1
>
>                      
>
>                     server3: local ZFS raidz1
>
>                      
>
>                     server4: Lustre client
>
>                      
>
>                      
>
>                      
>
>                     _______________________________________________
>
>                     lustre-discuss mailing list
>
>                     lustre-discuss@lists.lustre.org
>                     <mailto:lustre-discuss@lists.lustre.org>
>
>                     
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>                 _______________________________________________
>
>                 lustre-discuss mailing list
>
>                 lustre-discuss@lists.lustre.org
>                 <mailto:lustre-discuss@lists.lustre.org>
>
>                 http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>                  
>
>             _______________________________________________
>             lustre-discuss mailing list
>             lustre-discuss@lists.lustre.org
>             <mailto:lustre-discuss@lists.lustre.org>
>             http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to