Hi Uwe,

We had a similar problem in the past and our conclusion was that the in-kernel 
OFED drivers (provided by distribution) likely doesn’t have all the patches 
required for SR-IOV to work correctly in VMs (when using VFs on IB HBAs and 
KVM). Mellanox OFED is required at least on the VMs when using SR-IOV, 
otherwise you get these random network errors.

MOFED 4.9 is a LTS release for ConnectX-3 HBAs and should work great with 
Lustre 2.12.5 on your VMs. It's definitely worth the effort in the long term in 
my opinion.

Best,

Stephane

> On Aug 13, 2020, at 5:20 AM, Uwe Sauter <uwe.sauter...@gmail.com> wrote:
> 
> Dear all,
> 
> (TL;DR at the bottom)
> 
> I have the following situation:
> 
> 
> +----------------+
> |                +--------------+                
> +-------------------------------------------------+
> | Lustre servers |              |                |                            
>                      |
> |    @ o2ib20    |    +---------+--------+       |  Virtualization host:      
>                      |
> |                |    |                  |       |  * Proxmox 6.2, up-to-date 
>                      |
> +----------------+    |      o2ib20      |       |  ** Debian 10.5 based      
>                      |
>                      |  10.148.0.0/16   |       |  ** Ubuntu based kernel 
> 5.4.44-2-pve            |
>                      |                  |       |  * ConnectX-3 
> (MCX354A-FCBT)                    |
>                      +---------+--------+       |  ** 15 VFs configured       
>                     |
>                                |                |  ** SR-IOV                  
>                     |
>         +----------------------+                |  * OFED provided by 
> distribution                |
>         |                                       |                             
>                     |
> +--------+-------+                               |                            
>                      |
> |  LNET router   |                               |  Virtual machines and LNET 
> routers:             |
> +--------+-------+                               |  * CentOS 7.8 based        
>                      |
>         |                                       |  * OFED provided by CentOS  
>                     |
>         +----------------------+                |  * Lustre 2.12.5            
>                     |
>                                |                |  * Kernel 
> 3.10.0-1127.18.2.el7                  |
>                      +---------+--------+       |                             
>                     |
>                      |                  
> |-----------------+--------------+--------------+         |
>                      |      o2ib43      |       |         |              |    
>           |         |
> +----------------+    |  10.225.0.0/16   |       |         |              |   
>            |         |
> |                |    |                  |       |  +------+-----+ 
> +------+-----+ +------+-----+   |
> | Lustre servers |    +---------+--------+       |  |            | |          
>   | |            |   |
> |    @ o2ib43    |              |                |  |    VM 1    | |    VM 2  
>   | |    VM 3    |   |
> |                +--------------+                |  |  @ o2ib43  | |  @ 
> o2ib43  | |  @ o2ib43  |   |
> +----------------+                               |  |            | |          
>   | |            |   |
>                                                 |  |            | |           
>  | |            |   |
>                                                 |  +------------+ 
> +------------+ +------------+   |
>                                                 |                             
>                     |
>                                                 
> +-------------------------------------------------+
> 
> 
> Lustre @ o2ib20 is a Sonexion appliance based on CentOS 7.2 and Lustre 
> version 2.11.0.300_cray_43_gd35e657_dirty.
> 
> Lustre @ 02ib43 is CentOS 7.6 based setup with kernel 
> 3.10.0-957.1.3.el7_lustre and Lustre version lustre-2.10.7.1nec-1.el7.x86_64.
> 
> The issue I currently see is that once more that one VM is running on the 
> virualization host
> then access to the Lustre file system behind the LNET routers is stuck.
> 
> The errors I can see on the VM is e.g.:
> 
> [ 1297.470192] LustreError: 2477:0:(events.c:200:client_bulk_callback()) 
> event type 1, status -5, desc ffff89365cebb800
> [ 1297.472058] LustreError: 2478:0:(events.c:200:client_bulk_callback()) 
> event type 1, status -5, desc ffff89365cebb800
> [ 1297.473909] Lustre: 2490:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ 
> Request sent has failed due to network error: [sent
> 1597316593/real 1597316593]  req@ffff89365cee0900 x1674906532108800/t0
> (0) o4->snx11167-OST001a-osc-ffff893468b2f800@10.148.240.33@o2ib20:6/4 lens 
> 488/448 e 0 to 1 dl 1597316688 ref 2 fl
> Rpc:eX/0/ffffffff rc 0/-1
> [ 1297.479055] Lustre: snx11167-OST001a-osc-ffff893468b2f800: Connection to 
> snx11167-OST001a (at 10.148.240.33@o2ib20) was lost;
> in progress operations using this service will wait for recovery to com
> plete
> [ 1299.470205] LustreError: 2478:0:(events.c:200:client_bulk_callback()) 
> event type 1, status -5, desc ffff89365cebb800
> [ 1299.472403] LustreError: 2477:0:(events.c:200:client_bulk_callback()) 
> event type 1, status -5, desc ffff89365cebb800
> [ 1299.474395] Lustre: 2490:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ 
> Request sent has failed due to network error: [sent
> 1597316595/real 1597316595]  req@ffff89365cee0900 x1674906532108800/t0
> (0) o4->snx11167-OST001a-osc-ffff893468b2f800@10.148.240.33@o2ib20:6/4 lens 
> 488/448 e 0 to 1 dl 1597316690 ref 2 fl
> Rpc:eX/2/ffffffff rc 0/-1
> [ 1299.479830] Lustre: snx11167-OST001a-osc-ffff893468b2f800: Connection to 
> snx11167-OST001a (at 10.148.240.33@o2ib20) was lost;
> in progress operations using this service will wait for recovery to com
> plete
> [ 1299.496826] Lustre: snx11167-OST001a-osc-ffff893468b2f800: Connection 
> restored to 10.148.240.33@o2ib20 (at 10.148.240.33@o2ib20)
> [ 1301.470102] LustreError: 2478:0:(events.c:200:client_bulk_callback()) 
> event type 1, status -5, desc ffff89365cebb800
> [ 1301.472096] LustreError: 2477:0:(events.c:200:client_bulk_callback()) 
> event type 1, status -5, desc ffff89365cebb800
> [ 1301.474135] Lustre: 2490:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ 
> Request sent has failed due to network error: [sent
> 1597316597/real 1597316597]  req@ffff89365cee0900 x1674906532108800/t0
> (0) o4->snx11167-OST001a-osc-ffff893468b2f800@10.148.240.33@o2ib20:6/4 lens 
> 488/448 e 0 to 1 dl 1597316692 ref 2 fl
> Rpc:eX/2/ffffffff rc 0/-1
> [ 1301.479772] Lustre: snx11167-OST001a-osc-ffff893468b2f800: Connection to 
> snx11167-OST001a (at 10.148.240.33@o2ib20) was lost;
> in progress operations using this service will wait for recovery to com
> plete
> [ 1301.483576] LNetError: 
> 2486:0:(lib-move.c:1999:lnet_handle_find_routed_path()) no route to 
> 10.148.240.33@o2ib20 from <?>
> 
> 
> 
> Access to the Lustre file system which is on the same IB fabric is still 
> possible so I suspect that this is somehow related to
> LNET routing.
> 
> If I run LNET selftests as explained at http://wiki.lustre.org/LNET_Selftest 
> between one of the LNET routers and the VMs I can see
> that RPCs get dropped.
> 
> 
> Access from a client running on native hardware is possible for both file 
> systems.
> 
> 
> 
> Has someone a comparable setup? What kind of logs is needed to debug this? 
> I'll gladly provide any info…
> 
> 
> TL;DR
> * native access is possible in the same IB fabric as well as when being 
> routed between different fabrics
> * if only one VM is running then access is possible to both file systems, too
> * if more VMs are running on the same virtualization host than access is only 
> possible on the file system attached to the same
> fabric as the VMs
> * access to the routed file system gets stuck
> 
> 
> 
> Any help is appreciated.
> 
> Thanks,
> 
>  Uwe Sauter
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to