Stephane,
thank you for the hint. I will set up a test VM with Mellanox OFED and let you
know if the issue disappears.
Cheers,
Uwe
Am 14.08.20 um 06:40 schrieb Stephane Thiell:
Hi Uwe,
We had a similar problem in the past and our conclusion was that the in-kernel
OFED drivers (provided by distribution) likely doesn’t have all the patches
required for SR-IOV to work correctly in VMs (when using VFs on IB HBAs and
KVM). Mellanox OFED is required at least on the VMs when using SR-IOV,
otherwise you get these random network errors.
MOFED 4.9 is a LTS release for ConnectX-3 HBAs and should work great with
Lustre 2.12.5 on your VMs. It's definitely worth the effort in the long term in
my opinion.
Best,
Stephane
On Aug 13, 2020, at 5:20 AM, Uwe Sauter <uwe.sauter...@gmail.com> wrote:
Dear all,
(TL;DR at the bottom)
I have the following situation:
+----------------+
| +--------------+
+-------------------------------------------------+
| Lustre servers | | |
|
| @ o2ib20 | +---------+--------+ | Virtualization host:
|
| | | | | * Proxmox 6.2, up-to-date
|
+----------------+ | o2ib20 | | ** Debian 10.5 based
|
| 10.148.0.0/16 | | ** Ubuntu based kernel
5.4.44-2-pve |
| | | * ConnectX-3 (MCX354A-FCBT)
|
+---------+--------+ | ** 15 VFs configured
|
| | ** SR-IOV
|
+----------------------+ | * OFED provided by
distribution |
| |
|
+--------+-------+ |
|
| LNET router | | Virtual machines and LNET
routers: |
+--------+-------+ | * CentOS 7.8 based
|
| | * OFED provided by CentOS
|
+----------------------+ | * Lustre 2.12.5
|
| | * Kernel
3.10.0-1127.18.2.el7 |
+---------+--------+ |
|
|
|-----------------+--------------+--------------+ |
| o2ib43 | | | |
| |
+----------------+ | 10.225.0.0/16 | | | |
| |
| | | | | +------+-----+
+------+-----+ +------+-----+ |
| Lustre servers | +---------+--------+ | | | |
| | | |
| @ o2ib43 | | | | VM 1 | | VM 2
| | VM 3 | |
| +--------------+ | | @ o2ib43 | | @ o2ib43
| | @ o2ib43 | |
+----------------+ | | | |
| | | |
| | | |
| | | |
| +------------+
+------------+ +------------+ |
|
|
+-------------------------------------------------+
Lustre @ o2ib20 is a Sonexion appliance based on CentOS 7.2 and Lustre version
2.11.0.300_cray_43_gd35e657_dirty.
Lustre @ 02ib43 is CentOS 7.6 based setup with kernel 3.10.0-957.1.3.el7_lustre
and Lustre version lustre-2.10.7.1nec-1.el7.x86_64.
The issue I currently see is that once more that one VM is running on the
virualization host
then access to the Lustre file system behind the LNET routers is stuck.
The errors I can see on the VM is e.g.:
[ 1297.470192] LustreError: 2477:0:(events.c:200:client_bulk_callback()) event
type 1, status -5, desc ffff89365cebb800
[ 1297.472058] LustreError: 2478:0:(events.c:200:client_bulk_callback()) event
type 1, status -5, desc ffff89365cebb800
[ 1297.473909] Lustre: 2490:0:(client.c:2133:ptlrpc_expire_one_request()) @@@
Request sent has failed due to network error: [sent
1597316593/real 1597316593] req@ffff89365cee0900 x1674906532108800/t0
(0) o4->snx11167-OST001a-osc-ffff893468b2f800@10.148.240.33@o2ib20:6/4 lens
488/448 e 0 to 1 dl 1597316688 ref 2 fl
Rpc:eX/0/ffffffff rc 0/-1
[ 1297.479055] Lustre: snx11167-OST001a-osc-ffff893468b2f800: Connection to
snx11167-OST001a (at 10.148.240.33@o2ib20) was lost;
in progress operations using this service will wait for recovery to com
plete
[ 1299.470205] LustreError: 2478:0:(events.c:200:client_bulk_callback()) event
type 1, status -5, desc ffff89365cebb800
[ 1299.472403] LustreError: 2477:0:(events.c:200:client_bulk_callback()) event
type 1, status -5, desc ffff89365cebb800
[ 1299.474395] Lustre: 2490:0:(client.c:2133:ptlrpc_expire_one_request()) @@@
Request sent has failed due to network error: [sent
1597316595/real 1597316595] req@ffff89365cee0900 x1674906532108800/t0
(0) o4->snx11167-OST001a-osc-ffff893468b2f800@10.148.240.33@o2ib20:6/4 lens
488/448 e 0 to 1 dl 1597316690 ref 2 fl
Rpc:eX/2/ffffffff rc 0/-1
[ 1299.479830] Lustre: snx11167-OST001a-osc-ffff893468b2f800: Connection to
snx11167-OST001a (at 10.148.240.33@o2ib20) was lost;
in progress operations using this service will wait for recovery to com
plete
[ 1299.496826] Lustre: snx11167-OST001a-osc-ffff893468b2f800: Connection
restored to 10.148.240.33@o2ib20 (at 10.148.240.33@o2ib20)
[ 1301.470102] LustreError: 2478:0:(events.c:200:client_bulk_callback()) event
type 1, status -5, desc ffff89365cebb800
[ 1301.472096] LustreError: 2477:0:(events.c:200:client_bulk_callback()) event
type 1, status -5, desc ffff89365cebb800
[ 1301.474135] Lustre: 2490:0:(client.c:2133:ptlrpc_expire_one_request()) @@@
Request sent has failed due to network error: [sent
1597316597/real 1597316597] req@ffff89365cee0900 x1674906532108800/t0
(0) o4->snx11167-OST001a-osc-ffff893468b2f800@10.148.240.33@o2ib20:6/4 lens
488/448 e 0 to 1 dl 1597316692 ref 2 fl
Rpc:eX/2/ffffffff rc 0/-1
[ 1301.479772] Lustre: snx11167-OST001a-osc-ffff893468b2f800: Connection to
snx11167-OST001a (at 10.148.240.33@o2ib20) was lost;
in progress operations using this service will wait for recovery to com
plete
[ 1301.483576] LNetError: 2486:0:(lib-move.c:1999:lnet_handle_find_routed_path()) no
route to 10.148.240.33@o2ib20 from <?>
Access to the Lustre file system which is on the same IB fabric is still
possible so I suspect that this is somehow related to
LNET routing.
If I run LNET selftests as explained at http://wiki.lustre.org/LNET_Selftest
between one of the LNET routers and the VMs I can see
that RPCs get dropped.
Access from a client running on native hardware is possible for both file
systems.
Has someone a comparable setup? What kind of logs is needed to debug this? I'll
gladly provide any info…
TL;DR
* native access is possible in the same IB fabric as well as when being routed
between different fabrics
* if only one VM is running then access is possible to both file systems, too
* if more VMs are running on the same virtualization host than access is only
possible on the file system attached to the same
fabric as the VMs
* access to the routed file system gets stuck
Any help is appreciated.
Thanks,
Uwe Sauter
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org