here is the reason, it's a CENTOS 7.5 kernel bug
https://bugs.centos.org/view.php?id=15193
On 9/10/18 11:05 PM, Riccardo Veraldi wrote:
hello,
I installed a new Lustre system where MDS and OSSes are version 2.10.5
the lustre clients are running 2.10.1 and 2.9.0
when I try to mount the filesystem it fails with these errors:
OSS:
Sep 10 22:39:46 psananehoss01 kernel: LNetError:
10055:0:(o2iblnd_cb.c:2513:kiblnd_passive_connect()) Can't accept
172.21.52.33@o2ib2: -22
Sep 10 22:39:46 psananehoss01 kernel: LNet:
10055:0:(o2iblnd_cb.c:2212:kiblnd_reject()) Error -22 sending reject
Client:
Sep 10 22:41:26 psana101 kernel: LNetError:
336:0:(o2iblnd_cb.c:2726:kiblnd_rejected()) 172.21.52.90@o2ib2
rejected: consumer defined fatal error
I Am afraid this is the consequence of a mixed configuration.
on the client side Lustre is configured in /etc/modprobe/lustre.conf
options lnet networks=o2ib2(ib0),tcp0(enp6s0),tcp1(enp6s0),tcp2(enp6s0)
on the OSS site I am using lnet.conf
ip2nets:
- net-spec: o2ib2
interfaces:
0: ib0
- net-spec: tcp2
interfaces:
0: enp8s0f0
I supposed that peers could be discovered automatically and added
automatically to lnet
Should I revert back to static lustre.conf on the OSS side too ?
I have several lustre clients I cannot add all of them in a peers
section inside lnet.conf on the OSS side
any hints are very welcomed.
thank you
Rick
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org