September 5, 2020 1:04 AM, "Mohr Jr, Richard Frank" <rm...@utk.edu> wrote:
> So your server has both tcp and o2ib NIDs, and you
> want the server to route requests from tcp clients to other resources on the 
> o2ib network. But when
> you mount Lustre, you want the client to use the server’s o2ib NID instead of 
> mounting with the
> server’s tcp NID.

Correct.

Actually the pair of Lnet router themselves are also serving MDS & OSS, and 
with 4 more MDS/OSS that are only on o2ib serving yet another file system. With 
the extraneous peer added by route add, the Lnet router would print the follow 
kernel message:
> LNetError: 34250:0:(lib-move.c:4259:lnet_parse()) 10.4.7.145@tcp, src 
> 10.4.7.145@tcp: Bad dest nid 10.1.4.24@o2ib (it's my nid but on a different 
> network)

This is worked around by manually adding the routers as peers with the 2 NIDs 
prior to route add, whether o2ib or tcp is used as primary NID does not seems 
to matter; and I just discovered that if I perform a lnetctl discover with the 
router's TCP NID, either before or after route add, that would also yield a 
usable Lnet. After discovering the later workaround, I've implemented it using 
a systemd drop-in for lnet.service unit.

Best regards.
Angelos Ching

E: angelosch...@clustertech.com
P: +852-2655-6138
A: 210-213, Lake Side 1, Science Park, Hong Kong
W: http://clustertech.com
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to