September 5, 2020 1:04 AM, "Mohr Jr, Richard Frank" <rm...@utk.edu> wrote: > So your server has both tcp and o2ib NIDs, and you > want the server to route requests from tcp clients to other resources on the > o2ib network. But when > you mount Lustre, you want the client to use the server’s o2ib NID instead of > mounting with the > server’s tcp NID.
Correct. Actually the pair of Lnet router themselves are also serving MDS & OSS, and with 4 more MDS/OSS that are only on o2ib serving yet another file system. With the extraneous peer added by route add, the Lnet router would print the follow kernel message: > LNetError: 34250:0:(lib-move.c:4259:lnet_parse()) 10.4.7.145@tcp, src > 10.4.7.145@tcp: Bad dest nid 10.1.4.24@o2ib (it's my nid but on a different > network) This is worked around by manually adding the routers as peers with the 2 NIDs prior to route add, whether o2ib or tcp is used as primary NID does not seems to matter; and I just discovered that if I perform a lnetctl discover with the router's TCP NID, either before or after route add, that would also yield a usable Lnet. After discovering the later workaround, I've implemented it using a systemd drop-in for lnet.service unit. Best regards. Angelos Ching E: angelosch...@clustertech.com P: +852-2655-6138 A: 210-213, Lake Side 1, Science Park, Hong Kong W: http://clustertech.com _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org