Hi Rick,

If I don't add the "Lnet router + Server" peers manually as multi-rail enabled 
peer before route add, a non-multi-rail
peer with only TCP NID would be added by the route add command for the "Lnet 
router + Server" (as seen in line 76-83 in https://pastebin.com/h3wHyCM7) and 
the existent of those 2 peers would interfere with normal Lnet communication 
with server side kernel message printing "Bad dest nid n.n.n.n@o2ib (it's my 
nid but on a different network)"

This is also what happens when lnet.conf is imported by lnetctl: if lnetctl 
imports the peer before route, no extraneous peer entries were created and 
everything works as expected (as output by line 16). If lnetctl import the 
route before peer, the scenario mentioned in the last paragraph occurs and 
results in a non-usable Lnet for the client. And the order lnetctl import each 
section depends on its order of appearance inside the yaml file.

Best regards,
Angelos Ching

E: angelosch...@clustertech.com
P: +852-2655-6138
A: 210-213, Lake Side 1, Science Park, Hong Kong
W: http://clustertech.com

September 4, 2020 11:06 PM, "Mohr Jr, Richard Frank" <rm...@utk.edu> wrote:

>> On Sep 4, 2020, at 12:11 AM, Angelos Ching <angelosch...@clustertech.com> 
>> wrote:
>> 
>> All steps below carried out on Lustre client:
>> 
>> 1. Restart lnet service with empty /etc/lnet.conf
>> 2. lnetctl net add: TCP network using Ethernet
>> 3. lnetctl peer add: 2 peers with "Lnet router + server"@o2ib,tcp NIDs
> 
> The commands you ran were:
> 
> [root@access2 ~]# lnetctl peer add --nid 10.1.4.24@o2ib,10.4.7.24@tcp
> [root@access2 ~]# lnetctl peer add --nid 10.1.4.25@o2ib,10.4.7.25@tcp
> 
> Commands like this can be used when a node has a multirail setup, like when a 
> node has multiple
> interfaces on the same network. But for your routers, it looks like the tcp 
> network is available to
> the client and the o2ib network is available to the server. Since those 
> interfaces are not on the
> same network so you don’t need to add both of them as a peer.
> 
>> 4. lnetctl route add: 2 gateways to o2ib network using "Lnet router +
>> server"@TCP NID
> 
> [root@access2 ~]# lnetctl route add --net o2ib --gateway 10.4.7.24@tcp
> [root@access2 ~]# lnetctl route add --net o2ib --gateway 10.4.7.25@tcp
> 
> These should be the only commands you need to run to configure your routing.
> 
> -Rick
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to