We currently use lustre over an infiniband network. We would like to allow ethernet clients (ie. no infiniband) to use the same lustre FS.

The MDT and OSTs were all created using only the MGS infiniband interface (ie. mkfs.lustre --mgsnode=10.2.1...@ib0).

We have modified the OSTs and MDS to route over ib & tcp networks via lustre lnet module flags:
>options lnet networks=o2ib0(ib0),tcp(eth0)

For the ethernet clients, we specify only the tcp network for the lnet module:
>options lnet network=tcp(eth0)

For the infiniband clients, we use only the ib network in lnet module:
>options lnet networks=o2ib(ib0)

From an infiniband client, we mount lustre using the MDS infiniband IP address. eg)
mount -t lustre 10.2.1...@ib0:/lustre0   /lustre0

On an ethernet client, mounting lustre using the MDS tcp interface fails. eg)
mount -t lustre 10.1.1...@tcp0:/lustre0   /lustre0

From the syslog messages, it appears the MGS tells the ethernet client to use the ib0 interface (see attached syslog messages) to find the MDT.

So I am missing some configuration for the tcp network. My best guess we need to use the "tunefs.lustre" command to add the mgs tcp NID to the MDT. However this command won't execute while lustre is mounted.

Any suggestions how to add the tcp network to our existing lustre ib network ?

Thanks,
--Chris

Jan  7 17:25:41 bulldogk kernel: LustreError: 
31704:0:(events.c:454:ptlrpc_uuid_to_peer()) No NID found for 10.2.1...@o2ib
Jan  7 17:25:41 bulldogk kernel: LustreError: 
31704:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot find peer 
10.2.1...@o2ib!
Jan  7 17:25:41 bulldogk kernel: LustreError: 
31704:0:(ldlm_lib.c:321:client_obd_setup()) can't add initial connection
Jan  7 17:25:41 bulldogk kernel: LustreError: 
31704:0:(obd_config.c:332:class_setup()) setup 
lustre0-MDT0000-mdc-ffff81022d67c800 fai
led (-2)
Jan  7 17:25:41 bulldogk kernel: LustreError: 
8819:0:(connection.c:144:ptlrpc_put_connection()) NULL connection
Jan  7 17:25:41 bulldogk kernel: LustreError: 
31704:0:(obd_config.c:1071:class_config_llog_handler()) Err -2 on cfg command:
Jan  7 17:25:41 bulldogk kernel: Lustre:    cmd=cf003 0:lustre0-MDT0000-mdc  
1:lustre0-MDT0000_UUID  2:10.2.1...@o2ib  
Jan  7 17:25:41 bulldogk kernel: LustreError: 15c-8: mgc10.1.1...@tcp: The 
configuration from log 'lustre0-client' failed (-2). This 
may be the result of communication errors between this node and the MGS, a bad 
configuration, or other errors. See the syslog for mor
e information.
Jan  7 17:25:41 bulldogk kernel: LustreError: 
31704:0:(llite_lib.c:1061:ll_fill_super()) Unable to process log: -2
Jan  7 17:25:41 bulldogk kernel: LustreError: 
31704:0:(obd_config.c:399:class_cleanup()) Device 2 not setup
Jan  7 17:25:41 bulldogk kernel: LustreError: 
31704:0:(ldlm_request.c:986:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: 
canceli
ng anyway
Jan  7 17:25:41 bulldogk kernel: LustreError: 
31704:0:(ldlm_request.c:1575:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
Jan  7 17:25:41 bulldogk kernel: Lustre: client ffff81022d67c800 umount complete
Jan  7 17:25:41 bulldogk kernel: LustreError: 
31704:0:(obd_mount.c:1951:lustre_fill_super()) Unable to mount  (-2)
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to