We currently use lustre over an infiniband network. We would like to
allow ethernet clients (ie. no infiniband) to use the same lustre FS.
The MDT and OSTs were all created using only the MGS infiniband
interface (ie. mkfs.lustre --mgsnode=10.2.1...@ib0).
We have modified the OSTs and MDS to route over ib & tcp networks via
lustre lnet module flags:
>options lnet networks=o2ib0(ib0),tcp(eth0)
For the ethernet clients, we specify only the tcp network for the lnet
module:
>options lnet network=tcp(eth0)
For the infiniband clients, we use only the ib network in lnet module:
>options lnet networks=o2ib(ib0)
From an infiniband client, we mount lustre using the MDS infiniband IP
address. eg)
mount -t lustre 10.2.1...@ib0:/lustre0 /lustre0
On an ethernet client, mounting lustre using the MDS tcp interface
fails. eg)
mount -t lustre 10.1.1...@tcp0:/lustre0 /lustre0
From the syslog messages, it appears the MGS tells the ethernet client
to use the ib0 interface (see attached syslog messages) to find the MDT.
So I am missing some configuration for the tcp network. My best guess we
need to use the "tunefs.lustre" command to add the mgs tcp NID to the
MDT. However this command won't execute while lustre is mounted.
Any suggestions how to add the tcp network to our existing lustre ib
network ?
Thanks,
--Chris
Jan 7 17:25:41 bulldogk kernel: LustreError:
31704:0:(events.c:454:ptlrpc_uuid_to_peer()) No NID found for 10.2.1...@o2ib
Jan 7 17:25:41 bulldogk kernel: LustreError:
31704:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot find peer
10.2.1...@o2ib!
Jan 7 17:25:41 bulldogk kernel: LustreError:
31704:0:(ldlm_lib.c:321:client_obd_setup()) can't add initial connection
Jan 7 17:25:41 bulldogk kernel: LustreError:
31704:0:(obd_config.c:332:class_setup()) setup
lustre0-MDT0000-mdc-ffff81022d67c800 fai
led (-2)
Jan 7 17:25:41 bulldogk kernel: LustreError:
8819:0:(connection.c:144:ptlrpc_put_connection()) NULL connection
Jan 7 17:25:41 bulldogk kernel: LustreError:
31704:0:(obd_config.c:1071:class_config_llog_handler()) Err -2 on cfg command:
Jan 7 17:25:41 bulldogk kernel: Lustre: cmd=cf003 0:lustre0-MDT0000-mdc
1:lustre0-MDT0000_UUID 2:10.2.1...@o2ib
Jan 7 17:25:41 bulldogk kernel: LustreError: 15c-8: mgc10.1.1...@tcp: The
configuration from log 'lustre0-client' failed (-2). This
may be the result of communication errors between this node and the MGS, a bad
configuration, or other errors. See the syslog for mor
e information.
Jan 7 17:25:41 bulldogk kernel: LustreError:
31704:0:(llite_lib.c:1061:ll_fill_super()) Unable to process log: -2
Jan 7 17:25:41 bulldogk kernel: LustreError:
31704:0:(obd_config.c:399:class_cleanup()) Device 2 not setup
Jan 7 17:25:41 bulldogk kernel: LustreError:
31704:0:(ldlm_request.c:986:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC:
canceli
ng anyway
Jan 7 17:25:41 bulldogk kernel: LustreError:
31704:0:(ldlm_request.c:1575:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
Jan 7 17:25:41 bulldogk kernel: Lustre: client ffff81022d67c800 umount complete
Jan 7 17:25:41 bulldogk kernel: LustreError:
31704:0:(obd_mount.c:1951:lustre_fill_super()) Unable to mount (-2)
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss