> On Oct 16, 2018, at 7:04 AM, Mark Roper <markro...@gmail.com> wrote:
> 
> I have successfully set up a Lustre filesystem that is multi-homed on two 
> different TCP NIDs, using the following configuration.
> Mount MGS & MDT
> 
>    sudo lnetctl lnet configure
>    sudo lnetctl net del --net tcp
>    sudo lnetctl net add --net tcp0 --if eth1
>    sudo lnetctl net add --net tcp1 --if eth0
>    sudo zpool create -O canmount=off -o ashift=12 mdtPool0 /dev/device1
>    sudo mkfs.lustre --mgs \
>       --mdt \
>       --servicenode 10.0.1.109@tcp0,172.30.0.228@tcp1 \
>       --backfstype=zfs --fsname=demo --index=0 mdtPool0/mdt0 /dev/device1
>    sudo sh -c 'echo "$(hostname) - demo:MDT0000 zfs:mdtPool0/mdt0" >> 
> /etc/ldev.conf'
>    sudo service lustre start
> 
> Mount an OST
> 
>    sudo lnetctl lnet configure
>    sudo lnetctl net del --net tcp
>    sudo lnetctl net add --net tcp0 --if eth1
>    sudo lnetctl net add --net tcp1 --if eth0
>    sudo zpool create -O canmount=off -o ashift=12 ostPool0 /dev/device1
>    sudo mkfs.lustre --reformat --ost --backfstype=zfs --fsname=demo --index=0 
> \
>        --servicenode 10.0.6.156@tcp0,172.30.0.250@tcp1 
> --mgsnode=172.30.0.228@tcp1 ostPool0/ost0 /dev/device1
>    sudo sh -c 'echo "$(hostname) - demo:OST0000 zfs:ostPool0/ost0" >> 
> /etc/ldev.conf'
>    sudo service lustre start
> 
> I can mount and use this file system on either tcp0 or tcp1.  However, there 
> are times when network traffic on my tcp1 network is blocked.  If the tcp1 
> LNET network is network blocked while running mkfs.lustre for an OST, the 
> mount of the OST fails.

That is because you only specified one NID for the mgsnode option, and that NID 
uses tcp1.  If tcp1 is not available, the OSS doesn’t know how to contact the 
MDS to register the OST.  Have you tried using “—mgsnode 
10.0.1.109@tcp0,172.30.0.228@tcp1” to see if that works.

> If I attempt to use mkfs.lustre to go back and update the mgs & mdt 
> servernodes to support both LNET nids after mounting the OSTs, the command 
> succeeds, but the file system is not mountable from the client.

You can’t use mkfs.lustre to update service node NIDs once the file system is 
formatted.  You would need to perform a writeconf or use the “lctl replace_ids” 
command.  (You can check the Lustre manual for the “Changing a Server NID” 
section.

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to