can you expand on this part "However, there are times when network
traffic on my tcp1 network is blocked.  If the tcp1 LNET network is
network blocked while running mkfs.lustre"?  I'm not an expert by any
stretch, but this sounds like a recipe for disaster
On Tue, Oct 16, 2018 at 7:05 AM Mark Roper <[email protected]> wrote:
>
> Lustre Community,
>
> I have successfully set up a Lustre filesystem that is multi-homed on two 
> different TCP NIDs, using the following configuration.
>
> Mount MGS & MDT
>
>    sudo lnetctl lnet configure
>    sudo lnetctl net del --net tcp
>    sudo lnetctl net add --net tcp0 --if eth1
>    sudo lnetctl net add --net tcp1 --if eth0
>    sudo zpool create -O canmount=off -o ashift=12 mdtPool0 /dev/device1
>    sudo mkfs.lustre --mgs \
>       --mdt \
>       --servicenode 10.0.1.109@tcp0,172.30.0.228@tcp1 \
>       --backfstype=zfs --fsname=demo --index=0 mdtPool0/mdt0 /dev/device1
>    sudo sh -c 'echo "$(hostname) - demo:MDT0000 zfs:mdtPool0/mdt0" >> 
> /etc/ldev.conf'
>    sudo service lustre start
>
> Mount an OST
>
>    sudo lnetctl lnet configure
>    sudo lnetctl net del --net tcp
>    sudo lnetctl net add --net tcp0 --if eth1
>    sudo lnetctl net add --net tcp1 --if eth0
>    sudo zpool create -O canmount=off -o ashift=12 ostPool0 /dev/device1
>    sudo mkfs.lustre --reformat --ost --backfstype=zfs --fsname=demo --index=0 
> \
>        --servicenode 10.0.6.156@tcp0,172.30.0.250@tcp1 
> --mgsnode=172.30.0.228@tcp1 ostPool0/ost0 /dev/device1
>    sudo sh -c 'echo "$(hostname) - demo:OST0000 zfs:ostPool0/ost0" >> 
> /etc/ldev.conf'
>    sudo service lustre start
>
> I can mount and use this file system on either tcp0 or tcp1.  However, there 
> are times when network traffic on my tcp1 network is blocked.  If the tcp1 
> LNET network is network blocked while running mkfs.lustre for an OST, the 
> mount of the OST fails.  Running journalctl -xe yields:
>
> kernel: LustreError: 15f-b: demo-OST0000: cannot register this server with 
> the MGS: rc = -110. Is the MGS running?
> kernel: LustreError: 5798:0:(obd_mount_server.c:1936:server_fill_super()) 
> Unable to start targets: -110
> kernel: LustreError: 5798:0:(obd_mount_server.c:1586:server_put_super()) no 
> obd demo-OST0000
> kernel: LustreError: 
> 5798:0:(obd_mount_server.c:132:server_deregister_mount()) demo-OST0000 not 
> registered
> kernel: Lustre: server umount demo-OST0000 complete
> kernel: LustreError: 5798:0:(obd_mount.c:1599:lustre_fill_super()) Unable to 
> mount  (-110)
>
>
> If I exclude the tcp1 servicenode when I mount the MDT, I am able to mount 
> the OSTs on both tcp0 and tcp1.  If I attempt to use mkfs.lustre to go back 
> and update the mgs & mdt servernodes to support both LNET nids after mounting 
> the OSTs, the command succeeds, but the file system is not mountable from the 
> client.
>
> Is there a way to reliably stand up a filesystem in this configuration such 
> that the mkfs.lustre command succeed, and that the tcp1 lnet network will be 
> functional once the network traffic is no longer blocked?  Or is it required 
> that all LNET networks be functional at the time that the server components 
> mkfs.lustre commands are run?
>
> Many thanks!
>
> Mark Roper
> _______________________________________________
> lustre-discuss mailing list
> [email protected]
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to