Re: [lustre-discuss] Multiple MGS interfaces config
On 09/27/2015 08:59 PM, Exec Unerd wrote: >> I'm not sure if I have understood your setup correctly. > In this case, the clients are a combination of all three: some are o2ib > only, some tcp only, and some o2ib+tcp with tcp as failover. > > It sounds like I need a combination of configurations, one for the OSSes > and one for each client type. > > So if I used this parameter in the OST, > --mgsnode="172.16.10.1@o2ib0,192.168.10.1@tcp0" > > Then configured the modprobe.d/lustre.conf appropriately on the clients > tcp: options lnet networks="tcp0(ixgbe1)" > o2ib: options lnet networks="o2ib0(ib1)" > both: options lnet networks="o2ib0(ib1),tcp0(ixgbe1)" > > And use these mount parameters: > tcp: mount -v -t lustre 192.168.10.1@tcp0:/testfs /mnt/testfs > o2ib: mount -v -t lustre 172.16.10.1@o2ib0:/testfs /mnt/testfs > both: mount -v -t lustre 172.16.10.1@o2ib0,192.168.10.1@tcp0:/testfs I think here it should be a colon between the two MGS nids: mount -v -t lustre 172.16.10.1@o2ib0:192.168.10.1@tcp0:/testfs > /mnt/testfs > > Everything should be happy? > > On Thu, Sep 24, 2015 at 9:12 AM, Martin Hechtwrote: > >> On 09/24/2015 05:33 PM, Chris Hunter wrote: >>> [...] 2. What's the best way to trace the TCP client interactions to see where it's breaking down? >>> If lnet is running on the client, you can try "lctl ping" >>> eg) lctl ping 172.16.10.1@o2ib >>> >>> I believe a lustre mount uses ipoib for initial handshake with a mds >>> o2ib interfaces. You should make sure regular ping over ipoib is >>> working before mounting lustre. >> if the client and the server is on the same network, yes, it's a good >> starting point. But it's not a prerequisite. In general you can have an >> lnet router in-between or have different ip subnets for ipoib, so you >> can't ping on the ipoib layer, but you can still lctl ping the whole >> path (although you could verify that you can ip ping to the next hop at >> least). >> >> We also have a case in which we tried to block ipoib completely with >> iptables, but we still could lctl ping, even after rebooting the host >> and ensuring that the firewall was up before loading the lnet module. >> So, I doubt that ipoib is needed at all for establishing the o2ib >> connection. >> >> smime.p7s Description: S/MIME Cryptographic Signature ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Multiple MGS interfaces config
>> I think here it should be a colon between the two MGS nids: >> mount -v -t lustre 172.16.10.1@o2ib0:192.168.10.1@tcp0:/testfs That's part of my problem. The Lustre 2.x manual says that comma-delimited NIDs are on the same host, but colon-delimited NIDs are on separate hosts. Is that just for lustre.conf & mkfs.lustre, or is it for mount operations as well? In this case, my MGS node has a TCP and an IB rail to accommodate the different clients, so I'd use a comma, right? On Mon, Sep 28, 2015 at 7:07 AM, Martin Hechtwrote: > On 09/27/2015 08:59 PM, Exec Unerd wrote: > >> I'm not sure if I have understood your setup correctly. > > In this case, the clients are a combination of all three: some are o2ib > > only, some tcp only, and some o2ib+tcp with tcp as failover. > > > > It sounds like I need a combination of configurations, one for the OSSes > > and one for each client type. > > > > So if I used this parameter in the OST, > > --mgsnode="172.16.10.1@o2ib0,192.168.10.1@tcp0" > > > > Then configured the modprobe.d/lustre.conf appropriately on the clients > > tcp: options lnet networks="tcp0(ixgbe1)" > > o2ib: options lnet networks="o2ib0(ib1)" > > both: options lnet networks="o2ib0(ib1),tcp0(ixgbe1)" > > > > And use these mount parameters: > > tcp: mount -v -t lustre 192.168.10.1@tcp0:/testfs /mnt/testfs > > o2ib: mount -v -t lustre 172.16.10.1@o2ib0:/testfs /mnt/testfs > > both: mount -v -t lustre 172.16.10.1@o2ib0,192.168.10.1@tcp0:/testfs > I think here it should be a colon between the two MGS nids: > > mount -v -t lustre 172.16.10.1@o2ib0:192.168.10.1@tcp0:/testfs > > > > /mnt/testfs > > > > Everything should be happy? > > > > On Thu, Sep 24, 2015 at 9:12 AM, Martin Hecht wrote: > > > >> On 09/24/2015 05:33 PM, Chris Hunter wrote: > >>> [...] > 2. What's the best way to trace the TCP client interactions to see > where > it's breaking down? > >>> If lnet is running on the client, you can try "lctl ping" > >>> eg) lctl ping 172.16.10.1@o2ib > >>> > >>> I believe a lustre mount uses ipoib for initial handshake with a mds > >>> o2ib interfaces. You should make sure regular ping over ipoib is > >>> working before mounting lustre. > >> if the client and the server is on the same network, yes, it's a good > >> starting point. But it's not a prerequisite. In general you can have an > >> lnet router in-between or have different ip subnets for ipoib, so you > >> can't ping on the ipoib layer, but you can still lctl ping the whole > >> path (although you could verify that you can ip ping to the next hop at > >> least). > >> > >> We also have a case in which we tried to block ipoib completely with > >> iptables, but we still could lctl ping, even after rebooting the host > >> and ensuring that the firewall was up before loading the lnet module. > >> So, I doubt that ipoib is needed at all for establishing the o2ib > >> connection. > >> > >> > > > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Multiple MGS interfaces config
> I'm not sure if I have understood your setup correctly. In this case, the clients are a combination of all three: some are o2ib only, some tcp only, and some o2ib+tcp with tcp as failover. It sounds like I need a combination of configurations, one for the OSSes and one for each client type. So if I used this parameter in the OST, --mgsnode="172.16.10.1@o2ib0,192.168.10.1@tcp0" Then configured the modprobe.d/lustre.conf appropriately on the clients tcp: options lnet networks="tcp0(ixgbe1)" o2ib: options lnet networks="o2ib0(ib1)" both: options lnet networks="o2ib0(ib1),tcp0(ixgbe1)" And use these mount parameters: tcp: mount -v -t lustre 192.168.10.1@tcp0:/testfs /mnt/testfs o2ib: mount -v -t lustre 172.16.10.1@o2ib0:/testfs /mnt/testfs both: mount -v -t lustre 172.16.10.1@o2ib0,192.168.10.1@tcp0:/testfs /mnt/testfs Everything should be happy? On Thu, Sep 24, 2015 at 9:12 AM, Martin Hechtwrote: > On 09/24/2015 05:33 PM, Chris Hunter wrote: > > [...] > >>2. What's the best way to trace the TCP client interactions to see > >> where > >>it's breaking down? > > If lnet is running on the client, you can try "lctl ping" > > eg) lctl ping 172.16.10.1@o2ib > > > > I believe a lustre mount uses ipoib for initial handshake with a mds > > o2ib interfaces. You should make sure regular ping over ipoib is > > working before mounting lustre. > if the client and the server is on the same network, yes, it's a good > starting point. But it's not a prerequisite. In general you can have an > lnet router in-between or have different ip subnets for ipoib, so you > can't ping on the ipoib layer, but you can still lctl ping the whole > path (although you could verify that you can ip ping to the next hop at > least). > > We also have a case in which we tried to block ipoib completely with > iptables, but we still could lctl ping, even after rebooting the host > and ensuring that the firewall was up before loading the lnet module. > So, I doubt that ipoib is needed at all for establishing the o2ib > connection. > > > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Multiple MGS interfaces config
On 09/23/2015 02:39 AM, Exec Unerd wrote: > My environment has both TCP and IB clients, so my Lustre config has to > accommodate both, but I'm having a hard time figuring out the proper syntax > for it. Theoretically, I should be able to use comma-separated interfaces > in the mgsnode parameter like this: > > --mgsnode=192.168.10.1@tcp0,172.16.10.1@o2ib > --mgsnode=192.168.10.2@tcp0,172.16.10.2@o2ib I think this should work: --mgsnode=192.168.10.1@tcp0 --mgsnode=172.16.10.1@o2ib --mgsnode=192.168.10.2@tcp0 --mgsnode=172.16.10.2@o2ib at least that's how it works with a multirail ib network (where you would replace tcp0 by o2ib1). The mount command would contain all 4 nids, but if the client can't connect via tcp it takes until it reaches a timeout and tries the next one. If in addition the MGS is failed over to the second server I guess it takes three timeouts until the client succeeds to connect. > The problem is, this doesn't work for all clients all the time ... > randomly. It would work, then it wouldn't. Googling, I found some known > defects saying that the comma delimiter didn't work as per the manual and > recommending alternate syntaxes like using the colon instead of a comma. I > know what the manuals *say*about the syntax, I'm just having trouble > getting it to work. I'm not sure if I have understood your setup correctly. You have ib clients and you have other hosts which are connected via tcp, right? Or do the clients have both, and the tcp network a failback solution in case the ib doesn't work properly (network flooded, SM crashed or alike)? When you say it doesn't work on a particular client, can you lctl ping one of the nids in this situation? Or can you ping the other direction from the server to the client? And if at least one of the pings succeeds, can you suddenly mount afterwards? > This seems to affect only the TCP clients; at least I haven't seen it > affect any of the IB clients. It may be a comma parsing problem or > something else. > > I have two questions for the group: > >1. Is there a known-working method for using both TCP and IB interface >NIDs for the MGS in this manner? >2. What's the best way to trace the TCP client interactions to see where >it's breaking down? > > Versions in use: > kernel: 2.6.32-504.23.4.el6.x86_64 > lustre: lustre-2.7.58-2.6.32_504.23.4.el6.x86_64_g051c25b.x86_64 > zfs: zfs-0.6.4-76_g87abfcb.el6.x86_64 > > My lustre.conf contents: > options lnet networks="o2ib0(ib1),tcp0(ixgbe1)" ip2nets could be an alternative here, especially if not all clients have both interfaces. smime.p7s Description: S/MIME Cryptographic Signature ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Multiple MGS interfaces config
> My environment has both TCP and IB clients, so my Lustre config has to > accommodate both, but I'm having a hard time figuring out the proper syntax > for it. Theoretically, I should be able to use comma-separated interfaces > in the mgsnode parameter like this: > > --mgsnode=192.168.10.1@tcp0,172.16.10.1@o2ib > --mgsnode=192.168.10.2@tcp0,172.16.10.2@o2ib > > The problem is, this doesn't work for all clients all the time ... > randomly. It would work, then it wouldn't. Googling, I found some known > defects saying that the comma delimiter didn't work as per the manual and > recommending alternate syntaxes like using the colon instead of a comma. I > know what the manuals *say*about the syntax, I'm just having trouble > getting it to work. > > This seems to affect only the TCP clients; at least I haven't seen it > affect any of the IB clients. It may be a comma parsing problem or > something else. > > I have two questions for the group: > >1. Is there a known-working method for using both TCP and IB interface >NIDs for the MGS in this manner? I used quotes with comma-delimited listing when formatting osts eg) mkfs.lustre --verbose --ost --index=0 --fsname="testfs" --mgsnode="172.16.10.1@o2ib0,192.168.10.1@tcp0" When mounting on a multi-homed client, you can use both mgs addresses to give some failover support: mount -v -t lustre 172.16.10.1@o2ib0,192.168.10.1@tcp0:/testfs /mnt/testfs FYI, I also have dual-home OSS servers, so I also use comma-delimited list for the --servicenode parameter in mkfs.lustre. 2. What's the best way to trace the TCP client interactions to see where it's breaking down? If lnet is running on the client, you can try "lctl ping" eg) lctl ping 172.16.10.1@o2ib I believe a lustre mount uses ipoib for initial handshake with a mds o2ib interfaces. You should make sure regular ping over ipoib is working before mounting lustre. Versions in use: kernel: 2.6.32-504.23.4.el6.x86_64 lustre: lustre-2.7.58-2.6.32_504.23.4.el6.x86_64_g051c25b.x86_64 zfs: zfs-0.6.4-76_g87abfcb.el6.x86_64 My lustre.conf contents: options lnet networks="o2ib0(ib1),tcp0(ixgbe1)" chris hunter chris.hun...@yale.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Multiple MGS interfaces config
On 09/24/2015 05:33 PM, Chris Hunter wrote: > [...] >>2. What's the best way to trace the TCP client interactions to see >> where >>it's breaking down? > If lnet is running on the client, you can try "lctl ping" > eg) lctl ping 172.16.10.1@o2ib > > I believe a lustre mount uses ipoib for initial handshake with a mds > o2ib interfaces. You should make sure regular ping over ipoib is > working before mounting lustre. if the client and the server is on the same network, yes, it's a good starting point. But it's not a prerequisite. In general you can have an lnet router in-between or have different ip subnets for ipoib, so you can't ping on the ipoib layer, but you can still lctl ping the whole path (although you could verify that you can ip ping to the next hop at least). We also have a case in which we tried to block ipoib completely with iptables, but we still could lctl ping, even after rebooting the host and ensuring that the firewall was up before loading the lnet module. So, I doubt that ipoib is needed at all for establishing the o2ib connection. smime.p7s Description: S/MIME Cryptographic Signature ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Multiple MGS interfaces config
My environment has both TCP and IB clients, so my Lustre config has to accommodate both, but I'm having a hard time figuring out the proper syntax for it. Theoretically, I should be able to use comma-separated interfaces in the mgsnode parameter like this: --mgsnode=192.168.10.1@tcp0,172.16.10.1@o2ib --mgsnode=192.168.10.2@tcp0,172.16.10.2@o2ib The problem is, this doesn't work for all clients all the time ... randomly. It would work, then it wouldn't. Googling, I found some known defects saying that the comma delimiter didn't work as per the manual and recommending alternate syntaxes like using the colon instead of a comma. I know what the manuals *say*about the syntax, I'm just having trouble getting it to work. This seems to affect only the TCP clients; at least I haven't seen it affect any of the IB clients. It may be a comma parsing problem or something else. I have two questions for the group: 1. Is there a known-working method for using both TCP and IB interface NIDs for the MGS in this manner? 2. What's the best way to trace the TCP client interactions to see where it's breaking down? Versions in use: kernel: 2.6.32-504.23.4.el6.x86_64 lustre: lustre-2.7.58-2.6.32_504.23.4.el6.x86_64_g051c25b.x86_64 zfs: zfs-0.6.4-76_g87abfcb.el6.x86_64 My lustre.conf contents: options lnet networks="o2ib0(ib1),tcp0(ixgbe1)" ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org