Hi Ben Both the networks have netmasks of value 255.255.255.0 Thanks
On Mon, Jul 18, 2016 at 10:08 AM, Ben Evans <[email protected]> wrote: > What do your netmasks look like on each network? > > From: lustre-discuss <[email protected]> on behalf > of sohamm <[email protected]> > Date: Monday, July 18, 2016 at 1:56 AM > To: "[email protected]" <[email protected]> > Subject: Re: [lustre-discuss] lustre-discuss Digest, Vol 124, Issue 17 > > Hi Thomas > Below are the results of the commands you suggested. > > *From Client* > [root@dev1 ~]# lctl ping 192.168.200.52@o2ib > failed to ping 192.168.200.52@o2ib: Input/output error > [root@dev1 ~]# lctl ping 192.168.111.52@tcp > 12345-0@lo > 12345-192.168.200.52@o2ib > 12345-192.168.111.52@tcp > [root@dev1 ~]# mount -t lustre 192.168.111.52@tcp:/mylustre /lustre > mount.lustre: mount 192.168.111.52@tcp:/mylustre at /lustre failed: > Input/output error > Is the MGS running? > mount: mounting 192.168.111.52@tcp:/mylustre on /lustre failed: Invalid > argument > > cat /var/log/messages | tail > Jul 18 01:37:04 dev1 user.warn kernel: [2250504.401397] ib1: multicast > join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22 > Jul 18 01:37:26 dev1 user.warn kernel: [2250526.257309] LNet: No route to > 12345-192.168.200.52@o2ib via <?> (all routers down) > Jul 18 01:37:36 dev1 user.warn kernel: [2250536.481862] ib1: multicast > join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22 > Jul 18 01:41:53 dev1 user.warn kernel: [2250792.947299] LNet: No route to > 12345-192.168.200.52@o2ib via <?> (all routers down) > > > *From MGS* > [root@lustre_mgs01_vm03 ~]# lctl ping 192.168.111.102@tcp > 12345-0@lo > 12345-192.168.111.102@tcp > > Please let me know what else i can try. Looks like i am missing something > with the ib config? Do i need router setup as part of lnet ? > if i am able to ping mgs from client on the tcp network, it should still > work ? > > Thanks > > > On Sun, Jul 17, 2016 at 1:07 PM, <[email protected]> > wrote: > >> Send lustre-discuss mailing list submissions to >> [email protected] >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> or, via email, send a message with subject or body 'help' to >> [email protected] >> >> You can reach the person managing the list at >> [email protected] >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of lustre-discuss digest..." >> >> >> Today's Topics: >> >> 1. llapi_file_get_stripe() and /proc/fs/lustre/osc/ entries >> (John Bauer) >> 2. luster client mount issues (sohamm) >> 3. Re: luster client mount issues (Thomas Roth) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Sat, 16 Jul 2016 15:11:22 -0500 >> From: John Bauer <[email protected]> >> To: "[email protected]" >> <[email protected]> >> Subject: [lustre-discuss] llapi_file_get_stripe() and >> /proc/fs/lustre/osc/ entries >> Message-ID: <[email protected]> >> Content-Type: text/plain; charset="utf-8"; Format="flowed" >> >> I am using *llapi_file_get_stripe()* to get the ost indexes that a file >> is striped on. That part is working fine. But there are multiple Lustre >> file systems on the node resulting in multiple **OST0000* *in the >> directory /proc/fs/lustre/osc. Is there something in the *struct >> lov_user_ost_data* or *struct lov_user_md* that would indicate which of >> the following directories pertains to the file's OST ? >> >> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp1-OST0000-osc-ffff880287ae4c00 >> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp2-OST0000-osc-ffff881034d99000 >> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp6-OST0000-osc-ffff881003cd7800 >> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp7-OST0000-osc-ffff880ffe051c00 >> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp8-OST0000-osc-ffff880ffe054c00 >> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp9-OST0000-osc-ffff880fcf179400 >> >> Thanks >> >> -- >> I/O Doctors, LLC >> 507-766-0378 >> [email protected] >> >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: < >> http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20160716/95176929/attachment.html >> > >> >> ------------------------------ >> >> Message: 2 >> Date: Sat, 16 Jul 2016 14:34:35 -0700 >> From: sohamm <[email protected]> >> To: [email protected] >> Subject: [lustre-discuss] luster client mount issues >> Message-ID: >> < >> cakgc+ebq+mcdbsrc7ft4gd+zmz6fbazhavhsqtpgoshyrjq...@mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> Hi >> >> I am trying to mount lustre client. Below are steps and necessary >> information surrounding the issue. Please let me know if i am missing >> something >> >> Thanks >> Div >> >> *Mgs:* >> >> [root@lustre_mgs01_vm03 ~]# cat /etc/modprobe.d/lustre.conf >> >> options lnet networks=o2ib(ib0),tcp0(eth0) >> >> >> >> [root@lustre_mgs01_vm03 ~]# modprobe lnet >> >> [root@lustre_mgs01_vm03 ~]# lsmod | grep lnet >> >> lnet 449065 0 >> >> libcfs 405839 1 lnet >> >> [root@lustre_mgs01_vm03 ~]# lctl network up >> >> LNET configured >> >> [root@lustre_mgs01_vm03 ~]# lctl list_nids >> >> 192.168.200.52@o2ib >> >> 192.168.111.52@tcp >> >> *On Client:* >> I am able to ping MGS on both tcp and ib network >> >> [root@dev1~]# ping 192.168.111.52 >> >> PING 192.168.111.52 (192.168.111.52) 56(84) bytes of data. >> >> 64 bytes from 192.168.111.52: icmp_req=1 ttl=64 time=5.81 ms >> >> 64 bytes from 192.168.111.52: icmp_req=2 ttl=64 time=0.802 ms >> >> 64 bytes from 192.168.111.52: icmp_req=3 ttl=64 time=0.780 ms >> >> ^C >> >> --- 192.168.111.52 ping statistics --- >> >> 3 packets transmitted, 3 received, 0% packet loss, time 2000ms >> >> rtt min/avg/max/mdev = 0.780/2.464/5.811/2.366 ms >> >> [root@dev1 ~]# ping 192.168.200.52 >> >> PING 192.168.200.52 (192.168.200.52) 56(84) bytes of data. >> >> 64 bytes from 192.168.200.52: icmp_req=1 ttl=64 time=24.4 ms >> >> 64 bytes from 192.168.200.52: icmp_req=2 ttl=64 time=2.14 ms >> >> 64 bytes from 192.168.200.52: icmp_req=3 ttl=64 time=0.782 ms >> >> 64 bytes from 192.168.200.52: icmp_req=4 ttl=64 time=9.30 ms >> >> ^C >> >> --- 192.168.200.52 ping statistics --- >> >> 4 packets transmitted, 4 received, 0% packet loss, time 3005ms >> >> >> *client mount commands* >> >> mount -t lustre 192.168.111.52@tcp:/mylustre /lustre ( or) >> >> mount -t lustre 192.168.111.52@tcp0:/mylustre /lustre ( or) >> >> mount -t lustre 192.168.200.52@ob2:/mylustre /lustre >> >> >> *cat /var/log/messages | tail -40* >> >> Jul 16 17:03:17 dev1 user.err kernel: [2133277.466013] LustreError: 162-5: >> Missing mount data: check that /sbin/mount.lustre is installed. >> >> Jul 16 17:03:17 dev1 user.err kernel: [2133277.466064] LustreError: >> 13627:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount (-22) >> >> Jul 16 17:03:23 dev1 user.warn kernel: [2133282.680519] Lustre: >> 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has >> timed out for slow reply: [sent 1468702998/real 1468702998] >> req@ffff8801e0bc3c00 x1539427524411444/t0(0) o250->MGC192.168.111.52 >> >> Jul 16 17:03:24 dev1 user.err kernel: [2133283.680193] LustreError: >> 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired >> req@ffff8801e0bc7000 x1539427524411448/t0(0) o101->MGC192.168.111.52@tcp >> @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl >> >> Jul 16 17:03:31 dev1 user.err kernel: [2133290.760978] LustreError: >> 13657:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired >> req@ffff8801b7159800 x1539427524411456/t0(0) o101->MGC192.168.111.52@tcp >> @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl >> >> Jul 16 17:03:43 dev1 user.warn kernel: [2133302.681412] Lustre: >> 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has >> failed due to network error: [sent 1468703023/real 1468703023] >> req@ffff8801d6bfc800 x1539427524411460/t0(0) o250->MGC192.168.111 >> >> Jul 16 17:04:08 dev1 user.warn kernel: [2133327.681402] Lustre: >> 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has >> failed due to network error: [sent 1468703048/real 1468703048] >> req@ffff8801d6bfec00 x1539427524411464/t0(0) o250->MGC192.168.111 >> >> Jul 16 17:04:15 dev1 user.err kernel: [2133334.680175] LustreError: >> 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired >> req@ffff8801e0bc7000 x1539427524411452/t0(0) o101->MGC192.168.111.52@tcp >> @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl >> >> Jul 16 17:04:15 dev1 user.err kernel: [2133334.680316] LustreError: 15c-8: >> MGC192.168.111.52@tcp: The configuration from log 'mylustre-client' >> failed >> (-5). This may be the result of communication errors between this node and >> the MGS, a bad configuration, or other e >> >> Jul 16 17:04:15 dev1 user.err kernel: [2133334.680357] LustreError: >> 13628:0:(llite_lib.c:1046:ll_fill_super()) Unable to process log: -5 >> >> Jul 16 17:04:15 dev1 user.warn kernel: [2133334.680881] Lustre: Unmounted >> mylustre-client >> >> Jul 16 17:04:15 dev1 user.err kernel: [2133334.731730] LustreError: >> 13628:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount (-5) >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: < >> http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20160716/28fb2cad/attachment-0001.htm >> > >> >> ------------------------------ >> >> Message: 3 >> Date: Sun, 17 Jul 2016 10:19:18 +0200 >> From: Thomas Roth <[email protected]> >> To: <[email protected]> >> Subject: Re: [lustre-discuss] luster client mount issues >> Message-ID: <[email protected]> >> Content-Type: text/plain; charset="windows-1252"; format=flowed >> >> Hi, >> >> try 'lctl ping' from your clients to the MDS to check if you get through >> on lnet, e.g. >> >> lctl ping ping 192.168.200.52@o2ib >> >> or >> >> lctl ping 192.168.111.52@tcp >> >> >> and vice versa from the MDS to the clients' nids. >> >> Regards, >> Thomas >> >> On 07/16/2016 11:34 PM, sohamm wrote: >> > Hi >> > >> > I am trying to mount lustre client. Below are steps and necessary >> > information surrounding the issue. Please let me know if i am missing >> > something >> > >> > Thanks >> > Div >> > >> > *Mgs:* >> > >> > [root@lustre_mgs01_vm03 ~]# cat /etc/modprobe.d/lustre.conf >> > >> > options lnet networks=o2ib(ib0),tcp0(eth0) >> > >> > >> > >> > [root@lustre_mgs01_vm03 ~]# modprobe lnet >> > >> > [root@lustre_mgs01_vm03 ~]# lsmod | grep lnet >> > >> > lnet 449065 0 >> > >> > libcfs 405839 1 lnet >> > >> > [root@lustre_mgs01_vm03 ~]# lctl network up >> > >> > LNET configured >> > >> > [root@lustre_mgs01_vm03 ~]# lctl list_nids >> > >> > 192.168.200.52@o2ib >> > >> > 192.168.111.52@tcp >> > >> > *On Client:* >> > I am able to ping MGS on both tcp and ib network >> > >> > [root@dev1~]# ping 192.168.111.52 >> > >> > PING 192.168.111.52 (192.168.111.52) 56(84) bytes of data. >> > >> > 64 bytes from 192.168.111.52: icmp_req=1 ttl=64 time=5.81 ms >> > >> > 64 bytes from 192.168.111.52: icmp_req=2 ttl=64 time=0.802 ms >> > >> > 64 bytes from 192.168.111.52: icmp_req=3 ttl=64 time=0.780 ms >> > >> > ^C >> > >> > --- 192.168.111.52 ping statistics --- >> > >> > 3 packets transmitted, 3 received, 0% packet loss, time 2000ms >> > >> > rtt min/avg/max/mdev = 0.780/2.464/5.811/2.366 ms >> > >> > [root@dev1 ~]# ping 192.168.200.52 >> > >> > PING 192.168.200.52 (192.168.200.52) 56(84) bytes of data. >> > >> > 64 bytes from 192.168.200.52: icmp_req=1 ttl=64 time=24.4 ms >> > >> > 64 bytes from 192.168.200.52: icmp_req=2 ttl=64 time=2.14 ms >> > >> > 64 bytes from 192.168.200.52: icmp_req=3 ttl=64 time=0.782 ms >> > >> > 64 bytes from 192.168.200.52: icmp_req=4 ttl=64 time=9.30 ms >> > >> > ^C >> > >> > --- 192.168.200.52 ping statistics --- >> > >> > 4 packets transmitted, 4 received, 0% packet loss, time 3005ms >> > >> > >> > *client mount commands* >> > >> > mount -t lustre 192.168.111.52@tcp:/mylustre /lustre ( or) >> > >> > mount -t lustre 192.168.111.52@tcp0:/mylustre /lustre ( or) >> > >> > mount -t lustre 192.168.200.52@ob2:/mylustre /lustre >> > >> > >> > *cat /var/log/messages | tail -40* >> > >> > Jul 16 17:03:17 dev1 user.err kernel: [2133277.466013] LustreError: >> 162-5: >> > Missing mount data: check that /sbin/mount.lustre is installed. >> > >> > Jul 16 17:03:17 dev1 user.err kernel: [2133277.466064] LustreError: >> > 13627:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount (-22) >> > >> > Jul 16 17:03:23 dev1 user.warn kernel: [2133282.680519] Lustre: >> > 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has >> > timed out for slow reply: [sent 1468702998/real 1468702998] >> > req@ffff8801e0bc3c00 x1539427524411444/t0(0) o250->MGC192.168.111.52 >> > >> > Jul 16 17:03:24 dev1 user.err kernel: [2133283.680193] LustreError: >> > 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired >> > req@ffff8801e0bc7000 x1539427524411448/t0(0) >> o101->MGC192.168.111.52@tcp >> > @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl >> > >> > Jul 16 17:03:31 dev1 user.err kernel: [2133290.760978] LustreError: >> > 13657:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired >> > req@ffff8801b7159800 x1539427524411456/t0(0) >> o101->MGC192.168.111.52@tcp >> > @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl >> > >> > Jul 16 17:03:43 dev1 user.warn kernel: [2133302.681412] Lustre: >> > 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has >> > failed due to network error: [sent 1468703023/real 1468703023] >> > req@ffff8801d6bfc800 x1539427524411460/t0(0) o250->MGC192.168.111 >> > >> > Jul 16 17:04:08 dev1 user.warn kernel: [2133327.681402] Lustre: >> > 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has >> > failed due to network error: [sent 1468703048/real 1468703048] >> > req@ffff8801d6bfec00 x1539427524411464/t0(0) o250->MGC192.168.111 >> > >> > Jul 16 17:04:15 dev1 user.err kernel: [2133334.680175] LustreError: >> > 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired >> > req@ffff8801e0bc7000 x1539427524411452/t0(0) >> o101->MGC192.168.111.52@tcp >> > @192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl >> > >> > Jul 16 17:04:15 dev1 user.err kernel: [2133334.680316] LustreError: >> 15c-8: >> > MGC192.168.111.52@tcp: The configuration from log 'mylustre-client' >> failed >> > (-5). This may be the result of communication errors between this node >> and >> > the MGS, a bad configuration, or other e >> > >> > Jul 16 17:04:15 dev1 user.err kernel: [2133334.680357] LustreError: >> > 13628:0:(llite_lib.c:1046:ll_fill_super()) Unable to process log: -5 >> > >> > Jul 16 17:04:15 dev1 user.warn kernel: [2133334.680881] Lustre: >> Unmounted >> > mylustre-client >> > >> > Jul 16 17:04:15 dev1 user.err kernel: [2133334.731730] LustreError: >> > 13628:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount (-5) >> > >> > >> > >> > _______________________________________________ >> > lustre-discuss mailing list >> > [email protected] >> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> > >> >> -- >> -------------------------------------------------------------------- >> Thomas Roth >> Department: HPC >> Location: SB3 1.262 >> Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 >> >> GSI Helmholtzzentrum f?r Schwerionenforschung GmbH >> Planckstra?e 1 >> 64291 Darmstadt >> www.gsi.de >> >> Gesellschaft mit beschr?nkter Haftung >> Sitz der Gesellschaft: Darmstadt >> Handelsregister: Amtsgericht Darmstadt, HRB 1528 >> >> Gesch?ftsf?hrung: Professor Dr. Karlheinz Langanke >> Ursula Weyrich >> J?rg Blaurock >> >> Vorsitzender des Aufsichtsrates: St Dr. Georg Sch?tte >> Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt >> >> >> ------------------------------ >> >> Subject: Digest Footer >> >> _______________________________________________ >> lustre-discuss mailing list >> [email protected] >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> >> >> ------------------------------ >> >> End of lustre-discuss Digest, Vol 124, Issue 17 >> *********************************************** >> > >
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
