[lustre-discuss] luster client mount issues

2016-07-16 Thread sohamm
Hi

I am trying to mount lustre client. Below are steps and necessary
information surrounding the issue. Please let me know if i am missing
something

Thanks
Div

*Mgs:*

[root@lustre_mgs01_vm03 ~]# cat /etc/modprobe.d/lustre.conf

options lnet networks=o2ib(ib0),tcp0(eth0)



[root@lustre_mgs01_vm03 ~]# modprobe lnet

[root@lustre_mgs01_vm03 ~]# lsmod | grep lnet

lnet  449065  0

libcfs405839  1 lnet

[root@lustre_mgs01_vm03 ~]# lctl network up

LNET configured

[root@lustre_mgs01_vm03 ~]# lctl list_nids

192.168.200.52@o2ib

192.168.111.52@tcp

*On Client:*
I am able to ping MGS on both tcp and ib network

[root@dev1~]# ping 192.168.111.52

PING 192.168.111.52 (192.168.111.52) 56(84) bytes of data.

64 bytes from 192.168.111.52: icmp_req=1 ttl=64 time=5.81 ms

64 bytes from 192.168.111.52: icmp_req=2 ttl=64 time=0.802 ms

64 bytes from 192.168.111.52: icmp_req=3 ttl=64 time=0.780 ms

^C

--- 192.168.111.52 ping statistics ---

3 packets transmitted, 3 received, 0% packet loss, time 2000ms

rtt min/avg/max/mdev = 0.780/2.464/5.811/2.366 ms

[root@dev1 ~]# ping 192.168.200.52

PING 192.168.200.52 (192.168.200.52) 56(84) bytes of data.

64 bytes from 192.168.200.52: icmp_req=1 ttl=64 time=24.4 ms

64 bytes from 192.168.200.52: icmp_req=2 ttl=64 time=2.14 ms

64 bytes from 192.168.200.52: icmp_req=3 ttl=64 time=0.782 ms

64 bytes from 192.168.200.52: icmp_req=4 ttl=64 time=9.30 ms

^C

--- 192.168.200.52 ping statistics ---

4 packets transmitted, 4 received, 0% packet loss, time 3005ms


*client mount commands*

mount -t lustre 192.168.111.52@tcp:/mylustre /lustre ( or)

mount -t lustre 192.168.111.52@tcp0:/mylustre /lustre ( or)

mount -t lustre 192.168.200.52@ob2:/mylustre /lustre


*cat /var/log/messages | tail -40*

Jul 16 17:03:17 dev1 user.err kernel: [2133277.466013] LustreError: 162-5:
Missing mount data: check that /sbin/mount.lustre is installed.

Jul 16 17:03:17 dev1 user.err kernel: [2133277.466064] LustreError:
13627:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount  (-22)

Jul 16 17:03:23 dev1 user.warn kernel: [2133282.680519] Lustre:
12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
timed out for slow reply: [sent 1468702998/real 1468702998]
req@8801e0bc3c00 x1539427524411444/t0(0) o250->MGC192.168.111.52

Jul 16 17:03:24 dev1 user.err kernel: [2133283.680193] LustreError:
13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired
req@8801e0bc7000 x1539427524411448/t0(0) o101->MGC192.168.111.52@tcp
@192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl

Jul 16 17:03:31 dev1 user.err kernel: [2133290.760978] LustreError:
13657:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired
req@8801b7159800 x1539427524411456/t0(0) o101->MGC192.168.111.52@tcp
@192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl

Jul 16 17:03:43 dev1 user.warn kernel: [2133302.681412] Lustre:
12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
failed due to network error: [sent 1468703023/real 1468703023]
req@8801d6bfc800 x1539427524411460/t0(0) o250->MGC192.168.111

Jul 16 17:04:08 dev1 user.warn kernel: [2133327.681402] Lustre:
12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
failed due to network error: [sent 1468703048/real 1468703048]
req@8801d6bfec00 x1539427524411464/t0(0) o250->MGC192.168.111

Jul 16 17:04:15 dev1 user.err kernel: [214.680175] LustreError:
13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired
req@8801e0bc7000 x1539427524411452/t0(0) o101->MGC192.168.111.52@tcp
@192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl

Jul 16 17:04:15 dev1 user.err kernel: [214.680316] LustreError: 15c-8:
MGC192.168.111.52@tcp: The configuration from log 'mylustre-client' failed
(-5). This may be the result of communication errors between this node and
the MGS, a bad configuration, or other e

Jul 16 17:04:15 dev1 user.err kernel: [214.680357] LustreError:
13628:0:(llite_lib.c:1046:ll_fill_super()) Unable to process log: -5

Jul 16 17:04:15 dev1 user.warn kernel: [214.680881] Lustre: Unmounted
mylustre-client

Jul 16 17:04:15 dev1 user.err kernel: [214.731730] LustreError:
13628:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount  (-5)
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] luster client mount issues

2016-07-17 Thread Thomas Roth

Hi,

try 'lctl ping' from your clients to the MDS to check if you get through on 
lnet, e.g.

lctl ping ping 192.168.200.52@o2ib

or

lctl ping 192.168.111.52@tcp


and vice versa from the MDS to the clients' nids.

Regards,
Thomas

On 07/16/2016 11:34 PM, sohamm wrote:

Hi

I am trying to mount lustre client. Below are steps and necessary
information surrounding the issue. Please let me know if i am missing
something

Thanks
Div

*Mgs:*

[root@lustre_mgs01_vm03 ~]# cat /etc/modprobe.d/lustre.conf

options lnet networks=o2ib(ib0),tcp0(eth0)



[root@lustre_mgs01_vm03 ~]# modprobe lnet

[root@lustre_mgs01_vm03 ~]# lsmod | grep lnet

lnet  449065  0

libcfs405839  1 lnet

[root@lustre_mgs01_vm03 ~]# lctl network up

LNET configured

[root@lustre_mgs01_vm03 ~]# lctl list_nids

192.168.200.52@o2ib

192.168.111.52@tcp

*On Client:*
I am able to ping MGS on both tcp and ib network

[root@dev1~]# ping 192.168.111.52

PING 192.168.111.52 (192.168.111.52) 56(84) bytes of data.

64 bytes from 192.168.111.52: icmp_req=1 ttl=64 time=5.81 ms

64 bytes from 192.168.111.52: icmp_req=2 ttl=64 time=0.802 ms

64 bytes from 192.168.111.52: icmp_req=3 ttl=64 time=0.780 ms

^C

--- 192.168.111.52 ping statistics ---

3 packets transmitted, 3 received, 0% packet loss, time 2000ms

rtt min/avg/max/mdev = 0.780/2.464/5.811/2.366 ms

[root@dev1 ~]# ping 192.168.200.52

PING 192.168.200.52 (192.168.200.52) 56(84) bytes of data.

64 bytes from 192.168.200.52: icmp_req=1 ttl=64 time=24.4 ms

64 bytes from 192.168.200.52: icmp_req=2 ttl=64 time=2.14 ms

64 bytes from 192.168.200.52: icmp_req=3 ttl=64 time=0.782 ms

64 bytes from 192.168.200.52: icmp_req=4 ttl=64 time=9.30 ms

^C

--- 192.168.200.52 ping statistics ---

4 packets transmitted, 4 received, 0% packet loss, time 3005ms


*client mount commands*

mount -t lustre 192.168.111.52@tcp:/mylustre /lustre ( or)

mount -t lustre 192.168.111.52@tcp0:/mylustre /lustre ( or)

mount -t lustre 192.168.200.52@ob2:/mylustre /lustre


*cat /var/log/messages | tail -40*

Jul 16 17:03:17 dev1 user.err kernel: [2133277.466013] LustreError: 162-5:
Missing mount data: check that /sbin/mount.lustre is installed.

Jul 16 17:03:17 dev1 user.err kernel: [2133277.466064] LustreError:
13627:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount  (-22)

Jul 16 17:03:23 dev1 user.warn kernel: [2133282.680519] Lustre:
12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
timed out for slow reply: [sent 1468702998/real 1468702998]
req@8801e0bc3c00 x1539427524411444/t0(0) o250->MGC192.168.111.52

Jul 16 17:03:24 dev1 user.err kernel: [2133283.680193] LustreError:
13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired
req@8801e0bc7000 x1539427524411448/t0(0) o101->MGC192.168.111.52@tcp
@192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl

Jul 16 17:03:31 dev1 user.err kernel: [2133290.760978] LustreError:
13657:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired
req@8801b7159800 x1539427524411456/t0(0) o101->MGC192.168.111.52@tcp
@192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl

Jul 16 17:03:43 dev1 user.warn kernel: [2133302.681412] Lustre:
12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
failed due to network error: [sent 1468703023/real 1468703023]
req@8801d6bfc800 x1539427524411460/t0(0) o250->MGC192.168.111

Jul 16 17:04:08 dev1 user.warn kernel: [2133327.681402] Lustre:
12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
failed due to network error: [sent 1468703048/real 1468703048]
req@8801d6bfec00 x1539427524411464/t0(0) o250->MGC192.168.111

Jul 16 17:04:15 dev1 user.err kernel: [214.680175] LustreError:
13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired
req@8801e0bc7000 x1539427524411452/t0(0) o101->MGC192.168.111.52@tcp
@192.168.111.52@tcp:26/25 lens 328/344 e 0 to 0 dl

Jul 16 17:04:15 dev1 user.err kernel: [214.680316] LustreError: 15c-8:
MGC192.168.111.52@tcp: The configuration from log 'mylustre-client' failed
(-5). This may be the result of communication errors between this node and
the MGS, a bad configuration, or other e

Jul 16 17:04:15 dev1 user.err kernel: [214.680357] LustreError:
13628:0:(llite_lib.c:1046:ll_fill_super()) Unable to process log: -5

Jul 16 17:04:15 dev1 user.warn kernel: [214.680881] Lustre: Unmounted
mylustre-client

Jul 16 17:04:15 dev1 user.err kernel: [214.731730] LustreError:
13628:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount  (-5)



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



--

Thomas Roth
Department: HPC
Location: SB3 1.262
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1
64291 Darmstadt
www.gs

Re: [lustre-discuss] ​luster client mount issues

2016-07-20 Thread sohamm
Hi

Any guidance/help on this is greatly appreciated.

Thanks

On Mon, Jul 18, 2016 at 7:25 PM, sohamm  wrote:

> Hi Ben
> Both the networks have netmasks of value 255.255.255.0
>
> Thanks
>
> On Mon, Jul 18, 2016 at 10:08 AM, Ben Evans  wrote:
>
>> What do your netmasks look like on each network?
>>
>> From: lustre-discuss  on behalf
>> of sohamm 
>> Date: Monday, July 18, 2016 at 1:56 AM
>> To: "lustre-discuss@lists.lustre.org" 
>> Subject: Re: [lustre-discuss] lustre-discuss Digest, Vol 124, Issue 17
>>
>> Hi Thomas
>> Below are the results of the commands you suggested.
>>
>> *From Client*
>> [root@dev1 ~]# lctl ping 192.168.200.52@o2ib
>> failed to ping 192.168.200.52@o2ib: Input/output error
>> [root@dev1 ~]# lctl ping 192.168.111.52@tcp
>> 12345-0@lo
>> 12345-192.168.200.52@o2ib
>> 12345-192.168.111.52@tcp
>> [root@dev1 ~]# mount -t lustre 192.168.111.52@tcp:/mylustre /lustre
>> mount.lustre: mount 192.168.111.52@tcp:/mylustre at /lustre failed:
>> Input/output error
>> Is the MGS running?
>> mount: mounting 192.168.111.52@tcp:/mylustre on /lustre failed: Invalid
>> argument
>>
>> cat /var/log/messages | tail
>> Jul 18 01:37:04 dev1 user.warn kernel: [2250504.401397] ib1: multicast
>> join failed for ff12:401b::::::, status -22
>> Jul 18 01:37:26 dev1 user.warn kernel: [2250526.257309] LNet: No route to
>> 12345-192.168.200.52@o2ib via  (all routers down)
>> Jul 18 01:37:36 dev1 user.warn kernel: [2250536.481862] ib1: multicast
>> join failed for ff12:401b::::::, status -22
>> Jul 18 01:41:53 dev1 user.warn kernel: [2250792.947299] LNet: No route to
>> 12345-192.168.200.52@o2ib via  (all routers down)
>>
>>
>> *From MGS*
>> [root@lustre_mgs01_vm03 ~]# lctl ping 192.168.111.102@tcp
>> 12345-0@lo
>> 12345-192.168.111.102@tcp
>>
>> Please let me know what else i can try. Looks like i am missing something
>> with the ib config? Do i need router setup as part of lnet ?
>> if i am able to ping mgs from client on the tcp network, it should still
>> work ?
>>
>> Thanks
>>
>>
>> On Sun, Jul 17, 2016 at 1:07 PM, > > wrote:
>>
>>> Send lustre-discuss mailing list submissions to
>>> lustre-discuss@lists.lustre.org
>>>
>>> To subscribe or unsubscribe via the World Wide Web, visit
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>> or, via email, send a message with subject or body 'help' to
>>> lustre-discuss-requ...@lists.lustre.org
>>>
>>> You can reach the person managing the list at
>>> lustre-discuss-ow...@lists.lustre.org
>>>
>>> When replying, please edit your Subject line so it is more specific
>>> than "Re: Contents of lustre-discuss digest..."
>>>
>>>
>>> Today's Topics:
>>>
>>>1. llapi_file_get_stripe() and /proc/fs/lustre/osc/  entries
>>>   (John Bauer)
>>>2. luster client mount issues (sohamm)
>>>3. Re:
>>> ​​
>>> luster client mount issues (Thomas Roth)
>>>
>>>
>>> --
>>>
>>> Message: 1
>>> Date: Sat, 16 Jul 2016 15:11:22 -0500
>>> From: John Bauer 
>>> To: "lustre-discuss@lists.lustre.org"
>>> 
>>> Subject: [lustre-discuss] llapi_file_get_stripe() and
>>> /proc/fs/lustre/osc/entries
>>> Message-ID: <03ceaaa0-b004-ae43-eaa1-437da2a5b...@iodoctors.com>
>>> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>>>
>>> I am using *llapi_file_get_stripe()* to get the ost indexes that a file
>>> is striped on.  That part is working fine. But there are multiple Lustre
>>> file systems on the node resulting in multiple **OST* *in the
>>> directory /proc/fs/lustre/osc.  Is there something in the *struct
>>> lov_user_ost_data* or *struct lov_user_md* that would indicate which of
>>> the following directories pertains to the file's OST ?
>>>
>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp1-OST-osc-880287ae4c00
>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp2-OST-osc-ffff881034d99000
>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp6-OST-osc-881003cd7800
>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp7-OST-osc-880ffe051c00
>>> dr-xr-xr-x 2 root

Re: [lustre-discuss] ​luster client mount issues

2016-07-21 Thread Martin Hecht
client mount issues (sohamm)
>>>>3. Re:
>>>> ​​
>>>> luster client mount issues (Thomas Roth)
>>>>
>>>>
>>>> --
>>>>
>>>> Message: 1
>>>> Date: Sat, 16 Jul 2016 15:11:22 -0500
>>>> From: John Bauer 
>>>> To: "lustre-discuss@lists.lustre.org"
>>>> 
>>>> Subject: [lustre-discuss] llapi_file_get_stripe() and
>>>> /proc/fs/lustre/osc/entries
>>>> Message-ID: <03ceaaa0-b004-ae43-eaa1-437da2a5b...@iodoctors.com>
>>>> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>>>>
>>>> I am using *llapi_file_get_stripe()* to get the ost indexes that a file
>>>> is striped on.  That part is working fine. But there are multiple Lustre
>>>> file systems on the node resulting in multiple **OST* *in the
>>>> directory /proc/fs/lustre/osc.  Is there something in the *struct
>>>> lov_user_ost_data* or *struct lov_user_md* that would indicate which of
>>>> the following directories pertains to the file's OST ?
>>>>
>>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp1-OST-osc-880287ae4c00
>>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp2-OST-osc-881034d99000
>>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp6-OST-osc-881003cd7800
>>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp7-OST-osc-880ffe051c00
>>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp8-OST-osc-880ffe054c00
>>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp9-OST-osc-880fcf179400
>>>>
>>>> Thanks
>>>>
>>>> --
>>>> I/O Doctors, LLC
>>>> 507-766-0378
>>>> bau...@iodoctors.com
>>>>
>>>> -- next part --
>>>> An HTML attachment was scrubbed...
>>>> URL: <
>>>> http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20160716/95176929/attachment.html
>>>> --
>>>>
>>>> Message: 2
>>>> Date: Sat, 16 Jul 2016 14:34:35 -0700
>>>> From: sohamm 
>>>> To: lustre-discuss@lists.lustre.org
>>>> Subject: [lustre-discuss] luster client mount issues
>>>> Message-ID:
>>>> <
>>>> cakgc+ebq+mcdbsrc7ft4gd+zmz6fbazhavhsqtpgoshyrjq...@mail.gmail.com>
>>>> Content-Type: text/plain; charset="utf-8"
>>>>
>>>> Hi
>>>>
>>>> I am trying to mount lustre client. Below are steps and necessary
>>>> information surrounding the issue. Please let me know if i am missing
>>>> something
>>>>
>>>> Thanks
>>>> Div
>>>>
>>>> *Mgs:*
>>>>
>>>> [root@lustre_mgs01_vm03 ~]# cat /etc/modprobe.d/lustre.conf
>>>>
>>>> options lnet networks=o2ib(ib0),tcp0(eth0)
>>>>
>>>>
>>>>
>>>> [root@lustre_mgs01_vm03 ~]# modprobe lnet
>>>>
>>>> [root@lustre_mgs01_vm03 ~]# lsmod | grep lnet
>>>>
>>>> lnet  449065  0
>>>>
>>>> libcfs405839  1 lnet
>>>>
>>>> [root@lustre_mgs01_vm03 ~]# lctl network up
>>>>
>>>> LNET configured
>>>>
>>>> [root@lustre_mgs01_vm03 ~]# lctl list_nids
>>>>
>>>> 192.168.200.52@o2ib
>>>>
>>>> 192.168.111.52@tcp
>>>>
>>>> *On Client:*
>>>> I am able to ping MGS on both tcp and ib network
>>>>
>>>> [root@dev1~]# ping 192.168.111.52
>>>>
>>>> PING 192.168.111.52 (192.168.111.52) 56(84) bytes of data.
>>>>
>>>> 64 bytes from 192.168.111.52: icmp_req=1 ttl=64 time=5.81 ms
>>>>
>>>> 64 bytes from 192.168.111.52: icmp_req=2 ttl=64 time=0.802 ms
>>>>
>>>> 64 bytes from 192.168.111.52: icmp_req=3 ttl=64 time=0.780 ms
>>>>
>>>> ^C
>>>>
>>>> --- 192.168.111.52 ping statistics ---
>>>>
>>>> 3 packets transmitted, 3 received, 0% packet loss, time 2000ms
>>>>
>>>> rtt min/avg/max/mdev = 0.780/2.464/5.811/2.366 ms
>>>>
>>>> [root@dev1 ~]# ping 192.168.200.52
>>>>
>>>> PING 192.168.200.52 (192.168.200.52) 5

Re: [lustre-discuss] ​luster client mount issues

2016-07-25 Thread sohamm
t;> Send lustre-discuss mailing list submissions to
> >>>> lustre-discuss@lists.lustre.org
> >>>>
> >>>> To subscribe or unsubscribe via the World Wide Web, visit
> >>>>
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> >>>> or, via email, send a message with subject or body 'help' to
> >>>> lustre-discuss-requ...@lists.lustre.org
> >>>>
> >>>> You can reach the person managing the list at
> >>>> lustre-discuss-ow...@lists.lustre.org
> >>>>
> >>>> When replying, please edit your Subject line so it is more specific
> >>>> than "Re: Contents of lustre-discuss digest..."
> >>>>
> >>>>
> >>>> Today's Topics:
> >>>>
> >>>>1. llapi_file_get_stripe() and /proc/fs/lustre/osc/  entries
> >>>>   (John Bauer)
> >>>>2. luster client mount issues (sohamm)
> >>>>3. Re:
> >>>> ​​
> >>>> luster client mount issues (Thomas Roth)
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> Message: 1
> >>>> Date: Sat, 16 Jul 2016 15:11:22 -0500
> >>>> From: John Bauer 
> >>>> To: "lustre-discuss@lists.lustre.org"
> >>>> 
> >>>> Subject: [lustre-discuss] llapi_file_get_stripe() and
> >>>> /proc/fs/lustre/osc/entries
> >>>> Message-ID: <03ceaaa0-b004-ae43-eaa1-437da2a5b...@iodoctors.com>
> >>>> Content-Type: text/plain; charset="utf-8"; Format="flowed"
> >>>>
> >>>> I am using *llapi_file_get_stripe()* to get the ost indexes that a
> file
> >>>> is striped on.  That part is working fine. But there are multiple
> Lustre
> >>>> file systems on the node resulting in multiple **OST* *in the
> >>>> directory /proc/fs/lustre/osc.  Is there something in the *struct
> >>>> lov_user_ost_data* or *struct lov_user_md* that would indicate which
> of
> >>>> the following directories pertains to the file's OST ?
> >>>>
> >>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31
> nbp1-OST-osc-880287ae4c00
> >>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31
> nbp2-OST-osc-881034d99000
> >>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31
> nbp6-OST-osc-881003cd7800
> >>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31
> nbp7-OST-osc-880ffe051c00
> >>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31
> nbp8-OST-osc-880ffe054c00
> >>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31
> nbp9-OST-osc-880fcf179400
> >>>>
> >>>> Thanks
> >>>>
> >>>> --
> >>>> I/O Doctors, LLC
> >>>> 507-766-0378
> >>>> bau...@iodoctors.com
> >>>>
> >>>> -- next part --
> >>>> An HTML attachment was scrubbed...
> >>>> URL: <
> >>>>
> http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20160716/95176929/attachment.html
> >>>> --
> >>>>
> >>>> Message: 2
> >>>> Date: Sat, 16 Jul 2016 14:34:35 -0700
> >>>> From: sohamm 
> >>>> To: lustre-discuss@lists.lustre.org
> >>>> Subject: [lustre-discuss] luster client mount issues
> >>>> Message-ID:
> >>>> <
> >>>> cakgc+ebq+mcdbsrc7ft4gd+zmz6fbazhavhsqtpgoshyrjq...@mail.gmail.com>
> >>>> Content-Type: text/plain; charset="utf-8"
> >>>>
> >>>> Hi
> >>>>
> >>>> I am trying to mount lustre client. Below are steps and necessary
> >>>> information surrounding the issue. Please let me know if i am missing
> >>>> something
> >>>>
> >>>> Thanks
> >>>> Div
> >>>>
> >>>> *Mgs:*
> >>>>
> >>>> [root@lustre_mgs01_vm03 ~]# cat /etc/modprobe.d/lustre.conf
> >>>>
> >>>> options lnet networks=o2ib(ib0),tcp0(eth0)
> >>>>
> >>>>
> >>>>
> >>>> [root@lustre_mgs01_vm03 ~]# modprobe lnet
> >>>>
> >>>> [root@lustre_mgs01_vm03 ~]# lsmod | grep lnet
>

Re: [lustre-discuss] ​luster client mount issues

2016-07-28 Thread Mohr Jr, Richard Frank (Rick Mohr)
Is the client supposed to have an IB interface configured, or is it just 
supposed to mount over ethernet?

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu


> On Jul 20, 2016, at 2:09 PM, sohamm  wrote:
> 
> Hi 
> 
> Any guidance/help on this is greatly appreciated.
> 
> Thanks
> 
> On Mon, Jul 18, 2016 at 7:25 PM, sohamm  wrote:
> Hi Ben
> Both the networks have netmasks of value 255.255.255.0
> 
> Thanks
> 
> On Mon, Jul 18, 2016 at 10:08 AM, Ben Evans  wrote:
> What do your netmasks look like on each network?
> 
> From: lustre-discuss  on behalf of 
> sohamm 
> Date: Monday, July 18, 2016 at 1:56 AM
> To: "lustre-discuss@lists.lustre.org" 
> Subject: Re: [lustre-discuss] lustre-discuss Digest, Vol 124, Issue 17
> 
> Hi Thomas
> Below are the results of the commands you suggested.
> 
> From Client
> [root@dev1 ~]# lctl ping 192.168.200.52@o2ib
> failed to ping 192.168.200.52@o2ib: Input/output error
> [root@dev1 ~]# lctl ping 192.168.111.52@tcp
> 12345-0@lo
> 12345-192.168.200.52@o2ib
> 12345-192.168.111.52@tcp
> [root@dev1 ~]# mount -t lustre 192.168.111.52@tcp:/mylustre /lustre
> mount.lustre: mount 192.168.111.52@tcp:/mylustre at /lustre failed: 
> Input/output error
> Is the MGS running?
> mount: mounting 192.168.111.52@tcp:/mylustre on /lustre failed: Invalid 
> argument
> 
> cat /var/log/messages | tail
> Jul 18 01:37:04 dev1 user.warn kernel: [2250504.401397] ib1: multicast join 
> failed for ff12:401b::::::, status -22
> Jul 18 01:37:26 dev1 user.warn kernel: [2250526.257309] LNet: No route to 
> 12345-192.168.200.52@o2ib via  (all routers down)
> Jul 18 01:37:36 dev1 user.warn kernel: [2250536.481862] ib1: multicast join 
> failed for ff12:401b::::::, status -22
> Jul 18 01:41:53 dev1 user.warn kernel: [2250792.947299] LNet: No route to 
> 12345-192.168.200.52@o2ib via  (all routers down)
> 
> 
> From MGS
> [root@lustre_mgs01_vm03 ~]# lctl ping 192.168.111.102@tcp
> 12345-0@lo
> 12345-192.168.111.102@tcp
> 
> Please let me know what else i can try. Looks like i am missing something 
> with the ib config? Do i need router setup as part of lnet ?
> if i am able to ping mgs from client on the tcp network, it should still work 
> ?
> 
> Thanks
> 


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] ​luster client mount issues

2016-07-28 Thread sohamm
Hi Rick
Client is configured for IB interface.
in my understanding i can specific the network of choice in the mount
command. tried both tcp and ib. I am still checking on the configurations
as suggested in the forumn. will get back with my findings.

Thanks

On Thursday, July 28, 2016, Mohr Jr, Richard Frank (Rick Mohr) <
rm...@utk.edu> wrote:

> Is the client supposed to have an IB interface configured, or is it just
> supposed to mount over ethernet?
>
> --
> Rick Mohr
> Senior HPC System Administrator
> National Institute for Computational Sciences
> http://www.nics.tennessee.edu
>
>
> > On Jul 20, 2016, at 2:09 PM, sohamm >
> wrote:
> >
> > Hi
> >
> > Any guidance/help on this is greatly appreciated.
> >
> > Thanks
> >
> > On Mon, Jul 18, 2016 at 7:25 PM, sohamm >
> wrote:
> > Hi Ben
> > Both the networks have netmasks of value 255.255.255.0
> >
> > Thanks
> >
> > On Mon, Jul 18, 2016 at 10:08 AM, Ben Evans  > wrote:
> > What do your netmasks look like on each network?
> >
> > From: lustre-discuss  > on behalf of sohamm >
> > Date: Monday, July 18, 2016 at 1:56 AM
> > To: "lustre-discuss@lists.lustre.org " <
> lustre-discuss@lists.lustre.org >
> > Subject: Re: [lustre-discuss] lustre-discuss Digest, Vol 124, Issue 17
> >
> > Hi Thomas
> > Below are the results of the commands you suggested.
> >
> > From Client
> > [root@dev1 ~]# lctl ping 192.168.200.52@o2ib
> > failed to ping 192.168.200.52@o2ib: Input/output error
> > [root@dev1 ~]# lctl ping 192.168.111.52@tcp
> > 12345-0@lo
> > 12345-192.168.200.52@o2ib
> > 12345-192.168.111.52@tcp
> > [root@dev1 ~]# mount -t lustre 192.168.111.52@tcp:/mylustre /lustre
> > mount.lustre: mount 192.168.111.52@tcp:/mylustre at /lustre failed:
> Input/output error
> > Is the MGS running?
> > mount: mounting 192.168.111.52@tcp:/mylustre on /lustre failed: Invalid
> argument
> >
> > cat /var/log/messages | tail
> > Jul 18 01:37:04 dev1 user.warn kernel: [2250504.401397] ib1: multicast
> join failed for ff12:401b::::::, status -22
> > Jul 18 01:37:26 dev1 user.warn kernel: [2250526.257309] LNet: No route
> to 12345-192.168.200.52@o2ib via  (all routers down)
> > Jul 18 01:37:36 dev1 user.warn kernel: [2250536.481862] ib1: multicast
> join failed for ff12:401b::::::, status -22
> > Jul 18 01:41:53 dev1 user.warn kernel: [2250792.947299] LNet: No route
> to 12345-192.168.200.52@o2ib via  (all routers down)
> >
> >
> > From MGS
> > [root@lustre_mgs01_vm03 ~]# lctl ping 192.168.111.102@tcp
> > 12345-0@lo
> > 12345-192.168.111.102@tcp
> >
> > Please let me know what else i can try. Looks like i am missing
> something with the ib config? Do i need router setup as part of lnet ?
> > if i am able to ping mgs from client on the tcp network, it should still
> work ?
> >
> > Thanks
> >
>
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] ​luster client mount issues

2016-07-28 Thread Andrus, Brian Contractor
Are you running IPoIB?
Can you do “lsmod |grep lnet”? Also, ensure you have the right network settings 
in your /etc/modprobe.d/lnet.conf file (or wherever you may have defined the 
networks)


Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238





From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of sohamm
Sent: Thursday, July 28, 2016 6:55 PM
To: Mohr Jr, Richard Frank (Rick Mohr)
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] ​luster client mount issues

Hi Rick
Client is configured for IB interface.
in my understanding i can specific the network of choice in the mount command. 
tried both tcp and ib. I am still checking on the configurations as suggested 
in the forumn. will get back with my findings.

Thanks

On Thursday, July 28, 2016, Mohr Jr, Richard Frank (Rick Mohr) 
mailto:rm...@utk.edu>> wrote:
Is the client supposed to have an IB interface configured, or is it just 
supposed to mount over ethernet?

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu


> On Jul 20, 2016, at 2:09 PM, sohamm > wrote:
>
> Hi
>
> Any guidance/help on this is greatly appreciated.
>
> Thanks
>
> On Mon, Jul 18, 2016 at 7:25 PM, sohamm > 
> wrote:
> Hi Ben
> Both the networks have netmasks of value 255.255.255.0
>
> Thanks
>
> On Mon, Jul 18, 2016 at 10:08 AM, Ben Evans > 
> wrote:
> What do your netmasks look like on each network?
>
> From: lustre-discuss > 
> on behalf of sohamm >
> Date: Monday, July 18, 2016 at 1:56 AM
> To: "lustre-discuss@lists.lustre.org" 
> >
> Subject: Re: [lustre-discuss] lustre-discuss Digest, Vol 124, Issue 17
>
> Hi Thomas
> Below are the results of the commands you suggested.
>
> From Client
> [root@dev1 ~]# lctl ping 192.168.200.52@o2ib<mailto:192.168.200.52@o2ib>
> failed to ping 192.168.200.52@o2ib<mailto:192.168.200.52@o2ib>: Input/output 
> error
> [root@dev1 ~]# lctl ping 192.168.111.52@tcp<mailto:192.168.111.52@tcp>
> 12345-0@lo
> 12345-192.168.200.52@o2ib<mailto:12345-192.168.200.52@o2ib>
> 12345-192.168.111.52@tcp<mailto:12345-192.168.111.52@tcp>
> [root@dev1 ~]# mount -t lustre 
> 192.168.111.52@tcp:/mylustre<mailto:192.168.111.52@tcp:/mylustre> /lustre
> mount.lustre: mount 
> 192.168.111.52@tcp:/mylustre<mailto:192.168.111.52@tcp:/mylustre> at /lustre 
> failed: Input/output error
> Is the MGS running?
> mount: mounting 
> 192.168.111.52@tcp:/mylustre<mailto:192.168.111.52@tcp:/mylustre> on /lustre 
> failed: Invalid argument
>
> cat /var/log/messages | tail
> Jul 18 01:37:04 dev1 user.warn kernel: [2250504.401397] ib1: multicast join 
> failed for ff12:401b::::::, status -22
> Jul 18 01:37:26 dev1 user.warn kernel: [2250526.257309] LNet: No route to 
> 12345-192.168.200.52@o2ib<mailto:12345-192.168.200.52@o2ib> via  (all 
> routers down)
> Jul 18 01:37:36 dev1 user.warn kernel: [2250536.481862] ib1: multicast join 
> failed for ff12:401b::::::, status -22
> Jul 18 01:41:53 dev1 user.warn kernel: [2250792.947299] LNet: No route to 
> 12345-192.168.200.52@o2ib<mailto:12345-192.168.200.52@o2ib> via  (all 
> routers down)
>
>
> From MGS
> [root@lustre_mgs01_vm03 ~]# lctl ping 
> 192.168.111.102@tcp<mailto:192.168.111.102@tcp>
> 12345-0@lo
> 12345-192.168.111.102@tcp<mailto:12345-192.168.111.102@tcp>
>
> Please let me know what else i can try. Looks like i am missing something 
> with the ib config? Do i need router setup as part of lnet ?
> if i am able to ping mgs from client on the tcp network, it should still work 
> ?
>
> Thanks
>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] ​luster client mount issues

2016-08-01 Thread Mohr Jr, Richard Frank (Rick Mohr)

> On Jul 28, 2016, at 9:54 PM, sohamm  wrote:
> 
> Client is configured for IB interface. 

So it looks like there might be something wrong with the LNet config on the 
client then.  Based on the output from “lctl ping” that you ran from the 
server, the client only reported a NID on the tcp network.

> in my understanding i can specific the network of choice in the mount 
> command. tried both tcp and ib.

That is true, but sometimes if the client and server both have interfaces on 
two different networks (like ethernet and IB) there can be some subtle issues.  
When you specify the NID for the MGS to mount the file system, the client will 
retrieve information about the MDS/OSS servers from the MGS you specified.  
This information includes the NIDS that the MDS/OSS servers will listen for 
requests.  If a client sees that a server has a NID on tcp0 and a NID on o2ib0, 
and the client also has NIDs on tcp0 and o2ib0, then the client sees that there 
are two paths to the same server and it will just pick one of the paths (which 
might not be the one you want).  And if the path it chooses happens to be down, 
it won’t matter if the other path is up.

(Now, I should make a disclaimer about the above statements.  I believe that is 
how it worked on Lustre versions like 1.8 and 2.4.  I have not tried this with 
newer Lustre versions, so the behavior could be different.  I also have not 
experimented with anything like specifying weights for LNet routes, so I don’t 
know if that could be used to prefer one interface over another.)

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] ​luster client mount issues

2016-08-16 Thread sohamm
Hi All,
I was able to get the lustre client mounted successfully.

I was getting this error whenever i tried to mount client via tcp/o2ib
network.

[867363.885584] LustreError: 162-5: Missing mount data: check that
/sbin/mount.lustre isinstalled.
[867363.885637] LustreError: 13214:0:(obd_mount.c:1325:lustre_fill_super())
Unable to mount  (-22)
[867363.885659] LustreError: 13214:0:(obd_mount.c:1325:lustre_fill_super())
Skipped 1 previous similar message
[867364.107157] LustreError: 15c-8: MGC192.168.111.52@tcp: The
configuration from log 'mylustre-client' failed (-2). This may be the
result of communication errors between this node and the MGS, a bad
configuration, or other errors. See the syslog for more information.
[867364.107209] LustreError: 13215:0:(llite_lib.c:1046:ll_fill_super())
Unable to process log: -2
[867364.107729] Lustre: Unmounted mylustre-client
MGC192.168.111.52@tcp

but when checked on the mgs with cat /proc/fs/lustre/device, i could not
see any entry for MGC192.168.111.52@tcp . Only MGC192.168.200.52@o2ib was
present. Also some of the disks were missing. Which led me to look into my
disks health. To my surprise the iscsi disks in zpool were "Degraded". Not
sure what caused that. So i reconfigured the entire setup and was able to
mount the client via o2ib without any issues. Couple of other tweaks i did
this time is

1. Disabled the firewalld and ufw ( earlier i added an exception to allow
the iscsi disks )
2. used the following command to mount MGS and mdt ( both in single
command).
mkfs.lustre --fsname=lustre --mgs --mdt --backfstype=zfs mgs01/data --index
0

earlier i used below two commands
mkfs.lustre --mgs --backfstype=zfs mds1_1/mgs
mkfs.lustre --mdt --backfstype=zfs --fsname=mylustre --index=1
--mgsnode=192.168.200.52@o2ib mds1_1/mdt1

3. modified lustre.conf to have the preferred network as first parameter.
Not sure if thats how its supposed to work.
options lnet networks=o2ib(ib0),tcp0(eth0) if i want to connect over o2ib
options lnet networks=tcp0(eth0),o2ib(ib0) if i want to connect over tcp

Thanks all for your help. Hopefully my learning's will help others.


On Mon, Aug 1, 2016 at 7:42 AM, Mohr Jr, Richard Frank (Rick Mohr) <
rm...@utk.edu> wrote:

>
> > On Jul 28, 2016, at 9:54 PM, sohamm  wrote:
> >
> > Client is configured for IB interface.
>
> So it looks like there might be something wrong with the LNet config on
> the client then.  Based on the output from “lctl ping” that you ran from
> the server, the client only reported a NID on the tcp network.
>
> > in my understanding i can specific the network of choice in the mount
> command. tried both tcp and ib.
>
> That is true, but sometimes if the client and server both have interfaces
> on two different networks (like ethernet and IB) there can be some subtle
> issues.  When you specify the NID for the MGS to mount the file system, the
> client will retrieve information about the MDS/OSS servers from the MGS you
> specified.  This information includes the NIDS that the MDS/OSS servers
> will listen for requests.  If a client sees that a server has a NID on tcp0
> and a NID on o2ib0, and the client also has NIDs on tcp0 and o2ib0, then
> the client sees that there are two paths to the same server and it will
> just pick one of the paths (which might not be the one you want).  And if
> the path it chooses happens to be down, it won’t matter if the other path
> is up.
>
> (Now, I should make a disclaimer about the above statements.  I believe
> that is how it worked on Lustre versions like 1.8 and 2.4.  I have not
> tried this with newer Lustre versions, so the behavior could be different.
> I also have not experimented with anything like specifying weights for LNet
> routes, so I don’t know if that could be used to prefer one interface over
> another.)
>
> --
> Rick Mohr
> Senior HPC System Administrator
> National Institute for Computational Sciences
> http://www.nics.tennessee.edu
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org