Hi,
I think your client doesn't have the o2ib lnet (it should appear in the
output of the lctl ping, even if you ping on the tcp lnet).
In your /etc/modprobe.d/lustre.conf o2ib is associated with the ib0
interface, but your /var/log/messages talks about ib1.
If it is a dual port card where just one port is used, the easiest would
be to plug the cable to the other interface. (If there are two ib
connections, things might become a bit more complicated. There are
examples for multi rail configurations using several lnets in the lustre
manual, but maybe this goes too far.)
With the attempt to mount via tcp (or tcp0, which is the same) I think
the problem is that the file system config on the mgs doesn't contain
the tcp-NIDs and/or the routes are not configured correctly. It seems
the attempt to mount via tcp causes the client to use o2ib for the
connections to the MDS and OSSes. So, I would recommend to get that
working first and then look at tcp0 at a later stage (if you need it at
all - native o2ib is more performant).
Last but not least I have noticed a typo in your client mount command:
mount -t lustre 192.168.200.52@ob2:/mylustre /lustre
this should be "o2ib" here, too.
best regards,
Martin
On 07/20/2016 08:09 PM, sohamm wrote:
> Hi
>
> Any guidance/help on this is greatly appreciated.
>
> Thanks
>
> On Mon, Jul 18, 2016 at 7:25 PM, sohamm wrote:
>
>> Hi Ben
>> Both the networks have netmasks of value 255.255.255.0
>>
>> Thanks
>>
>> On Mon, Jul 18, 2016 at 10:08 AM, Ben Evans wrote:
>>
>>> What do your netmasks look like on each network?
>>>
>>> From: lustre-discuss on behalf
>>> of sohamm
>>> Date: Monday, July 18, 2016 at 1:56 AM
>>> To: "lustre-discuss@lists.lustre.org"
>>> Subject: Re: [lustre-discuss] lustre-discuss Digest, Vol 124, Issue 17
>>>
>>> Hi Thomas
>>> Below are the results of the commands you suggested.
>>>
>>> *From Client*
>>> [root@dev1 ~]# lctl ping 192.168.200.52@o2ib
>>> failed to ping 192.168.200.52@o2ib: Input/output error
>>> [root@dev1 ~]# lctl ping 192.168.111.52@tcp
>>> 12345-0@lo
>>> 12345-192.168.200.52@o2ib
>>> 12345-192.168.111.52@tcp
>>> [root@dev1 ~]# mount -t lustre 192.168.111.52@tcp:/mylustre /lustre
>>> mount.lustre: mount 192.168.111.52@tcp:/mylustre at /lustre failed:
>>> Input/output error
>>> Is the MGS running?
>>> mount: mounting 192.168.111.52@tcp:/mylustre on /lustre failed: Invalid
>>> argument
>>>
>>> cat /var/log/messages | tail
>>> Jul 18 01:37:04 dev1 user.warn kernel: [2250504.401397] ib1: multicast
>>> join failed for ff12:401b::::::, status -22
>>> Jul 18 01:37:26 dev1 user.warn kernel: [2250526.257309] LNet: No route to
>>> 12345-192.168.200.52@o2ib via (all routers down)
>>> Jul 18 01:37:36 dev1 user.warn kernel: [2250536.481862] ib1: multicast
>>> join failed for ff12:401b::::::, status -22
>>> Jul 18 01:41:53 dev1 user.warn kernel: [2250792.947299] LNet: No route to
>>> 12345-192.168.200.52@o2ib via (all routers down)
>>>
>>>
>>> *From MGS*
>>> [root@lustre_mgs01_vm03 ~]# lctl ping 192.168.111.102@tcp
>>> 12345-0@lo
>>> 12345-192.168.111.102@tcp
>>>
>>> Please let me know what else i can try. Looks like i am missing something
>>> with the ib config? Do i need router setup as part of lnet ?
>>> if i am able to ping mgs from client on the tcp network, it should still
>>> work ?
>>>
>>> Thanks
>>>
>>>
>>> On Sun, Jul 17, 2016 at 1:07 PM,
To: "lustre-discuss@lists.lustre.org"
Subject: [lustre-discuss] llapi_file_get_stripe() and
/proc/fs/lustre/osc/entries
Message-ID: <03ceaaa0-b004-ae43-eaa1-437da2a5b...@iodoctors.com>
Content-Type: text/plain; charset="utf-8"; Format="flowed"
I am using