So, after a few weeks I picked this back up. It turned out that the order of
the interfaces in modprobe.conf caused this. ie:
options lnet networks=o2ib(ib0),tcp(eth0)
did not work while:
options lnet networks=tcp(eth0),o2ib(ib0)
worked just fine.
Is this a bug?
John White
High
While I've never used Lustre on IB, I have seen clients with very similar
symptoms before when the network appears to be functioning properly. You should
verify that jumbo frames are getting passed properly between your networks (or
switch ports if both hosts are on the same network segment). If
Unfortunately for this case, there is a clear network path, no firewalls. A
quick telnet test at least confirms something is listening on the port and
closes the connection pretty quickly. If networking were the case, wouldn't I
still see connection errors for the @tcp NID?
Jo
On 2010-03-26, at 17:45, John White wrote:
> We've got a new client we're trying to get to mount an existing
> file system. The host cluster is set up with 2 NIDs for the MDT
> (o2ib, tcp), same with the client. When I try mounting via tcp
> (mount -t lustre -o flock n0006.lus...@tcp:
Hello Folks,
We've got a new client we're trying to get to mount an existing file
system. The host cluster is set up with 2 NIDs for the MDT (o2ib, tcp), same
with the client. When I try mounting via tcp (mount -t lustre -o flock
n0006.lus...@tcp:/vulcan /clusterfs/vulcan/pscratch), it