Hey Celine,
Thanks for gathering all this info! So the rdma connections work fine
with everything _but_ nfsrdma. And errno 103 indicates the connection
was aborted, maybe by the server (since no failures are logged by the
client).
More below:
Celine Bourde wrote:
Hi Steve,
This email summarizes the situation:
Standard mount -> OK
---------------------
[r...@twind ~]# mount -o rw 192.168.0.215:/vol0 /mnt/
Command works fine.
rdma mount -> KO
-----------------
[r...@twind ~]# mount -o rdma,port=2050 192.168.0.215:/vol0 /mnt/
Command blocks ! I should perform Ctr+C to kill process.
or
[r...@twind ofa_kernel-1.4.1]# strace mount.nfs 192.168.0.215:/vol0
/mnt/ -o rdma,port=2050
[..]
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(610),
sin_addr=inet_addr("127.0.0.1")}, 16) = 0
fcntl(3, F_SETFL, O_RDWR) = 0
sendto(3,
"-3\245\357\0\0\0\0\0\0\0\2\0\1\206\270\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0"...,
40, 0, {sa_family=AF_INET, sin_port=htons(610),
sin_addr=inet_addr("127.0.0.1")}, 16) = 40
poll([{fd=3, events=POLLIN}], 1, 3000) = 1 ([{fd=3, revents=POLLIN}])
recvfrom(3, "-3\245\357\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0",
8800, MSG_DONTWAIT, {sa_family=AF_INET, sin_port=htons(610),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 24
close(3) = 0
mount("192.168.0.215:/vol0", "/mnt", "nfs", 0,
"rdma,port=2050,addr=192.168.0.215"
..same problem
[r...@twind tmp]# dmesg
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32
ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32
ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32
ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
rpcrdma: connection to 192.168.0.215:2050 on mlx4_0, memreg 5 slots 32
ird 16
rpcrdma: connection to 192.168.0.215:2050 closed (-103)
Is there anything logged on the server side?
Also, can you try this again, but on both systems do this before
attempting the mount:
echo 32768 > /proc/sys/sunrpc/rpc_debug
This will enable all the rpc trace points and add a bunch of logging to
/var/log/messages.
Maybe that will show us something. It think the server is aborting the
connection for some reason.
Steve.
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general