<DISCLAIMER>
Warning semi-long mail.
</DISCLAIMER>
I have a somewhat strange problem with knfsd, lockd and statd. I have
a bunch of machines. Some running linux, some running HPUX 10.20, some
running solaris 7 and some running Digital UNIX V4.0D. The problem is
locking a file across nfs (fcntl) between a (HPUX|DUX)-box and my
linux-box. If I lock from a (solaris|linux)-box to myt linux-box,
everything works(and yes lockd and statd are running ;). I've tried to
give as much info as I can.
I've tested with this simple program:
------------------------------------------------------------
#include <fcntl.h>
int main(int argc, char **argv)
{
struct flock lock;
int fd, tmp;
if (argc < 2) {exit(1);}
fd = open(argv[1], O_RDWR);
lock.l_type = F_WRLCK;
printf("FD: %d\n", fd);
tmp = fcntl(fd, F_SETLKW, &lock);
if (tmp < 0) {
perror("lock");
}
printf("TMP: %d\n", tmp);
return 0;
}
------------------------------------------------------------
I export the files from my linux-box like this (/etc/exports):
------------------------------------------------------------
/net/hugin/vol/mail (rw) @diku
------------------------------------------------------------
I do get an error in syslog. I get this when I try to lock from HPUX
or DUX. I reads (in the syslog of the linux-server):
------------------------------------------------------------
May 11 21:42:50 hugin kernel: fh_verify: mailspool/hall permission \
failure, acc=4, error=13
------------------------------------------------------------
The linux box'es are redhat-6.1 with the default kernel 2.2.12 (rpm
build 20). The nfs is knfsd-1.4.7 and knfsd-clients-1.4.7 (rpm build
7). Maybe I should mention that files are created in /var/lib/nfs/sm/
directory, with the right ip's (even for the failed locks).
Now in order to provider even further documentation, here is a
tcpdump-session of a failed lock from a HPUX-box to a linux-box:
------------------------------------------------------------
22:09:34.526293 eth0 < skirner.diku.dk.1998905672 > hugin.diku.dk.nfs: 128 getattr
[|nfs]
22:09:34.526398 eth0 > hugin.diku.dk.nfs > skirner.diku.dk.1998905672: reply ok 96
getattr REG 100600 ids 3026/125 sz 420
22:09:34.527543 eth0 < skirner.diku.dk.1998905673 > hugin.diku.dk.nfs: 128 getattr
[|nfs]
22:09:34.527608 eth0 > hugin.diku.dk.nfs > skirner.diku.dk.1998905673: reply ok 96
getattr REG 100600 ids 3026/125 sz 420
22:09:34.530435 eth0 < skirner.diku.dk.719 > hugin.diku.dk.1041: udp 156
22:09:34.530575 eth0 > hugin.diku.dk.794 > skirner.diku.dk.3128: udp 88
22:09:34.530611 eth0 > hugin.diku.dk.1041 > skirner.diku.dk.719: udp 24
22:09:39.518360 eth0 > arp who-has skirner.diku.dk tell hugin.diku.dk
(0:90:27:61:e6:5a)
22:09:39.518577 eth0 < arp reply skirner.diku.dk is-at 0:60:b0:a4:2d:36
(0:90:27:61:e6:5a)
22:09:43.655227 eth0 < skirner.diku.dk.821138 > hugin.diku.dk.nfs: 40 null
22:09:43.655278 eth0 > hugin.diku.dk.nfs > skirner.diku.dk.821138: reply ok 24 null
------------------------------------------------------------
but when I run between to linux-boxes, i get this:
------------------------------------------------------------
22:10:27.327087 eth0 < nanna.diku.dk.2186153216 > hugin.diku.dk.nfs: 40 null
22:10:27.327166 eth0 > arp who-has nanna.diku.dk tell hugin.diku.dk (0:90:27:61:e6:5a)
22:10:27.327286 eth0 < arp reply nanna.diku.dk is-at 0:d0:b7:e:46:a4 (0:90:27:61:e6:5a)
22:10:27.327302 eth0 > hugin.diku.dk.nfs > nanna.diku.dk.2186153216: reply ok 24 null
22:10:27.327485 eth0 < nanna.diku.dk.1022 > hugin.diku.dk.sunrpc: udp 144
22:10:27.327870 eth0 > hugin.diku.dk.sunrpc > nanna.diku.dk.1022: udp 28
22:10:27.328455 eth0 < nanna.diku.dk.1022 > hugin.diku.dk.638: udp 148
22:10:27.330261 eth0 > hugin.diku.dk.638 > nanna.diku.dk.1022: udp 60
22:10:27.331862 eth0 < nanna.diku.dk.3351726386 > hugin.diku.dk.nfs: 160 getattr [|nfs]
22:10:27.331936 eth0 > hugin.diku.dk.nfs > nanna.diku.dk.3351726386: reply ok 96
getattr DIR 42775 ids 0/125 sz 4096
22:10:27.333288 eth0 < nanna.diku.dk.3402058034 > hugin.diku.dk.nfs: 152 lookup [|nfs]
22:10:27.333394 eth0 > hugin.diku.dk.nfs > nanna.diku.dk.3402058034: reply ok 128
lookup fh Unknown/1
22:10:27.334010 eth0 < nanna.diku.dk.1315 > hugin.diku.dk.sunrpc: udp 56
22:10:27.334364 eth0 > hugin.diku.dk.sunrpc > nanna.diku.dk.1315: udp 28
22:10:27.334565 eth0 < nanna.diku.dk.798 > hugin.diku.dk.1041: udp 228
22:10:27.335582 eth0 > hugin.diku.dk.1041 > nanna.diku.dk.798: udp 36
22:10:27.336042 eth0 < nanna.diku.dk.798 > hugin.diku.dk.1041: udp 212
22:10:27.336102 eth0 > hugin.diku.dk.1041 > nanna.diku.dk.798: udp 36
------------------------------------------------------------
There seems to be more nfs/rpc communication when the locking works.
BTW eveything is mounted with amd(am-utils-6.0.2). The configuration
is the same for all machines (excep arch). It has the following
configuration:
------------------------------------------------------------
[ global ]
arch = i686linux
auto_dir = /net
local_domain = diku.dk
log_file = syslog
log_options = all,noinfo,nomap,nostats,nouser
plock = yes
print_version = no
restart_mounts = yes
unmount_on_exit = yes
[ /home ]
map_type = nis
map_name = amd.home
map_options = cache:=all,sync
[ /vol ]
map_type = nis
map_name = amd.vol
map_options = cache:=all,sync
------------------------------------------------------------
Can anybody help?
PS: I you get this far I admire your patience.
--
Christoffer
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]