<DISCLAIMER>

Warning semi-long mail.

</DISCLAIMER>

I have a somewhat strange problem with knfsd, lockd and statd. I have
a bunch of machines. Some running linux, some running HPUX 10.20, some
running solaris 7 and some running Digital UNIX V4.0D. The problem is
locking a file across nfs (fcntl) between a (HPUX|DUX)-box and my
linux-box. If I lock from a (solaris|linux)-box to myt linux-box,
everything works(and yes lockd and statd are running ;). I've tried to
give as much info as I can.

I've tested with this simple program:

------------------------------------------------------------
#include <fcntl.h>

int main(int argc, char **argv)
{
    struct flock lock;

    int fd, tmp;

    if (argc < 2) {exit(1);}

    fd = open(argv[1], O_RDWR);
    lock.l_type = F_WRLCK;

    printf("FD: %d\n", fd);
    tmp = fcntl(fd, F_SETLKW, &lock);
    if (tmp < 0) {
        perror("lock");
    }

    printf("TMP: %d\n", tmp);
    return 0;
}
------------------------------------------------------------



I export the files from my linux-box like this (/etc/exports):
------------------------------------------------------------
/net/hugin/vol/mail (rw) @diku
------------------------------------------------------------

I do get an error in syslog. I get this when I try to lock from HPUX
or DUX. I reads (in the syslog of the linux-server):

------------------------------------------------------------
May 11 21:42:50 hugin kernel: fh_verify: mailspool/hall permission \
failure, acc=4, error=13
------------------------------------------------------------

The linux box'es are redhat-6.1 with the default kernel 2.2.12 (rpm
build 20). The nfs is knfsd-1.4.7 and knfsd-clients-1.4.7 (rpm build
7). Maybe I should mention that files are created in /var/lib/nfs/sm/
directory, with the right ip's (even for the failed locks).

Now in order to provider even further documentation, here is a
tcpdump-session of a failed lock from a HPUX-box to a linux-box:

------------------------------------------------------------
22:09:34.526293 eth0 < skirner.diku.dk.1998905672 > hugin.diku.dk.nfs: 128 getattr 
[|nfs]
22:09:34.526398 eth0 > hugin.diku.dk.nfs > skirner.diku.dk.1998905672: reply ok 96 
getattr REG 100600 ids 3026/125 sz 420
22:09:34.527543 eth0 < skirner.diku.dk.1998905673 > hugin.diku.dk.nfs: 128 getattr 
[|nfs]
22:09:34.527608 eth0 > hugin.diku.dk.nfs > skirner.diku.dk.1998905673: reply ok 96 
getattr REG 100600 ids 3026/125 sz 420
22:09:34.530435 eth0 < skirner.diku.dk.719 > hugin.diku.dk.1041: udp 156
22:09:34.530575 eth0 > hugin.diku.dk.794 > skirner.diku.dk.3128: udp 88
22:09:34.530611 eth0 > hugin.diku.dk.1041 > skirner.diku.dk.719: udp 24
22:09:39.518360 eth0 > arp who-has skirner.diku.dk tell hugin.diku.dk 
(0:90:27:61:e6:5a)
22:09:39.518577 eth0 < arp reply skirner.diku.dk is-at 0:60:b0:a4:2d:36 
(0:90:27:61:e6:5a)
22:09:43.655227 eth0 < skirner.diku.dk.821138 > hugin.diku.dk.nfs: 40 null
22:09:43.655278 eth0 > hugin.diku.dk.nfs > skirner.diku.dk.821138: reply ok 24 null
------------------------------------------------------------

but when I run between to linux-boxes, i get this:

------------------------------------------------------------
22:10:27.327087 eth0 < nanna.diku.dk.2186153216 > hugin.diku.dk.nfs: 40 null
22:10:27.327166 eth0 > arp who-has nanna.diku.dk tell hugin.diku.dk (0:90:27:61:e6:5a)
22:10:27.327286 eth0 < arp reply nanna.diku.dk is-at 0:d0:b7:e:46:a4 (0:90:27:61:e6:5a)
22:10:27.327302 eth0 > hugin.diku.dk.nfs > nanna.diku.dk.2186153216: reply ok 24 null
22:10:27.327485 eth0 < nanna.diku.dk.1022 > hugin.diku.dk.sunrpc: udp 144
22:10:27.327870 eth0 > hugin.diku.dk.sunrpc > nanna.diku.dk.1022: udp 28
22:10:27.328455 eth0 < nanna.diku.dk.1022 > hugin.diku.dk.638: udp 148
22:10:27.330261 eth0 > hugin.diku.dk.638 > nanna.diku.dk.1022: udp 60
22:10:27.331862 eth0 < nanna.diku.dk.3351726386 > hugin.diku.dk.nfs: 160 getattr [|nfs]
22:10:27.331936 eth0 > hugin.diku.dk.nfs > nanna.diku.dk.3351726386: reply ok 96 
getattr DIR 42775 ids 0/125 sz 4096
22:10:27.333288 eth0 < nanna.diku.dk.3402058034 > hugin.diku.dk.nfs: 152 lookup [|nfs]
22:10:27.333394 eth0 > hugin.diku.dk.nfs > nanna.diku.dk.3402058034: reply ok 128 
lookup fh Unknown/1
22:10:27.334010 eth0 < nanna.diku.dk.1315 > hugin.diku.dk.sunrpc: udp 56
22:10:27.334364 eth0 > hugin.diku.dk.sunrpc > nanna.diku.dk.1315: udp 28
22:10:27.334565 eth0 < nanna.diku.dk.798 > hugin.diku.dk.1041: udp 228
22:10:27.335582 eth0 > hugin.diku.dk.1041 > nanna.diku.dk.798: udp 36
22:10:27.336042 eth0 < nanna.diku.dk.798 > hugin.diku.dk.1041: udp 212
22:10:27.336102 eth0 > hugin.diku.dk.1041 > nanna.diku.dk.798: udp 36
------------------------------------------------------------

There seems to be more nfs/rpc communication when the locking works.

BTW eveything is mounted with amd(am-utils-6.0.2). The configuration
is the same for all machines (excep arch). It has the following
configuration:

------------------------------------------------------------
[ global ]
arch =                  i686linux
auto_dir =              /net
local_domain =          diku.dk
log_file =              syslog
log_options =           all,noinfo,nomap,nostats,nouser
plock =                 yes
print_version =         no
restart_mounts =        yes
unmount_on_exit =       yes
[ /home ]
map_type =              nis
map_name =              amd.home
map_options =           cache:=all,sync
[ /vol ]
map_type =              nis
map_name =              amd.vol
map_options =           cache:=all,sync

------------------------------------------------------------

Can anybody help?

PS: I you get this far I admire your patience.

-- 
        Christoffer
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]

Reply via email to