On 21.07.12 07:51, Gene Heskett wrote:
> On Saturday 21 July 2012 06:55:06 Erik Christiansen did opine:
> > and seems to be reasonably talkative, even when things are going
> > well.

> Not so much, but maybe a clue from "service nfs-kernel-server restart":
> 
> Jul 21 06:53:30 coyote kernel: [69826.321565] nfsd: last server has exited, 
> flushing export cache

Since most of your mounts aren't working, let's check that this doesn't
mean that all the nfs daemons have left the building. A quick

$ ps -ef | grep nfsd

shows bunches of them here, and I've always made sure there were at
least 4 of them running on a server. You get nowhere if they're gone,
and I have had it happen, both on HPUX and Solaris boxes, back when I
thought I knew how this stuff works.

> Jul 21 06:53:31 coyote kernel: [69827.534764] svc: failed to register 
> lockdv1 RPC service (errno 97).
> Jul 21 06:53:31 coyote kernel: [69827.535528] NFSD: Using 
> /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
> Jul 21 06:53:31 coyote kernel: [69827.535545] NFSD: starting 90-second 
> grace period
> 
> What is this lockdv1 RPC service?

Yeah. It has a real guilty look, doesn't it? Just looking at it, I guess
it's a version 1 lock daemon, which the log entry is telling us is a Remote
Procedure Call service. (RPC is an ancient unix protocol for making
procedure calls on other machines across the network.
If you do a "man -a rpc", you'll see that you could use it to get at
nfs and portmapper services from a C program.) Incidentally, nfs needs
one of them too, IIRC:

$ ps -ef | egrep portmap
daemon     701     1  0 16:22 ?        00:00:00 portmap

The registration failure doesn't have to mean much, though. I have the
same here:

Jul 21 17:39:34 ratatosk kernel: [ 4638.217979] svc: failed
to register lockdv1 RPC service (errno 97).

I'm happy with this:

$ ps -ef | egrep '(lockd|statd)'       # egrep, not just grep. ;-)
root        15     2  0 16:22 ?        00:00:00 [kblockd/0]
root      3073     1  0 17:39 ?        00:00:00 rpc.statd -L
root      3340     2  0 17:39 ?        00:00:00 [lockd]

NFS does need lockd and statd, to work properly, AFAIR, but it doesn't
have to be lockdv1, I figure.

...

> > How does your DNS respond to that hostname, if you try a
> > "dig coyote.coyote.den", or if not, does at least
> > "ping coyote.coyote.den " pick up the right IP address?
> 
> And this is nuking futz:

... Crook DNS results cropped to shorten things.

> Totally AFU!

OK, we don't want to rely on DNS to resolve those hostnames. ;-)

... Good ping results wuz here.

> My hosts file:
> 127.0.0.1     localhost
> 192.168.71.3  coyote.coyote.den       coyote
> 192.168.71.1  router.coyote.den       router
> 192.168.71.4  shop.coyote.den         shop
> 192.168.71.5  lathe.coyote.den        lathe
> 192.168.71.6  lappy.coyote.den        lappy
> 
> # The following lines are desirable for IPv6 capable hosts
> ::1     localhost ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> ff02::3 ip6-allhosts
> 
> And resolv.conf:
> nameserver 192.168.71.1
> domain coyote.coyote.den
> search        hosts,dns
> 
> Where the router is the gateway, which fwds the dns requests to one of the 
> shentel servers,  209.55.24.10  or 209.55.27.13

That's similar to what I have here, and a ping checks /etc/hosts, but
dig doesn't, although I have:

$ more /etc/host.conf 
# The "order" line is only used by old versions of the C library.
order hosts,bind
multi on

And there's a likely explanation for why dns doesn't check /ets/hosts
here either, because that "C library" is the resolver library.

... More good ping results elided.

> I do not have a local to this machine dns server (adns, dnsmasq, etc) 
> installed, and just installed dnswalk to see what it says:
> root@coyote:/var# dnswalk -adilrfFm coyote.coyote.den.
> Checking coyote.coyote.den.
> BAD: SOA record not found for coyote.coyote.den.
> !BAD: coyote.coyote.den. has NO authoritative nameservers!
> !BAD: All zone transfer attempts of coyote.coyote.den. failed!
> !0 failures, 0 warnings, 3 errors.

To make sure we only have to debug nfs, what about trying in
/etc/exports:

/my/shared/filesystem  192.168.71.0/24(rw)

(I've forgotten the names of what you're exporting)
Now we don't have to resolve any hostnames, which eliminates another
potential cause of the observed failure. The other mount attributes
won't hurt, but they're both defaults now, IIUC.

> And of coarse there is no /var/named directory.  I figured the hosts files 
> should handle the local stuff & if its not in the hosts file, send it to 
> the gateway, my router, which is a Buffalo Hi-Power running genuine dd-wrt.
> 
> So despite the dig results, I should be covered.  Humm, "hostname" returns 
> the alias!
> root@coyote:/var# hostname
> coyote

"Should be", but what happens with "192.168.71.0/24(rw)" in
/etc/exports, I wonder?

> Should it not be returning the FQDN?

Try: $ hostname --fqdn

> So there are 3 head scratchers above. 
> 
> I just ran hostname and domainname in 'set' mode, they now return correct 
> answers, but restarting nfs-kernel-server and autofs still returns the same 
> results, total failure.

See sig. ;-)

> 
> Thanks Erik

No worries. By the time this is fixed, I figure I'll have learnt
something too. (I'll have a look at that dnswalk for starters. :-)

Erik

-- 
Sometimes you have to outsmart this stuff, it works for Murphy you know.
                                        - Gene Heskett, on emc-users ML.


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Emc-users mailing list
Emc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/emc-users

Reply via email to