I poke around at this today and figured it out - Sort of.
When I did 'gluster volume set named nfs.register-with-portmap on'
originally, I only had rpcbind running on two of my four servers.
Gluster nfs started up on all four, but obviously only two correctly
connected with rpcbind/portmap. Seems that if rpcbind is not running
when you set 'register-with-portmap on', even when you cycle gluster it
still doesn't work.
So, I started up rpcbind, did a 'register-with-portmap off', followed by
'register-with-portmap on' and it works now.
When I diffed the before and after nfs-server.vol files, I see this:
[root@dresproddns01 ~]# diff nfs-vol-1 /etc/glusterd/nfs/nfs-server.vol
143c143
< option rpc.register-with-portmap off
---
> option rpc.register-with-portmap on
Apparently if rpcbind is not running, the option does not get enabled
properly. There is an error in nfs.log, but it's hard to find especially
if the node you manage the cluster from isn't the node with the issue.
It isn't clear either that it's broken even if you cycle gluster (and
even though the gluster volume configuration says 'register-with-portmap
on'. Does the 'gluster volume set' command have the ability to get
success/fail information back from each node? It also appears that
'register-with-portmap' is applied to all volumes, even if you just
enable it on one - Is there a cluster-wide place to 'set' options?
[2012-03-05 19:51:22.517368] E
[rpcsvc.c:2771:nfs_rpcsvc_program_register_portmap] 0-nfsrpc: Could not
register with portmap
[2012-03-05 19:51:22.517420] E
[rpcsvc.c:2861:nfs_rpcsvc_program_register] 0-nfsrpc: portmap
registration of program failed
[2012-03-05 19:51:22.517428] E
[rpcsvc.c:2874:nfs_rpcsvc_program_register] 0-nfsrpc: Program
registration failed: MOUNT3, Num: 100005, Ver: 3, Port: 38465
David
On 3/5/12 2:05 PM, Bryan Whitehead wrote:
Is selinux running? iptables?
Can you http://pastie.org/ the nfs.log in /var/log/glusterfs ?
On Mon, Mar 5, 2012 at 3:59 AM, David Coulson <da...@davidcoulson.net
<mailto:da...@davidcoulson.net>> wrote:
Yep.
[root@dresproddns01 ~]# service glusterd stop
Stopping glusterd: [ OK ]
[root@dresproddns01 ~]# ps ax | grep nfs
120494 pts/0 S+ 0:00 grep nfs
2167119 ? S 0:00 [nfsiod]
[root@dresproddns01 ~]# service rpcbind stop
Stopping rpcbind: [ OK ]
[root@dresproddns01 ~]# rpcinfo -p
rpcinfo: can't contact portmapper: RPC: Remote system error - No
such file or directory
[root@dresproddns01 ~]# service rpcbind start
Starting rpcbind: [ OK ]
[root@dresproddns01 ~]# service glusterd start
Starting glusterd: [ OK ]
[root@dresproddns01 ~]# rpcinfo -p
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
Note that I waited a short while between the last two steps. FYI,
this is RHEL6 (the two systems that work are RHEL6 too, so I'm not
sure it matters much).
On 3/5/12 3:27 AM, Bryan Whitehead wrote:
did you start portmap service before you started gluster?
On Sun, Mar 4, 2012 at 11:53 AM, David Coulson
<da...@davidcoulson.net <mailto:da...@davidcoulson.net>> wrote:
I've four systems with multiple 4-way replica volumes. I'm
migrating a number of volumes from Fuse to NFS for
performance reasons.
My first two hosts seem to work nicely, but the other two
won't start the NFS services properly. I looked through the
nfs.log, but it doesn't give any indication of why it did not
register with rpcbind. I'm presuming I've got a
misconfiguration on two of the systems, but there isn't a
clear indication of what is not working.
Here is an example from a host which does not work:
[root@dresproddns01 ~]# rpcinfo -p
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
[root@dresproddns01 ~]# ps ax | grep nfs
2167119 ? S 0:00 [nfsiod]
2738268 ? Ssl 0:00
/opt/glusterfs/3.2.5/sbin/glusterfs -f
/etc/glusterd/nfs/nfs-server.vol -p
/etc/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log
2934228 pts/0 S+ 0:00 grep nfs
[root@dresproddns01 ~]# netstat -ntlp | grep 2738268
tcp 0 0 0.0.0.0:38465
<http://0.0.0.0:38465>
0.0.0.0:* LISTEN 2738268/glusterfs
tcp 0 0 0.0.0.0:38466
<http://0.0.0.0:38466>
0.0.0.0:* LISTEN 2738268/glusterfs
tcp 0 0 0.0.0.0:38467
<http://0.0.0.0:38467>
0.0.0.0:* LISTEN 2738268/glusterfs
[root@dresproddns01 ~]# gluster volume info svn
Volume Name: svn
Type: Replicate
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: rhesproddns01:/gluster/svn
Brick2: rhesproddns02:/gluster/svn
Brick3: dresproddns01:/gluster/svn
Brick4: dresproddns02:/gluster/svn
Options Reconfigured:
performance.client-io-threads: 1
performance.flush-behind: on
network.ping-timeout: 5
performance.stat-prefetch: 1
nfs.disable: off
nfs.register-with-portmap: on
auth.allow: 10.250.53.*,10.252.248.*,169.254.*,127.0.0.1
performance.cache-size: 256Mb
performance.write-behind-window-size: 128Mb
Only obvious difference with a host which does work is this:
[root@rhesproddns01 named]# rpcinfo -p
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100005 3 tcp 38465 mountd
100005 1 tcp 38466 mountd
100003 3 tcp 38467 nfs
Any ideas where to look for errors?
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users