I poke around at this today and figured it out - Sort of.

When I did 'gluster volume set named nfs.register-with-portmap on' originally, I only had rpcbind running on two of my four servers. Gluster nfs started up on all four, but obviously only two correctly connected with rpcbind/portmap. Seems that if rpcbind is not running when you set 'register-with-portmap on', even when you cycle gluster it still doesn't work.

So, I started up rpcbind, did a 'register-with-portmap off', followed by 'register-with-portmap on' and it works now.

When I diffed the before and after nfs-server.vol files, I see this:

[root@dresproddns01 ~]# diff nfs-vol-1 /etc/glusterd/nfs/nfs-server.vol
143c143
<     option rpc.register-with-portmap off
---
>     option rpc.register-with-portmap on


Apparently if rpcbind is not running, the option does not get enabled properly. There is an error in nfs.log, but it's hard to find especially if the node you manage the cluster from isn't the node with the issue. It isn't clear either that it's broken even if you cycle gluster (and even though the gluster volume configuration says 'register-with-portmap on'. Does the 'gluster volume set' command have the ability to get success/fail information back from each node? It also appears that 'register-with-portmap' is applied to all volumes, even if you just enable it on one - Is there a cluster-wide place to 'set' options?

[2012-03-05 19:51:22.517368] E [rpcsvc.c:2771:nfs_rpcsvc_program_register_portmap] 0-nfsrpc: Could not register with portmap [2012-03-05 19:51:22.517420] E [rpcsvc.c:2861:nfs_rpcsvc_program_register] 0-nfsrpc: portmap registration of program failed [2012-03-05 19:51:22.517428] E [rpcsvc.c:2874:nfs_rpcsvc_program_register] 0-nfsrpc: Program registration failed: MOUNT3, Num: 100005, Ver: 3, Port: 38465


David


On 3/5/12 2:05 PM, Bryan Whitehead wrote:
Is selinux running? iptables?

Can you http://pastie.org/ the nfs.log in /var/log/glusterfs ?

On Mon, Mar 5, 2012 at 3:59 AM, David Coulson <da...@davidcoulson.net <mailto:da...@davidcoulson.net>> wrote:

    Yep.

    [root@dresproddns01 ~]# service glusterd stop
    Stopping glusterd:                                         [  OK  ]

    [root@dresproddns01 ~]# ps ax | grep nfs
     120494 pts/0    S+     0:00 grep nfs

    2167119 ?        S      0:00 [nfsiod]
    [root@dresproddns01 ~]# service rpcbind stop
    Stopping rpcbind:                                          [  OK  ]

    [root@dresproddns01 ~]# rpcinfo -p
    rpcinfo: can't contact portmapper: RPC: Remote system error - No
    such file or directory
    [root@dresproddns01 ~]# service rpcbind start
    Starting rpcbind:                                          [  OK  ]
    [root@dresproddns01 ~]# service glusterd start
    Starting glusterd:                                         [  OK  ]

    [root@dresproddns01 ~]# rpcinfo -p
       program vers proto   port  service
        100000    4   tcp    111  portmapper
        100000    3   tcp    111  portmapper
        100000    2   tcp    111  portmapper
        100000    4   udp    111  portmapper
        100000    3   udp    111  portmapper
        100000    2   udp    111  portmapper

    Note that I waited a short while between the last two steps. FYI,
    this is RHEL6 (the two systems that work are RHEL6 too, so I'm not
    sure it matters much).


    On 3/5/12 3:27 AM, Bryan Whitehead wrote:
    did you start portmap service before you started gluster?

    On Sun, Mar 4, 2012 at 11:53 AM, David Coulson
    <da...@davidcoulson.net <mailto:da...@davidcoulson.net>> wrote:

        I've four systems with multiple 4-way replica volumes. I'm
        migrating a number of volumes from Fuse to NFS for
        performance reasons.

        My first two hosts seem to work nicely, but the other two
        won't start the NFS services properly. I looked through the
        nfs.log, but it doesn't give any indication of why it did not
        register with rpcbind. I'm presuming I've got a
        misconfiguration on two of the systems, but there isn't a
        clear indication of what is not working.

        Here is an example from a host which does not work:

        [root@dresproddns01 ~]# rpcinfo -p
           program vers proto   port  service
            100000    4   tcp    111  portmapper
            100000    3   tcp    111  portmapper
            100000    2   tcp    111  portmapper
            100000    4   udp    111  portmapper
            100000    3   udp    111  portmapper
            100000    2   udp    111  portmapper
        [root@dresproddns01 ~]# ps ax | grep nfs
        2167119 ?        S      0:00 [nfsiod]
        2738268 ?        Ssl    0:00
        /opt/glusterfs/3.2.5/sbin/glusterfs -f
        /etc/glusterd/nfs/nfs-server.vol -p
        /etc/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log
        2934228 pts/0    S+     0:00 grep nfs
        [root@dresproddns01 ~]# netstat -ntlp | grep 2738268
        tcp        0      0 0.0.0.0:38465
<http://0.0.0.0:38465> 0.0.0.0:* LISTEN 2738268/glusterfs
        tcp        0      0 0.0.0.0:38466
<http://0.0.0.0:38466> 0.0.0.0:* LISTEN 2738268/glusterfs
        tcp        0      0 0.0.0.0:38467
<http://0.0.0.0:38467> 0.0.0.0:* LISTEN 2738268/glusterfs

        [root@dresproddns01 ~]# gluster volume info svn

        Volume Name: svn
        Type: Replicate
        Status: Started
        Number of Bricks: 4
        Transport-type: tcp
        Bricks:
        Brick1: rhesproddns01:/gluster/svn
        Brick2: rhesproddns02:/gluster/svn
        Brick3: dresproddns01:/gluster/svn
        Brick4: dresproddns02:/gluster/svn
        Options Reconfigured:
        performance.client-io-threads: 1
        performance.flush-behind: on
        network.ping-timeout: 5
        performance.stat-prefetch: 1
        nfs.disable: off
        nfs.register-with-portmap: on
        auth.allow: 10.250.53.*,10.252.248.*,169.254.*,127.0.0.1
        performance.cache-size: 256Mb
        performance.write-behind-window-size: 128Mb

        Only obvious difference with a host which does work is this:

        [root@rhesproddns01 named]# rpcinfo -p
           program vers proto   port  service
            100000    4   tcp    111  portmapper
            100000    3   tcp    111  portmapper
            100000    2   tcp    111  portmapper
            100000    4   udp    111  portmapper
            100000    3   udp    111  portmapper
            100000    2   udp    111  portmapper
            100005    3   tcp  38465  mountd
            100005    1   tcp  38466  mountd
            100003    3   tcp  38467  nfs


        Any ideas where to look for errors?


        _______________________________________________
        Gluster-users mailing list
        Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
        http://gluster.org/cgi-bin/mailman/listinfo/gluster-users



_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to