Re: [Gluster-users] Fwd: nfs-ganesha HA with arbiter volume

Jiffin Tony Thottan Mon, 21 Sep 2015 23:15:53 -0700


On 21/09/15 21:21, Tiemen Ruiten wrote:

Whoops, replied off-list.

Additionally I noticed that the generated corosync config is notvalid, as there is no interface section:


/etc/corosync/corosync.conf

totem {
version: 2
secauth: off
cluster_name: rd-ganesha-ha
transport: udpu
}

nodelist {
  node {
        ring0_addr: cobalt
        nodeid: 1
       }
  node {
        ring0_addr: iron
        nodeid: 2
       }
}

quorum {
provider: corosync_votequorum
two_node: 1
}

logging {
to_syslog: yes
}


May be Kaleb can help you out.


---------- Forwarded message ----------
From: *Tiemen Ruiten* <t.rui...@rdmedia.com <mailto:t.rui...@rdmedia.com>>
Date: 21 September 2015 at 17:16
Subject: Re: [Gluster-users] nfs-ganesha HA with arbiter volume
To: Jiffin Tony Thottan <jthot...@redhat.com <mailto:jthot...@redhat.com>>

Could you point me to the latest documentation? I've been strugglingto find something up-to-date. I believe I have all the prerequisites:


- shared storage volume exists and is mounted
- all nodes in hosts files
- Gluster-NFS disabled
- corosync, pacemaker and nfs-ganesha rpm's installed

Anything I missed?

Everything has been installed by RPM so is in the default locations:
/usr/libexec/ganesha/ganesha-ha.sh
/etc/ganesha/ganesha.conf (empty)
/etc/ganesha/ganesha-ha.conf


Looks fine for me.

After I started the pcsd service manually, nfs-ganesha could beenabled successfully, but there was no virtual IP present on theinterfaces and looking at the system log, I noticed corosync failed tostart:
- on the host where I issued the gluster nfs-ganesha enable command:

Sep 21 17:07:18 iron systemd: Starting NFS-Ganesha file server...
Sep 21 17:07:19 iron systemd: Started NFS-Ganesha file server.
Sep 21 17:07:19 iron rpc.statd[2409]: Received SM_UNMON_ALL requestfrom iron.int.rdmedia.com <http://iron.int.rdmedia.com> while notmonitoring any hosts
Sep 21 17:07:20 iron systemd: Starting Corosync Cluster Engine...
Sep 21 17:07:20 iron corosync[3426]: [MAIN ] Corosync Cluster Engine('2.3.4'): started and ready to provide service.Sep 21 17:07:20 iron corosync[3426]: [MAIN ] Corosync built-infeatures: dbus systemd xmlconf snmp pie relro bindnowSep 21 17:07:20 iron corosync[3427]: [TOTEM ] Initializing transport(UDP/IP Unicast).Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] Initializingtransmit/receive security (NSS) crypto: none hash: noneSep 21 17:07:20 iron corosync[3427]: [TOTEM ] The network interface[10.100.30.38] is now up.Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine loaded:corosync configuration map access [0]
Sep 21 17:07:20 iron corosync[3427]: [QB    ] server name: cmap
Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine loaded:corosync configuration service [1]
Sep 21 17:07:20 iron corosync[3427]: [QB    ] server name: cfg
Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine loaded:corosync cluster closed process group service v1.01 [2]
Sep 21 17:07:20 iron corosync[3427]: [QB    ] server name: cpg
Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine loaded:corosync profile loading service [4]Sep 21 17:07:20 iron corosync[3427]: [QUORUM] Using quorum providercorosync_votequorumSep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all clustermembers. Current votes: 1 expected_votes: 2Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine loaded:corosync vote quorum service v1.0 [5]
Sep 21 17:07:20 iron corosync[3427]: [QB    ] server name: votequorum
Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine loaded:corosync cluster quorum service v0.1 [3]
Sep 21 17:07:20 iron corosync[3427]: [QB    ] server name: quorum
Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] adding new UDPU member{10.100.30.38}Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] adding new UDPU member{10.100.30.37}Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] A new membership(10.100.30.38:104 <http://10.100.30.38:104>) was formed. Members joined: 1Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all clustermembers. Current votes: 1 expected_votes: 2Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all clustermembers. Current votes: 1 expected_votes: 2Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all clustermembers. Current votes: 1 expected_votes: 2
Sep 21 17:07:20 iron corosync[3427]: [QUORUM] Members[1]: 1
Sep 21 17:07:20 iron corosync[3427]: [MAIN ] Completed servicesynchronization, ready to provide service.Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] A new membership(10.100.30.37:108 <http://10.100.30.37:108>) was formed. Members joined: 1Sep 21 17:08:21 iron corosync: Starting Corosync Cluster Engine(corosync): [FAILED]Sep 21 17:08:21 iron systemd: corosync.service: control processexited, code=exited status=1
Sep 21 17:08:21 iron systemd: Failed to start Corosync Cluster Engine.
Sep 21 17:08:21 iron systemd: Unit corosync.service entered failed state.


- on the other host:

Sep 21 17:07:19 cobalt systemd: Starting Preprocess NFS configuration...
Sep 21 17:07:19 cobalt systemd: Starting RPC Port Mapper.
Sep 21 17:07:19 cobalt systemd: Reached target RPC Port Mapper.
Sep 21 17:07:19 cobalt systemd: Starting Host and Network Name Lookups.
Sep 21 17:07:19 cobalt systemd: Reached target Host and Network NameLookups.
Sep 21 17:07:19 cobalt systemd: Starting RPC bind service...
Sep 21 17:07:19 cobalt systemd: Started Preprocess NFS configuration.
Sep 21 17:07:19 cobalt systemd: Started RPC bind service.
Sep 21 17:07:19 cobalt systemd: Starting NFS status monitor forNFSv2/3 locking....
Sep 21 17:07:19 cobalt rpc.statd[2662]: Version 1.3.0 starting
Sep 21 17:07:19 cobalt rpc.statd[2662]: Flags: TI-RPC
Sep 21 17:07:19 cobalt systemd: Started NFS status monitor for NFSv2/3locking..
Sep 21 17:07:19 cobalt systemd: Starting NFS-Ganesha file server...
Sep 21 17:07:19 cobalt systemd: Started NFS-Ganesha file server.
Sep 21 17:07:19 cobalt kernel: warning: `ganesha.nfsd' uses 32-bitcapabilities (legacy support in use)
Sep 21 17:07:19 cobalt logger: setting up rd-ganesha-ha
Sep 21 17:07:19 cobalt rpc.statd[2662]: Received SM_UNMON_ALL requestfrom cobalt.int.rdmedia.com <http://cobalt.int.rdmedia.com> while notmonitoring any hostsSep 21 17:07:19 cobalt logger: setting up cluster rd-ganesha-ha withthe following cobalt ironSep 21 17:07:20 cobalt systemd: Stopped Pacemaker High AvailabilityCluster Manager.
Sep 21 17:07:20 cobalt systemd: Stopped Corosync Cluster Engine.
Sep 21 17:07:20 cobalt systemd: Reloading.
Sep 21 17:07:20 cobalt systemd:[/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue'RemoveOnStop' in section 'Socket'Sep 21 17:07:20 cobalt systemd:[/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue'RemoveOnStop' in section 'Socket'
Sep 21 17:07:20 cobalt systemd: Reloading.
Sep 21 17:07:20 cobalt systemd:[/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue'RemoveOnStop' in section 'Socket'Sep 21 17:07:20 cobalt systemd:[/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue'RemoveOnStop' in section 'Socket'
Sep 21 17:07:20 cobalt systemd: Starting Corosync Cluster Engine...
Sep 21 17:07:20 cobalt corosync[2816]: [MAIN ] Corosync ClusterEngine ('2.3.4'): started and ready to provide service.Sep 21 17:07:20 cobalt corosync[2816]: [MAIN ] Corosync built-infeatures: dbus systemd xmlconf snmp pie relro bindnowSep 21 17:07:20 cobalt corosync[2817]: [TOTEM ] Initializing transport(UDP/IP Unicast).Sep 21 17:07:20 cobalt corosync[2817]: [TOTEM ] Initializingtransmit/receive security (NSS) crypto: none hash: noneSep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] The network interface[10.100.30.37] is now up.Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine loaded:corosync configuration map access [0]
Sep 21 17:07:21 cobalt corosync[2817]: [QB    ] server name: cmap
Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine loaded:corosync configuration service [1]
Sep 21 17:07:21 cobalt corosync[2817]: [QB    ] server name: cfg
Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine loaded:corosync cluster closed process group service v1.01 [2]
Sep 21 17:07:21 cobalt corosync[2817]: [QB    ] server name: cpg
Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine loaded:corosync profile loading service [4]Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Using quorum providercorosync_votequorumSep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for allcluster members. Current votes: 1 expected_votes: 2Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine loaded:corosync vote quorum service v1.0 [5]
Sep 21 17:07:21 cobalt corosync[2817]: [QB    ] server name: votequorum
Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine loaded:corosync cluster quorum service v0.1 [3]
Sep 21 17:07:21 cobalt corosync[2817]: [QB    ] server name: quorum
Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] adding new UDPU member{10.100.30.37}Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] adding new UDPU member{10.100.30.38}Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] A new membership(10.100.30.37:100 <http://10.100.30.37:100>) was formed. Members joined: 1Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for allcluster members. Current votes: 1 expected_votes: 2Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for allcluster members. Current votes: 1 expected_votes: 2Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for allcluster members. Current votes: 1 expected_votes: 2
Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Members[1]: 1
Sep 21 17:07:21 cobalt corosync[2817]: [MAIN ] Completed servicesynchronization, ready to provide service.Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] A new membership(10.100.30.37:108 <http://10.100.30.37:108>) was formed. Members joined: 1Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for allcluster members. Current votes: 1 expected_votes: 2
Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Members[1]: 1
Sep 21 17:07:21 cobalt corosync[2817]: [MAIN ] Completed servicesynchronization, ready to provide service.Sep 21 17:08:50 cobalt systemd: corosync.service operation timed out.Terminating.Sep 21 17:08:50 cobalt corosync: Starting Corosync Cluster Engine(corosync):
Sep 21 17:08:50 cobalt systemd: Failed to start Corosync Cluster Engine.
Sep 21 17:08:50 cobalt systemd: Unit corosync.service entered failedstate.Sep 21 17:08:55 cobalt logger: warning: pcs property setno-quorum-policy=ignore failedSep 21 17:08:55 cobalt logger: warning: pcs property setstonith-enabled=false failedSep 21 17:08:55 cobalt logger: warning: pcs resource create nfs_startganesha_nfsd ha_vol_mnt=/var/run/gluster/shared_storage --clone failedSep 21 17:08:56 cobalt logger: warning: pcs resource deletenfs_start-clone failedSep 21 17:08:56 cobalt logger: warning: pcs resource create nfs-monganesha_mon --clone failedSep 21 17:08:56 cobalt logger: warning: pcs resource create nfs-graceganesha_grace --clone failedSep 21 17:08:57 cobalt logger: warning pcs resource createcobalt-cluster_ip-1 ocf:heartbeat:IPaddr ip= cidr_netmask=32 opmonitor interval=15s failedSep 21 17:08:57 cobalt logger: warning: pcs resource createcobalt-trigger_ip-1 ocf:heartbeat:Dummy failedSep 21 17:08:57 cobalt logger: warning: pcs constraint colocation addcobalt-cluster_ip-1 with cobalt-trigger_ip-1 failedSep 21 17:08:57 cobalt logger: warning: pcs constraint ordercobalt-trigger_ip-1 then nfs-grace-clone failedSep 21 17:08:57 cobalt logger: warning: pcs constraint ordernfs-grace-clone then cobalt-cluster_ip-1 failedSep 21 17:08:57 cobalt logger: warning pcs resource createiron-cluster_ip-1 ocf:heartbeat:IPaddr ip= cidr_netmask=32 op monitorinterval=15s failedSep 21 17:08:57 cobalt logger: warning: pcs resource createiron-trigger_ip-1 ocf:heartbeat:Dummy failedSep 21 17:08:57 cobalt logger: warning: pcs constraint colocation addiron-cluster_ip-1 with iron-trigger_ip-1 failedSep 21 17:08:57 cobalt logger: warning: pcs constraint orderiron-trigger_ip-1 then nfs-grace-clone failedSep 21 17:08:58 cobalt logger: warning: pcs constraint ordernfs-grace-clone then iron-cluster_ip-1 failedSep 21 17:08:58 cobalt logger: warning: pcs constraint locationcobalt-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 failedSep 21 17:08:58 cobalt logger: warning: pcs constraint locationcobalt-cluster_ip-1 prefers iron=1000 failedSep 21 17:08:58 cobalt logger: warning: pcs constraint locationcobalt-cluster_ip-1 prefers cobalt=2000 failedSep 21 17:08:58 cobalt logger: warning: pcs constraint locationiron-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 failedSep 21 17:08:58 cobalt logger: warning: pcs constraint locationiron-cluster_ip-1 prefers cobalt=1000 failedSep 21 17:08:58 cobalt logger: warning: pcs constraint locationiron-cluster_ip-1 prefers iron=2000 failedSep 21 17:08:58 cobalt logger: warning pcs cluster cib-push/tmp/tmp.nXTfyA1GMR failedSep 21 17:08:58 cobalt logger: warning: scp ganesha-ha.conf to cobaltfailed
BTW, I'm using CentOS 7. There are multiple network interfaces on theservers, could that be a problem?
On 21 September 2015 at 11:48, Jiffin Tony Thottan<jthot...@redhat.com <mailto:jthot...@redhat.com>> wrote:
    On 21/09/15 13:56, Tiemen Ruiten wrote:
    Hello Soumya, Kaleb, list,

    This Friday I created the gluster_shared_storage volume manually,
    I just tried it with the command you supplied, but both have the
    same result:

    from etc-glusterfs-glusterd.vol.log on the node where I issued
    the command:

    [2015-09-21 07:59:47.756845] I [MSGID: 106474]
    [glusterd-ganesha.c:403:check_host_list] 0-management: ganesha
    host found Hostname is cobalt
    [2015-09-21 07:59:48.071755] I [MSGID: 106474]
    [glusterd-ganesha.c:349:is_ganesha_host] 0-management: ganesha
    host found Hostname is cobalt
    [2015-09-21 07:59:48.653879] E [MSGID: 106470]
    [glusterd-ganesha.c:264:glusterd_op_set_ganesha] 0-management:
    Initial NFS-Ganesha set up failed
    As far as what I understand from the logs, it called
    setup_cluser()[calls `ganesha-ha.sh` script ] but script failed.
    Can u please provide following details :
    -Location of ganesha.sh file??
    -Location of ganesha-ha.conf, ganesha.conf files ?


    And also can u cross check whether all the prerequisites before HA
    setup satisfied ?

    --
    With Regards,
    Jiffin
    [2015-09-21 07:59:48.653912] E [MSGID: 106123]
    [glusterd-syncop.c:1404:gd_commit_op_phase] 0-management: Commit
    of operation 'Volume (null)' failed on localhost : Failed to set
    up HA config for NFS-Ganesha. Please check the log file for details
    [2015-09-21 07:59:45.402458] I [MSGID: 106006]
    [glusterd-svc-mgmt.c:323:glusterd_svc_common_rpc_notify]
    0-management: nfs has disconnected from glusterd.
    [2015-09-21 07:59:48.071578] I [MSGID: 106474]
    [glusterd-ganesha.c:403:check_host_list] 0-management: ganesha
    host found Hostname is cobalt

    from etc-glusterfs-glusterd.vol.log on the other node:

    [2015-09-21 08:12:50.111877] E [MSGID: 106062]
    [glusterd-op-sm.c:3698:glusterd_op_ac_unlock] 0-management:
    Unable to acquire volname
    [2015-09-21 08:14:50.548087] E [MSGID: 106062]
    [glusterd-op-sm.c:3635:glusterd_op_ac_lock] 0-management: Unable
    to acquire volname
    [2015-09-21 08:14:50.654746] I [MSGID: 106132]
    [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs
    already stopped
    [2015-09-21 08:14:50.655095] I [MSGID: 106474]
    [glusterd-ganesha.c:403:check_host_list] 0-management: ganesha
    host found Hostname is cobalt
    [2015-09-21 08:14:51.287156] E [MSGID: 106062]
    [glusterd-op-sm.c:3698:glusterd_op_ac_unlock] 0-management:
    Unable to acquire volname


    from etc-glusterfs-glusterd.vol.log on the arbiter node:

    [2015-09-21 08:18:50.934713] E [MSGID: 101075]
    [common-utils.c:3127:gf_is_local_addr] 0-management: error in
    getaddrinfo: Name or service not known
    [2015-09-21 08:18:51.504694] E [MSGID: 106062]
    [glusterd-op-sm.c:3698:glusterd_op_ac_unlock] 0-management:
    Unable to acquire volname

    I have put the hostnames of all servers in my /etc/hosts file,
    including the arbiter node.


    On 18 September 2015 at 16:52, Soumya Koduri <skod...@redhat.com
    <mailto:skod...@redhat.com>> wrote:

        Hi Tiemen,

        One of the pre-requisites before setting up nfs-ganesha HA is
        to create and mount shared_storage volume. Use below CLI for that

        "gluster volume set all cluster.enable-shared-storage enable"

        It shall create the volume and mount in all the nodes
        (including the arbiter node). Note this volume shall be
        mounted on all the nodes of the gluster storage pool (though
        in this case it may not be part of nfs-ganesha cluster).

        So instead of manually creating those directory paths, please
        use above CLI and try re-configuring the setup.

        Thanks,
        Soumya

        On 09/18/2015 07:29 PM, Tiemen Ruiten wrote:

            Hello Kaleb,

            I don't:

            # Name of the HA cluster created.
            # must be unique within the subnet
            HA_NAME="rd-ganesha-ha"
            #
            # The gluster server from which to mount the shared data
            volume.
            HA_VOL_SERVER="iron"
            #
            # N.B. you may use short names or long names; you may not
            use IP addrs.
            # Once you select one, stay with it as it will be mildly
            unpleasant to
            # clean up if you switch later on. Ensure that all names
            - short and/or
            # long - are in DNS or /etc/hosts on all machines in the
            cluster.
            #
            # The subset of nodes of the Gluster Trusted Pool that
            form the ganesha
            # HA cluster. Hostname is specified.
            HA_CLUSTER_NODES="cobalt,iron"
            #HA_CLUSTER_NODES="server1.lab.redhat.com
            <http://server1.lab.redhat.com>
            <http://server1.lab.redhat.com>,server2.lab.redhat.com
            <http://server2.lab.redhat.com>
            <http://server2.lab.redhat.com>,..."
            #
            # Virtual IPs for each of the nodes specified above.
            VIP_server1="10.100.30.101"
            VIP_server2="10.100.30.102"
            #VIP_server1_lab_redhat_com="10.0.2.1"
            #VIP_server2_lab_redhat_com="10.0.2.2"

            hosts cobalt & iron are the data nodes, the arbiter
            ip/hostname (neon)
            isn't mentioned anywhere in this config file.


            On 18 September 2015 at 15:56, Kaleb S. KEITHLEY
            <kkeit...@redhat.com <mailto:kkeit...@redhat.com>
            <mailto:kkeit...@redhat.com
            <mailto:kkeit...@redhat.com>>> wrote:

                On 09/18/2015 09:46 AM, Tiemen Ruiten wrote:
                > Hello,
                >
                > I have a Gluster cluster with a single replica 3,
            arbiter 1 volume (so
                > two nodes with actual data, one arbiter node). I
            would like to setup
                > NFS-Ganesha HA for this volume but I'm having some
            difficulties.
                >
                > - I needed to create a directory
            /var/run/gluster/shared_storage
                > manually on all nodes, or the command 'gluster
            nfs-ganesha enable would
                > fail with the following error:
                > [2015-09-18 13:13:34.690416] E [MSGID: 106032]
                > [glusterd-ganesha.c:708:pre_setup] 0-THIS->name:
            mkdir() failed on path
                > /var/run/gluster/shared_storage/nfs-ganesha, [No
            such file or directory]
                >
                > - Then I found out that the command connects to the
            arbiter node as
                > well, but obviously I don't want to set up
            NFS-Ganesha there. Is it
                > actually possible to setup NFS-Ganesha HA with an
            arbiter node? If it's
                > possible, is there any documentation on how to do that?
                >

                Please send the /etc/ganesha/ganesha-ha.conf file
            you're using.

                Probably you have included the arbiter in your HA
            config; that would be
                a mistake.

                --

                Kaleb




            --
            Tiemen Ruiten
            Systems Engineer
            R&D Media


            _______________________________________________
            Gluster-users mailing list
            Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
            http://www.gluster.org/mailman/listinfo/gluster-users
--Tiemen Ruiten
    Systems Engineer
    R&D Media


    _______________________________________________
    Gluster-users mailing list
    Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
    http://www.gluster.org/mailman/listinfo/gluster-users
    _______________________________________________
    Gluster-users mailing list
    Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
    http://www.gluster.org/mailman/listinfo/gluster-users




--
Tiemen Ruiten
Systems Engineer
R&D Media



--
Tiemen Ruiten
Systems Engineer
R&D Media


_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fwd: nfs-ganesha HA with arbiter volume

Reply via email to