Re: [Gluster-users] Enabling Halo sets volume RO

2017-11-08 Thread Jon Cope
Thank you! That did the trick. 

For anyone else who encounters this, here are the settings that worked for me: 

cluster.quorum-type fixed 
cluster.quorum-count 2 
cluster.halo-enabled yes 
cluster.halo-min-replicas 2 
cluster.halo-max-latency 10 

- Original Message -

| From: "Mohammed Rafi K C" 
| To: "Jon Cope" , gluster-users@gluster.org
| Sent: Wednesday, November 8, 2017 3:34:07 AM
| Subject: Re: [Gluster-users] Enabling Halo sets volume RO
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Enabling Halo sets volume RO

2017-11-07 Thread Jon Cope
Hi all, 

I'm taking a stab at deploying a storage cluster to explore the Halo AFR 
feature and running into some trouble. In GCE, I have 4 instances, each with 
one 10gb brick. 2 instances are in the US and the other 2 are in Asia (with the 
hope that it will drive up latency sufficiently). The bricks make up a 
Replica-4 volume. Before I enable halo, I can mount to volume and r/w files. 

The issue is when I set the `cluster.halo-enabled yes`, I can no longer write 
to the volume: 

[root@jcope-rhs-g2fn vol]# touch /mnt/vol/test1 
touch: setting times of ‘test1’: Read-only file system 

This can be fixed by turning halo off again. While halo is enabled and writes 
return the above message, the mount still shows it to be r/w: 

[root@jcope-rhs-g2fn vol]# mount 
gce-node1:gv0 on /mnt/vol type fuse.glusterfs 
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
 


Thanks in advace, 
-Jon 


Setup info 
CentOS Linux release 7.4.1708 (Core) 
4 GCE Instances (2 US, 2 Asia) 
1 10gb Brick/Instance 
replica 4 volume 

Packages: 


glusterfs-client-xlators-3.12.1-2.el7.x86_64 
glusterfs-cli-3.12.1-2.el7.x86_64 
python2-gluster-3.12.1-2.el7.x86_64 
glusterfs-3.12.1-2.el7.x86_64 
glusterfs-api-3.12.1-2.el7.x86_64 
glusterfs-fuse-3.12.1-2.el7.x86_64 
glusterfs-server-3.12.1-2.el7.x86_64 
glusterfs-libs-3.12.1-2.el7.x86_64 
glusterfs-geo-replication-3.12.1-2.el7.x86_64 

Logs, beginning when halo is enabled: 


[2017-11-07 22:20:15.029298] W [MSGID: 101095] [xlator.c:213:xlator_dynload] 
0-xlator: /usr/lib64/glusterfs/3.12.1/xlator/nfs/server.so: cannot open shared 
object file: No such file or directory 
[2017-11-07 22:20:15.204241] W [MSGID: 101095] 
[xlator.c:162:xlator_volopt_dynload] 0-xlator: 
/usr/lib64/glusterfs/3.12.1/xlator/nfs/server.so: cannot open shared object 
file: No such file or directory 
[2017-11-07 22:20:15.232176] I [MSGID: 106600] 
[glusterd-nfs-svc.c:163:glusterd_nfssvc_reconfigure] 0-management: 
nfs/server.so xlator is not installed 
[2017-11-07 22:20:15.235481] I [MSGID: 106132] 
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already 
stopped 
[2017-11-07 22:20:15.235512] I [MSGID: 106568] 
[glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: quotad service is 
stopped 
[2017-11-07 22:20:15.235572] I [MSGID: 106132] 
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped 
[2017-11-07 22:20:15.235585] I [MSGID: 106568] 
[glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: bitd service is 
stopped 
[2017-11-07 22:20:15.235638] I [MSGID: 106132] 
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already 
stopped 
[2017-11-07 22:20:15.235650] I [MSGID: 106568] 
[glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: scrub service is 
stopped 
[2017-11-07 22:20:15.250297] I [run.c:190:runner_log] 
(-->/usr/lib64/glusterfs/3.12.1/xlator/mgmt/glusterd.so(+0xde17a) 
[0x7fc23442117a] 
-->/usr/lib64/glusterfs/3.12.1/xlator/mgmt/glusterd.so(+0xddc3d) 
[0x7fc234420c3d] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7fc23f915da5] 
) 0-management: Ran script: /var/lib 
/glusterd/hooks/1/set/post/S30samba-set.sh --volname=gv0 -o 
cluster.halo-enabled=yes --gd-workdir=/var/lib/glusterd 
[2017-11-07 22:20:15.255777] I [run.c:190:runner_log] 
(-->/usr/lib64/glusterfs/3.12.1/xlator/mgmt/glusterd.so(+0xde17a) 
[0x7fc23442117a] 
-->/usr/lib64/glusterfs/3.12.1/xlator/mgmt/glusterd.so(+0xddc3d) 
[0x7fc234420c3d] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7fc23f915da5] 
) 0-management: Ran script: /var/lib 
/glusterd/hooks/1/set/post/S32gluster_enable_shared_storage.sh --volname=gv0 -o 
cluster.halo-enabled=yes --gd-workdir=/var/lib/glusterd 
[2017-11-07 22:20:47.420098] W [MSGID: 101095] [xlator.c:213:xlator_dynload] 
0-xlator: /usr/lib64/glusterfs/3.12.1/xlator/nfs/server.so: cannot open shared 
object file: No such file or directory 
[2017-11-07 22:20:47.595960] W [MSGID: 101095] 
[xlator.c:162:xlator_volopt_dynload] 0-xlator: 
/usr/lib64/glusterfs/3.12.1/xlator/nfs/server.so: cannot open shared object 
file: No such file or directory 
[2017-11-07 22:20:47.631833] I [MSGID: 106600] 
[glusterd-nfs-svc.c:163:glusterd_nfssvc_reconfigure] 0-management: 
nfs/server.so xlator is not installed 
[2017-11-07 22:20:47.635109] I [MSGID: 106132] 
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already 
stopped 
[2017-11-07 22:20:47.635136] I [MSGID: 106568] 
[glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: quotad service is 
stopped 
[2017-11-07 22:20:47.635201] I [MSGID: 106132] 
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped 
[2017-11-07 22:20:47.635216] I [MSGID: 106568] 
[glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: bitd service is 
stopped 
[2017-11-07 22:20:47.635284] I [MSGID: 106132] 
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already 
stopped 
[2017-11-07 22:20:47.635297] I [MSGID: 106568] 
[glusterd-svc-mgmt.c:229

Re: [Gluster-users] glusterd service fails to start from AWS AMI

2014-03-05 Thread Jon Cope
Hi Carlos,

Thanks for the input.  You guess right, I'm using the latest RH distro.

As for the IP/hostname resolution, that should have been accounted for using 
the elastic IPs.  When an instance within the private network attempts to 
resolve the public DNS containing an elastic IP, it gets pointed to the private 
IP address of the machine.  The /etc/hosts file contains these elastic IPs and 
ssh / ping works find.  I'm also not sure if a hostname resolution should cause 
a failure on start.  Does someone have an answer to that?



- Original Message -
From: "Carlos Capriotti" 
To: "Jon Cope" 
Cc: gluster-users@gluster.org
Sent: Tuesday, March 4, 2014 4:29:31 PM
Subject: Re: [Gluster-users] glusterd service fails to start from AWS AMI

I don't want to sound simplistic, but seems to be name resolution/network
related.

Again, I DO know you email ends with redhat.com, but just to make sure,
Gluster is running on what distro ? I never dealt with Amazon's platform,
so ignorance here is abundant.

The reason why I am asking is that I am stress-testing my first (on
premisses) install, and I ran into a problem that I am choosing to ignore
for now, but will have to solve in the future: DNS resolution stops working
after a while.

I am using CentOS 6.5, with Gluster 3.4.2. I have a bonded NIC, made out of
two physical ones and a third NIC for management.

I realized that, despite the fact that I have manually configured all
interfaces, disabled user control (may be this), disabled NM access to
them, and even tried to update resolv.conf, after a reboot, name resolution
does not work.

While the NICs were working with NM and/or DHCP, all went file, but after
tailoring my ifcfg-* files, DNS went south.

You said your name resolution does work. Maybe an entry on your hosts file
just to test ?

Another thought would be using 3.4.2, instead of 3.4.0.

Just wanted to share.

KR,

Carlos


On Tue, Mar 4, 2014 at 10:45 PM, Jon Cope  wrote:

> Hello all.
>
> I have a working replica 2 cluster (4 nodes) up and running happily over
> Amazon EC2.  My end goal is to create AMIs of each machine and then quickly
> reproduce the same, but new, cluster from those AMIs.  Essentially, I'd
> like a cluster "template".
>
> -Assigned original instances' Elastic IPs to new machines to reduce
> resolution issues.
> -Passwordless SSH works on initial boot across all machines
> -Node1: has no evident issue.  Starts with glusterd running.
> -Node1: 'gluster peer status' returns correct public DNS / hostnames for
> peer nodes.  Status: (Disconnected)  --since the service is off on them
>
> Since my goal is to create a cluster template, reinstalling gluster for
> each node, though it'll probably work, isn't an optimal solution.
>
> Thank You
>
> #  Node2: etc-glusterfs-glusterd.vol.log
> #  Begins at 'service glusterd start' command entry
>
> [2014-03-04 21:20:30.532138] I [glusterfsd.c:2024:main]
> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version
> 3.4.0.44rhs (/usr/sbin/glusterd --pid-file=/var/run/glusterd.pid)
> [2014-03-04 21:20:30.539331] I [glusterd.c:1020:init] 0-management: Using
> /var/lib/glusterd as working directory
> [2014-03-04 21:20:30.542578] I [socket.c:3485:socket_init]
> 0-socket.management: SSL support is NOT enabled
> [2014-03-04 21:20:30.542603] I [socket.c:3500:socket_init]
> 0-socket.management: using system polling thread
> [2014-03-04 21:20:30.543203] C [rdma.c:4099:gf_rdma_init]
> 0-rpc-transport/rdma: Failed to get IB devices
> [2014-03-04 21:20:30.543342] E [rdma.c:4990:init] 0-rdma.management:
> Failed to initialize IB Device
> [2014-03-04 21:20:30.543375] E [rpc-transport.c:320:rpc_transport_load]
> 0-rpc-transport: 'rdma' initialization failed
> [2014-03-04 21:20:30.543471] W [rpcsvc.c:1387:rpcsvc_transport_create]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2014-03-04 21:20:37.116571] I
> [glusterd-store.c:1388:glusterd_restore_op_version] 0-glusterd: retrieved
> op-version: 2
> [2014-03-04 21:20:37.120082] E
> [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key:
> brick-0
> [2014-03-04 21:20:37.120118] E
> [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key:
> brick-1
> [2014-03-04 21:20:37.120137] E
> [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key:
> brick-2
> [2014-03-04 21:20:37.120154] E
> [glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key:
> brick-3
> [2014-03-04 21:20:37.761785] I
> [glusterd-handler.c:2886:glusterd_friend_add] 0-management: connect
> returned 0
> [2014-03-04 21:20:37.765059] I
> [glusterd-handler.c:2886:glusterd_friend_add] 0-management: connect
> returned 0
> [2014-03-0

[Gluster-users] glusterd service fails to start from AWS AMI

2014-03-04 Thread Jon Cope
Hello all.

I have a working replica 2 cluster (4 nodes) up and running happily over Amazon 
EC2.  My end goal is to create AMIs of each machine and then quickly reproduce 
the same, but new, cluster from those AMIs.  Essentially, I'd like a cluster 
"template".

-Assigned original instances' Elastic IPs to new machines to reduce resolution 
issues.
-Passwordless SSH works on initial boot across all machines
-Node1: has no evident issue.  Starts with glusterd running.
-Node1: 'gluster peer status' returns correct public DNS / hostnames for peer 
nodes.  Status: (Disconnected)  --since the service is off on them

Since my goal is to create a cluster template, reinstalling gluster for each 
node, though it'll probably work, isn't an optimal solution.

Thank You

#  Node2: etc-glusterfs-glusterd.vol.log
#  Begins at 'service glusterd start' command entry

[2014-03-04 21:20:30.532138] I [glusterfsd.c:2024:main] 0-/usr/sbin/glusterd: 
Started running /usr/sbin/glusterd version 3.4.0.44rhs (/usr/sbin/glusterd 
--pid-file=/var/run/glusterd.pid)
[2014-03-04 21:20:30.539331] I [glusterd.c:1020:init] 0-management: Using 
/var/lib/glusterd as working directory
[2014-03-04 21:20:30.542578] I [socket.c:3485:socket_init] 0-socket.management: 
SSL support is NOT enabled
[2014-03-04 21:20:30.542603] I [socket.c:3500:socket_init] 0-socket.management: 
using system polling thread
[2014-03-04 21:20:30.543203] C [rdma.c:4099:gf_rdma_init] 0-rpc-transport/rdma: 
Failed to get IB devices
[2014-03-04 21:20:30.543342] E [rdma.c:4990:init] 0-rdma.management: Failed to 
initialize IB Device
[2014-03-04 21:20:30.543375] E [rpc-transport.c:320:rpc_transport_load] 
0-rpc-transport: 'rdma' initialization failed
[2014-03-04 21:20:30.543471] W [rpcsvc.c:1387:rpcsvc_transport_create] 
0-rpc-service: cannot create listener, initing the transport failed
[2014-03-04 21:20:37.116571] I 
[glusterd-store.c:1388:glusterd_restore_op_version] 0-glusterd: retrieved 
op-version: 2
[2014-03-04 21:20:37.120082] E 
[glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key: brick-0
[2014-03-04 21:20:37.120118] E 
[glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key: brick-1
[2014-03-04 21:20:37.120137] E 
[glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key: brick-2
[2014-03-04 21:20:37.120154] E 
[glusterd-store.c:1905:glusterd_store_retrieve_volume] 0-: Unknown key: brick-3
[2014-03-04 21:20:37.761785] I [glusterd-handler.c:2886:glusterd_friend_add] 
0-management: connect returned 0
[2014-03-04 21:20:37.765059] I [glusterd-handler.c:2886:glusterd_friend_add] 
0-management: connect returned 0
[2014-03-04 21:20:37.767677] I [glusterd-handler.c:2886:glusterd_friend_add] 
0-management: connect returned 0
[2014-03-04 21:20:37.767783] I [rpc-clnt.c:974:rpc_clnt_connection_init] 
0-management: setting frame-timeout to 600
[2014-03-04 21:20:37.767850] I [socket.c:3485:socket_init] 0-management: SSL 
support is NOT enabled
[2014-03-04 21:20:37.767866] I [socket.c:3500:socket_init] 0-management: using 
system polling thread
[2014-03-04 21:20:37.772356] I [rpc-clnt.c:974:rpc_clnt_connection_init] 
0-management: setting frame-timeout to 600
[2014-03-04 21:20:37.772441] I [socket.c:3485:socket_init] 0-management: SSL 
support is NOT enabled
[2014-03-04 21:20:37.772459] I [socket.c:3500:socket_init] 0-management: using 
system polling thread
[2014-03-04 21:20:37.776131] I [rpc-clnt.c:974:rpc_clnt_connection_init] 
0-management: setting frame-timeout to 600
[2014-03-04 21:20:37.776185] I [socket.c:3485:socket_init] 0-management: SSL 
support is NOT enabled
[2014-03-04 21:20:37.776201] I [socket.c:3500:socket_init] 0-management: using 
system polling thread
[2014-03-04 21:20:37.780363] E 
[glusterd-store.c:2548:glusterd_resolve_all_bricks] 0-glusterd: resolve brick 
failed in restore
[2014-03-04 21:20:37.780395] E [xlator.c:423:xlator_init] 0-management: 
Initialization of volume 'management' failed, review your volfile again
[2014-03-04 21:20:37.780410] E [graph.c:292:glusterfs_graph_init] 0-management: 
initializing translator failed
[2014-03-04 21:20:37.780422] E [graph.c:479:glusterfs_graph_activate] 0-graph: 
init failed
[2014-03-04 21:20:37.780723] W [glusterfsd.c:1097:cleanup_and_exit] 
(-->/usr/sbin/glusterd(main+0x6b1) [0x406a91] 
(-->/usr/sbin/glusterd(glusterfs_volumes_init+0xb7) [0x405247] 
(-->/usr/sbin/glusterd(glusterfs_process_volfp+0x106) [0x405156]))) 0-: 
received signum (0), shutting down
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] not in 'Peer in Cluster' state

2014-02-18 Thread Jon Cope
Hi Kaushal,

Thanks for the input.  I gave it a go and it produced identical results.  I 
was, however, pointed towards an article that's led me to a solution, linked 
below.  As I understand it, by assigning each node an elastic IP, you cement 
the public DNS (now containing the elasticIP), preventing AWS from changing it 
during reboot.  Querying the public DNS from inside EC2 returns the private IP 
addresses, while a query from outside EC2 returns the elastic IP.  Gluster 
seems happy with this, so I am too.

Regards,
Jon

http://alestic.com/2009/06/ec2-elastic-ip-internal

- Original Message -
From: "Kaushal M" 
To: "Jon Cope" 
Cc: gluster-users@gluster.org
Sent: Saturday, February 15, 2014 5:40:32 AM
Subject: Re: [Gluster-users]  not in 'Peer in Cluster' state

Peer status having node1's elastic ip suggests that you probed the
other peers from node1. This would mean that the other peers don't
know of node1's hostname. Even though you've edited the hosts file on
the peers, a reverse resolution on node1s ip wouldn't return the
hostnames you've set. Gluster uses reverse resolution to match
hostnames when it doesn't have a straight match in the peer list.

To recover from this. just probe node1 from another peer. Do '#
gluster peer probe node1.ec2' from another peer. This will update
gluster's peerlist to contain the name node1.ec2. After this other
operations will continue successfully.

~kaushal

On Sat, Feb 15, 2014 at 5:23 AM, Jon Cope  wrote:
> Hi all,
>
> I'm attempting to create a 4 nodes cluster over EC2.  I'm fairly new to this 
> and so may not be seeing something obvious.
>
> - Established passworldless SSH between nodes.
> - edited /etc/sysconfig/network HOSTNAME=node#.ec2 to satisfy FQDN
> - mounted xfs /dev/xvdh /mnt/brick1
> - stopped iptables
>
>
> The error I'm getting occurs when invoking the following, where  is 
> the volume name:
>
> # gluster volume create  replica 2 node1.ec2:/mnt/brick1 
> node2.ec2:/mnt/brick1 node3.ec2:/mnt/brick1 node4.ec2:/mnt/brick1
> # volume create: : failed: Host node1.ec2 is not in 'Peer in Cluster' 
> state
>
> Checking peer status of node1.ec2 from node{2..4}.ec2 produces the following. 
>  Note that node1.ec2's elastic IP appears instead of the FQDN; not sure if 
> that's relevant or not.
>
> [root@node2 ~]# gluster peer status
> Number of Peers: 3
>
> Hostname: node4.ec2
> Uuid: ab2bcdd8-2c0b-439d-b685-3be457988abc
> State: Peer in Cluster (Connected)
>
> Hostname: node3.ec2
> Uuid: 4f128213-3549-494a-af04-822b5e2f2b96
> State: Peer in Cluster (Connected)
>
> Hostname: ###.##.##.### #node1.ec2 elastic IP
> Uuid: 09d81803-e5e1-43b1-9faf-e94f730acc3e
> State: Peer in Cluster (Connected)
>
> The error as it appears in vim etc-glusterfs-glusterd.vol.log:
>
> [2014-02-14 23:28:44.634663] E 
> [glusterd-utils.c:5351:glusterd_new_brick_validate] 0-management: Host 
> node1.ec2 is not in 'Peer in Cluster' state
> [2014-02-14 23:28:44.634699] E 
> [glusterd-volume-ops.c:795:glusterd_op_stage_create_volume] 0-management: 
> Host node1.ec2 is not in 'Peer in Cluster' state
> [2014-02-14 23:28:44.634718] E [glusterd-syncop.c:890:gd_stage_op_phase] 
> 0-management: Staging of operation 'Volume Create' failed on localhost : Host 
> node1.ec2 is not in 'Peer in Cluster' state
>
> Can someone suggest possible cause of this error or point me in a viable 
> direction?
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] not in 'Peer in Cluster' state

2014-02-14 Thread Jon Cope
Hi all,

I'm attempting to create a 4 nodes cluster over EC2.  I'm fairly new to this 
and so may not be seeing something obvious.  

- Established passworldless SSH between nodes.
- edited /etc/sysconfig/network HOSTNAME=node#.ec2 to satisfy FQDN
- mounted xfs /dev/xvdh /mnt/brick1
- stopped iptables


The error I'm getting occurs when invoking the following, where  is the 
volume name:
 
# gluster volume create  replica 2 node1.ec2:/mnt/brick1 
node2.ec2:/mnt/brick1 node3.ec2:/mnt/brick1 node4.ec2:/mnt/brick1
# volume create: : failed: Host node1.ec2 is not in 'Peer in Cluster' 
state

Checking peer status of node1.ec2 from node{2..4}.ec2 produces the following.  
Note that node1.ec2's elastic IP appears instead of the FQDN; not sure if 
that's relevant or not.

[root@node2 ~]# gluster peer status
Number of Peers: 3

Hostname: node4.ec2
Uuid: ab2bcdd8-2c0b-439d-b685-3be457988abc
State: Peer in Cluster (Connected)

Hostname: node3.ec2
Uuid: 4f128213-3549-494a-af04-822b5e2f2b96
State: Peer in Cluster (Connected)

Hostname: ###.##.##.### #node1.ec2 elastic IP
Uuid: 09d81803-e5e1-43b1-9faf-e94f730acc3e
State: Peer in Cluster (Connected)

The error as it appears in vim etc-glusterfs-glusterd.vol.log:

[2014-02-14 23:28:44.634663] E 
[glusterd-utils.c:5351:glusterd_new_brick_validate] 0-management: Host 
node1.ec2 is not in 'Peer in Cluster' state
[2014-02-14 23:28:44.634699] E 
[glusterd-volume-ops.c:795:glusterd_op_stage_create_volume] 0-management: Host 
node1.ec2 is not in 'Peer in Cluster' state
[2014-02-14 23:28:44.634718] E [glusterd-syncop.c:890:gd_stage_op_phase] 
0-management: Staging of operation 'Volume Create' failed on localhost : Host 
node1.ec2 is not in 'Peer in Cluster' state

Can someone suggest possible cause of this error or point me in a viable 
direction?



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] rhs-hadoop-install Fails to create volume

2014-02-12 Thread Jon Cope

Hi All,

I'm trying to configure a Gluster/Hadoop volume in a 4 node EC2 cluster using 
the automated configure process in:

rhs-hadoop-install-0_65-2.el6rhs.noarch.rpm
rhs-hadoop-2.1.6-2.noarch.rpm
command "./install /dev/SomeDevice"

I begin with 4 nodes, each with an attached and formatted EBS volume named 
/dev/xvhd.  Using "./install.sh /dev/SomeDevice", the script successfully:
1. creates a dir on each node called /mnt/brick1
2. uses mkfs.xfs on each device
3. mounts each filesystem to /mnt/brick1
4. edits /etc/fstab accordingly
5. probes peers listed in /usr/share/rhs-hadoop-install/hosts
6. attempts to create the volume
7. explodes (see below)

The nodes themselves appear okay.  Hostnames are all qualified...

I've run out of ideas.  Does anyone have anything?



--Begin cluster configuration --


-- Cleaning up (un-mounting, deleting volume, etc.)
  -- un-mounting /mnt/glusterfs on all nodes...
  -- from node node1.ec2:
   stopping HadoopVol volume...
   deleting HadoopVol volume...
  -- from node node1.ec2:
   detaching all other nodes from trusted pool...
  -- on all nodes:
   rm /mnt/glusterfs...
   umount /mnt/brick1...
   rm /mnt/brick1 and /mnt/brick1/mapredlocal...

-- Setting up brick and volume mounts, creating and starting volume
  -- on all nodes:
   mkfs.xfs /dev/xvdh...
   mkdir /mnt/brick1, /mnt/glusterfs and /mnt/brick1/mapredlocal...
   append mount entries to /etc/fstab...
   mount /mnt/brick1...
  -- from node node1.ec2:
   creating trusted pool...
   creating HadoopVol volume...
   starting HadoopVol volume...
   ERROR: Volume "HadoopVol" creation failed with error 1
  Bricks=" node1.ec2:/mnt/brick1/HadoopVol 
node2.ec2:/mnt/brick1/HadoopVol node3.ec2:/mnt/brick1/HadoopVol 
node4.ec2:/mnt/brick1/HadoopVol"

 All nodes appear okay.

[root@node1 rhs-hadoop-install]# gluster peer status
Number of Peers: 3

Hostname: node2.ec2
Uuid: 888d8c52-dcec-42c4-96a8-e7fbf1e04de0
State: Peer in Cluster (Connected)

Hostname: node3.ec2
Uuid: 34d0c158-3021-4187-94d1-63adaa1a3a3d
State: Peer in Cluster (Connected)

Hostname: node4.ec2
Uuid: 2d9ae6c0-9dc1-4080-ab0b-dfd12e3f108e
State: Peer in Cluster (Connected)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users