Re: [Gluster-users] [Gluster-devel] Gluster on an ARM system

2011-08-12 Thread Pavan T C

On Friday 12 August 2011 08:48 AM, Emmanuel Dreyfus wrote:

John Mark Walkerjwal...@gluster.com  wrote:


I've CC'd the gluster-devel list in the hopes that someone there can help
you out. However, my understanding is that it will take some significant
porting to get GlusterFS to run in any production capacity on ARM.


What ARM specific problems have been identified?



The biggest issue, IMO, will be that of endianness.
GlusterFS has been run only on Intel/AMD architecture, AFAIK. I have not 
heard of any SPARC installations. That means that the code has been 
tested only on little-endian architecture. The worst problems come in 
when there is interaction between entities of different endianness.


However, there is another side to this. From what I know, ARM is 
actually a bi-Endian processor. If the ARM cores have the system control 
co-processor, the endianness of the ARM processor can be controlled by 
software. So, if we make ARM to work as a little-endian processor, we 
should work well even in a mixed environment. But then, ARM is a 32-bit 
processor. I am unsure/ignorant of the stability of 32-bit GlusterFS.


If we can solve the two major issues mentioned above viz. Endianness and 
stability of GlusterFS on 32-bit, we should theoretically be able to get 
GlusterFS working on ARM without any other major work.


Again, I cannot vouch for the above statement. Just my thoughts from 
what I know.


Pavan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Stripe+replicate

2011-08-12 Thread Reinis Rozitis

Hello,
is this ( http://gluster.org/pipermail/gluster-users/2011-July/008223.html ) 
true regarding 3.3.0 beta or should check out GIT?

Also while it is possible to manually create in client volfile will some more complex setups like striped+replicated+distributed 
setups ( like for example stripe on 6 (or more) nodes each stripe having 3 replicas and distributed on 12 servers) be supported or 
better stay away from something like that?


What's the suggested way to store large ~500 Gb files in reliable way so it doesn't bring the cluster down if a replica fails and 
has to be resynced.



thx in advance
rr 


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster server and multi-homing, again

2011-08-12 Thread Greg_Swift


gluster-users-boun...@gluster.org wrote on 08/11/2011 02:22:11 PM:

 On Thu, Aug 11, 2011 at 12:18 PM, Mohit Anchlia
 mohitanch...@gmail.com wrote:
  Run one glusterd on server1.  Have server1-eth1 and server1-eth2
  refer to the two interfaces.  Use gluster peer probe server1-eth1
  AND gluster peer probe server1-eth2 (?)
 
  My concern is that this will be attempting to add the same server to
 
  Don't they have different IP unless I misunderstood your
  configuration. Are the NIC bonded?

 No, no bonding.  I mean one server with two IP addresses.  So
 server1-eth1 might be 192.168.0.37 and server1-eth2 might be
 192.168.1.37, but they both connect to the same server (just on
 different interfaces).

  If you want to try out you will also need to do peer probe with eth1
  and eth2 IPs.

 If both IPs refer to the same server, they will connect to the same
 glusterd process, unless I find a way to launch two such processes and
 bind each to a separate address...



to clarify... you effectively have 1 server and set of clients, but want
the clients to be able to access the server over both IPs?  With each IP
mapping to only a portion (a brick) of the combined volume?

If I interpreted incorrectly, take the rest with a grain of salt.

I do not think what you want is really possible the way you are describing
it, primarily because gluster uses an algorithm to distribute files across
bricks, thus making targeting a single brick in a volume pretty close to
impossible.  To accomplish that you would need them to be separate volumes.

Once you have them split into separate volumes you can accomplish what you
want (having clients use specific interfaces for specific disks), but it
will require you to not use the recommended mounting method for one of the
two volumes. For the primary interface, you can mount normally from
clients, but for the second one you will have to download the vol file to
the clients and change the IPs in the vol file for the second interface.

we used to do the manual editing.. it was a pain.  we re-worked our system
to allow us to use the default method and it works a lot better.

Although we do still have a failover scenario where we have to do the
manual file change, because gluster doesn't natively support alternate
interface/hostnames in the config file.  hopefully they will one day.

-greg

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] (no subject)

2011-08-12 Thread Jürgen Maurer

L°
Von Samsung Mobile gesendet___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] GlusterFS 3.1.6

2011-08-12 Thread John Mark Walker
Not sure if I announced it here, but GlusterFS 3.1.6 has been released.

If you're on the 3.1.x series, and you've been waiting for a fix for the gfid 
issue, this is it:

http://community.gluster.org/p/glusterfs-3-1-6-released/


Thanks,
John Mark
Gluster Community Guy
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] New GlusterFS Documentation

2011-08-12 Thread John Mark Walker
We have recently (~1 month ago) revamped some of our documentation with new 
installation and administration guides for GlusterFS 3.2. If you're new to 
GlusterFS, you may have already seen it, but if you are an old-timer you may 
not have noticed.

Please feel free to peruse the new guides:

Installation guide -
wiki: 
http://www.gluster.com/community/documentation/index.php/Gluster_3.2_Filesystem_Installation_Guide
PDF: 
http://download.gluster.com/pub/gluster/glusterfs/3.2/3.2.0/Gluster_FS_3.2_Installation_Guide.pdf

Administration guide -
wiki: 
http://www.gluster.com/community/documentation/index.php/Gluster_3.2_Filesystem_Administration_Guide
PDF: 
http://download.gluster.com/pub/gluster/glusterfs/3.2/3.2.2/Gluster_FS_3.2_Admin_Guide.pdf

Please take a look and let us know if this documentation meets your needs and 
how we should change them going forward.

We have also begun the task of writing new documentation and adding to the mix 
of currently available docs. We know that there's more to do yet in terms of 
performance tuning, extending GlusterFS, and other advanced topics.

What other topics would you like to see covered?

Thanks,
John Mark Walker
Gluster Community Guy
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Replace brick of a dead node

2011-08-12 Thread Rajat Chopra

Thank you Harsha for the quick response.

Unfortunately, the infrastructure is in the cloud. So, I cant get the dead 
node's disk.
Since I have replication 'ON', there is no downtime as the brick on the second 
node serves well, but I want the redundancy/replication to be restored with the 
introduction of a new node (#3) in the cluster. 

I would hope there is a gluster command to just forget about the dead node's 
brick, and pick up the new brick and start replicating/serving from the new 
location (in conjunction with the one existing brick on the #2 node). Is that 
the self heal feature? I am using v3.11 as of now.

Rajat




- Original Message -
From: Harshavardhana har...@gluster.com
To: Rajat Chopra rcho...@redhat.com
Cc: gluster-users@gluster.org
Sent: Friday, August 12, 2011 2:06:14 PM
Subject: Re: [Gluster-users] Replace brick of a dead node

 I have a two node cluster, with two bricks replicated, one on each node.
 Lets say one of the node dies and is unreachable.

If you have the disk from the dead node, then all have to do is plug
it in new system and start running following commands.

gluster volume replace-brick volname old-brick new-brick start
gluster volume replace-brick volname old-brick new-brick commit

You don't have to migrate the data, this works as expected.

Since you have a replicate you wouldn't see a downtime,  but mind you
self-heal will kick in as of 3.2 it will be blocking, wait for 3.3 you
have non-blocking self-healing capabilities.

 I want to be able to spin a new node and replace the dead node's brick to a 
 location on the new node.

This is out of Gluster's hand, if you already have mechanisms to
decommission a brick and reattach it on new node then above steps are
fairly simple.

Go ahead and try it, it should work.

-Harsha
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Replace brick of a dead node

2011-08-12 Thread Mohit Anchlia
On Fri, Aug 12, 2011 at 2:35 PM, Rajat Chopra rcho...@redhat.com wrote:

 Thank you Harsha for the quick response.

 Unfortunately, the infrastructure is in the cloud. So, I cant get the dead 
 node's disk.
 Since I have replication 'ON', there is no downtime as the brick on the 
 second node serves well, but I want the redundancy/replication to be restored 
 with the introduction of a new node (#3) in the cluster.

One way is 
http://gluster.com/community/documentation/index.php/Gluster_3.2:_Brick_Restoration_-_Replace_Crashed_Server

Other way is to use replace-brick. You should be able to use it even
if the node is dead.


 I would hope there is a gluster command to just forget about the dead node's 
 brick, and pick up the new brick and start replicating/serving from the new 
 location (in conjunction with the one existing brick on the #2 node). Is that 
 the self heal feature? I am using v3.11 as of now.

 Rajat




 - Original Message -
 From: Harshavardhana har...@gluster.com
 To: Rajat Chopra rcho...@redhat.com
 Cc: gluster-users@gluster.org
 Sent: Friday, August 12, 2011 2:06:14 PM
 Subject: Re: [Gluster-users] Replace brick of a dead node

 I have a two node cluster, with two bricks replicated, one on each node.
 Lets say one of the node dies and is unreachable.

 If you have the disk from the dead node, then all have to do is plug
 it in new system and start running following commands.

 gluster volume replace-brick volname old-brick new-brick start
 gluster volume replace-brick volname old-brick new-brick commit

 You don't have to migrate the data, this works as expected.

 Since you have a replicate you wouldn't see a downtime,  but mind you
 self-heal will kick in as of 3.2 it will be blocking, wait for 3.3 you
 have non-blocking self-healing capabilities.

 I want to be able to spin a new node and replace the dead node's brick to a 
 location on the new node.

 This is out of Gluster's hand, if you already have mechanisms to
 decommission a brick and reattach it on new node then above steps are
 fairly simple.

 Go ahead and try it, it should work.

 -Harsha
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster on an ARM system

2011-08-12 Thread Devon Miller
For what it's worth, I've been running 3.2.0 for about 4 months now on ARM
processors  (Globalscale SheevaPlug (armv5tel) running Debian squeeze). I
have 4 volumes, each running 2 bricks in replicated mode. I haven't seen
anything like this.

dcm

On Fri, Aug 12, 2011 at 7:24 AM, Charles Williams ch...@itadmins.netwrote:

 As discussed with avati in IRC. I am able to setup a user account on the
 ARM box. I have also done a bit more tracing and have attached an strace
 of glusterd from startup to peer probe to core dump.

 chuck

 On 08/11/2011 08:50 PM, John Mark Walker wrote:
  Hi Charles,
 
  We have plans in the future to work on an ARM port, but that won't come
 to fruition for some time.
 
  I've CC'd the gluster-devel list in the hopes that someone there can help
 you out. However, my understanding is that it will take some significant
 porting to get GlusterFS to run in any production capacity on ARM.
 
  Once we have more news on the ARM front, I'll be happy to share it here
 and elsewhere.
 
  Please send all responses to gluster-devel, as that is the proper place
 for this conversation.
 
  Thanks,
  John Mark Walker
  Gluster Community Guy
 
  
  From: gluster-users-boun...@gluster.org [
 gluster-users-boun...@gluster.org] on behalf of Charles Williams [
 ch...@itadmins.net]
  Sent: Thursday, August 11, 2011 3:48 AM
  To: gluster-users@gluster.org
  Subject: Re: [Gluster-users] Gluster on an ARM system
 
  OK, running glusterd on the ARM box with gdb and then doing a gluster
  peer probe zmn1 I get the following from gdb when glusterd core dumps:
 
  [2011-08-11 12:46:35.326998] D
  [glusterd-utils.c:2627:glusterd_friend_find_by_hostname] 0-glusterd:
  Friend zmn1 found.. state: 0
 
  Program received signal SIGSEGV, Segmentation fault.
  0x4008e954 in rpc_transport_connect (this=0x45c48, port=0) at
  rpc-transport.c:810
  810 ret = this-ops-connect (this, port);
  (gdb)
 
 
  On 08/11/2011 10:49 AM, Charles Williams wrote:
  sorry,
 
  that last lines of the debug info should be:
 
  [2011-08-11 10:38:21.499022] D
  [glusterd-utils.c:2627:glusterd_friend_find_by_hostname] 0-glusterd:
  Friend zmn1 found.. state: 0
  Segmentation fault (core dumped)
 
 
 
  On 08/11/2011 10:46 AM, Charles Williams wrote:
  Hey all,
 
  So I went ahead and did a test install on my QNAP TS412U (ARM based)
 and
  all went well with the build and install. The problems started
 afterwards.
 
  QNAP (ARM server) config:
 
  volume management-zmn1
  type mgmt/glusterd
  option working-directory /opt/etc/glusterd
  option transport-type socket
  option transport.address-family inet
  option transport.socket.keepalive-time 10
  option transport.socket.keepalive-interval 2
  end-volume
 
 
  zmn1 (Dell PowerEdge) config:
 
  volume management
  type mgmt/glusterd
  option working-directory /etc/glusterd
  option transport-type socket
  option transport.address-family inet
  option transport.socket.keepalive-time 10
  option transport.socket.keepalive-interval 2
  end-volume
 
 
  When I tried to do a peer probe from the QNAP server to add the first
  server into the cluster glusterd seg faulted with a core dump:
 
  [2011-08-11 10:38:21.457839] I
  [glusterd-handler.c:623:glusterd_handle_cli_probe] 0-glusterd: Received
  CLI probe req zmn1 24007
  [2011-08-11 10:38:21.459508] D
  [glusterd-utils.c:213:glusterd_is_local_addr] 0-glusterd: zmn1 is not
 local
  [2011-08-11 10:38:21.460162] D
  [glusterd-utils.c:2675:glusterd_friend_find_by_hostname] 0-glusterd:
  Unable to find friend: zmn1
  [2011-08-11 10:38:21.460682] D
  [glusterd-utils.c:2675:glusterd_friend_find_by_hostname] 0-glusterd:
  Unable to find friend: zmn1
  [2011-08-11 10:38:21.460766] I
  [glusterd-handler.c:391:glusterd_friend_find] 0-glusterd: Unable to
 find
  hostname: zmn1
  [2011-08-11 10:38:21.460843] I
  [glusterd-handler.c:3417:glusterd_probe_begin] 0-glusterd: Unable to
  find peerinfo for host: zmn1 (24007)
  [2011-08-11 10:38:21.460943] D
  [glusterd-utils.c:3080:glusterd_sm_tr_log_init] 0-: returning 0
  [2011-08-11 10:38:21.461017] D
  [glusterd-utils.c:3169:glusterd_peerinfo_new] 0-: returning 0
  [2011-08-11 10:38:21.461199] D
 
 [glusterd-handler.c:3323:glusterd_transport_inet_keepalive_options_build]
 0-glusterd:
  Returning 0
  [2011-08-11 10:38:21.465952] D
 [rpc-clnt.c:914:rpc_clnt_connection_init]
  0-management-zmn1: defaulting frame-timeout to 30mins
  [2011-08-11 10:38:21.466146] D [rpc-transport.c:672:rpc_transport_load]
  0-rpc-transport: attempt to load file
  /opt/lib/glusterfs/3.2.2/rpc-transport/socket.so
  [2011-08-11 10:38:21.466346] D
  [rpc-transport.c:97:__volume_option_value_validate] 0-management-zmn1:
  no range check required for 'option transport.socket.keepalive-time 10'
  [2011-08-11 10:38:21.466460] D
  [rpc-transport.c:97:__volume_option_value_validate] 0-management-zmn1:
  no range check required for 'option 

Re: [Gluster-users] Gluster on an ARM system

2011-08-12 Thread John Mark Walker
Based on the discussions here, I think I should create an ARM resource 
page/section on the wiki.

Looks like there's more community activity that I thought.

-JM



From: Devon Miller [devon.c.mil...@gmail.com]
Sent: Friday, August 12, 2011 3:50 PM
To: Charles Williams
Cc: John Mark Walker; gluster-users@gluster.org; gluster-de...@nongnu.org
Subject: Re: [Gluster-users] Gluster on an ARM system

For what it's worth, I've been running 3.2.0 for about 4 months now on ARM 
processors  (Globalscale SheevaPlug (armv5tel) running Debian squeeze). I have 
4 volumes, each running 2 bricks in replicated mode. I haven't seen anything 
like this.

dcm

On Fri, Aug 12, 2011 at 7:24 AM, Charles Williams 
ch...@itadmins.netmailto:ch...@itadmins.net wrote:
As discussed with avati in IRC. I am able to setup a user account on the
ARM box. I have also done a bit more tracing and have attached an strace
of glusterd from startup to peer probe to core dump.

chuck

On 08/11/2011 08:50 PM, John Mark Walker wrote:
 Hi Charles,

 We have plans in the future to work on an ARM port, but that won't come to 
 fruition for some time.

 I've CC'd the gluster-devel list in the hopes that someone there can help you 
 out. However, my understanding is that it will take some significant porting 
 to get GlusterFS to run in any production capacity on ARM.

 Once we have more news on the ARM front, I'll be happy to share it here and 
 elsewhere.

 Please send all responses to gluster-devel, as that is the proper place for 
 this conversation.

 Thanks,
 John Mark Walker
 Gluster Community Guy

 
 From: 
 gluster-users-boun...@gluster.orgmailto:gluster-users-boun...@gluster.org 
 [gluster-users-boun...@gluster.orgmailto:gluster-users-boun...@gluster.org] 
 on behalf of Charles Williams [ch...@itadmins.netmailto:ch...@itadmins.net]
 Sent: Thursday, August 11, 2011 3:48 AM
 To: gluster-users@gluster.orgmailto:gluster-users@gluster.org
 Subject: Re: [Gluster-users] Gluster on an ARM system

 OK, running glusterd on the ARM box with gdb and then doing a gluster
 peer probe zmn1 I get the following from gdb when glusterd core dumps:

 [2011-08-11 12:46:35.326998] D
 [glusterd-utils.c:2627:glusterd_friend_find_by_hostname] 0-glusterd:
 Friend zmn1 found.. state: 0

 Program received signal SIGSEGV, Segmentation fault.
 0x4008e954 in rpc_transport_connect (this=0x45c48, port=0) at
 rpc-transport.c:810
 810 ret = this-ops-connect (this, port);
 (gdb)


 On 08/11/2011 10:49 AM, Charles Williams wrote:
 sorry,

 that last lines of the debug info should be:

 [2011-08-11 10:38:21.499022] D
 [glusterd-utils.c:2627:glusterd_friend_find_by_hostname] 0-glusterd:
 Friend zmn1 found.. state: 0
 Segmentation fault (core dumped)



 On 08/11/2011 10:46 AM, Charles Williams wrote:
 Hey all,

 So I went ahead and did a test install on my QNAP TS412U (ARM based) and
 all went well with the build and install. The problems started afterwards.

 QNAP (ARM server) config:

 volume management-zmn1
 type mgmt/glusterd
 option working-directory /opt/etc/glusterd
 option transport-type socket
 option transport.address-family inet
 option transport.socket.keepalive-time 10
 option transport.socket.keepalive-interval 2
 end-volume


 zmn1 (Dell PowerEdge) config:

 volume management
 type mgmt/glusterd
 option working-directory /etc/glusterd
 option transport-type socket
 option transport.address-family inet
 option transport.socket.keepalive-time 10
 option transport.socket.keepalive-interval 2
 end-volume


 When I tried to do a peer probe from the QNAP server to add the first
 server into the cluster glusterd seg faulted with a core dump:

 [2011-08-11 10:38:21.457839] I
 [glusterd-handler.c:623:glusterd_handle_cli_probe] 0-glusterd: Received
 CLI probe req zmn1 24007
 [2011-08-11 10:38:21.459508] D
 [glusterd-utils.c:213:glusterd_is_local_addr] 0-glusterd: zmn1 is not local
 [2011-08-11 10:38:21.460162] D
 [glusterd-utils.c:2675:glusterd_friend_find_by_hostname] 0-glusterd:
 Unable to find friend: zmn1
 [2011-08-11 10:38:21.460682] D
 [glusterd-utils.c:2675:glusterd_friend_find_by_hostname] 0-glusterd:
 Unable to find friend: zmn1
 [2011-08-11 10:38:21.460766] I
 [glusterd-handler.c:391:glusterd_friend_find] 0-glusterd: Unable to find
 hostname: zmn1
 [2011-08-11 10:38:21.460843] I
 [glusterd-handler.c:3417:glusterd_probe_begin] 0-glusterd: Unable to
 find peerinfo for host: zmn1 (24007)
 [2011-08-11 10:38:21.460943] D
 [glusterd-utils.c:3080:glusterd_sm_tr_log_init] 0-: returning 0
 [2011-08-11 10:38:21.461017] D
 [glusterd-utils.c:3169:glusterd_peerinfo_new] 0-: returning 0
 [2011-08-11 10:38:21.461199] D
 [glusterd-handler.c:3323:glusterd_transport_inet_keepalive_options_build] 
 0-glusterd:
 Returning 0
 [2011-08-11 10:38:21.465952] D [rpc-clnt.c:914:rpc_clnt_connection_init]
 0-management-zmn1: defaulting frame-timeout to 30mins
 

Re: [Gluster-users] Replace brick of a dead node

2011-08-12 Thread Harshavardhana
 Since I have replication 'ON', there is no downtime as the brick on the 
 second node serves well, but I want the redundancy/replication to be restored 
 with the introduction of a new node (#3) in the cluster.


Exactly if its in Cloud then the disk be it EBS blocks which can be
reattached back to the new server and you can do a replace-brick even
when the old-brick is dead/unreachable.

If there are no EBS blocks, then sure there should be a mechanism to
reattach the brick associated with that instance.

 I would hope there is a gluster command to just forget about the dead node's 
 brick, and pick up the new brick and start replicating/serving from the new

Gluster cannot decide as it has no awareness that it is in Cloud or a
Bare metal or a KVM setup.  So right now the above procedure stands
good.

-Harsha
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Replace brick of a dead node

2011-08-12 Thread Harshavardhana
On Fri, Aug 12, 2011 at 4:03 PM, Harshavardhana har...@gluster.com wrote:
 Since I have replication 'ON', there is no downtime as the brick on the 
 second node serves well, but I want the redundancy/replication to be 
 restored with the introduction of a new node (#3) in the cluster.


 Exactly if its in Cloud then the disk be it EBS blocks which can be
 reattached back to the new server and you can do a replace-brick even
 when the old-brick is dead/unreachable.

Here old-brick i meant 'old instance' not 'disk' per se.

-Harsha
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Replace brick of a dead node

2011-08-12 Thread Rajat Chopra

I am afraid the 'replace-brick' procedure does not work well if the node is 
dead. Here is the (long-ish) step-wise procedure for the dead-end that I run 
into... 

 [node-1 $] service glusterd start
 [node-1 $] gluster volume create my-vol replica 2 node-1:/srv-node-1-first 
node-1:/srv-node-1-second
 [node-1 $] gluster volume start my-vol
# this began my gluster service on first node with two bricks replicated but 
sourcing from the same node
# next I add a new node and replace one of the bricks with a new brick location 
on second node
# the purpose is to achieve failover redundancy
 [node-2 $] service glusterd start
 [node-1 $] gluster peer probe node-2
 [node-2 $] gluster peer probe node-1
 [node-2 $] gluster volume replace-brick my-vol node-1:/srv-node-1-second 
node-2:/srv-node-2-third start
# this starts the replace operation and after a while I can do volume info from 
either node
 [node-2 $] gluster volume info
Volume Name: my-vol
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: node-1:/srv-node-1-first
Brick2: node-2:/srv-node-2-third

# all good so far... now node-1 dies (no EBS, no disk, no data... just not 
reachable.. its a pvt cloud and the machine running the vm had a hardware 
failure)
# good gluster serves well from node-2 to all the clients nicely too
# now I want to replace the node-1 brick to another brick in node-2 so that I 
can pass it on to new nodes later

# so according to the suggestion, I ran replace-brick command
[node-2 $] gluster volume replace-brick my-vol node-1:/srv-node-1-first 
node-2:/srv-node-2-fourth start
# the command succeeds without errors, so I check status...
[node-2 $] gluster volume replace-brick my-vol node-1:/srv-node-1-first 
node-2:/srv-node-2-fourth status
# this command is supposed to return the status, but it returns nothing
# I check with gluster volume info on node-2
[node-2 $] gluster volume info
No volumes present
# wha?? where did my volume go? 
# Note that all this while.. my mounted client is working fine, so no downtime

Since 'gluster volume info' returned with 'No volumes present', I assume that 
the procedure does not work. Is there something wrong in my procedure, or was 
it not supposed to work anyway?

I am using v3.1.1

Again, I really appreciate the help, but I seem to be stuck. The email 
suggested the procedure in the following link
  [ 
http://gluster.com/community/documentation/index.php/Gluster_3.2:_Brick_Restoration_-_Replace_Crashed_Server
 ]
It seems like a better way of replacing dead-nodes, but then it would seem that 
I cant replace the brick from dead node to a newly created path on an existing 
node, because I should have the hostname matched. That is fine too, if its a 
requirement, but can I assume that it will work with 3.1.1 or do I have to 
upgrade to 3.2 for it?

Thanks again for the assistance.
Rajat





- Original Message -
From: Mohit Anchlia mohitanch...@gmail.com
To: Rajat Chopra rcho...@redhat.com
Cc: Harshavardhana har...@gluster.com, gluster-users@gluster.org
Sent: Friday, August 12, 2011 3:07:59 PM
Subject: Re: [Gluster-users] Replace brick of a dead node

On Fri, Aug 12, 2011 at 2:35 PM, Rajat Chopra rcho...@redhat.com wrote:

 Thank you Harsha for the quick response.

 Unfortunately, the infrastructure is in the cloud. So, I cant get the dead 
 node's disk.
 Since I have replication 'ON', there is no downtime as the brick on the 
 second node serves well, but I want the redundancy/replication to be restored 
 with the introduction of a new node (#3) in the cluster.

One way is 
http://gluster.com/community/documentation/index.php/Gluster_3.2:_Brick_Restoration_-_Replace_Crashed_Server

Other way is to use replace-brick. You should be able to use it even
if the node is dead.


 I would hope there is a gluster command to just forget about the dead node's 
 brick, and pick up the new brick and start replicating/serving from the new 
 location (in conjunction with the one existing brick on the #2 node). Is that 
 the self heal feature? I am using v3.11 as of now.

 Rajat




 - Original Message -
 From: Harshavardhana har...@gluster.com
 To: Rajat Chopra rcho...@redhat.com
 Cc: gluster-users@gluster.org
 Sent: Friday, August 12, 2011 2:06:14 PM
 Subject: Re: [Gluster-users] Replace brick of a dead node

 I have a two node cluster, with two bricks replicated, one on each node.
 Lets say one of the node dies and is unreachable.

 If you have the disk from the dead node, then all have to do is plug
 it in new system and start running following commands.

 gluster volume replace-brick volname old-brick new-brick start
 gluster volume replace-brick volname old-brick new-brick commit

 You don't have to migrate the data, this works as expected.

 Since you have a replicate you wouldn't see a downtime,  but mind you
 self-heal will kick in as of 3.2 it will be blocking, wait for 3.3 you
 have non-blocking self-healing capabilities.

 I want to be able to 

Re: [Gluster-users] Replace brick of a dead node

2011-08-12 Thread Mohit Anchlia
On Fri, Aug 12, 2011 at 4:49 PM, Rajat Chopra rcho...@redhat.com wrote:

 I am afraid the 'replace-brick' procedure does not work well if the node is 
 dead. Here is the (long-ish) step-wise procedure for the dead-end that I run 
 into...

  [node-1 $] service glusterd start
  [node-1 $] gluster volume create my-vol replica 2 node-1:/srv-node-1-first 
 node-1:/srv-node-1-second
  [node-1 $] gluster volume start my-vol
 # this began my gluster service on first node with two bricks replicated but 
 sourcing from the same node
 # next I add a new node and replace one of the bricks with a new brick 
 location on second node
 # the purpose is to achieve failover redundancy
  [node-2 $] service glusterd start
  [node-1 $] gluster peer probe node-2
  [node-2 $] gluster peer probe node-1
  [node-2 $] gluster volume replace-brick my-vol node-1:/srv-node-1-second 
 node-2:/srv-node-2-third start
 # this starts the replace operation and after a while I can do volume info 
 from either node
  [node-2 $] gluster volume info
 Volume Name: my-vol
 Type: Replicate
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp
 Bricks:
 Brick1: node-1:/srv-node-1-first
 Brick2: node-2:/srv-node-2-third

 # all good so far... now node-1 dies (no EBS, no disk, no data... just not 
 reachable.. its a pvt cloud and the machine running the vm had a hardware 
 failure)
 # good gluster serves well from node-2 to all the clients nicely too
 # now I want to replace the node-1 brick to another brick in node-2 so that I 
 can pass it on to new nodes later

 # so according to the suggestion, I ran replace-brick command
 [node-2 $] gluster volume replace-brick my-vol node-1:/srv-node-1-first 
 node-2:/srv-node-2-fourth start


Did you run $ gluster volume replace-brick my-vol
node-1:/srv-node-1-first node-2:/srv-node-2-fourth commit ?

If not, try running this additional command commit. This will make
necessary changes to the config. So don't check status but run commit
right after because dead node is not around.

 # the command succeeds without errors, so I check status...
 [node-2 $] gluster volume replace-brick my-vol node-1:/srv-node-1-first 
 node-2:/srv-node-2-fourth status
 # this command is supposed to return the status, but it returns nothing
 # I check with gluster volume info on node-2
 [node-2 $] gluster volume info
 No volumes present
 # wha?? where did my volume go?
 # Note that all this while.. my mounted client is working fine, so no downtime

 Since 'gluster volume info' returned with 'No volumes present', I assume that 
 the procedure does not work. Is there something wrong in my procedure, or was 
 it not supposed to work anyway?

 I am using v3.1.1

 Again, I really appreciate the help, but I seem to be stuck. The email 
 suggested the procedure in the following link
      [ 
 http://gluster.com/community/documentation/index.php/Gluster_3.2:_Brick_Restoration_-_Replace_Crashed_Server
  ]
 It seems like a better way of replacing dead-nodes, but then it would seem 
 that I cant replace the brick from dead node to a newly created path on an 
 existing node, because I should have the hostname matched. That is fine too, 
 if its a requirement, but can I assume that it will work with 3.1.1 or do I 
 have to upgrade to 3.2 for it?

 Thanks again for the assistance.
 Rajat





 - Original Message -
 From: Mohit Anchlia mohitanch...@gmail.com
 To: Rajat Chopra rcho...@redhat.com
 Cc: Harshavardhana har...@gluster.com, gluster-users@gluster.org
 Sent: Friday, August 12, 2011 3:07:59 PM
 Subject: Re: [Gluster-users] Replace brick of a dead node

 On Fri, Aug 12, 2011 at 2:35 PM, Rajat Chopra rcho...@redhat.com wrote:

 Thank you Harsha for the quick response.

 Unfortunately, the infrastructure is in the cloud. So, I cant get the dead 
 node's disk.
 Since I have replication 'ON', there is no downtime as the brick on the 
 second node serves well, but I want the redundancy/replication to be 
 restored with the introduction of a new node (#3) in the cluster.

 One way is 
 http://gluster.com/community/documentation/index.php/Gluster_3.2:_Brick_Restoration_-_Replace_Crashed_Server

 Other way is to use replace-brick. You should be able to use it even
 if the node is dead.


 I would hope there is a gluster command to just forget about the dead node's 
 brick, and pick up the new brick and start replicating/serving from the new 
 location (in conjunction with the one existing brick on the #2 node). Is 
 that the self heal feature? I am using v3.11 as of now.

 Rajat




 - Original Message -
 From: Harshavardhana har...@gluster.com
 To: Rajat Chopra rcho...@redhat.com
 Cc: gluster-users@gluster.org
 Sent: Friday, August 12, 2011 2:06:14 PM
 Subject: Re: [Gluster-users] Replace brick of a dead node

 I have a two node cluster, with two bricks replicated, one on each node.
 Lets say one of the node dies and is unreachable.

 If you have the disk from the dead node, then all have to do is plug
 it in new system and 

Re: [Gluster-users] Replace brick of a dead node

2011-08-12 Thread Marcel Pennewiß
On Friday 12 August 2011 22:58:22 Rajat Chopra wrote:
 Is there a way I can achieve that without any downtime?

Maybe [1] could help?

[1] http://bugs.gluster.com/show_bug.cgi?id=2506#c3 

best regards,
Marcel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users