Re: [Gluster-users] Removing bricks from a replicated setup completely brakes volume on Gluster 3.3

Marc Seeger Tue, 11 Jun 2013 08:09:12 -0700

Ok, a smaller test case for release-3.3 branch. I can't seem to remove a brick 
without somehow breaking the volume:

[14:53:46] [email protected]:~# mkdir /test
[14:55:23] [email protected]:~# cd /test/
[14:55:26] [email protected]:/test# mkdir b1
[14:55:28] [email protected]:/test# mkdir b2
[14:55:29] [email protected]:/test# mkdir b3
[14:55:31] [email protected]:/test# gluster volume create marctest replica 3 
fs-5.mseeger:/test/b1 fs-5.mseeger:/test/b2 fs-5.mseeger:/test/b3
Multiple bricks of a replicate volume are present on the same server. This 
setup is not optimal.
Do you still want to continue creating the volume?  (y/n) y
Creation of volume marctest has been successful. Please start the volume to 
access data.
[14:56:07] [email protected]:/test# gluster volume start marctest
Starting volume marctest has been successful

[14:57:40] [email protected]:/test# gluster volume info marctest

Volume Name: marctest
Type: Replicate
Volume ID: a25ee38b-156c-4ea0-87d6-0522af615c72
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: fs-5.mseeger:/test/b1
Brick2: fs-5.mseeger:/test/b2
Brick3: fs-5.mseeger:/test/b3

[14:57:44] [email protected]:/test# gluster volume remove-brick marctest 
replica 2 fs-5.mseeger:/test/b3 start
Remove Brick start unsuccessful

[14:57:52] [email protected]:/test# gluster volume info marctest

Volume Name: marctest
Type: Distributed-Replicate
Volume ID: a25ee38b-156c-4ea0-87d6-0522af615c72
Status: Started
Number of Bricks: 1 x 2 = 3
Transport-type: tcp
Bricks:
Brick1: fs-5.mseeger:/test/b1
Brick2: fs-5.mseeger:/test/b2
Brick3: fs-5.mseeger:/test/b3

[14:58:03] [email protected]:/test# gluster volume remove-brick marctest 
replica 2 fs-5.mseeger:/test/b3 start
number of bricks provided (1) is not valid. need at least 2 (or 2xN)

[14:58:56] [email protected]:/test# gluster volume stop marctest
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) 
y
Stopping volume marctest has been successful

[15:01:22] [email protected]:/test# gluster volume start marctest
Starting volume marctest has been unsuccessful

These are the log file entries for the initial removal:

[2013-06-11 14:57:44.498903] I 
[glusterd-handler.c:866:glusterd_handle_cli_get_volume] 0-glusterd: Received 
get vol req
[2013-06-11 14:57:52.758892] I 
[glusterd-brick-ops.c:601:glusterd_handle_remove_brick] 0-glusterd: Received 
rem brick req
[2013-06-11 14:57:52.758892] I 
[glusterd-brick-ops.c:642:glusterd_handle_remove_brick] 0-management: request 
to change replica-count to 2
[2013-06-11 14:57:52.758892] I 
[glusterd-utils.c:857:glusterd_volume_brickinfo_get_by_brick] 0-: brick: 
fs-5.mseeger:/test/b3
[2013-06-11 14:57:52.758892] I 
[glusterd-utils.c:814:glusterd_volume_brickinfo_get] 0-management: Found brick
[2013-06-11 14:57:52.758892] I [glusterd-utils.c:285:glusterd_lock] 0-glusterd: 
Cluster lock held by 7c798980-5413-484c-ac33-aeb873acec7d
[2013-06-11 14:57:52.758892] I [glusterd-handler.c:463:glusterd_op_txn_begin] 
0-management: Acquired local lock
[2013-06-11 14:57:52.758892] I 
[glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC 
from uuid: f2bb435f-5db3-4ea9-b640-fc5aab3fdf76
[2013-06-11 14:57:52.758892] I 
[glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 
1 peers
[2013-06-11 14:57:52.758892] I 
[glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from 
uuid: f2bb435f-5db3-4ea9-b640-fc5aab3fdf76
[2013-06-11 14:57:52.758892] I 
[glusterd-op-sm.c:3487:glusterd_bricks_select_remove_brick] 0-management: force 
flag is not set
[2013-06-11 14:57:52.758892] I 
[glusterd-utils.c:857:glusterd_volume_brickinfo_get_by_brick] 0-: brick: 
fs-5.mseeger:/test/b3
[2013-06-11 14:57:52.758892] I 
[glusterd-utils.c:814:glusterd_volume_brickinfo_get] 0-management: Found brick
[2013-06-11 14:57:52.768892] I 
[glusterd-brick-ops.c:1590:glusterd_op_remove_brick] 0-management: changing 
replica count 3 to 2 on volume marctest
[2013-06-11 14:57:52.768892] E 
[glusterd-volgen.c:2158:volgen_graph_build_clients] 0-: volume inconsistency: 
total number of bricks (3) is not divisible with number of bricks per cluster 
(2) in a multi-cluster setup
[2013-06-11 14:57:52.768892] E 
[glusterd-volgen.c:3286:glusterd_create_volfiles_and_notify_services] 
0-management: Could not generate trusted client volfiles
[2013-06-11 14:57:52.768892] W 
[glusterd-brick-ops.c:1609:glusterd_op_remove_brick] 0-management: failed to 
create volfiles
[2013-06-11 14:57:52.768892] E 
[glusterd-op-sm.c:2350:glusterd_op_ac_send_commit_op] 0-management: Commit 
failed
[2013-06-11 14:57:52.768892] I 
[glusterd-op-sm.c:2254:glusterd_op_modify_op_ctx] 0-management: op_ctx 
modification not required
[2013-06-11 14:57:52.768892] I 
[glusterd-rpc-ops.c:607:glusterd3_1_cluster_unlock_cbk] 0-glusterd: Received 
ACC from uuid: f2bb435f-5db3-4ea9-b640-fc5aab3fdf76
[2013-06-11 14:57:52.768892] I [glusterd-op-sm.c:2653:glusterd_op_txn_complete] 
0-glusterd: Cleared local lock
[2013-06-11 14:58:03.018878] I 
[glusterd-handler.c:866:glusterd_handle_cli_get_volume] 0-glusterd: Received 
get vol req
[2013-06-11 14:58:56.278813] I 
[glusterd-brick-ops.c:601:glusterd_handle_remove_brick] 0-glusterd: Received 
rem brick req
[2013-06-11 14:58:56.278813] I 
[glusterd-brick-ops.c:642:glusterd_handle_remove_brick] 0-management: request 
to change replica-count to 2
[2013-06-11 14:58:56.278813] W 
[glusterd-brick-ops.c:319:gd_rmbr_validate_replica_count] 0-management: number 
of bricks provided (1) is not valid. need at least 2 (or 2xN)
[2013-06-11 14:58:56.278813] E 
[glusterd-brick-ops.c:844:glusterd_handle_remove_brick] 0-: number of bricks 
provided (1) is not valid. need at least 2 (or 2xN)
[2013-06-11 15:01:05.688935] I 
[glusterd-volume-ops.c:354:glusterd_handle_cli_stop_volume] 0-glusterd: 
Received stop vol reqfor volume marctest

On Jun 11, 2013, at 3:01 PM, Bobby Jacob <[email protected]> wrote:

> Hi All,
> 
> I'm using the following glusterFS version:
>       glusterfs 3.3.1 built on Oct 11 2012
> I was successfully able to remove bricks from a 4-replica volume by reducing
> the replica count to 3. My "gluster volume status" displayed the status of
> the volume to be a 3-Mode Replicate volume. Further I removed another brick
> by reducing the replica to 2. 
> 
> Later, added another node using add-brick and increasing the replica count
> to 3. ALL WORKED FINE FOR ME. !!
> 
> Here are the commands I used:
> 1) gluster volume remove-brick Cloud-data replica 3 GSNODE01:/mnt/brick1
> (Changed Replica count from 4 to 3)
> 2) gluster volume remove-brick Cloud-data replica 2 GSNODE01:/mnt/brick2
> (Changed Replica count from 3 to 2)
> 3) gluster volume add-brick Cloud-data replica 3 GSNODE01:/brick4
> (Changed Replica count from 2 to 3)
> 
> Thanks & Regards,
> 
> Bobby Jacob
> Senior Technical Systems Engineer | eGroup
> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Marc Seeger
> Sent: Tuesday, June 11, 2013 3:42 PM
> To: [email protected]
> Subject: [Gluster-users] Removing bricks from a replicated setup completely
> brakes volume on Gluster 3.3
> 
> Initial setup: A replicated volume with 3 bricks
> Goal: Remove one of the bricks from it.
> Version: # glusterfs 3.3git built on Jun  7 2013 14:38:02 (branch
> release-3.3)
> 
> Initial setup: A replicated volume with 3 bricks
> Goal: Remove one of the bricks from it.
> Outcome: A completely broken volume
> 
> 
> ------------- Volume info -------------
> 
> [email protected]:~# gluster volume info
> 
> Volume Name: test-fs-cluster-1
> Type: Replicate
> Volume ID: 752e7ffd-04bb-4234-8d16-d1f49ef510b7
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: fs-14.example.com:/mnt/brick21
> Brick2: fs-15.example.com:/mnt/brick20
> Brick3: fs-14.example.com:/mnt/brick33
> 
> 
> ------------- Trying to remove a brick -------------
> 
> fields-config-gluster.rb[5035]: Using commandline: gluster volume
> remove-brick test-fs-cluster-1 replica 2 fs-14.example.com:/mnt/brick33
> start
> fields-config-gluster.rb[5035]: Command returned exit code 255: gluster
> volume remove-brick test-fs-cluster-1 replica 2
> fs-14.example.com:/mnt/brick33 start stdout was:
> 
> stderr was:
> Remove Brick start unsuccessful
> 
> 
> 
> 
> ------------- Volume turned Distributed-Replicate ------------- [12:23:37]
> [email protected]:~# gluster volume info
> 
> Volume Name: test-fs-cluster-1
> Type: Distributed-Replicate
> Volume ID: 752e7ffd-04bb-4234-8d16-d1f49ef510b7
> Status: Started
> Number of Bricks: 1 x 2 = 3
> Transport-type: tcp
> Bricks:
> Brick1: fs-14.example.com:/mnt/brick21
> Brick2: fs-15.example.com:/mnt/brick20
> Brick3: fs-14.example.com:/mnt/brick33
> 
> 
> ------------- Trying to remove brick again -------------
> 
> [12:26:20] [email protected]:~# gluster volume remove-brick
> test-fs-cluster-1 replica 2 fs-14.example.com:/mnt/brick33 start number of
> bricks provided (1) is not valid. need at least 2 (or 2xN)
> 
> ------------- Trying to stop volume -------------
> 
> [12:28:34] [email protected]:~# gluster volume stop test-fs-cluster-1
> Stopping volume will make its data inaccessible. Do you want to continue?
> (y/n) y Stopping volume test-fs-cluster-1 has been successful
> 
> 
> ------------- Trying to start volume again ------------- [12:29:03]
> [email protected]:~# gluster volume start test-fs-cluster-1 Starting volume
> test-fs-cluster-1 has been unsuccessful
> 
> ------------- Trying to stop volume again -------------
> 
> [12:29:49] [email protected]:~# gluster volume stop test-fs-cluster-1
> Stopping volume will make its data inaccessible. Do you want to continue?
> (y/n) y Volume test-fs-cluster-1 is not in the started state
> 
> ------------- Trying to delete volume -------------
> 
> [12:29:55] [email protected]:~# gluster volume delete test-fs-cluster-1
> Deleting volume will erase all information about the volume. Do you want to
> continue? (y/n) y Volume test-fs-cluster-1 has been started.Volume needs to
> be stopped before deletion.
> 
> ------------- Checking volume info ------------- # gluster volume info
> 
> Volume Name: test-fs-cluster-1
> Type: Distributed-Replicate
> Volume ID: 752e7ffd-04bb-4234-8d16-d1f49ef510b7
> Status: Started
> Number of Bricks: 1 x 2 = 3
> Transport-type: tcp
> Bricks:
> Brick1: fs-14.example.com:/mnt/brick21
> Brick2: fs-15.example.com:/mnt/brick20
> Brick3: fs-14.example.com:/mnt/brick33
> 
> ------------- Trying to stop volume again ------------- [12:30:50]
> [email protected]:~# gluster volume stop test-fs-cluster-1 Stopping volume
> will make its data inaccessible. Do you want to continue? (y/n) y Volume
> test-fs-cluster-1 is not in the started state
> 
> 
> 
> ------------- Restarting glusterfs-server -------------
> 
> [12:38:05] [email protected]:~# /etc/init.d/glusterfs-server restart
> glusterfs-server start/running, process 6426
> 
> ------------- Volume switched back to "Replicate" ------------- [12:38:33]
> [email protected]:~# gluster volume info
> 
> Volume Name: test-fs-cluster-1
> Type: Replicate
> Volume ID: 752e7ffd-04bb-4234-8d16-d1f49ef510b7
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: fs-14.example.com:/mnt/brick21
> Brick2: fs-15.example.com:/mnt/brick20
> Brick3: fs-14.example.com:/mnt/brick33
> 
> 
> ------------- Trying to stop volume again ------------- [12:38:39]
> [email protected]:~# gluster volume stop test-fs-cluster-1 Stopping volume
> will make its data inaccessible. Do you want to continue? (y/n) y Volume
> test-fs-cluster-1 is not in the started state
> 
> 
> 
> Any idea what's up with that?
> 
> Cheers,
> Marc
> _______________________________________________
> Gluster-users mailing list
> [email protected]
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>

_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Removing bricks from a replicated setup completely brakes volume on Gluster 3.3

Reply via email to