Re: [Gluster-users] Rebalancing newly added bricks
> > Hi, > > Rebalance will abort itself if it cannot reach any of the nodes. Are all > the bricks still up and reachable? > > Regards, > Nithya > Yes the bricks appear to be fine. I restarted the rebalance and the process is moving along again: # gluster vol rebalance tank status Node Rebalanced-files size scanned failures skipped status run time in h:m:s - --- --- --- --- --- -- localhost 22697314.9TB 1572952 0 0 in progress 44:26:48 serverB 0 0Bytes631667 0 0completed 37:2:14 volume rebalance: tank: success # df -hP |grep data /dev/mapper/gluster_vg-gluster_lv1_data 60T 24T 36T 40% /gluster_bricks/data1 /dev/mapper/gluster_vg-gluster_lv2_data 60T 24T 36T 40% /gluster_bricks/data2 /dev/mapper/gluster_vg-gluster_lv3_data 60T 17T 43T 29% /gluster_bricks/data3 /dev/mapper/gluster_vg-gluster_lv4_data 60T 17T 43T 29% /gluster_bricks/data4 /dev/mapper/gluster_vg-gluster_lv5_data 60T 19T 41T 31% /gluster_bricks/data5 /dev/mapper/gluster_vg-gluster_lv6_data 60T 19T 41T 31% /gluster_bricks/data6 Thanks, HB > > > > >> >> # gluster vol rebalance tank status >> Node Rebalanced-files size >> scanned failures skipped status run time in >> h:m:s >>- --- --- >> --- --- --- >> -- >>localhost 134870657.8TB >> 2234439 9 6 failed 190:24:3 >>serverB 0 >> 0Bytes 7 0 0completed >> 63:47:55 >> volume rebalance: tank: success >> >> # gluster vol status tank >> Status of volume: tank >> Gluster process TCP Port RDMA Port Online >> Pid >> >> -- >> Brick serverA:/gluster_bricks/data1 49162 0 Y >> 20318 >> Brick serverB:/gluster_bricks/data1 49166 0 Y >> 3432 >> Brick serverA:/gluster_bricks/data2 49163 0 Y >> 20323 >> Brick serverB:/gluster_bricks/data2 49167 0 Y >> 3435 >> Brick serverA:/gluster_bricks/data3 49164 0 Y >> 4625 >> Brick serverA:/gluster_bricks/data4 49165 0 Y >> 4644 >> Brick serverA:/gluster_bricks/data5 49166 0 Y >> 5088 >> Brick serverA:/gluster_bricks/data6 49167 0 Y >> 5128 >> Brick serverB:/gluster_bricks/data3 49168 0 Y >> 22314 >> Brick serverB:/gluster_bricks/data4 49169 0 Y >> 22345 >> Brick serverB:/gluster_bricks/data5 49170 0 Y >> 22889 >> Brick serverB:/gluster_bricks/data6 49171 0 Y >> 22932 >> Self-heal Daemon on localhost N/A N/AY >> 6202 >> Self-heal Daemon on serverB N/A N/AY >> 22981 >> >> Task Status of Volume tank >> >> -- >> Task : Rebalance >> ID : eec64343-8e0d-4523-ad05-5678f9eb9eb2 >> Status : failed >> >> # df -hP |grep data >> /dev/mapper/gluster_vg-gluster_lv1_data 60T 31T 29T 52% >> /gluster_bricks/data1 >> /dev/mapper/gluster_vg-gluster_lv2_data 60T 31T 29T 51% >> /gluster_bricks/data2 >> /dev/mapper/gluster_vg-gluster_lv3_data 60T 15T 46T 24% >> /gluster_bricks/data3 >> /dev/mapper/gluster_vg-gluster_lv4_data 60T 15T 46T 24% >> /gluster_bricks/data4 >> /dev/mapper/gluster_vg-gluster_lv5_data 60T 15T 45T 25% >> /gluster_bricks/data5 >> /dev/mapper/gluster_vg-gluster_lv6_data 60T 15T 45T 25% >> /gluster_bricks/data6 >> >> >> The rebalance log on serverA shows a disconnect from serverB >> >> [2019-09-08 15:41:44.285591] C >> [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-tank-client-10: server >> :49170 has not responded in the last 42 seconds, disconnecting. >> [2019-09-08 15:41:44.285739] I [MSGID: 114018] >> [client.c:2280:client_rpc_notify] 0-tank-client-10: disconnected from >> tank-client-10. Client process will keep trying to connect to glusterd >> until brick's port is available >> [2019-09-08 15:41:44.286023] E [rpc-clnt.c:365:saved_frames_unwind] (--> >> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7ff986e8b132] (--> >> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7ff986c5299e] (--> >>
Re: [Gluster-users] Rebalancing newly added bricks
On Sat, 14 Sep 2019 at 01:25, Herb Burnswell wrote: > Hi, > > Well our rebalance seems to have failed. Here is the output: > Hi, Rebalance will abort itself if it cannot reach any of the nodes. Are all the bricks still up and reachable? Regards, Nithya > > # gluster vol rebalance tank status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s >- --- --- > --- --- --- > -- >localhost 134870657.8TB > 2234439 9 6 failed 190:24:3 >serverB 0 > 0Bytes 7 0 0completed > 63:47:55 > volume rebalance: tank: success > > # gluster vol status tank > Status of volume: tank > Gluster process TCP Port RDMA Port Online > Pid > > -- > Brick serverA:/gluster_bricks/data1 49162 0 Y > 20318 > Brick serverB:/gluster_bricks/data1 49166 0 Y > 3432 > Brick serverA:/gluster_bricks/data2 49163 0 Y > 20323 > Brick serverB:/gluster_bricks/data2 49167 0 Y > 3435 > Brick serverA:/gluster_bricks/data3 49164 0 Y > 4625 > Brick serverA:/gluster_bricks/data4 49165 0 Y > 4644 > Brick serverA:/gluster_bricks/data5 49166 0 Y > 5088 > Brick serverA:/gluster_bricks/data6 49167 0 Y > 5128 > Brick serverB:/gluster_bricks/data3 49168 0 Y > 22314 > Brick serverB:/gluster_bricks/data4 49169 0 Y > 22345 > Brick serverB:/gluster_bricks/data5 49170 0 Y > 22889 > Brick serverB:/gluster_bricks/data6 49171 0 Y > 22932 > Self-heal Daemon on localhost N/A N/AY > 6202 > Self-heal Daemon on serverB N/A N/AY > 22981 > > Task Status of Volume tank > > -- > Task : Rebalance > ID : eec64343-8e0d-4523-ad05-5678f9eb9eb2 > Status : failed > > # df -hP |grep data > /dev/mapper/gluster_vg-gluster_lv1_data 60T 31T 29T 52% > /gluster_bricks/data1 > /dev/mapper/gluster_vg-gluster_lv2_data 60T 31T 29T 51% > /gluster_bricks/data2 > /dev/mapper/gluster_vg-gluster_lv3_data 60T 15T 46T 24% > /gluster_bricks/data3 > /dev/mapper/gluster_vg-gluster_lv4_data 60T 15T 46T 24% > /gluster_bricks/data4 > /dev/mapper/gluster_vg-gluster_lv5_data 60T 15T 45T 25% > /gluster_bricks/data5 > /dev/mapper/gluster_vg-gluster_lv6_data 60T 15T 45T 25% > /gluster_bricks/data6 > > > The rebalance log on serverA shows a disconnect from serverB > > [2019-09-08 15:41:44.285591] C > [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-tank-client-10: server > :49170 has not responded in the last 42 seconds, disconnecting. > [2019-09-08 15:41:44.285739] I [MSGID: 114018] > [client.c:2280:client_rpc_notify] 0-tank-client-10: disconnected from > tank-client-10. Client process will keep trying to connect to glusterd > until brick's port is available > [2019-09-08 15:41:44.286023] E [rpc-clnt.c:365:saved_frames_unwind] (--> > /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7ff986e8b132] (--> > /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7ff986c5299e] (--> > /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7ff986c52aae] (--> > /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7ff986c54220] (--> > /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2b0)[0x7ff986c54ce0] ) > 0-tank-client-10: forced unwinding frame type(GlusterFS 3.3) > op(FXATTROP(34)) called at 2019-09-08 15:40:44.040333 (xid=0x7f8cfac) > > Does this type of failure cause data corruption? What is the best course > of action at this point? > > Thanks, > > HB > > On Wed, Sep 11, 2019 at 11:58 PM Strahil wrote: > >> Hi Nithya, >> >> Thanks for the detailed explanation. >> It makes sense. >> >> Best Regards, >> Strahil Nikolov >> On Sep 12, 2019 08:18, Nithya Balachandran wrote: >> >> >> >> On Wed, 11 Sep 2019 at 09:47, Strahil wrote: >> >> Hi Nithya, >> >> I just reminded about your previous e-mail which left me with the >> impression that old volumes need that. >> This is the one 1 mean: >> >> >It looks like this is a replicate volume. If >that is the case then yes, >> you are >running an old version of Gluster for >which this was the default >> >> >> Hi Strahil, >> >> I'm providing a little more detail here which I hope will explain things. >> Rebalance was always a volume wide operation - a *rebalance start* >> operation will start rebalance
Re: [Gluster-users] Rebalancing newly added bricks
Hi, Well our rebalance seems to have failed. Here is the output: # gluster vol rebalance tank status Node Rebalanced-files size scanned failures skipped status run time in h:m:s - --- --- --- --- --- -- localhost 134870657.8TB 2234439 9 6 failed 190:24:3 serverB 0 0Bytes 7 0 0completed 63:47:55 volume rebalance: tank: success # gluster vol status tank Status of volume: tank Gluster process TCP Port RDMA Port Online Pid -- Brick serverA:/gluster_bricks/data1 49162 0 Y 20318 Brick serverB:/gluster_bricks/data1 49166 0 Y 3432 Brick serverA:/gluster_bricks/data2 49163 0 Y 20323 Brick serverB:/gluster_bricks/data2 49167 0 Y 3435 Brick serverA:/gluster_bricks/data3 49164 0 Y 4625 Brick serverA:/gluster_bricks/data4 49165 0 Y 4644 Brick serverA:/gluster_bricks/data5 49166 0 Y 5088 Brick serverA:/gluster_bricks/data6 49167 0 Y 5128 Brick serverB:/gluster_bricks/data3 49168 0 Y 22314 Brick serverB:/gluster_bricks/data4 49169 0 Y 22345 Brick serverB:/gluster_bricks/data5 49170 0 Y 22889 Brick serverB:/gluster_bricks/data6 49171 0 Y 22932 Self-heal Daemon on localhost N/A N/AY 6202 Self-heal Daemon on serverB N/A N/AY 22981 Task Status of Volume tank -- Task : Rebalance ID : eec64343-8e0d-4523-ad05-5678f9eb9eb2 Status : failed # df -hP |grep data /dev/mapper/gluster_vg-gluster_lv1_data 60T 31T 29T 52% /gluster_bricks/data1 /dev/mapper/gluster_vg-gluster_lv2_data 60T 31T 29T 51% /gluster_bricks/data2 /dev/mapper/gluster_vg-gluster_lv3_data 60T 15T 46T 24% /gluster_bricks/data3 /dev/mapper/gluster_vg-gluster_lv4_data 60T 15T 46T 24% /gluster_bricks/data4 /dev/mapper/gluster_vg-gluster_lv5_data 60T 15T 45T 25% /gluster_bricks/data5 /dev/mapper/gluster_vg-gluster_lv6_data 60T 15T 45T 25% /gluster_bricks/data6 The rebalance log on serverA shows a disconnect from serverB [2019-09-08 15:41:44.285591] C [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-tank-client-10: server :49170 has not responded in the last 42 seconds, disconnecting. [2019-09-08 15:41:44.285739] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-tank-client-10: disconnected from tank-client-10. Client process will keep trying to connect to glusterd until brick's port is available [2019-09-08 15:41:44.286023] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7ff986e8b132] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7ff986c5299e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7ff986c52aae] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7ff986c54220] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2b0)[0x7ff986c54ce0] ) 0-tank-client-10: forced unwinding frame type(GlusterFS 3.3) op(FXATTROP(34)) called at 2019-09-08 15:40:44.040333 (xid=0x7f8cfac) Does this type of failure cause data corruption? What is the best course of action at this point? Thanks, HB On Wed, Sep 11, 2019 at 11:58 PM Strahil wrote: > Hi Nithya, > > Thanks for the detailed explanation. > It makes sense. > > Best Regards, > Strahil Nikolov > On Sep 12, 2019 08:18, Nithya Balachandran wrote: > > > > On Wed, 11 Sep 2019 at 09:47, Strahil wrote: > > Hi Nithya, > > I just reminded about your previous e-mail which left me with the > impression that old volumes need that. > This is the one 1 mean: > > >It looks like this is a replicate volume. If >that is the case then yes, > you are >running an old version of Gluster for >which this was the default > > > Hi Strahil, > > I'm providing a little more detail here which I hope will explain things. > Rebalance was always a volume wide operation - a *rebalance start* > operation will start rebalance processes on all nodes of the volume. > However, different processes would behave differently. In earlier releases, > all nodes would crawl the bricks and update the directory layouts. However, > only one node in each replica/disperse set would actually migrate files,so > the rebalance status would only show one node doing any "work"
Re: [Gluster-users] Rebalancing newly added bricks
On Wed, 11 Sep 2019 at 09:47, Strahil wrote: > Hi Nithya, > > I just reminded about your previous e-mail which left me with the > impression that old volumes need that. > This is the one 1 mean: > > >It looks like this is a replicate volume. If >that is the case then yes, > you are >running an old version of Gluster for >which this was the default > Hi Strahil, I'm providing a little more detail here which I hope will explain things. Rebalance was always a volume wide operation - a *rebalance start* operation will start rebalance processes on all nodes of the volume. However, different processes would behave differently. In earlier releases, all nodes would crawl the bricks and update the directory layouts. However, only one node in each replica/disperse set would actually migrate files,so the rebalance status would only show one node doing any "work" (scanning, rebalancing etc). However, this one node will process all the files in its replica sets. Rerunning rebalance on other nodes would make no difference as it will always be the same node that ends up migrating files. So for instance, for a replicate volume with server1:/brick1, server2:/brick2 and server3:/brick3 in that order, only the rebalance process on server1 would migrate files. In newer releases, all 3 nodes would migrate files. The rebalance status does not capture the directory operations of fixing layouts which is why it looks like the other nodes are not doing anything. Hope this helps. Regards, Nithya > behaviour. > > > > > > > >Regards, > > > > > >Nithya > > > Best Regards, > Strahil Nikolov > On Sep 9, 2019 06:36, Nithya Balachandran wrote: > > > > On Sat, 7 Sep 2019 at 00:03, Strahil Nikolov > wrote: > > As it was mentioned, you might have to run rebalance on the other node - > but it is better to wait this node is over. > > > Hi Strahil, > > Rebalance does not need to be run on the other node - the operation is a > volume wide one . Only a single node per replica set would migrate files in > the version used in this case . > > Regards, > Nithya > > Best Regards, > Strahil Nikolov > > В петък, 6 септември 2019 г., 15:29:20 ч. Гринуич+3, Herb Burnswell < > herbert.burnsw...@gmail.com> написа: > > > > > On Thu, Sep 5, 2019 at 9:56 PM Nithya Balachandran > wrote: > > > > On Thu, 5 Sep 2019 at 02:41, Herb Burnswell > wrote: > > Thanks for the replies. The rebalance is running and the brick > percentages are not adjusting as expected: > > # df -hP |grep data > /dev/mapper/gluster_vg-gluster_lv1_data 60T 49T 11T 83% > /gluster_bricks/data1 > /dev/mapper/gluster_vg-gluster_lv2_data 60T 49T 11T 83% > /gluster_bricks/data2 > /dev/mapper/gluster_vg-gluster_lv3_data 60T 4.6T 55T 8% > /gluster_bricks/data3 > /dev/mapper/gluster_vg-gluster_lv4_data 60T 4.6T 55T 8% > /gluster_bricks/data4 > /dev/mapper/gluster_vg-gluster_lv5_data 60T 4.6T 55T 8% > /gluster_bricks/data5 > /dev/mapper/gluster_vg-gluster_lv6_data 60T 4.6T 55T 8% > /gluster_bricks/data6 > > At the current pace it looks like this will continue to run for another > 5-6 days. > > I appreciate the guidance.. > > > What is the output of the rebalance status command? > Can you check if there are any errors in the rebalance logs on the node > on which you see rebalance activity? > If there are a lot of small files on the volume, the rebalance is expected > to take time. > > Regards, > Nithya > > > My apologies, that was a typo. I meant to say: > > "The rebalance is running and the brick percentages are NOW adjusting as > expected" > > I did expect the rebalance to take several days. The rebalance log is not > showing any errors. Status output: > > # gluster vol rebalance tank status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s >- --- --- > --- --- --- > -- >localhost 125132035.5TB > 2079527 0 0 in progress 139:9:46 >serverB 0 > 0Bytes 7 0 0completed > 63:47:55 > volume rebalance: tank: success > > Thanks again for the guidance. > > HB > > > > > > On Mon, Sep 2, 2019 at 9:08 PM Nithya Balachandran > wrote: > > > > On Sat, 31 Aug 2019 at 22:59, Herb Burnswell > wrote: > > Thank you for the reply. > > I started a rebalance with force on serverA as suggested. Now I see > 'activity' on that node: > > # gluster vol rebalance tank status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s >- --- --- > --- --- ---
Re: [Gluster-users] Rebalancing newly added bricks
On Sat, 7 Sep 2019 at 00:03, Strahil Nikolov wrote: > As it was mentioned, you might have to run rebalance on the other node - > but it is better to wait this node is over. > > Hi Strahil, Rebalance does not need to be run on the other node - the operation is a volume wide one . Only a single node per replica set would migrate files in the version used in this case . Regards, Nithya Best Regards, > Strahil Nikolov > > В петък, 6 септември 2019 г., 15:29:20 ч. Гринуич+3, Herb Burnswell < > herbert.burnsw...@gmail.com> написа: > > > > > On Thu, Sep 5, 2019 at 9:56 PM Nithya Balachandran > wrote: > > > > On Thu, 5 Sep 2019 at 02:41, Herb Burnswell > wrote: > > Thanks for the replies. The rebalance is running and the brick > percentages are not adjusting as expected: > > # df -hP |grep data > /dev/mapper/gluster_vg-gluster_lv1_data 60T 49T 11T 83% > /gluster_bricks/data1 > /dev/mapper/gluster_vg-gluster_lv2_data 60T 49T 11T 83% > /gluster_bricks/data2 > /dev/mapper/gluster_vg-gluster_lv3_data 60T 4.6T 55T 8% > /gluster_bricks/data3 > /dev/mapper/gluster_vg-gluster_lv4_data 60T 4.6T 55T 8% > /gluster_bricks/data4 > /dev/mapper/gluster_vg-gluster_lv5_data 60T 4.6T 55T 8% > /gluster_bricks/data5 > /dev/mapper/gluster_vg-gluster_lv6_data 60T 4.6T 55T 8% > /gluster_bricks/data6 > > At the current pace it looks like this will continue to run for another > 5-6 days. > > I appreciate the guidance.. > > > What is the output of the rebalance status command? > Can you check if there are any errors in the rebalance logs on the node > on which you see rebalance activity? > If there are a lot of small files on the volume, the rebalance is expected > to take time. > > Regards, > Nithya > > > My apologies, that was a typo. I meant to say: > > "The rebalance is running and the brick percentages are NOW adjusting as > expected" > > I did expect the rebalance to take several days. The rebalance log is not > showing any errors. Status output: > > # gluster vol rebalance tank status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s >- --- --- > --- --- --- > -- >localhost 125132035.5TB > 2079527 0 0 in progress 139:9:46 >serverB 0 > 0Bytes 7 0 0completed > 63:47:55 > volume rebalance: tank: success > > Thanks again for the guidance. > > HB > > > > > > On Mon, Sep 2, 2019 at 9:08 PM Nithya Balachandran > wrote: > > > > On Sat, 31 Aug 2019 at 22:59, Herb Burnswell > wrote: > > Thank you for the reply. > > I started a rebalance with force on serverA as suggested. Now I see > 'activity' on that node: > > # gluster vol rebalance tank status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s >- --- --- > --- --- --- > -- >localhost 6143 6.1GB >9542 0 0 in progress0:4:5 >serverB 00Bytes > 7 0 0 in progress0:4:5 > volume rebalance: tank: success > > But I am not seeing any activity on serverB. Is this expected? Does the > rebalance need to run on each node even though it says both nodes are 'in > progress'? > > > It looks like this is a replicate volume. If that is the case then yes, > you are running an old version of Gluster for which this was the default > behaviour. > > Regards, > Nithya > > Thanks, > > HB > > On Sat, Aug 31, 2019 at 4:18 AM Strahil wrote: > > The rebalance status show 0 Bytes. > > Maybe you should try with the 'gluster volume rebalance start > force' ? > > Best Regards, > Strahil Nikolov > > Source: > https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#rebalancing-volumes > On Aug 30, 2019 20:04, Herb Burnswell wrote: > > All, > > RHEL 7.5 > Gluster 3.8.15 > 2 Nodes: serverA & serverB > > I am not deeply knowledgeable about Gluster and it's administration but we > have a 2 node cluster that's been running for about a year and a half. All > has worked fine to date. Our main volume has consisted of two 60TB bricks > on each of the cluster nodes. As we reached capacity on the volume we > needed to expand. So, we've added four new 60TB bricks to each of the > cluster nodes. The bricks are now seen, and the total size of the volume > is as expected: > > # gluster vol status tank
Re: [Gluster-users] Rebalancing newly added bricks
On Thu, Sep 5, 2019 at 9:56 PM Nithya Balachandran wrote: > > > On Thu, 5 Sep 2019 at 02:41, Herb Burnswell > wrote: > >> Thanks for the replies. The rebalance is running and the brick >> percentages are not adjusting as expected: >> >> # df -hP |grep data >> /dev/mapper/gluster_vg-gluster_lv1_data 60T 49T 11T 83% >> /gluster_bricks/data1 >> /dev/mapper/gluster_vg-gluster_lv2_data 60T 49T 11T 83% >> /gluster_bricks/data2 >> /dev/mapper/gluster_vg-gluster_lv3_data 60T 4.6T 55T 8% >> /gluster_bricks/data3 >> /dev/mapper/gluster_vg-gluster_lv4_data 60T 4.6T 55T 8% >> /gluster_bricks/data4 >> /dev/mapper/gluster_vg-gluster_lv5_data 60T 4.6T 55T 8% >> /gluster_bricks/data5 >> /dev/mapper/gluster_vg-gluster_lv6_data 60T 4.6T 55T 8% >> /gluster_bricks/data6 >> >> At the current pace it looks like this will continue to run for another >> 5-6 days. >> >> I appreciate the guidance.. >> >> > What is the output of the rebalance status command? > Can you check if there are any errors in the rebalance logs on the node > on which you see rebalance activity? > If there are a lot of small files on the volume, the rebalance is expected > to take time. > > Regards, > Nithya > My apologies, that was a typo. I meant to say: "The rebalance is running and the brick percentages are NOW adjusting as expected" I did expect the rebalance to take several days. The rebalance log is not showing any errors. Status output: # gluster vol rebalance tank status Node Rebalanced-files size scanned failures skipped status run time in h:m:s - --- --- --- --- --- -- localhost 125132035.5TB 2079527 0 0 in progress 139:9:46 serverB 0 0Bytes 7 0 0completed 63:47:55 volume rebalance: tank: success Thanks again for the guidance. HB > >> >> On Mon, Sep 2, 2019 at 9:08 PM Nithya Balachandran >> wrote: >> >>> >>> >>> On Sat, 31 Aug 2019 at 22:59, Herb Burnswell < >>> herbert.burnsw...@gmail.com> wrote: >>> Thank you for the reply. I started a rebalance with force on serverA as suggested. Now I see 'activity' on that node: # gluster vol rebalance tank status Node Rebalanced-files size scanned failures skipped status run time in h:m:s - --- --- --- --- --- -- localhost 6143 6.1GB 9542 0 0 in progress0:4:5 serverB 00Bytes 7 0 0 in progress0:4:5 volume rebalance: tank: success But I am not seeing any activity on serverB. Is this expected? Does the rebalance need to run on each node even though it says both nodes are 'in progress'? >>> It looks like this is a replicate volume. If that is the case then yes, >>> you are running an old version of Gluster for which this was the default >>> behaviour. >>> >>> Regards, >>> Nithya >>> >>> Thanks, HB On Sat, Aug 31, 2019 at 4:18 AM Strahil wrote: > The rebalance status show 0 Bytes. > > Maybe you should try with the 'gluster volume rebalance > start force' ? > > Best Regards, > Strahil Nikolov > > Source: > https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#rebalancing-volumes > On Aug 30, 2019 20:04, Herb Burnswell > wrote: > > All, > > RHEL 7.5 > Gluster 3.8.15 > 2 Nodes: serverA & serverB > > I am not deeply knowledgeable about Gluster and it's administration > but we have a 2 node cluster that's been running for about a year and a > half. All has worked fine to date. Our main volume has consisted of two > 60TB bricks on each of the cluster nodes. As we reached capacity on the > volume we needed to expand. So, we've added four new 60TB bricks to each > of the cluster nodes. The bricks are now seen, and the total size of the > volume is as expected: > > # gluster vol status tank > Status of volume: tank > Gluster process TCP Port RDMA Port > Online Pid > > -- > Brick serverA:/gluster_bricks/data1 49162 0 Y > 20318 > Brick
Re: [Gluster-users] Rebalancing newly added bricks
On Thu, 5 Sep 2019 at 02:41, Herb Burnswell wrote: > Thanks for the replies. The rebalance is running and the brick > percentages are not adjusting as expected: > > # df -hP |grep data > /dev/mapper/gluster_vg-gluster_lv1_data 60T 49T 11T 83% > /gluster_bricks/data1 > /dev/mapper/gluster_vg-gluster_lv2_data 60T 49T 11T 83% > /gluster_bricks/data2 > /dev/mapper/gluster_vg-gluster_lv3_data 60T 4.6T 55T 8% > /gluster_bricks/data3 > /dev/mapper/gluster_vg-gluster_lv4_data 60T 4.6T 55T 8% > /gluster_bricks/data4 > /dev/mapper/gluster_vg-gluster_lv5_data 60T 4.6T 55T 8% > /gluster_bricks/data5 > /dev/mapper/gluster_vg-gluster_lv6_data 60T 4.6T 55T 8% > /gluster_bricks/data6 > > At the current pace it looks like this will continue to run for another > 5-6 days. > > I appreciate the guidance.. > > What is the output of the rebalance status command? Can you check if there are any errors in the rebalance logs on the node on which you see rebalance activity? If there are a lot of small files on the volume, the rebalance is expected to take time. Regards, Nithya > HB > > On Mon, Sep 2, 2019 at 9:08 PM Nithya Balachandran > wrote: > >> >> >> On Sat, 31 Aug 2019 at 22:59, Herb Burnswell >> wrote: >> >>> Thank you for the reply. >>> >>> I started a rebalance with force on serverA as suggested. Now I see >>> 'activity' on that node: >>> >>> # gluster vol rebalance tank status >>> Node Rebalanced-files size >>> scanned failures skipped status run time in >>> h:m:s >>>- --- --- >>> --- --- --- >>> -- >>>localhost 6143 6.1GB >>> 9542 0 0 in progress0:4:5 >>>serverB 00Bytes >>> 7 0 0 in progress0:4:5 >>> volume rebalance: tank: success >>> >>> But I am not seeing any activity on serverB. Is this expected? Does >>> the rebalance need to run on each node even though it says both nodes are >>> 'in progress'? >>> >>> >> It looks like this is a replicate volume. If that is the case then yes, >> you are running an old version of Gluster for which this was the default >> behaviour. >> >> Regards, >> Nithya >> >> Thanks, >>> >>> HB >>> >>> On Sat, Aug 31, 2019 at 4:18 AM Strahil wrote: >>> The rebalance status show 0 Bytes. Maybe you should try with the 'gluster volume rebalance start force' ? Best Regards, Strahil Nikolov Source: https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#rebalancing-volumes On Aug 30, 2019 20:04, Herb Burnswell wrote: All, RHEL 7.5 Gluster 3.8.15 2 Nodes: serverA & serverB I am not deeply knowledgeable about Gluster and it's administration but we have a 2 node cluster that's been running for about a year and a half. All has worked fine to date. Our main volume has consisted of two 60TB bricks on each of the cluster nodes. As we reached capacity on the volume we needed to expand. So, we've added four new 60TB bricks to each of the cluster nodes. The bricks are now seen, and the total size of the volume is as expected: # gluster vol status tank Status of volume: tank Gluster process TCP Port RDMA Port Online Pid -- Brick serverA:/gluster_bricks/data1 49162 0 Y 20318 Brick serverB:/gluster_bricks/data1 49166 0 Y 3432 Brick serverA:/gluster_bricks/data2 49163 0 Y 20323 Brick serverB:/gluster_bricks/data2 49167 0 Y 3435 Brick serverA:/gluster_bricks/data3 49164 0 Y 4625 Brick serverA:/gluster_bricks/data4 49165 0 Y 4644 Brick serverA:/gluster_bricks/data5 49166 0 Y 5088 Brick serverA:/gluster_bricks/data6 49167 0 Y 5128 Brick serverB:/gluster_bricks/data3 49168 0 Y 22314 Brick serverB:/gluster_bricks/data4 49169 0 Y 22345 Brick serverB:/gluster_bricks/data5 49170 0 Y 22889 Brick serverB:/gluster_bricks/data6 49171 0 Y 22932 Self-heal Daemon on localhost N/A N/AY 22981 Self-heal Daemon on serverA.example.com N/A N/AY 6202 After adding the bricks we ran a rebalance from serverA as: # gluster volume rebalance tank start The
Re: [Gluster-users] Rebalancing newly added bricks
Thanks for the replies. The rebalance is running and the brick percentages are not adjusting as expected: # df -hP |grep data /dev/mapper/gluster_vg-gluster_lv1_data 60T 49T 11T 83% /gluster_bricks/data1 /dev/mapper/gluster_vg-gluster_lv2_data 60T 49T 11T 83% /gluster_bricks/data2 /dev/mapper/gluster_vg-gluster_lv3_data 60T 4.6T 55T 8% /gluster_bricks/data3 /dev/mapper/gluster_vg-gluster_lv4_data 60T 4.6T 55T 8% /gluster_bricks/data4 /dev/mapper/gluster_vg-gluster_lv5_data 60T 4.6T 55T 8% /gluster_bricks/data5 /dev/mapper/gluster_vg-gluster_lv6_data 60T 4.6T 55T 8% /gluster_bricks/data6 At the current pace it looks like this will continue to run for another 5-6 days. I appreciate the guidance.. HB On Mon, Sep 2, 2019 at 9:08 PM Nithya Balachandran wrote: > > > On Sat, 31 Aug 2019 at 22:59, Herb Burnswell > wrote: > >> Thank you for the reply. >> >> I started a rebalance with force on serverA as suggested. Now I see >> 'activity' on that node: >> >> # gluster vol rebalance tank status >> Node Rebalanced-files size >> scanned failures skipped status run time in >> h:m:s >>- --- --- >> --- --- --- >> -- >>localhost 6143 6.1GB >>9542 0 0 in progress0:4:5 >>serverB 00Bytes >> 7 0 0 in progress0:4:5 >> volume rebalance: tank: success >> >> But I am not seeing any activity on serverB. Is this expected? Does the >> rebalance need to run on each node even though it says both nodes are 'in >> progress'? >> >> > It looks like this is a replicate volume. If that is the case then yes, > you are running an old version of Gluster for which this was the default > behaviour. > > Regards, > Nithya > > Thanks, >> >> HB >> >> On Sat, Aug 31, 2019 at 4:18 AM Strahil wrote: >> >>> The rebalance status show 0 Bytes. >>> >>> Maybe you should try with the 'gluster volume rebalance start >>> force' ? >>> >>> Best Regards, >>> Strahil Nikolov >>> >>> Source: >>> https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#rebalancing-volumes >>> On Aug 30, 2019 20:04, Herb Burnswell >>> wrote: >>> >>> All, >>> >>> RHEL 7.5 >>> Gluster 3.8.15 >>> 2 Nodes: serverA & serverB >>> >>> I am not deeply knowledgeable about Gluster and it's administration but >>> we have a 2 node cluster that's been running for about a year and a half. >>> All has worked fine to date. Our main volume has consisted of two 60TB >>> bricks on each of the cluster nodes. As we reached capacity on the volume >>> we needed to expand. So, we've added four new 60TB bricks to each of the >>> cluster nodes. The bricks are now seen, and the total size of the volume >>> is as expected: >>> >>> # gluster vol status tank >>> Status of volume: tank >>> Gluster process TCP Port RDMA Port Online >>> Pid >>> >>> -- >>> Brick serverA:/gluster_bricks/data1 49162 0 Y >>> 20318 >>> Brick serverB:/gluster_bricks/data1 49166 0 Y >>> 3432 >>> Brick serverA:/gluster_bricks/data2 49163 0 Y >>> 20323 >>> Brick serverB:/gluster_bricks/data2 49167 0 Y >>> 3435 >>> Brick serverA:/gluster_bricks/data3 49164 0 Y >>> 4625 >>> Brick serverA:/gluster_bricks/data4 49165 0 Y >>> 4644 >>> Brick serverA:/gluster_bricks/data5 49166 0 Y >>> 5088 >>> Brick serverA:/gluster_bricks/data6 49167 0 Y >>> 5128 >>> Brick serverB:/gluster_bricks/data3 49168 0 Y >>> 22314 >>> Brick serverB:/gluster_bricks/data4 49169 0 Y >>> 22345 >>> Brick serverB:/gluster_bricks/data5 49170 0 Y >>> 22889 >>> Brick serverB:/gluster_bricks/data6 49171 0 Y >>> 22932 >>> Self-heal Daemon on localhost N/A N/AY >>> 22981 >>> Self-heal Daemon on serverA.example.com N/A N/AY >>> 6202 >>> >>> After adding the bricks we ran a rebalance from serverA as: >>> >>> # gluster volume rebalance tank start >>> >>> The rebalance completed: >>> >>> # gluster volume rebalance tank status >>> Node Rebalanced-files size >>> scanned failures skipped status run time in >>> h:m:s >>>- --- --- >>> --- --- --- >>> -- >>>localhost00Bytes >>>
Re: [Gluster-users] Rebalancing newly added bricks
On Sat, 31 Aug 2019 at 22:59, Herb Burnswell wrote: > Thank you for the reply. > > I started a rebalance with force on serverA as suggested. Now I see > 'activity' on that node: > > # gluster vol rebalance tank status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s >- --- --- > --- --- --- > -- >localhost 6143 6.1GB >9542 0 0 in progress0:4:5 >serverB 00Bytes > 7 0 0 in progress0:4:5 > volume rebalance: tank: success > > But I am not seeing any activity on serverB. Is this expected? Does the > rebalance need to run on each node even though it says both nodes are 'in > progress'? > > It looks like this is a replicate volume. If that is the case then yes, you are running an old version of Gluster for which this was the default behaviour. Regards, Nithya Thanks, > > HB > > On Sat, Aug 31, 2019 at 4:18 AM Strahil wrote: > >> The rebalance status show 0 Bytes. >> >> Maybe you should try with the 'gluster volume rebalance start >> force' ? >> >> Best Regards, >> Strahil Nikolov >> >> Source: >> https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#rebalancing-volumes >> On Aug 30, 2019 20:04, Herb Burnswell >> wrote: >> >> All, >> >> RHEL 7.5 >> Gluster 3.8.15 >> 2 Nodes: serverA & serverB >> >> I am not deeply knowledgeable about Gluster and it's administration but >> we have a 2 node cluster that's been running for about a year and a half. >> All has worked fine to date. Our main volume has consisted of two 60TB >> bricks on each of the cluster nodes. As we reached capacity on the volume >> we needed to expand. So, we've added four new 60TB bricks to each of the >> cluster nodes. The bricks are now seen, and the total size of the volume >> is as expected: >> >> # gluster vol status tank >> Status of volume: tank >> Gluster process TCP Port RDMA Port Online >> Pid >> >> -- >> Brick serverA:/gluster_bricks/data1 49162 0 Y >> 20318 >> Brick serverB:/gluster_bricks/data1 49166 0 Y >> 3432 >> Brick serverA:/gluster_bricks/data2 49163 0 Y >> 20323 >> Brick serverB:/gluster_bricks/data2 49167 0 Y >> 3435 >> Brick serverA:/gluster_bricks/data3 49164 0 Y >> 4625 >> Brick serverA:/gluster_bricks/data4 49165 0 Y >> 4644 >> Brick serverA:/gluster_bricks/data5 49166 0 Y >> 5088 >> Brick serverA:/gluster_bricks/data6 49167 0 Y >> 5128 >> Brick serverB:/gluster_bricks/data3 49168 0 Y >> 22314 >> Brick serverB:/gluster_bricks/data4 49169 0 Y >> 22345 >> Brick serverB:/gluster_bricks/data5 49170 0 Y >> 22889 >> Brick serverB:/gluster_bricks/data6 49171 0 Y >> 22932 >> Self-heal Daemon on localhost N/A N/AY >> 22981 >> Self-heal Daemon on serverA.example.com N/A N/AY >> 6202 >> >> After adding the bricks we ran a rebalance from serverA as: >> >> # gluster volume rebalance tank start >> >> The rebalance completed: >> >> # gluster volume rebalance tank status >> Node Rebalanced-files size >> scanned failures skipped status run time in >> h:m:s >>- --- --- >> --- --- --- >> -- >>localhost00Bytes >> 0 0 0completed3:7:10 >> serverA.example.com00Bytes >> 0 0 0completed0:0:0 >> volume rebalance: tank: success >> >> However, when I run a df, the two original bricks still show all of the >> consumed space (this is the same on both nodes): >> >> # df -hP >> Filesystem Size Used Avail Use% Mounted on >> /dev/mapper/vg0-root 5.0G 625M 4.4G 13% / >> devtmpfs 32G 0 32G 0% /dev >> tmpfs 32G 0 32G 0% /dev/shm >> tmpfs 32G 67M 32G 1% /run >> tmpfs 32G 0 32G 0% >> /sys/fs/cgroup >> /dev/mapper/vg0-usr 20G 3.6G 17G 18% /usr >> /dev/md126
Re: [Gluster-users] Rebalancing newly added bricks
Thank you for the reply. I started a rebalance with force on serverA as suggested. Now I see 'activity' on that node: # gluster vol rebalance tank status Node Rebalanced-files size scanned failures skipped status run time in h:m:s - --- --- --- --- --- -- localhost 6143 6.1GB 9542 0 0 in progress0:4:5 serverB 00Bytes 7 0 0 in progress0:4:5 volume rebalance: tank: success But I am not seeing any activity on serverB. Is this expected? Does the rebalance need to run on each node even though it says both nodes are 'in progress'? Thanks, HB On Sat, Aug 31, 2019 at 4:18 AM Strahil wrote: > The rebalance status show 0 Bytes. > > Maybe you should try with the 'gluster volume rebalance start > force' ? > > Best Regards, > Strahil Nikolov > > Source: > https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#rebalancing-volumes > On Aug 30, 2019 20:04, Herb Burnswell wrote: > > All, > > RHEL 7.5 > Gluster 3.8.15 > 2 Nodes: serverA & serverB > > I am not deeply knowledgeable about Gluster and it's administration but we > have a 2 node cluster that's been running for about a year and a half. All > has worked fine to date. Our main volume has consisted of two 60TB bricks > on each of the cluster nodes. As we reached capacity on the volume we > needed to expand. So, we've added four new 60TB bricks to each of the > cluster nodes. The bricks are now seen, and the total size of the volume > is as expected: > > # gluster vol status tank > Status of volume: tank > Gluster process TCP Port RDMA Port Online > Pid > > -- > Brick serverA:/gluster_bricks/data1 49162 0 Y > 20318 > Brick serverB:/gluster_bricks/data1 49166 0 Y > 3432 > Brick serverA:/gluster_bricks/data2 49163 0 Y > 20323 > Brick serverB:/gluster_bricks/data2 49167 0 Y > 3435 > Brick serverA:/gluster_bricks/data3 49164 0 Y > 4625 > Brick serverA:/gluster_bricks/data4 49165 0 Y > 4644 > Brick serverA:/gluster_bricks/data5 49166 0 Y > 5088 > Brick serverA:/gluster_bricks/data6 49167 0 Y > 5128 > Brick serverB:/gluster_bricks/data3 49168 0 Y > 22314 > Brick serverB:/gluster_bricks/data4 49169 0 Y > 22345 > Brick serverB:/gluster_bricks/data5 49170 0 Y > 22889 > Brick serverB:/gluster_bricks/data6 49171 0 Y > 22932 > Self-heal Daemon on localhost N/A N/AY > 22981 > Self-heal Daemon on serverA.example.com N/A N/AY > 6202 > > After adding the bricks we ran a rebalance from serverA as: > > # gluster volume rebalance tank start > > The rebalance completed: > > # gluster volume rebalance tank status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s >- --- --- > --- --- --- > -- >localhost00Bytes > 0 0 0completed3:7:10 > serverA.example.com00Bytes > 0 0 0completed0:0:0 > volume rebalance: tank: success > > However, when I run a df, the two original bricks still show all of the > consumed space (this is the same on both nodes): > > # df -hP > Filesystem Size Used Avail Use% Mounted on > /dev/mapper/vg0-root 5.0G 625M 4.4G 13% / > devtmpfs 32G 0 32G 0% /dev > tmpfs 32G 0 32G 0% /dev/shm > tmpfs 32G 67M 32G 1% /run > tmpfs 32G 0 32G 0% > /sys/fs/cgroup > /dev/mapper/vg0-usr 20G 3.6G 17G 18% /usr > /dev/md126 1014M 228M 787M 23% /boot > /dev/mapper/vg0-home 5.0G 37M 5.0G 1% /home > /dev/mapper/vg0-opt 5.0G 37M 5.0G 1% /opt > /dev/mapper/vg0-tmp 5.0G 33M 5.0G 1% /tmp > /dev/mapper/vg0-var 20G 2.6G 18G 13% /var > /dev/mapper/gluster_vg-gluster_lv1_data 60T 59T 1.1T
[Gluster-users] Rebalancing newly added bricks
All, RHEL 7.5 Gluster 3.8.15 2 Nodes: serverA & serverB I am not deeply knowledgeable about Gluster and it's administration but we have a 2 node cluster that's been running for about a year and a half. All has worked fine to date. Our main volume has consisted of two 60TB bricks on each of the cluster nodes. As we reached capacity on the volume we needed to expand. So, we've added four new 60TB bricks to each of the cluster nodes. The bricks are now seen, and the total size of the volume is as expected: # gluster vol status tank Status of volume: tank Gluster process TCP Port RDMA Port Online Pid -- Brick serverA:/gluster_bricks/data1 49162 0 Y 20318 Brick serverB:/gluster_bricks/data1 49166 0 Y 3432 Brick serverA:/gluster_bricks/data2 49163 0 Y 20323 Brick serverB:/gluster_bricks/data2 49167 0 Y 3435 Brick serverA:/gluster_bricks/data3 49164 0 Y 4625 Brick serverA:/gluster_bricks/data4 49165 0 Y 4644 Brick serverA:/gluster_bricks/data5 49166 0 Y 5088 Brick serverA:/gluster_bricks/data6 49167 0 Y 5128 Brick serverB:/gluster_bricks/data3 49168 0 Y 22314 Brick serverB:/gluster_bricks/data4 49169 0 Y 22345 Brick serverB:/gluster_bricks/data5 49170 0 Y 22889 Brick serverB:/gluster_bricks/data6 49171 0 Y 22932 Self-heal Daemon on localhost N/A N/AY 22981 Self-heal Daemon on serverA.example.com N/A N/AY 6202 After adding the bricks we ran a rebalance from serverA as: # gluster volume rebalance tank start The rebalance completed: # gluster volume rebalance tank status Node Rebalanced-files size scanned failures skipped status run time in h:m:s - --- --- --- --- --- -- localhost00Bytes 0 0 0completed3:7:10 serverA.example.com00Bytes 0 0 0completed0:0:0 volume rebalance: tank: success However, when I run a df, the two original bricks still show all of the consumed space (this is the same on both nodes): # df -hP Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg0-root 5.0G 625M 4.4G 13% / devtmpfs 32G 0 32G 0% /dev tmpfs 32G 0 32G 0% /dev/shm tmpfs 32G 67M 32G 1% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/mapper/vg0-usr 20G 3.6G 17G 18% /usr /dev/md126 1014M 228M 787M 23% /boot /dev/mapper/vg0-home 5.0G 37M 5.0G 1% /home /dev/mapper/vg0-opt 5.0G 37M 5.0G 1% /opt /dev/mapper/vg0-tmp 5.0G 33M 5.0G 1% /tmp /dev/mapper/vg0-var 20G 2.6G 18G 13% /var /dev/mapper/gluster_vg-gluster_lv1_data 60T 59T 1.1T 99% /gluster_bricks/data1 /dev/mapper/gluster_vg-gluster_lv2_data 60T 58T 1.3T 98% /gluster_bricks/data2 /dev/mapper/gluster_vg-gluster_lv3_data 60T 451M 60T 1% /gluster_bricks/data3 /dev/mapper/gluster_vg-gluster_lv4_data 60T 451M 60T 1% /gluster_bricks/data4 /dev/mapper/gluster_vg-gluster_lv5_data 60T 451M 60T 1% /gluster_bricks/data5 /dev/mapper/gluster_vg-gluster_lv6_data 60T 451M 60T 1% /gluster_bricks/data6 localhost:/tank 355T 116T 239T 33% /mnt/tank We were thinking that the used space would be distributed across the now 6 bricks after rebalance. Is that not what a rebalance does? Is this expected behavior? Can anyone provide some guidance as to what the behavior here and if there is anything that we need to do at this point? Thanks in advance, HB ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users