Hi,
On Mon, 4 Feb 2019 at 16:39, mohammad kashif <kashif.a...@gmail.com> wrote: > Hi Nithya > > Thanks for replying so quickly. It is very much appreciated. > > There are lots if " [No space left on device] " errors which I can not > understand as there are much space on all of the nodes. > This means that Gluster could not find sufficient space for the file. Would you be willing to share your rebalance log file? Please provide the following information: - The gluster version - The gluster volume info for the volume - How full are the individual bricks for the volume? > A little bit of background will be useful in this case. I had cluster of > seven nodes of varying capacity(73, 73, 73, 46, 46, 46,46 TB) . The > cluster was almost 90% full so every node has almost 8 to 15 TB free > space. I added two new nodes with 100TB each and ran fix-layout which > completed successfully. > > After that I started remove-brick operation. I don't think that any point > , any of the nodes were 100% full. Looking at my ganglia graph, there is > minimum 5TB always available at every node. > > I was keeping an eye on remove-brick status and for very long time there > was no failures and then at some point these 17000 failures appeared and it > stayed like that. > > Thanks > > Kashif > > > > > > Let me explain a little bit of background. > > > On Mon, Feb 4, 2019 at 5:09 AM Nithya Balachandran <nbala...@redhat.com> > wrote: > >> Hi, >> >> The status shows quite a few failures. Please check the rebalance logs to >> see why that happened. We can decide what to do based on the errors. >> Once you run a commit, the brick will no longer be part of the volume and >> you will not be able to access those files via the client. >> Do you have sufficient space on the remaining bricks for the files on the >> removed brick? >> >> Regards, >> Nithya >> >> On Mon, 4 Feb 2019 at 03:50, mohammad kashif <kashif.a...@gmail.com> >> wrote: >> >>> Hi >>> >>> I have a pure distributed gluster volume with nine nodes and trying to >>> remove one node, I ran >>> gluster volume remove-brick atlasglust >>> nodename:/glusteratlas/brick007/gv0 start >>> >>> It completed but with around 17000 failures >>> >>> Node Rebalanced-files size scanned failures >>> skipped status run time in h:m:s >>> --------- ----------- ----------- >>> ----------- ----------- ----------- ------------ >>> -------------- >>> nodename 4185858 27.5TB 6746030 >>> 17488 0 completed 405:15:34 >>> >>> I can see that there is still 1.5 TB of data on the node which I was >>> trying to remove. >>> >>> I am not sure what to do now? Should I run remove-brick command again >>> so the files which has been failed can be tried again? >>> >>> or should I run commit first and then try to remove node again? >>> >>> Please advise as I don't want to remove files. >>> >>> Thanks >>> >>> Kashif >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >>
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users