Re: [Gluster-users] Rebalancing with no new bricks.
Thanks Jeff, Thru some happy coincidences, a few brutal threats, and some executive liquidations, the problem was averted. We dropped 40TB in an hour and the storage load over the bricks equalized as it did so. So we're good for a while. However, I'll look into your suggestions (THANKS!!) and experiment with them on some test files. Re: the SvK's suggestions, I often do on-brick file manipulations to resolve problems and especially do recursive ops. It's not a user-level solution, but it does allow sysadmin flexibility. ie, we use a small util to fork recursive 'du's to each brick and then sum them afterwards. It's easily 10-1000x faster than doing it via gluster, especially on our ZOT file dirs. As is the usual case, it was writ for our specific case, but if anyone wants it as skel for their instance, happy to share. hjm On Wed, Jun 12, 2013 at 6:04 AM, Jeff Darcy wrote: > On 06/11/2013 05:55 PM, Harry Mangalam wrote: > > I understand that gluster does not allow gluster-rebalancing in the same > > config (ie, without adding a brick), but would moving some offending > > files to another fs and then copying them back tend to rebalance the FS > > by distributing the copied files more evenly? > > The problem is that no files will move unless the layout (the hash-range > information used to place files) changes, and the layout won't change > unless the set of bricks does, so a rebalance without adding/removing > bricks would end up being a no-op. However, there are two other ways > this kind of imbalance can be addressed. > > Firstly, we can observe that GlusterFS will still find files even if > they're in the "wrong" place according to the layout. It is possible to > move files manually from one brick to another, so long as appropriate > care is taken e.g. to ensure that all extended attributes are preserved. > The problem with this approach is that such careful placement might be > undone the next time you do a rebalance for other reasons. > > The second approach avoids that problem, but is a bit trickier in other > ways. There is a feature that allows a user to define their own layout > for a directory, so that it will be left alone by rebalance. There's > even a script in your friendly neighborhood source tree > (extras/rebalance.py) that will use this to set a layout that's weighted > according to the size of bricks (or free space) instead of just treating > all as equal. You can use that as an example of how to construct and > apply a valid user-defined layout, plus other techniques that are likely > to be useful. Lastly, we have a feature that setting the fake > "distribute.migrate-data" xattr on a file will cause use to re-evaluate > where it should go and migrate it to the correct brick if necessary. > Since you seem to be dealing with a relatively small number of files > that need to be moved, it shouldn't be too hard to combine this little > bag of tricks into a solution that meets your needs. Just let me know > if you'd like me to assist. > > > -- Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 415 South Circle View Dr, Irvine, CA, 92697 [shipping] MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps) ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Rebalancing with no new bricks.
On Wed, 12 Jun 2013 09:57:15 -0400 Jeff Darcy wrote: > On 06/12/2013 09:46 AM, Stephan von Krawczynski wrote: > > The true question is indeed: why does he need "tricks" at all to come to > > something obvious for humans: a way of distributing files over the glusterfs > > so that "full" means really all bricks are full. > > That's my view too. I keep trying. > > > 3) The data must be left accessible even if glusterfs is not used on the > > bricks any longer - without copying "back". > > This part is already true in practically all cases. You can ignore the > .glusterfs directory and extra xattrs, or nuke them, and you have a perfectly > normal file/directory structure that's usable as-is. The exceptions are if > you > use striping or erasure coding, but (like RAID) those are fundamentally ways > of > slicing and dicing data across storage units so some reassembly would be > necessary. I only tried to list _the_ major advantages glusterfs could/should have over almost all other competitors. Of special importance is "1)" and "3)" because it allows everyone to "go and try" without having to fiddle around with tons of data. "2)" is convenience, something a good piece of software should deliver :-) -- Regards, Stephan ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Rebalancing with no new bricks.
On 06/12/2013 09:46 AM, Stephan von Krawczynski wrote: The true question is indeed: why does he need "tricks" at all to come to something obvious for humans: a way of distributing files over the glusterfs so that "full" means really all bricks are full. That's my view too. I keep trying. 3) The data must be left accessible even if glusterfs is not used on the bricks any longer - without copying "back". This part is already true in practically all cases. You can ignore the .glusterfs directory and extra xattrs, or nuke them, and you have a perfectly normal file/directory structure that's usable as-is. The exceptions are if you use striping or erasure coding, but (like RAID) those are fundamentally ways of slicing and dicing data across storage units so some reassembly would be necessary. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Rebalancing with no new bricks.
On Wed, 12 Jun 2013 09:04:30 -0400 Jeff Darcy wrote: > [...] > that need to be moved, it shouldn't be too hard to combine this little > bag of tricks into a solution that meets your needs. Just let me know > if you'd like me to assist. The true question is indeed: why does he need "tricks" at all to come to something obvious for humans: a way of distributing files over the glusterfs so that "full" means really all bricks are full. It cannot be the right way to design software (for humans) so that they have to adapt to the software. Instead the software should be able to adapt to the users' needs and situation. It is very obvious today that bricks can be of different size. In fact I always thought it would be a big advantage of glusterfs to be able to use what's already there and make more out of it (just as linux did from the first day on). Which means for me: 1) It must be easy to deploy to an already filled fileserver => no need to copy data over onto the "new" glusterfs (soft migration). 2) whatever layout the bricks are glusterfs must be able to follow the obvious: if there is space left then use it. 3) The data must be left accessible even if glusterfs is not used on the bricks any longer - without copying "back". -- Regards, Stephan ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Rebalancing with no new bricks.
On 06/11/2013 05:55 PM, Harry Mangalam wrote: > I understand that gluster does not allow gluster-rebalancing in the same > config (ie, without adding a brick), but would moving some offending > files to another fs and then copying them back tend to rebalance the FS > by distributing the copied files more evenly? The problem is that no files will move unless the layout (the hash-range information used to place files) changes, and the layout won't change unless the set of bricks does, so a rebalance without adding/removing bricks would end up being a no-op. However, there are two other ways this kind of imbalance can be addressed. Firstly, we can observe that GlusterFS will still find files even if they're in the "wrong" place according to the layout. It is possible to move files manually from one brick to another, so long as appropriate care is taken e.g. to ensure that all extended attributes are preserved. The problem with this approach is that such careful placement might be undone the next time you do a rebalance for other reasons. The second approach avoids that problem, but is a bit trickier in other ways. There is a feature that allows a user to define their own layout for a directory, so that it will be left alone by rebalance. There's even a script in your friendly neighborhood source tree (extras/rebalance.py) that will use this to set a layout that's weighted according to the size of bricks (or free space) instead of just treating all as equal. You can use that as an example of how to construct and apply a valid user-defined layout, plus other techniques that are likely to be useful. Lastly, we have a feature that setting the fake "distribute.migrate-data" xattr on a file will cause use to re-evaluate where it should go and migrate it to the correct brick if necessary. Since you seem to be dealing with a relatively small number of files that need to be moved, it shouldn't be too hard to combine this little bag of tricks into a solution that meets your needs. Just let me know if you'd like me to assist. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users