Re: [Gluster-users] Rebalancing with no new bricks.

2013-06-12 Thread Harry Mangalam
Thanks Jeff,

Thru some happy coincidences, a few brutal threats, and some executive
liquidations, the problem was averted.  We dropped 40TB in an hour and the
storage load over the bricks equalized as it did so.  So we're good for a
while.  However, I'll look into your suggestions (THANKS!!) and experiment
with them on some test files.

Re: the SvK's suggestions, I often do on-brick file manipulations to
resolve problems and especially do recursive ops. It's not a user-level
solution, but it does allow sysadmin flexibility. ie, we use a small util
to fork recursive 'du's to each brick and then sum them afterwards.  It's
easily 10-1000x faster than doing it via gluster, especially on our ZOT
file dirs. As is the usual case, it was writ for our specific case, but if
anyone wants it as skel for their instance, happy to share.

hjm



On Wed, Jun 12, 2013 at 6:04 AM, Jeff Darcy  wrote:

> On 06/11/2013 05:55 PM, Harry Mangalam wrote:
> > I understand that gluster does not allow gluster-rebalancing in the same
> > config (ie, without adding a brick), but would moving some offending
> > files to another fs and then copying them back tend to rebalance the FS
> > by distributing the copied files more evenly?
>
> The problem is that no files will move unless the layout (the hash-range
> information used to place files) changes, and the layout won't change
> unless the set of bricks does, so a rebalance without adding/removing
> bricks would end up being a no-op.  However, there are two other ways
> this kind of imbalance can be addressed.
>
> Firstly, we can observe that GlusterFS will still find files even if
> they're in the "wrong" place according to the layout.  It is possible to
> move files manually from one brick to another, so long as appropriate
> care is taken e.g. to ensure that all extended attributes are preserved.
>  The problem with this approach is that such careful placement might be
> undone the next time you do a rebalance for other reasons.
>
> The second approach avoids that problem, but is a bit trickier in other
> ways.  There is a feature that allows a user to define their own layout
> for a directory, so that it will be left alone by rebalance.  There's
> even a script in your friendly neighborhood source tree
> (extras/rebalance.py) that will use this to set a layout that's weighted
> according to the size of bricks (or free space) instead of just treating
> all as equal.  You can use that as an example of how to construct and
> apply a valid user-defined layout, plus other techniques that are likely
> to be useful.  Lastly, we have a feature that setting the fake
> "distribute.migrate-data" xattr on a file will cause use to re-evaluate
> where it should go and migrate it to the correct brick if necessary.
> Since you seem to be dealing with a relatively small number of files
> that need to be moved, it shouldn't be too hard to combine this little
> bag of tricks into a solution that meets your needs.  Just let me know
> if you'd like me to assist.
>
>
>


-- 
Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
[m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
415 South Circle View Dr, Irvine, CA, 92697 [shipping]
MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Rebalancing with no new bricks.

2013-06-12 Thread Stephan von Krawczynski
On Wed, 12 Jun 2013 09:57:15 -0400
Jeff Darcy  wrote:

> On 06/12/2013 09:46 AM, Stephan von Krawczynski wrote:
> > The true question is indeed: why does he need "tricks" at all to come to
> > something obvious for humans: a way of distributing files over the glusterfs
> > so that "full" means really all bricks are full.
> 
> That's my view too.  I keep trying.
> 
> > 3) The data must be left accessible even if glusterfs is not used on the
> > bricks any longer - without copying "back".
> 
> This part is already true in practically all cases.  You can ignore the 
> .glusterfs directory and extra xattrs, or nuke them, and you have a perfectly 
> normal file/directory structure that's usable as-is.  The exceptions are if 
> you 
> use striping or erasure coding, but (like RAID) those are fundamentally ways 
> of 
> slicing and dicing data across storage units so some reassembly would be 
> necessary.

I only tried to list _the_ major advantages glusterfs could/should have over
almost all other competitors.
Of special importance is "1)" and "3)" because it allows everyone to "go and
try" without having to fiddle around with tons of data.
"2)" is convenience, something a good piece of software should deliver :-)

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Rebalancing with no new bricks.

2013-06-12 Thread Jeff Darcy

On 06/12/2013 09:46 AM, Stephan von Krawczynski wrote:

The true question is indeed: why does he need "tricks" at all to come to
something obvious for humans: a way of distributing files over the glusterfs
so that "full" means really all bricks are full.


That's my view too.  I keep trying.


3) The data must be left accessible even if glusterfs is not used on the
bricks any longer - without copying "back".


This part is already true in practically all cases.  You can ignore the 
.glusterfs directory and extra xattrs, or nuke them, and you have a perfectly 
normal file/directory structure that's usable as-is.  The exceptions are if you 
use striping or erasure coding, but (like RAID) those are fundamentally ways of 
slicing and dicing data across storage units so some reassembly would be necessary.


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Rebalancing with no new bricks.

2013-06-12 Thread Stephan von Krawczynski
On Wed, 12 Jun 2013 09:04:30 -0400
Jeff Darcy  wrote:

> [...]
> that need to be moved, it shouldn't be too hard to combine this little
> bag of tricks into a solution that meets your needs.  Just let me know
> if you'd like me to assist.

The true question is indeed: why does he need "tricks" at all to come to
something obvious for humans: a way of distributing files over the glusterfs
so that "full" means really all bricks are full.
It cannot be the right way to design software (for humans) so that they have
to adapt to the software. Instead the software should be able to adapt to the
users' needs and situation. It is very obvious today that bricks can be of
different size.

In fact I always thought it would be a big advantage of glusterfs to be able
to use what's already there and make more out of it (just as linux did from
the first day on).
Which means for me:
1) It must be easy to deploy to an already filled fileserver => no need to
copy data over onto the "new" glusterfs (soft migration).
2) whatever layout the bricks are glusterfs must be able to follow the
obvious: if there is space left then use it.
3) The data must be left accessible even if glusterfs is not used on the
bricks any longer - without copying "back".

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Rebalancing with no new bricks.

2013-06-12 Thread Jeff Darcy
On 06/11/2013 05:55 PM, Harry Mangalam wrote:
> I understand that gluster does not allow gluster-rebalancing in the same
> config (ie, without adding a brick), but would moving some offending
> files to another fs and then copying them back tend to rebalance the FS
> by distributing the copied files more evenly?  

The problem is that no files will move unless the layout (the hash-range
information used to place files) changes, and the layout won't change
unless the set of bricks does, so a rebalance without adding/removing
bricks would end up being a no-op.  However, there are two other ways
this kind of imbalance can be addressed.

Firstly, we can observe that GlusterFS will still find files even if
they're in the "wrong" place according to the layout.  It is possible to
move files manually from one brick to another, so long as appropriate
care is taken e.g. to ensure that all extended attributes are preserved.
 The problem with this approach is that such careful placement might be
undone the next time you do a rebalance for other reasons.

The second approach avoids that problem, but is a bit trickier in other
ways.  There is a feature that allows a user to define their own layout
for a directory, so that it will be left alone by rebalance.  There's
even a script in your friendly neighborhood source tree
(extras/rebalance.py) that will use this to set a layout that's weighted
according to the size of bricks (or free space) instead of just treating
all as equal.  You can use that as an example of how to construct and
apply a valid user-defined layout, plus other techniques that are likely
to be useful.  Lastly, we have a feature that setting the fake
"distribute.migrate-data" xattr on a file will cause use to re-evaluate
where it should go and migrate it to the correct brick if necessary.
Since you seem to be dealing with a relatively small number of files
that need to be moved, it shouldn't be too hard to combine this little
bag of tricks into a solution that meets your needs.  Just let me know
if you'd like me to assist.


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users