On 5/13/2010 6:24 PM, Craig Carl wrote:
Jeff -
Thanks for your email, I think I've got a grasp of your
environment now and I understand the problem. If we create a
"/gluster/small_files" and a "/gluster/large_files" your users are
unlikely to respect distinction, plus it is a management nightmare,
right?
If you have time I'd like your help writing a feature request that
would implement what you need. Something like -
Gluster should provide the option of distributing files based on size
to different volumes.
This distribution should be transparent to users.
This distribution only needs to happen the first time a file is written.
The Gluster administrator should have the ability to provide a file
size range for each volume.
The different volumes could be different types; mirror, stripe, mirror
& distribute, etc.
What have I missed?
Craig
That would be one solution. I would target another that I suspecr is
probably simpler:
Gluster should provide the option of pseudo-randomizing the distribution
of file stripes across volumes, so that all small files do not end up on
the same subvolume of a cluster/stripe.
This distribution should be transparent to users.
This distribution only needs to happen the first time a file is written
and may be based on the file name hash (a la cluster/distribute).
The net behavior could be such that small files (less that the
block-size) would have the same data distribution pattern as they would
have with cluster/distribute, while larger files (greater than the
stripe block-size) would have their upper blocks ditributed in a
round-robin from that starting place.
Given that the code already exists for distributing files based on
namehash in cluster/distribute I think this could be an easier feature
to add.
Jeff
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users