> To what extent, is Gluster a good choice for the "many small files scenario", > as opposed to HDFS? Last I checked, hdfs would consume humongous memory > resources if the cluster has many small files, given its architecture. There > are some hackish solutions on top HDFS for the case of many small files > rather than huge files, but it would be nice to find a file system that > matches that scenario well as is. So I wonder how would Gluster do when > files are typically small.
We're not as bad as HDFS, but it's still not what I'd call a good scenario for us. While we have good space efficiency for small files, and we don't have a single-metadata-server SPOF either, the price we pay is a hit to our performance for creates (and renames). There are several efforts under way to improve this, but there's only so much we can do when directory contents must be consistent across the volume despite being spread across many bricks (or replica sets). More details on those efforts are here. http://www.gluster.org/community/documentation/index.php/Features/Feature_Smallfile_Perf _______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users