Your reply makes all sense to me. I remember that auto-heal happens at
file reading; doest that mean opening a file for read is also a global
operation? Do you mean that there's no other way of copying 30 million
files to our 66-node glusterfs cluster for parallel processing other
than waiting for half a month? Can I somehow disable self-heal and get
a seedup?
Things turn out to be too bad for me.
- Wei
Mark Mielke wrote:
On 09/28/2009 10:35 AM, Wei Dong wrote:
Hi All,
I noticed a very weird phenomenon when I'm copying data (200KB image
files) to our glusterfs storage. When I run only run client, it
copies roughly 20 files per second and as soon as I start a second
client on another machine, the copy rate of the first client
immediately degrade to 5 files per second. When I stop the second
client, the first client will immediately speed up to the original 20
files per second. When I run 15 clients, the aggregate throughput is
about 8 files per second, much worse than running only one client.
Neither CPU nor network is saturated. My volume file is attached.
The servers are running on a 66 node cluster and the clients are a
15-node cluster.
We have 33x2 servers and at most 15 separate machines, with each
server serving < 0.5 clients on average. I cannot think of a reason
for a distributed system to behave like this. There must be some
kind of central access point.
Although there is probably room for the GlusterFS folk to optimize...
You should consider directory write operations to involve the whole
cluster. Creating a file is a directory write operation. Think of how
it might have to do self-heal across the cluster, make sure the name
is right and not already in use across the cluster, and such things.
Once you get to reads and writes for a particular file, it should be
distributed.
Cheers,
mark
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users