Re: [Gluster-users] any configuration guidelines?

2009-07-30 Thread Wei Dong
Thanks a lot for your insightful reply; it really clarifies a lot of 
things.  I think the DHT-AFR-DHT configuration makes a lot of sense.


- Wei Dong


Harald Stürzebecher wrote:

Hi,

2009/7/28 Wei Dong :
  

Hi All,

We've been using GlusterFS 2.0.1 on our lab cluster to host a large number
of small images for distributed processing with Hadoop and it has been
working fine without human intervention for a couple of months.  Thanks for
the wonderful project -- it's the only freely available cluster filesystem
that fits our needs.

What keeps bothering me is the extremely high flexibility of ClusterFS.
 There's simply so many ways to achieve the same goal that I don't know
which is the best.  So I'm writing to ask if there are some general
guidelines of configuration to improve both data safety and performance.



AFAIK, there are some general guidelines in the GlusterFS documentation.
IMHO, sometimes it takes careful reading or some experimentation to find them.
Some examples have been discussed on the mailing list.

  

Specifically, we have 66 machines (in two racks) with 4 x 1.5TB disks /
machine.  We want to aggregate all the available disk space into a single
shared directory with 3 replications..  Following are some of the potential
configurations.

*  Each node exports 4 directories, so there are 66x4 = 264 directories to
the client.  We then first group those directories into threes with AFR,
making 88 replicated directories, and then aggregate them with DHT.  When
configuring AFR, we can either make the three replicates on different
machines, or two on the same machine and the third on another machine.



I'd put the three replicates on three different machines - three
machines are less likely to fail than just two.

One setup on my list of setups to evaluate would be a DHT - AFR - DHT
configuration.
- aggregate the four disks on each server to a single volume, export
only that volume
- on the clients, group those 66 volumes into threes with AFR and
aggregate with DHT
That would reduce the client config file from 264 imported volumes to
66, reducing complexity of the configuration and the number of open
connections

  

*  Each node first aggregates three disks (forget about the 4th for
simplicity) and exports a replicated directory.  The client side then
aggregates the 66 single replicated directory into one.



That might mean that access to some of the data is lost if one node
fails - not what I'd accept from a replicated setup.

  

* When grouping the aggregated directories on the client side, we can use
some kind of hierarchy.  For example the 66 directories are first aggregated
into groups of N each with DHT, and then the 66/N groups are again
aggregated with DHT.



Doesn't that just make the setup more complicated?

  

*  We don't do the grouping on the client side.  Rather, we use some
intermediate server to first aggregate small groups of directories with DHT
and export them as a single directory.



The network connection of the intermediate server might become a
bottleneck, limiting performance.
The intermediate server might become a single point of failure.

  

* We can also put AFR after DHT
..

To make things more complicated, the 66 machines are separated into two
racks with only 4-gigabit inter-rack connection, so all the directories
exported by the servers are not equal to a particular client.



A workaround might be to create two intermediate volumes that each
perform better when accessed from on one of the racks and use NUFA to
create the single volume.

Keeping replicated data local to one rack would improve performance,
but the failure of one complete rack (e.g. power line failure,
inter-rack networking) would block access to half of your data.

Getting a third rack and much faster inter-rack connection would
improve performance and protect better against failures - just place
the three copies of a file on different racks. ;-)

  

I'm wondering if someone on the mailing list could provide me with some
advice.



Plan, build, test ... repeat until satisfied :-)

Optional: share your solution, with benchmarks


IMHO, there won't be a single "best" solution.


Harald Stürzebecher
  


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] any configuration guidelines?

2009-07-30 Thread Harald Stürzebecher
Hi,

2009/7/28 Wei Dong :
> Hi All,
>
> We've been using GlusterFS 2.0.1 on our lab cluster to host a large number
> of small images for distributed processing with Hadoop and it has been
> working fine without human intervention for a couple of months.  Thanks for
> the wonderful project -- it's the only freely available cluster filesystem
> that fits our needs.
>
> What keeps bothering me is the extremely high flexibility of ClusterFS.
>  There's simply so many ways to achieve the same goal that I don't know
> which is the best.  So I'm writing to ask if there are some general
> guidelines of configuration to improve both data safety and performance.

AFAIK, there are some general guidelines in the GlusterFS documentation.
IMHO, sometimes it takes careful reading or some experimentation to find them.
Some examples have been discussed on the mailing list.

> Specifically, we have 66 machines (in two racks) with 4 x 1.5TB disks /
> machine.  We want to aggregate all the available disk space into a single
> shared directory with 3 replications..  Following are some of the potential
> configurations.
>
> *  Each node exports 4 directories, so there are 66x4 = 264 directories to
> the client.  We then first group those directories into threes with AFR,
> making 88 replicated directories, and then aggregate them with DHT.  When
> configuring AFR, we can either make the three replicates on different
> machines, or two on the same machine and the third on another machine.

I'd put the three replicates on three different machines - three
machines are less likely to fail than just two.

One setup on my list of setups to evaluate would be a DHT - AFR - DHT
configuration.
- aggregate the four disks on each server to a single volume, export
only that volume
- on the clients, group those 66 volumes into threes with AFR and
aggregate with DHT
That would reduce the client config file from 264 imported volumes to
66, reducing complexity of the configuration and the number of open
connections

> *  Each node first aggregates three disks (forget about the 4th for
> simplicity) and exports a replicated directory.  The client side then
> aggregates the 66 single replicated directory into one.

That might mean that access to some of the data is lost if one node
fails - not what I'd accept from a replicated setup.

> * When grouping the aggregated directories on the client side, we can use
> some kind of hierarchy.  For example the 66 directories are first aggregated
> into groups of N each with DHT, and then the 66/N groups are again
> aggregated with DHT.

Doesn't that just make the setup more complicated?

> *  We don't do the grouping on the client side.  Rather, we use some
> intermediate server to first aggregate small groups of directories with DHT
> and export them as a single directory.

The network connection of the intermediate server might become a
bottleneck, limiting performance.
The intermediate server might become a single point of failure.

> * We can also put AFR after DHT
> ..
>
> To make things more complicated, the 66 machines are separated into two
> racks with only 4-gigabit inter-rack connection, so all the directories
> exported by the servers are not equal to a particular client.

A workaround might be to create two intermediate volumes that each
perform better when accessed from on one of the racks and use NUFA to
create the single volume.

Keeping replicated data local to one rack would improve performance,
but the failure of one complete rack (e.g. power line failure,
inter-rack networking) would block access to half of your data.

Getting a third rack and much faster inter-rack connection would
improve performance and protect better against failures - just place
the three copies of a file on different racks. ;-)

> I'm wondering if someone on the mailing list could provide me with some
> advice.

Plan, build, test ... repeat until satisfied :-)

Optional: share your solution, with benchmarks


IMHO, there won't be a single "best" solution.


Harald Stürzebecher
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] any configuration guidelines?

2009-07-29 Thread Nathan Stratton

On Wed, 29 Jul 2009, Liam Slusser wrote:


The preferred way is using the client and not the backend server.  There is
some documentation somewhere about it - ill see if i can dig it up.


The downside is that I am using this for xen so I need to disable 
direct-io. This makes client side recovery painfully slow compared to 
server side.


-Nathan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] any configuration guidelines?

2009-07-29 Thread Liam Slusser
On Wed, Jul 29, 2009 at 1:22 PM, Nathan Stratton wrote:

> On Tue, 28 Jul 2009, Wei Dong wrote:
>
>  Hi All,
>>
>> We've been using GlusterFS 2.0.1 on our lab cluster to host a large number
>> of small images for distributed processing with Hadoop and it has been
>> working fine without human intervention for a couple of months.  Thanks for
>> the wonderful project -- it's the only freely available cluster filesystem
>> that fits our needs.
>>
>> What keeps bothering me is the extremely high flexibility of ClusterFS.
>> There's simply so many ways to achieve the same goal that I don't know which
>> is the best.  So I'm writing to ask if there are some general guidelines of
>> configuration to improve both data safety and performance.
>>
>
> Totally understand, I am facing many of the same issues, I am not sure if I
> should be doing replicate / distribute in the frontend client config or
> backend server configs.
>
>
> -Nathan


The preferred way is using the client and not the backend server.  There is
some documentation somewhere about it - ill see if i can dig it up.

ls
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] any configuration guidelines?

2009-07-29 Thread Nathan Stratton

On Tue, 28 Jul 2009, Wei Dong wrote:


Hi All,

We've been using GlusterFS 2.0.1 on our lab cluster to host a large number of 
small images for distributed processing with Hadoop and it has been working 
fine without human intervention for a couple of months.  Thanks for the 
wonderful project -- it's the only freely available cluster filesystem that 
fits our needs.


What keeps bothering me is the extremely high flexibility of ClusterFS. 
There's simply so many ways to achieve the same goal that I don't know which 
is the best.  So I'm writing to ask if there are some general guidelines of 
configuration to improve both data safety and performance.


Totally understand, I am facing many of the same issues, I am not sure if 
I should be doing replicate / distribute in the frontend client config or 
backend server configs.



-Nathan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] any configuration guidelines?

2009-07-28 Thread Wei Dong

Hi All,

We've been using GlusterFS 2.0.1 on our lab cluster to host a large 
number of small images for distributed processing with Hadoop and it has 
been working fine without human intervention for a couple of months.  
Thanks for the wonderful project -- it's the only freely available 
cluster filesystem that fits our needs.


What keeps bothering me is the extremely high flexibility of ClusterFS.  
There's simply so many ways to achieve the same goal that I don't know 
which is the best.  So I'm writing to ask if there are some general 
guidelines of configuration to improve both data safety and performance.


Specifically, we have 66 machines (in two racks) with 4 x 1.5TB disks / 
machine.  We want to aggregate all the available disk space into a 
single shared directory with 3 replications..  Following are some of the 
potential configurations.


*  Each node exports 4 directories, so there are 66x4 = 264 directories 
to the client.  We then first group those directories into threes with 
AFR, making 88 replicated directories, and then aggregate them with 
DHT.  When configuring AFR, we can either make the three replicates on 
different machines, or two on the same machine and the third on another 
machine.


*  Each node first aggregates three disks (forget about the 4th for 
simplicity) and exports a replicated directory.  The client side then 
aggregates the 66 single replicated directory into one.


* When grouping the aggregated directories on the client side, we can 
use some kind of hierarchy.  For example the 66 directories are first 
aggregated into groups of N each with DHT, and then the 66/N groups are 
again aggregated with DHT.


*  We don't do the grouping on the client side.  Rather, we use some 
intermediate server to first aggregate small groups of directories with 
DHT and export them as a single directory.


* We can also put AFR after DHT
..

To make things more complicated, the 66 machines are separated into two 
racks with only 4-gigabit inter-rack connection, so all the directories 
exported by the servers are not equal to a particular client.


I'm wondering if someone on the mailing list could provide me with some 
advice.


Thanks a lot.

- Wei
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users