Re: [Lustre-discuss] Rule of thumb for setting up lustre resources...

2008-06-17 Thread Brian J. Murrell
On Tue, 2008-06-17 at 09:40 -0400, Mark True wrote:
 Thanks so much for the prompt response, I do have a couple of
 questions for clarification:
 
 Does the hardware makeup of the OSS affect the speed of the OSTs?

Of course.  An OST is only going to go as fast as the hardware that it's
made up of.  If you put a slow disk in an OSS, the OST is going to be
slow.

 If so, what is likely to be the bottleneck in an OSS.

There is no one right answer to that.  You have to get together with
your hardware vendor and explain your use scenario (i.e. for an OSS) and
have them spec out some hardware that meets the use-case.  If you will
be spec'ing your own hardware then you need to grab the technical
specifications for all of the hardware you are proposing using and
understand their performance aspects.  If you don't feel confident in
being able to do the latter, then I would suggest you do the former.

In general, an OSS is I/O bound.  You need to provide enough I/O
capacity between the disk and network through which the data will
travel.

 Say we have an OSS with 3 OSTs attatched, is that different than
 having
 three OSSs with 1 OST apiece?

That depends on whether that OSS with the 3 OSTs attached has the I/O
capacity to do full-out I/O to all three disks.  As I've said before,
this is basically an exercise in understanding the capacity of your
entire I/O path from OST to client and sizing to meet that capacity.

 Also, does the OSS have as much of a performance impact as the speed
 of the
 OST.

The OSS hosts the OST, so you can't really compare the performance
impact of one vs. the other.

 What is the recommended max number of OSTs per OSS?

As you have probably gathered, that is *completely* dependent on the
hardware you are using for OSSes and OSTs and there is no one answer
that fits all hardware.  You just want to make sure you don't create
bottlenecks, again by understanding the capacity of the various paths
from OST (disk) to client.

 If I am able to determine the max capabilities of an OST/OSS is it
 safe to
 assume that the increase in performance scales linearly as I increase
 the
 number of OSS/OSTs?

Yes, raw bandwidth will grow pretty linearly.  It is up to your
applications and use-cases to take advantage of that though.

b.




signature.asc
Description: This is a digitally signed message part
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Rule of thumb for setting up lustre resources...

2008-06-16 Thread Brian J. Murrell
On Sat, 2008-06-14 at 14:22 -0400, Mark True wrote:
 
 Hello!

Hi.

 A If increasing the number of OSTs increases throughput, is there a
 relationship that can be used to determine how many OSTs we're likely
 to need at the outset to establish a baseline minimum throughput.

Of course.

 For examples, if I want to get 3GB sustained throughput how many OSTs
 will facilitate this.

That is _completely_ dependent on your hardware configuration.  If you
are adding an identical (to an existing) OSTs you can simply use the
speed of the existing OST to determine how much more the new OST will
add.  But be very careful of ceilings.  You can of course only add so
many OSTs before you start to hit other resource limitations such as bus
bandwidth in the OSS and network bandwidth of the OSS's interconnect,
etc.  In short, you need to understand the performance capability of all
of your components to come up with an overall design that meets your
performance goals and scales to future goals.

 B Does the MGS and MDS have to be separate for best performance, or
 can they be consolidated into one server without causing too much
 hardship

I'd tend to say that most people put them into the same server.  For
anything but toy installations however, we strongly suggest you put
the MGS and MDT on separate devices.

 C  Right now I am looking at a model where I am connecting all the
 OSTs, and the MDS/MGS together using infiniband,

Just to keep the nomenclature straight, an OST is a device (i.e. a disk)
in/attached to an OSS.  An OSS is the server that serves OSTs.

 and connecting the storage via fibrechannel.   Is this the ideal
 solution or am I going in the wrong direction.  

That sounds suitable.

b.



signature.asc
Description: This is a digitally signed message part
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Rule of thumb for setting up lustre resources...

2008-06-16 Thread Klaus Steden

Hi Mark,

See my comments inline below.

cheers,
Klaus

On 6/14/08 11:22 AM, Mark True [EMAIL PROTECTED]did etch on stone
tablets:

 
 Hello!
 
 I am new to the list, but I have been researching Lustre for quite some time
 and finally have an occasion to use it.  I am trying to do some capacity
 planning and I am wondering if there are some general rules of thumb for
 configuring a Lustre environment.
 
 Specifically:
 
 A If increasing the number of OSTs increases throughput, is there a
 relationship that can be used to determine how many OSTs we're likely to need
 at the outset to establish a baseline minimum throughput.  For examples, if I
 want to get 3GB sustained throughput how many OSTs will facilitate this.
 
 B Does the MGS and MDS have to be separate for best performance, or can they
 be consolidated into one server without causing too much hardship
 

 C  Right now I am looking at a model where I am connecting all the OSTs, and
 the MDS/MGS together using infiniband, and connecting the storage via
 fibrechannel.   Is this the ideal solution or am I going in the wrong
 direction.  

This is a good solution, and will give you good performance overall,
although you can mix different storage technologies and network technologies
within the same storage environment and it should remain relatively
transparent. I've got a cluster that handles both FC storage and iSCSI
storage, but I know there are people out there using DRBD, and I'm dying to
try Infiniband-based storage as well. Anything that presents a block device
to an OSS should be suitable for use with Lustre, but some will perform
better than others.

Bottom line, I think, is pick the best technology for your price range and
performance needs. Infiniband + FC is pretty much the top of the mountain,
though.
 
 D Just wondering what clustering software people use on the front end with
 Lustre typically, if they are going to be using this as a filesystem for some
 kind of HPC environment, what is the most popular clustering technology for
 this.
 
Our CFS clusters are all organized as part of ROCKS clusters. I know a
number of people on this list are on the ROCKS list, so there's good
cross-pollination between technologies. It's a mature cluster architecture
designed for HPC, and bundles a number of useful solutions and tools onboard
(MPI, SGE, Torque, distributed compilers, visualization, etc.). It's also
relatively easy to integrate with Lustre, as you can simply drop in the
pre-built Lustre RPMs into the cluster installer and be ready to go in a few
minutes.

 E Does Heartbeat install next to whatever HPC clustering technology you have?

I'm using Linux-HA, and it wasn't built into my cluster software distro, but
it was easy enough to drop into the mix, and as of late last year had native
disk support for Lustre file systems.
 
 Thanks, and I hope that I can soon be someone who contributes rather than just
 asking questions :)
 
 --Mark T.
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Rule of thumb for setting up lustre resources...

2008-06-15 Thread Mark True
Hello!

I am new to the list, but I have been researching Lustre for quite some time
and finally have an occasion to use it.  I am trying to do some capacity
planning and I am wondering if there are some general rules of thumb for
configuring a Lustre environment.

Specifically:

A If increasing the number of OSTs increases throughput, is there a
relationship that can be used to determine how many OSTs we're likely to
need at the outset to establish a baseline minimum throughput.  For
examples, if I want to get 3GB sustained throughput how many OSTs will
facilitate this.

B Does the MGS and MDS have to be separate for best performance, or can
they be consolidated into one server without causing too much hardship

C  Right now I am looking at a model where I am connecting all the OSTs,
and the MDS/MGS together using infiniband, and connecting the storage via
fibrechannel.   Is this the ideal solution or am I going in the wrong
direction.

D Just wondering what clustering software people use on the front end with
Lustre typically, if they are going to be using this as a filesystem for
some kind of HPC environment, what is the most popular clustering technology
for this.

E Does Heartbeat install next to whatever HPC clustering technology you
have?

Thanks, and I hope that I can soon be someone who contributes rather than
just asking questions :)

--Mark T.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss