On Mon, Jul 28, 2003 at 10:36:27AM -0500, DP Support wrote:
> Matthew,
> That explains it perfectly!  However, let me throw this on the pile and let me have 
> your opinion on it:
> 1) Hardware load balancing is an option (or so I found out from the people that 
> built the server) .. Using ATA133 RAID.  Being this is an option, should I hardware 
> LB or software LB?

Load balancing is a different thing to RAID.  As to whether you should
use hardware RAID: probably yes, as the hardware RAID will take some
of the load off the main CPU thus maximizing performance.  Plus some
HW raid can implement write caching using non-volatile memory, which
means that an unexpected power cut can't damage the file system or
cause data loss.  That grade of RAID controller tends to be fairly
expensive though.  However, it's not cut and dried that HW RAID will
out-perform SW RAID, particularly when doing RAID 0 or RAID 1.  You'll
have to experiment to find out which works better.

Nb. Load balancing usually refers to having two or more web servers to
serve up the same data.  You can implement it in a hackish way using
an apache proxy as described in the apache mod_rewrite docs.  However,
for higher throughput situations it's generally the case that a
specialist piece of kit, like Cisco Arrowpoint or Alteon AceDirector
is used.
> 2) Personally, I like the idea of having 400gbs available.  This server is a web / 
> database / email / dns server.  However, I'm desperately trying to see this in a 
> more advanced light than "WOW, 400 gigs! WOOHOO!" :)  That being said, is Raid 1 the 
> best way to go?  If it slows down writes, will it be noticeable?  The one thing I 
> DON'T want is noticeable speed lost.  Is the Raid becoming corrupt in Raid 0 a 
> common thing? I generally don't add to servers, I buy new ones.. So when I outgrow 
> this production machine, it will get demoted to something else and I'll buy a bigger 
> better production server. (meaning the chances of me adding additional harddisks is 
> unlikely)

Nope -- RAID 1 (mirroring) would be strongly recommended for a
production server, especially if it's at a remote site.  The write
performance won't be an issue: the gains you make on read performance
will far outway any penalties.

RAID becoming corrupt is about as likely as a regular filesystem
becoming corrupt.  Generally it takes some sort of hardware failure to
cause it.  Bad memory in a hardware RAID controller can be
particularly agravating, but it's a fairly unusual occurrence.
> 3) Does it matter that I'm planning an offsite location?  Essentially, I'll backup 
> all the web / email / db stuff to a server offsite.  Although my backup server isn't 
> as big as my production server (right now), I don't have 400 gigs of crap. I figure 
> when I've outgrown my backup server, I can simply replace it.

If the machine is offsite, and particularly if it's used for critical
services, then you should concentrate on making it as resilient as
possible, if only to minimize the amount of time you have to spend
travelling to the hosting center.  It's a judgement call though -- all
of these niceties like HW RAID cost money, and it's basically up to
you to decide if it's worth it.
> Many thanks for all your help.. It's greatly appreciated!

No problem.


> > On Sun, Jul 27, 2003 at 06:20:27PM -0500, Duane Stark wrote:
> > 
> > > To preface this, I'm not OS retarded - just BSD retarded ;)  I haven't had to 
> > > mess with my current BSD server since I bought it, and now I have purchased a 
> > > new p4 3.0(something), 2 gig ram, 2 IDE 200gig HD's to replace it..
> > > 
> > > Here is my question:
> > > 
> > > How do I setup these multiple drives?  What does the "industry" recommend when 
> > > it comes to setting them up?  Should I set BSD up to think its one datasource 
> > > (so 400gig) and then run from that? Or do I setup 1 drive to hold my 
> > > web/mail/mysql, and the other to do something elsE?
> > > 
> > > I'm totally lost, so any help would be greatly appericated.. PLEASE don't assume 
> > > I know what your talking about, because it's a given that I dont! heh :)
> > 
> > The only possible answer is "it depends".  With disks there are 3
> > characteristics that you can modify the balance between depending on
> > your needs.  Those are resilience, available space and access speed.
> > There's also a fourth consideration, which may affect your choice but
> > that has little effect during the day-to-day operation of the system,
> > which is the amount of time and effort you're prepared to put into
> > doing sys-adminly things.
> > 
> > Now, you've only got two disks, so that immediately rules out any
> > choices involving RAID5.  You make no mention of any sort of hardware
> > raid controller, so I'll assume that isn't a possibility either.
> > 
> > That leaves essentially 3 choices:
> > 
> >    i) No RAID at all.  This scores highly on the ease of admin, as
> >       it's the default way things are set up by sysinstall.  Just
> >       partition the disks, put filesystems on them and set up
> >       /etc/fstab so the partitions get mounted in appropriate
> >       locations.  I'll take this as the baseline to compare the other
> >       setups to.
> > 
> >   ii) RAID 0 or disk striping.  This creates one synthetic 400Gb
> >       partition from your two actual drives, by writing alternate
> >       blocks of data to each drive.  The block size is configurable:
> >       at one extreme you could make the block size the same as the raw
> >       disk size, in which case you'ld end up appending one disk to the
> >       end of the other.  However, the greatest advantage occurs when
> >       the block size is round about the same size as the system can
> >       read from the drive in one gulp.  This spreads the load of any
> >       IO evenly of the two drives and should maximize performance.
> > 
> >       The bad news is that if either of the disks becomes faulty, then
> >       all of the disk space on your system will be unavailable.  As
> >       you add disks to the stripe, this problem becomes more and more
> >       acute, so this setup is generally not used very much unless in
> >       combination with RAID 5 or RAID 1 to give higher resilience.
> > 
> >  iii) RAID 1 or mirroring.  Each drive contains a complete copy of all
> >       of the data, maintained in parallel.  The advantages are
> >       improved resilience -- the system should just keep chugging
> >       along merrily even if one of the drives self destructs -- and
> >       improved IO performance on reads -- writes have to go to both
> >       drives, which takes only slightly longer than writing to a
> >       single drive, but reads can go to either drive which gives you
> >       much better performance.  (The biggest factor is the
> >       milliseconds it takes to position the head and wait for the
> >       drive to turn round until the correct block is under the head.
> >       Talking between the CPU, RAM and the disk electronics takes of
> >       the order of microseconds.)
> > 
> >       The bad news is that you've got to sacrifice half of your
> >       potentially available disk capacity.  However, assuming that the
> >       resulting space is adequate for your needs, a mirrored root disk
> >       setup is pretty standard for server machines.
> > 
> > Either of ii) and iii) will probably entail your learning about
> > vinum(8) as the best available mechanism for doing software RAID on
> > FreeBSD.  The alternatives are not that hot: ccd(4) is pretty ancient
> > and doesn't offer any means of recovering a mirrored partition than
> > backup and re-install should one drive fail.  I've heard that NetBSD's
> > raidframe stuff is being ported to FreeBSD, but I don't think it's
> > ready for primetime use yet.
> > 
> > See
> > http://www.freebsd.org/doc/en_US.ISO8859-1/articles/vinum/index.html
> > for a thorough introduction to vinum bootstrapping,
> > http://www.vinumvm.org/ for general information and
> > http://org.netbase.org/vinum-mirrored.html for a quick HOWTO set up a
> > bootable vinum root drive.
> > 
