On Tuesday 15 May 2007 12:26:08 am Kevin Wise wrote:
> I'd really like one piece of hardware
> that does both.  In my mind this would save me maintenance (fewer
> patches to apply) and maybe even cost.  Any comments?

I can see value in separating firewall and other functionality, but I 
personally use one system for both, for just this reason.

> I Is 512 MB of RAM enough?

Plenty.  I have 1 GiB in mine, but that's mostly because I had extra RAM lying 
around from upgrading another box.

> Should I get 
> hardware RAID or software RAID?  In terms of importance to me,
> reliability is second only to cost.  I don't want my files to disappear
> because my single RAID controller failed and the drive is unreadable by
> another controller.

I use software RAID primarily for this reason, but there are other reasons as 
well.  A big one is flexibility.  With Linux MD RAID you can mix and match 
drives of different types and sizes with no problem, and you can use as many 
disks as you can pack into the box.  

I also made use of MD RAID's flexibility to make adding new disks easier.  I 
partitioned my drives into small (50GB) pieces and constructed multiple 
arrays (each array element on a different disk, obviously), then combined the 
RAID arrays into a large storage pool with LVM.  That way, when I need to add 
another disk I can add it to the running system by:

1.  Pick one physical volume (which is a RAID array) and use pvmove to migrate 
all of the data off of it.
2.  Remove the array from the volume group
3.  Destroy the array and rebuild it, adding another partition from the new 
disk.
4.  Add the resulting (larger) physical volume back into the volume group.
5.  Go back to step 1, until all arrays have been upgraded.

This approach takes a long time, but it's perfectly safe -- after a power 
outage pvmove picks up right where it left off, yes I have firsthand 
experience -- and the system continues running and serving files the whole 
time.  Last time I did it, I wrote a script to perform the operations.  The 
script took about 30 minutes to write and about four days to run.

Supposedly, someone is looking into giving MD the native ability to add 
another drive into RAID-5 arrays, which would make the partitioning + LVM 
stuff less necessary, but it hasn't happened yet.

One other thing to consider with your RAID configuration is hot spare vs. RAID 
6.  I use a hot spare, but I'm planning on rebuilding my system with RAID 6 
(one partition array at a time).  The odds of two drives failing at once are 
negligibly small, but I had a scare a few weeks ago when one of the RAID 5 
drives failed and while the system was rebuilding onto the hot spare, another 
drive had some transient error -- I think caused by a SATA controller driver 
bug, but I can't be sure.

The problem with RAID 5 is that the process of rebuilding a degraded RAID 5 
array is very intense, so if you have another drive with any latent problems, 
they'll probably crop up then -- the worst possible time.

I think I did the best possible thing I could do -- I immediately shut the 
machine down (and told the kids the video server was down, possibly for good) 
and thought things over for a full week.  I realized that if I could forcibly 
reconstruct each array with the exact sequence of drives that were running 
when the second failure occurred, I might be able to get it back.  Luckily, 
mdadm had e-mailed me the contents of /proc/mdstat, and that had the 
information I needed.

So I powered the machine back up, forcibly rebuilt an array (still in degraded 
mode) with --assume-clean, then added the spare and crossed my fingers while 
it recalculated parity and changed to non-degraded mode.  When that worked, I 
repeated with each of the other arrays, then held my breath while I 
reactivated LVM and then ran fsck on the file systems.  It worked and I 
didn't lose anything.

After that harrowing experience, I made two decisions:

1)  I need to be more diligent about backing up my important data.  I had most 
of it, but not all of it.
2)  I'm going to move to RAID 6 so that I can take two *simultaneous* disk 
failures and not lose anything.  That's better than RAID 5 with a hot spare, 
and much better than RAID 5 without a spare.

BTW, my system has 4 PATA and 2 SATA drives:

3 200 GB PATA
1 250 GB PATA
2 500 GB SATA

I have four PATA controllers (two on the mobo, two on a PCI card), so each 
drive is a master, for better performance.

I use 200 GB of five of the six drives for the main RAID 5 arrays, so I have 
800 GB of usable storage there.  One of the 200s is the hot spare.  The 500 
GB drives have 300 left over, so I mirrored that, for another 300 GB usable.  
All of that storage is in one big 1.1 TB volume.  The 50 GB left over on the 
250 GB drive is in a separate volume group, with bits carved out for various 
temp storage uses.  So I'm "wasting" 200 (hot spare) + 200 (parity on RAID-5) 
+ 300 (mirrored) = 700.

I'm soon going to add another 500.  When I do I'll add 200 of it to the 
existing RAID 5 (while converting it to RAID 6 and incorporating the current 
hot spare as an active disk), and I guess I'll have to change the mirrored 
300s to a RAID 5.  That'll get me to 1.65 TB usable of 2.35 TB total. I 
figure I'll go to 8 disks before I start replacing the small 200s, mainly 
because my server case has room for 8.

To support such a large number of drives I had to get a bigger PSU and some 
extra fans to keep everything cool.  BTW, an underpowered PSU causes very 
strange, intermittent drive failures :-)

> Also seems like a waste to buy new ATA drives
> (are they even available any more?). 

Sure, and they're priced basically the same as SATA drives.

> Another option of course is to buy 
> a SATA controller card.  Any idea how much that might cost?

They're cheap.  $20 or so from Newegg or the like.

        Shawn.
_______________________________________________
Ldsoss mailing list
Ldsoss@lists.ldsoss.org
http://lists.ldsoss.org/mailman/listinfo/ldsoss

Reply via email to