On Jul 13, 2006, at 3:03 PM, mos wrote:
At 03:45 PM 7/12/2006, Jon Frisby wrote:
This REALLY should be an academic concern. Either you have a system
that can tolerate the failure of a drive, or you do not. The
frequency of failure rates is pretty much irrelevant: You can train
incredibly non-technical (inexpensive) people to respond to a pager
and hot-swap a bad drive.
If you are in the position where the typical failure-rate of a class
of drive is of concern to you then either: A) You have a different
problem causing all your drives to fail ultra-fast (heat, electrical
noise, etc) or B) You haven't adequately designed your storage
subsystem.
It all depends how valuable your uptime is. If you double or triple
the time between hard disk failures, most people would pay extra
for that so they buy SCSI drive. You wouldn't take your family car
and race in the Indy 500, would you? After a few laps at 150 mph
(if you can get it going that fast), it will seize up, so you go
into the pit stop and what? Get another family car and drive that?
And keep doing that until you finish the race? Down time is
extremely expensive and embarrassing. Just talk to the guys at
FastMail who has had 2 outages even with hardware raid in place.
Recovery doesn't always work as smoothly as you think it should.
Again: Either your disk sub-system can TOLERATE (read: CONTINUE
OPERATING IN THE FACE OF) a drive failure, or it cannot. If you
can't hot-stop a dead drive, your system can't tolerate the failure
of a drive.
Your analogy is flawed. The fact that companies like Google are
running with incredibly good uptimes while using cheap, commodity
hardware (including IDE drives!) demonstrates it.
SCSI drives WILL NOT improve your uptime by a factor of 2x or 3x.
Using a hot-swappable disk subsystem, and having hot-spares WILL.
Designing your systems without needless single points of failure WILL.
Software RAID? Are you serious? No way!
You make a compelling case for your position, but I'm afraid I still
disagree with you. *cough*
If you're using RAID10, or other forms of RAID that don't involve
computing a checksum (and the "write hole" that accompanies it),
there's little need for hardware support. It won't make things
dramatically faster unless you spend a ton of money on cache -- in
which case you should seriously consider a SAN for the myriad other
benefits it provides. The "reliability" introduced by hardware RAID
with battery backups is pretty negligible if you're doing your I/O
right (I.E. you've made sure your drives aren't lying when they say a
write has completed AND you're using fsync -- which MySQL does).
-JF
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]