Re: [PERFORM] Reliability recommendations

Mark Lewis Wed, 15 Feb 2006 09:32:47 -0800

Machine 1: $2000
Machine 2: $2000
Machine 3: $2000

Knowing how to rig them together and maintain them in a fully fault-
tolerant way: priceless.



(Sorry for the off-topic post, I couldn't resist).

-- Mark Lewis

On Wed, 2006-02-15 at 09:19 -0800, Craig A. James wrote:
> Jeremy Haile wrote:
> > We are a small company looking to put together the most cost effective
> > solution for our production database environment.  Currently in
> > production Postgres 8.1 is running on this machine:
> > 
> > Dell 2850
> > 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
> > 4 GB DDR2 400 Mhz
> > 2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
> > 4 x 146 GB 10K SCSI RAID 10 (for postgres data)
> > Perc4ei controller
> > 
> > ... I sent our scenario to our sales team at Dell and they came back with
> > all manner of SAN, DAS, and configuration costing as much as $50k.
> 
> Given what you've told us, a $50K machine is not appropriate.
> 
> Instead, think about a simple system with several clones of the database and 
> a load-balancing web server, even if one machine could handle your load.  If 
> a machine goes down, the load balancer automatically switches to the other.
> 
> Look at the MTBF figures of two hypothetical machines:
> 
>  Machine 1: Costs $2,000, MTBF of 2 years, takes two days to fix on average.
>  Machine 2: Costs $50,000, MTBF of 100 years (!), takes one hour to fix on 
> average.
> 
> Now go out and buy three of the $2,000 machines.  Use a load-balancer front 
> end web server that can send requests round-robin fashion to a "server farm". 
>  Clone your database.  In fact, clone the load-balancer too so that all three 
> machines have all software and databases installed.  Call these A, B, and C 
> machines.
> 
> At any given time, your Machine A is your web front end, serving requests to 
> databases on A, B and C.  If B or C goes down, no problem - the system keeps 
> running.  If A goes down, you switch the IP address of B or C and make it 
> your web front end, and you're back in business in a few minutes.
> 
> Now compare the reliability -- in order for this system to be disabled, you'd 
> have to have ALL THREE computers fail at the same time.  With the MTBF and 
> repair time of two days, each machine has a 99.726% uptime.  The "MTBF", that 
> is, the expected time until all three machines will fail simultaneously, is 
> well over 100,000 years!  Of course, this is silly, machines don't last that 
> long, but it illustrates the point:  Redundancy is beats reliability (which 
> is why RAID is so useful). 
> 
> All for $6,000.
> 
> Craig
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Reliability recommendations

Reply via email to