Depending on your needs and the size of your cluster, the out-of-band management can be of significant interest. It is a pretty simple cost/benefit analysis that trades your sysops time (which is probably about the equivalent of $50-150 per hour fully loaded and accounting for opportunity cost) versus the cost of IPMI cards. If it takes an extra hour of time to actually go to the data center per event and possibly another hour of time because the data center is a lousy place to work, then the IPMI card is probably about break-even. In our case, it is more than an hour of inconvenience and our systems guy has LOTs of things to do so the board's are a no-brainer.
You don't say here what size the disks are. Dual disks are a good idea for any number of reasons. I just saw a price this morning of about $170 for a 2TB drive and about half that for a 1TB drive so make sure you are doing at least that well. You are specifying only 4GB of RAM. I would account that as severely underpowering your machine. My own preference is to put 4-8x that much RAM on a machine with one or two quad core CPU's and four drives. That still fits in a 1U chassis and will out-perform several of the boxes that you are describing, although perhaps not exactly on a $/cycle even trade-off. There are also some very sweet twin setups where you get two beefy machines in a single 1U slot. Very nice. For instance, you can put two dual CPU quad core Nehalem processors with 48GB, a bunch of disk into 1U for about $14K including paying somebody to set up the machine and a 3 year maintenance contract. You should be able to do this yourself for $12K or less and this is equivalent to about something between 6 to 30 of the nodes that you are spec'ing (2 x 2 x 4 cores vs 4 cores = 4x (but round up because of fancier processors), 96GB vs 4 GB = 32x). Cut off another K$ or two because this is an older quote and the 2TB drives are much cheaper suddenly as well. On Wed, Sep 30, 2009 at 3:46 PM, Kevin Sweeney <ke...@yieldex.com> wrote: > I really appreciate everyone's input. We've been going back and forth on > the > server size issue here. There are a few reasons we shot for the $1k price, > one because we wanted to be able to compare our datacenter costs vs. the > cloud costs. Another is that we have spec'd out a fast Intel node with > over-the-counter parts. We have a hard time justifying the dual-processor > costs and really don't see the need for the big server extras like > out-of-band management and redundancy. This is our proposed config, feel > free to criticize :) > Supermicro 512L-260 Chassis $90 > Supermicro X8SIL $160 > Heatsink $22 > Intel 3460 Xeon $350 > Samsung 7200 RPM SATA2 2x$85 > 2GB Non-ECC DIMM 4x$65 > > This totals $1052. Doesn't this seem like a reasonable setup? Isn't the > purpose of a hadoop cluster to build cheap,fast, replaceable nodes? > > > > On Wed, Sep 30, 2009 at 9:06 PM, Ted Dunning <ted.dunn...@gmail.com> > wrote: > > > 2TB drives are just now dropping to parity with 1TB on a $/GB basis. > > > > If you want space rather than speed, this is a good option. If you want > > speed rather than space, more spindles and smaller disks are better. > > Ironically, 500GB drives now often cost more than 1TB drives (that is $, > > not > > $/GB). > > > > On Wed, Sep 30, 2009 at 7:33 AM, Patrick Angeles > > <patrickange...@gmail.com>wrote: > > > > > We went with 2 x Nehalems, 4 x 1TB drives and 24GB RAM. The ram might > be > > > overkill... but it's DDR3 so you get either 12 or 24GB. Each box has 16 > > > virtual cores so 12GB might not have been enough. These boxes are > around > > > $4k > > > each, but can easily outperform any $1K box dollar per dollar (and > > > performance per watt). > > > > > > If you're extremely I/O bound, you can get single-socket configurations > > > with > > > the same amount of drive spindles for really cheap (~$2k for single > proc, > > > 8-12GB RAM, 4x1TB drives). > > > > > > On Wed, Sep 30, 2009 at 10:19 AM, stephen mulcahy > > > <stephen.mulc...@deri.org>wrote: > > > > > > > Todd Lipcon wrote: > > > > > > > >> Most people building new clusters at this point seem to be leaning > > > towards > > > >> dual quad core Nehalem with 4x1TB 7200RPM SATA and at least 8G RAM. > > > >> > > > > > > > > We went with a similar configuration for a recently purchased cluster > > but > > > > opted for qual quad core Opterons (Shanghai) rather than Nehalems and > > > > invested the difference in more memory per node (16GB). Nehalem seem > to > > > > perform very well on some benchmarks but that performance comes at a > > > > premium. I guess it depends on your planned use of the cluster but in > a > > > lot > > > > of cases more memory may be better spent, especially if you plan on > > > running > > > > things like HBase on the cluster also (which we do). > > > > > > > > -stephen > > > > > > > > -- > > > > Stephen Mulcahy, DI2, Digital Enterprise Research Institute, > > > > NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland > > > > http://di2.deri.ie http://webstar.deri.ie http://sindice.com > > > > > > > > > > > > > > > -- > > Ted Dunning, CTO > > DeepDyve > > > > > > -- > Kevin Sweeney > Systems Engineer > Yieldex -- www.yieldex.com > (303) 999-7045 > -- Ted Dunning, CTO DeepDyve