Well... If you're not running HBase, you're less harmed by minimal swapping so you could push the number of slots and over subscribe. The only thing I would have to suggest is that you monitor your system closely as you adjust the number of slots.
You have to admit though, its fun to tune the cluster. :-) On Oct 3, 2012, at 12:09 PM, J. Rottinghuis <jrottingh...@gmail.com> wrote: > Of course it all depends... > But something like this could work: > > Leave 1-2 GB for the kernel, pagecache, tools, overhead etc. > Plan 3-4 GB for Datanode and Tasktracker each > > Plan 2.5-3 GB per slot. Depending on the kinds of jobs, you may need more > or less memory per slot. > Have 2-3 times as many mappers as reducers (depending on the kinds of jobs > you run). > > As Micheal pointed out the ratio of cores (hyperthreads) per disk matters. > > With those initial rules of thumb you'd arrive somewhere between > 10 mappers + 5 reducers > and > 9 mappers + 4 reducers > > Try, test, measure, adjust, rinse, repeat. > > Cheers, > > Joep > > On Tue, Oct 2, 2012 at 8:42 PM, Alexander Pivovarov > <apivova...@gmail.com>wrote: > >> All configs are per node. >> No HBase, only Hive and Pig installed >> >> On Tue, Oct 2, 2012 at 9:40 PM, Michael Segel <michael_se...@hotmail.com >>> wrote: >> >>> I think he's saying that its 24 maps 8 reducers per node and at 48GB that >>> could be too many mappers. >>> Especially if they want to run HBase. >>> >>> On Oct 2, 2012, at 8:14 PM, hadoopman <hadoop...@gmail.com> wrote: >>> >>>> Only 24 map and 8 reduce tasks for 38 data nodes? are you sure that's >>> right? Sounds VERY low for a cluster that size. >>>> >>>> We have only 10 c2100's and are running I believe 140 map and 70 reduce >>> slots so far with pretty decent performance. >>>> >>>> >>>> >>>> On 10/02/2012 12:55 PM, Alexander Pivovarov wrote: >>>>> 38 data nodes + 2 Name Nodes >>>>>>> >>>>>>> Data Node: >>>>>>> Dell PowerEdge C2100 series >>>>>>> 2 x XEON x5670 >>>>>>> 48 GB RAM ECC (12x4GB 1333MHz) >>>>>>> 12 x 2 TB 7200 RPM SATA HDD (with hot swap) JBOD >>>>>>> Intel Gigabit ET Dual port PCIe x4 >>>>>>> Redundant Power Supply >>>>>>> Hadoop CDH3 >>>>>>> max map tasks 24 >>>>>>> max reduce tasks 8 >>>> >>>> >>> >>> >>