Re: performance not great, or did I miss something?
Thus spake Allen Wittenauer:: On 8/8/08 1:25 PM, James Graham (Greywolf) [EMAIL PROTECTED] wrote: 226GB of available disk space on each one; 4 processors (2 x dualcore) 8GB of RAM each. Some simple stuff: (Assuming SATA): Are you using AHCI? Do you have the write cache enabled? I will investigate this... Is the topologyProgram providing proper results? The whowhat, now? Is DNS performing as expected? Is it fast? DNS seems appropriately configured... How many tasks per node? four, I think, each of map and reduce. How much heap does your name node have? Is it going into garbage collection or swapping? Maybe GC; no swapping (our systems do not have swap allocated). -- James Graham (Greywolf) | 650.930.1138|925.768.4053 * [EMAIL PROTECTED] | Check out what people are saying about SearchMe! -- click below http://www.searchme.com/stack/109aa
performance not great, or did I miss something?
Greetings, I'm very very new to this (as you could probably tell from my other postings). I have 20 nodes available as a cluster, less one as the namenode and one as the jobtracker (unless I can use them too). Specs are: 226GB of available disk space on each one; 4 processors (2 x dualcore) 8GB of RAM each. The RandomWriter takes just over 17 minutes to complete; the Sorter takes well over three to four hours or more to complete on only about a half terabyte of data. This is certainly not the speed or power I had been led to expect from Hadoop, so I am guessing I have some things tuned wrong (actually, I'm certain some are tuned wrong as during the reduce phase, I'm seeing processes die from lack of memory...). Given the above hardware specs, what should I expect as a theoretical maximum throughput? machines 3-10 are on 1GbE, machines 11-20 are on a second 1GbE, connected by a mutual 1GbE upstream (another switch). -- James Graham (Greywolf) | 650.930.1138|925.768.4053 * [EMAIL PROTECTED] | Check out what people are saying about SearchMe! -- click below http://www.searchme.com/stack/109aa
Re: performance not great, or did I miss something?
On 8/8/08 1:25 PM, James Graham (Greywolf) [EMAIL PROTECTED] wrote: 226GB of available disk space on each one; 4 processors (2 x dualcore) 8GB of RAM each. Some simple stuff: (Assuming SATA): Are you using AHCI? Do you have the write cache enabled? Is the topologyProgram providing proper results? Is DNS performing as expected? Is it fast? How many tasks per node? How much heap does your name node have? Is it going into garbage collection or swapping?