George, 

Did you try to monitor the CPU in real-time to see any patterns? Does it do any 
CPU spikes? Even so, the load averages should be matched against the number of 
cores/hyper threads. How many cores do you see in top? (I personally prefer 
htop to get a cluster overview in realtime). If you have 2 CPUs with 2 cores 
each the maximum load you should have would be 4. However if the cluster is 
idle it should go as low as 0.2 - 0.3.  If you don't see the real-time 
utilization I'd check if Ganglia is set up correctly.

Cosmin 


On Aug 18, 2010, at 1:11 AM, George Stathis wrote:

> No takers? :-) Am I missing something too obvious?
> 
> On Tue, Aug 17, 2010 at 2:03 PM, George Stathis <[email protected]> wrote:
> 
>> Hello,
>> 
>> We have just setup a new cluster on EC2 using Hadoop 0.20.2 and HBase
>> 0.20.3. Our small setup as of right now consists of one master and four
>> slaves with a replication factor of 2:
>> 
>> Master: xLarge instance with 2 CPUs and 17.5 GB RAM - runs 1 namenode, 1
>> secondarynamenode, 1 jobtracker, 1 hbasemaster, 1 zookeeper (uses its' own
>> dedicated EMS drive)
>> Slaves: xLarge instance with 2 CPUs and 17.5 GB RAM each - run 1 datanode,
>> 1 tasktracker, 1 regionserver
>> 
>> We have also installed Ganglia to monitor the cluster stats as we are about
>> to run some performance tests but, right out of the box, we are noticing
>> high system loads (especially on the master node) without any activity
>> happening on the clister. Of course, the CPUs are not being utilized at all,
>> but Ganglia is reporting almost all nodes in the red as the 1, 5 an 15
>> minute load times are all above 100% most of the time (i.e. there are more
>> than two processes at a time competing for the 2 CPUs time).
>> 
>> Question1: is this normal?
>> Question2: does it matter since each process barely uses any of the CPU
>> time?
>> 
>> Thank you in advance and pardon the noob questions.
>> 
>> -GS
>> 

Reply via email to