Hi, The significant factor in cluster loading is memory, not CPU. Hadoop views the cluster only with respect to memory and cares not about CPU utilization or Disk saturation. If you run too many TaskTrackers, you risk memory overcommit where the Linux OOM will come out of the closet and randomly kill processes which will ultimately take out the box. Disk saturation is another issue that will contribute to datanode timeouts, which lead to a downward spiral where the namenode will think the datanode is down and start replicating its blocks adding to the overall I/O loading contributing to other datanode timeouts. If your datanode CPU's are pegged then this contributes as well, but not as much. Generally, if you carve up your cluster properly, you won't have CPU over utilization issues, quite the opposite.
-Bill -----Original Message----- From: Amandeep Khurana [mailto:ama...@gmail.com] Sent: Mon 1/16/2012 10:21 PM To: common-user@hadoop.apache.org Cc: hadoop-u...@lucene.apache.org Subject: Re: How to find out whether a node is Overloaded from Cpu utilization ? Arun, I don't think you'll hear a fixed number. Having said that, I have seen CPU being pegged at 95% during jobs and the cluster working perfectly fine. On the slaves, if you have nothing else going on, Hadoop only has TaskTrackers and DataNodes. Those two daemons are relatively light weight in terms of CPU for the most part. So, you can afford to let your tasks take up a high %. Hope that helps. -Amandeep On Tue, Jan 17, 2012 at 2:16 PM, ArunKumar <arunk...@gmail.com> wrote: > Hi Guys ! > > When we get CPU utilization value of a node in hadoop cluster, what > percent > value can be considered as overloaded ? > Say for eg. > > CPU utilization Node Status > 85% Overloaded > 20% Normal > > > Arun > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/How-to-find-out-whether-a-node-is-Overloaded-from-Cpu-utilization-tp3665289p3665289.html > Sent from the Hadoop lucene-users mailing list archive at Nabble.com. >