Re: Resource limits with Hadoop and JVM

Forrest Aldrich Mon, 16 Sep 2013 15:10:18 -0700

Yes, I mentioned below we're running RHEL.

In this case, when I went to add the node, I ran "hadoop mradmin-refreshNodes" (as user hadoop) and the master node went completely nuts- the system load jumped to 60 ("top" was frozen on the console) andrequired a hard reboot.

Whether or not the slave node I added had errors in the *.xml, thisshould never happen. At least, I would like it if it never happenedagain ;-)


We're running:

java version "1.6.0_39"
Java(TM) SE Runtime Environment (build 1.6.0_39-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode)

Hadoop v1.0.1

Perhaps we ran into a bug? I know we need to upgrade, but we're beingvery cautious about changes to the production environment. If it works,don't fix it type of approach.




Thanks,

Forrest



On 9/16/13 5:04 PM, Vinod Kumar Vavilapalli wrote:

I assume you are on Linux. Also assuming that your tasks are soresource intensive that they are taking down nodes. You should enablelimits per task, seehttp://hadoop.apache.org/docs/stable/cluster_setup.html#Memory+monitoring
What it does is that jobs are now forced to up front provide theirresource requirements, and TTs enforce those limits.
HTH
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Sep 16, 2013, at 1:35 PM, Forrest Aldrich wrote:
We recently experienced a couple of situations that brought one ormore Hadoop nodes down (unresponsive). One was related to a bug ina utility we use (ffmpeg) that was resolved by compiling a newversion. The next, today, occurred after attempting to join a newnode to the cluster.
A basic start of the (local) tasktracker and datanode did not work --so based on reference, I issued: hadoop mradmin -refreshNodes, whichwas to be followed by hadoop dfsadmin -refreshNodes. The loadaverage literally jumped to 60 and the master (which also runs aslave) became unresponsive.
Seems to me that this should never happen. But, looking around, Isaw an article from Spotify which mentioned the need to set certainresource limits on the JVM as well as in the system itself(limits.conf, we run RHEL). I (and we) are fairly new to Hadoop,so some of these issues are very new.
I wonder if some of the experts here might be able to comment on thisissue - perhaps point out settings and other measures we can take toprevent this sort of incident in the future.
Our setup is not complicated. Have 3 hadoop nodes, the first isalso a master and a slave (has more resources, too). The underlyingsystem we do is split up tasks to ffmpeg (which is another issue asit tends to eat resources, but so far with a recompile, we aregood). We have two more hardware nodes to add shortly.
Thanks!
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual orentity to which it is addressed and may contain information that isconfidential, privileged and exempt from disclosure under applicablelaw. If the reader of this message is not the intended recipient, youare hereby notified that any printing, copying, dissemination,distribution, disclosure or forwarding of this communication isstrictly prohibited. If you have received this communication in error,please contact the sender immediately and delete it from your system.Thank You.

Re: Resource limits with Hadoop and JVM

Reply via email to