Hello, I setup I hadoop 2.7.2 cluster on Ubuntu 16.04 with OpenJDK8. After running TeraGen from examples jar: hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar teragen 10000000000 /user/mitya/terasort-input
I see that many NodeManages are not running anymore. In log file, I get 2016-05-15 11:30:29,254 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1463300801042_0001_01_003114 is : 143 2016-05-15 11:30:29,254 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1463300801042_0001_01_001872 is : 143 2016-05-15 11:30:29,254 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1463300801042_0001_01_002959 is : 143 2016-05-15 11:30:29,260 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8042 2016-05-15 11:30:29,361 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Applications still running : [application_1463300801042_0001] 2016-05-15 11:30:29,363 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Waiting for Applications to be Finished 2016-05-15 11:30:29,449 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_002959 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,449 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_001872 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,449 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_001220 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_003114 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_002181 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_001532 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_002649 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_002493 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_003450 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_000596 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_002805 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_000001 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_001376 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_002337 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_003269 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_002026 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1463300801042_0001_01_000440 transitioned from RUNNING to EXITED_WITH_FAILURE 2016-05-15 11:30:29,450 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1463300801042_0001 transitioned from RUNNING to FINISHING_CONTAINERS_WAIT 2016-05-15 11:30:29,451 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1463300801042_0001_01_002959 2016-05-15 11:30:29,470 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1463300801042_0001_01_001872 2016-05-15 11:30:29,476 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: RECEIVED SIGNAL 15: SIGTERM Here is how NodeManager was started: mapred 5469 1 99 11:38 ? 00:00:13 /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Dproc_nodemanager -Xmx1000m -Dhadoop.log.dir=/var/log/hadoop -Dyarn.log.dir=/var/log/hadoop -Dhadoop.log.file=yarn-mapred-nodemanager.log -Dyarn.log.file=yarn-mapred-nodemanager.log -Dyarn.home.dir= -Dyarn.id.str=mapred -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/usr/local/hadoop/lib/native -Dyarn.policy.file=hadoop-policy.xml -server -Dhadoop.log.dir=/var/log/hadoop -Dyarn.log.dir=/var/log/hadoop -Dhadoop.log.file=yarn-mapred-nodemanager.log -Dyarn.log.file=yarn-mapred-nodemanager.log -Dyarn.home.dir=/usr/local/hadoop -Dhadoop.home.dir=/usr/local/hadoop -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/usr/local/hadoop/lib/native -classpath /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar:/contrib/capacity-scheduler/*.jar:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/etc/hadoop/nm-config/log4j.properties org.apache.hadoop.yarn.server.nodemanager.NodeManager So I have 2 questions: 1) How can I figure out what's wrong with simple TeraGen Job? 2) Why Nodemanager itself (hadoop system process) exits? Even that user-submitted job is broken, why NodeManager process itself stops? Thanks in advance! --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org For additional commands, e-mail: user-h...@hadoop.apache.org