[ https://issues.apache.org/jira/browse/YARN-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109048#comment-16109048 ]
abhishek bharani commented on YARN-6914: ---------------------------------------- Below is the information from NM Logs : 2017-08-01 10:19:50,510 ERROR org.apache.spark.network.util.LevelDBProvider: error opening leveldb file /usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb. Creating new file, will not be able to recover state for existing applications org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb/LOCK: No such file or directory at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.spark.network.util.LevelDBProvider.initLevelDB(LevelDBProvider.java:48) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:116) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:94) at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:65) at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:166) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:261) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:495) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:543) 2017-08-01 10:19:50,511 WARN org.apache.spark.network.util.LevelDBProvider: error deleting /usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb 2017-08-01 10:19:50,511 INFO org.apache.hadoop.service.AbstractService: Service spark_shuffle failed in state INITED; cause: java.io.IOException: Unable to create state store java.io.IOException: Unable to create state store at org.apache.spark.network.util.LevelDBProvider.initLevelDB(LevelDBProvider.java:77) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:116) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:94) at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:65) at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:166) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:261) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:495) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:543) Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb/LOCK: No such file or directory at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.spark.network.util.LevelDBProvider.initLevelDB(LevelDBProvider.java:75) ... 15 more 2017-08-01 10:19:50,513 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Using ResourceCalculatorPlugin : null > Application application_1501553373419_0001 failed 2 times due to AM Container > for appattempt_1501553373419_0001_000002 exited with exitCode: -1000 > -------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: YARN-6914 > URL: https://issues.apache.org/jira/browse/YARN-6914 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn > Affects Versions: 2.7.3 > Environment: Mac OS > Reporter: abhishek bharani > Priority: Critical > Original Estimate: 336h > Remaining Estimate: 336h > > I am getting below error while running > spark-shell --master yarn > Application application_1501553373419_0001 failed 2 times due to AM Container > for appattempt_1501553373419_0001_000002 exited with exitCode: -1000 > For more detailed output, check application tracking > page:http://abhisheks-mbp:8088/cluster/app/application_1501553373419_0001Then, > click on links to logs of each attempt. > Diagnostics: null > Failing this attempt. Failing the application. > Below are the contents of yarn-site.xml : > <configuration> > <!-- Site specific YARN configuration properties --> > <property> > <name>yarn.nodemanager.aux-services</name> > <value>mapreduce_shuffle</value> > </property> > <property> > > <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> > <value>org.apache.hadoop.mapred.ShuffleHandler</value> > </property> > <property> > <name>yarn.nodemanager.aux-services.spark_shuffle.class</name> > > <value>org.apache.spark.network.yarn.YarnShuffleService</value> > </property> > <property> > <name>yarn.log-aggregation-enable</name> > <value>true</value> > </property> > <property> > > <name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name> > <value>3600</value> > </property> > <property> > <name>yarn.resourcemanager.hostname</name> > <value>localhost</value> > </property> > <property> > > <name>yarn.resourcemanager.resourcetracker.address</name> > <value>${yarn.resourcemanager.hostname}:8025</value> > <description>Enter your ResourceManager > hostname.</description> > </property> > <property> > <name>yarn.resourcemanager.scheduler.address</name> > <value>${yarn.resourcemanager.hostname}:8035</value> > <description>Enter your ResourceManager > hostname.</description> > </property> > <property> > <name>yarn.resourcemanager.address</name> > <value>${yarn.resourcemanager.hostname}:8055</value> > <description>Enter your ResourceManager > hostname.</description> > </property> > <property> > <description>The http address of the RM web > application.</description> > <name>yarn.resourcemanager.webapp.address</name> > <value>${yarn.resourcemanager.hostname}:8088</value> > </property> > I tried many solutions but none of them is working : > 1.Added property > yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage > to yarn-site.xml with value as 98.5 > 2.added below property to yarn-site.xml > yarn.nodemanager.aux-services.spark_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > 3.Added property in spark-defaults.conf > spark.yarn.jars=hdfs://localhost:50010/users/spark/jars/*.jar -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org