Nicolas Fraison created SPARK-13622: ---------------------------------------
Summary: Can't create leveldb file for YARN shuffle service Key: SPARK-13622 URL: https://issues.apache.org/jira/browse/SPARK-13622 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.6.0 Environment: cdh 5.5.0 Reporter: Nicolas Fraison Priority: Minor After activating the spark external shuffle service for YARN in order to have dynamic ressource allocation I face those level db creation issues on the nodemanager: 2016-03-02 03:13:59,692 ERROR org.apache.spark.network.shuffle.ExternalShuffleBlockResolver: error opening leveldb file file:/data/yarn/cache/yarn/nm-local-dir/registeredExecutors.ldb. Creating new file, will not be able to recover state for existing applications org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /usr/lib/hadoop-yarn/file:/data/yarn/cache/yarn/nm-local-dir/registeredExecutors.ldb/LOCK: Aucun fichier ou dossier de ce type at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:100) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:81) at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:56) at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:128) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:236) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:255) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:474) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:521) 2016-03-02 03:13:59,694 WARN org.apache.spark.network.shuffle.ExternalShuffleBlockResolver: error deleting file:/data/yarn/cache/yarn/nm-local-dir/registeredExecutors.ldb 2016-03-02 03:13:59,694 ERROR org.apache.spark.network.yarn.YarnShuffleService: Failed to initialize external shuffle service java.io.IOException: Unable to create state store at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:129) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:81) at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:56) at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:128) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:236) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:255) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:474) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:521) Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /usr/lib/hadoop-yarn/file:/data/yarn/cache/yarn/nm-local-dir/registeredExecutors.ldb/LOCK: Aucun fichier ou dossier de ce type at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:127) ... 14 more On the yarn-site.xml config file I used URI to set path: <property> <description>List of directories to store localized files in.</description> <name>yarn.nodemanager.local-dirs</name> <value>file:///data/yarn/cache/${user.name}/nm-local-dir</value> </property> If I removed the scheme for this config in order to have /data/yarn/cache/${user.name}/nm-local-dir the nodemanager not face this issue and the level db is well created. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org