[ https://issues.apache.org/jira/browse/SPARK-23843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zhoutai.zt resolved SPARK-23843. -------------------------------- Resolution: Invalid > Deploy yarn meets incorrect LOCALIZED_CONF_DIR > ---------------------------------------------- > > Key: SPARK-23843 > URL: https://issues.apache.org/jira/browse/SPARK-23843 > Project: Spark > Issue Type: Bug > Components: Deploy > Affects Versions: 2.3.0 > Environment: spark-2.3.0-bin-hadoop2.7 > Reporter: zhoutai.zt > Priority: Major > > We have implement a new Hadoop-compatible filesystem and run spark on it. The > commands is: > {quote}./bin/spark-submit --class org.apache.spark.examples.SparkPi --master > yarn --deploy-mode cluster --executor-memory 1G --num-executors 1 > /home/hadoop/app/spark-2.3.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.0.jar > 10 > {quote} > The result is: > {quote}Exception in thread "main" org.apache.spark.SparkException: > Application application_1522399820301_0020 finishe > d with failed status > at org.apache.spark.deploy.yarn.Client.run(Client.scala:1159) > {quote} > We set log level to DEBUG and find: > {quote}2018-04-02 09:36:09,603 DEBUG org.apache.spark.deploy.yarn.Client: > __app__.jar -> resource \{ scheme: "dfs" host: > "f-63a47d43wh98.cn-neimeng-env10-d01.dfs.aliyuncs.com" port: 10290 file: > "/user/hadoop/.sparkStaging/application_1522399820301_0006/spark-examples_2.11-2.3.0.jar" > } size: 1997548 timestamp: 1522632978000 type: FILE visibility: PRIVATE > 2018-04-02 09:36:09,603 DEBUG org.apache.spark.deploy.yarn.Client: > __spark_libs__ -> resource \{ scheme: "dfs" host: > "f-63a47d43wh98.cn-neimeng-env10-d01.dfs.aliyuncs.com" port: 10290 file: > "/user/hadoop/.sparkStaging/application_1522399820301_0006/__spark_libs__924155631753698276.zip" > } size: 242801307 timestamp: 1522632977000 type: ARCHIVE visibility: PRIVATE > 2018-04-02 09:36:09,603 DEBUG org.apache.spark.deploy.yarn.Client: > __spark_conf__ -> resource \{ port: -1 file: > "/user/hadoop/.sparkStaging/application_1522399820301_0006/__spark_conf__.zip" > } size: 185531 timestamp: 1522632978000 type: ARCHIVE visibility: PRIVATE > {quote} > As shown, __app__.jar and __spark_libs__ ‘s information are all correct. BUT > __spark_conf__ has no port, scheme. > We explore the source code, addResource appears two times in Client.scala > {code:java} > val destPath = copyFileToRemote(destDir, localPath, replication, symlinkCache) > val destFs = FileSystem.get(destPath.toUri(), hadoopConf) > distCacheMgr.addResource( > destFs, hadoopConf, destPath, localResources, resType, linkname, statCache, > appMasterOnly = appMasterOnly) > {code} > {code:java} > > val remoteConfArchivePath = new Path(destDir, LOCALIZED_CONF_ARCHIVE) val > remoteFs = FileSystem.get(remoteConfArchivePath.toUri(), hadoopConf) > sparkConf.set(CACHED_CONF_ARCHIVE, remoteConfArchivePath.toString()) val > localConfArchive = new Path(createConfArchive().toURI()) > copyFileToRemote(destDir, localConfArchive, replication, symlinkCache, force > = true, destName = Some(LOCALIZED_CONF_ARCHIVE)) // Manually add the config > archive to the cache manager so that the AM is launched with // the proper > files set up. > distCacheMgr.addResource( remoteFs, hadoopConf, remoteConfArchivePath, > localResources, LocalResourceType.ARCHIVE, LOCALIZED_CONF_DIR, statCache, > appMasterOnly = false) > {code} > As shown in the source code, the destPaths are differently constructed. And > this is confirmed by self added debug log > {quote}2018-04-02 15:18:46,357 ERROR > org.apache.hadoop.yarn.util.ConverterUtils: getYarnUrlFromURI > URI:/user/root/.sparkStaging/application_1522399820301_0020/__spark_conf__.zip > 2018-04-02 15:18:46,357 ERROR org.apache.hadoop.yarn.util.ConverterUtils: > getYarnUrlFromURI URL:null; > null;-1;null;/user/root/.sparkStaging/application_1522399820301_0020/__spark_conf__.zip{quote} > Log messages on YARN NM: > {quote}2018-04-02 09:36:11,958 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: > Failed to parse resource-request > java.net.URISyntaxException: Expected scheme name at index 0: > :///user/hadoop/.sparkStaging/application_1522399820301_0006/__spark_conf__.zip > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org