[GitHub] spark pull request: [WIP] In yarn.ClientBase spark.yarn.dist.* do ...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/969 [WIP] In yarn.ClientBase spark.yarn.dist.* do not work You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark yarn_ClientBase Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/969.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #969 commit 836248956ff3ef17d44cea37b357e3616b054d64 Author: witgo Date: 2014-06-04T16:50:12Z yarn.ClientBase spark.yarn.dist.* do not work --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] In yarn.ClientBase spark.yarn.dist.* do ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/969#issuecomment-45120602 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] In yarn.ClientBase spark.yarn.dist.* do ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/969#issuecomment-45120576 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] In yarn.ClientBase spark.yarn.dist.* do ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/969#issuecomment-45124529 Please provide more description or problem and how to reproduce and open a jira. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] In yarn.ClientBase spark.yarn.dist.* do ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/969#issuecomment-45125353 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15449/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] In yarn.ClientBase spark.yarn.dist.* do ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/969#issuecomment-45125352 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] In yarn.ClientBase spark.yarn.dist.* do ...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/969#discussion_r13417150 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -220,10 +220,21 @@ trait ClientBase extends Logging { } } +def getArg(arg: String, envVar: String, sysProp: String): String = { + if (arg != null && !arg.isEmpty) { +arg + } else if (System.getenv(envVar) != null && !System.getenv(envVar).isEmpty) { +System.getenv(envVar) + } else { +sparkConf.getOption(sysProp).orNull + } +} var cachedSecondaryJarLinks = ListBuffer.empty[String] -val fileLists = List( (args.addJars, LocalResourceType.FILE, true), - (args.files, LocalResourceType.FILE, false), - (args.archives, LocalResourceType.ARCHIVE, false) ) +val fileLists = List((args.addJars, LocalResourceType.FILE, true), + (getArg(args.files, "SPARK_YARN_DIST_FILES", "spark.yarn.dist.files"), +LocalResourceType.FILE, false), + (getArg(args.archives, "SPARK_YARN_DIST_ARCHIVES", "spark.yarn.dist.archives"), --- End diff -- I don't think env variables and conf entries should be handled here like this. YarnClientSchedulerBackend already deals with the env variable and command line option for client mode. It seems that SparkSubmit might be missing code to handle the env variable for cluster mode, though. Probably better to fix it there, and leave this code to deal only with the command line args (which are already correctly parsed). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] In yarn.ClientBase spark.yarn.dist.* do ...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/969#issuecomment-45224988 Spark configuration `conf/spark-defaults.conf` => ``` spark.yarn.dist.archives /toona/conf spark.executor.extraClassPath ./conf spark.driver.extraClassPath ./conf ``` HDFS directory `hadoop dfs -cat /toona/conf/toona.conf` => ``` redis.num=4 ``` - The following command execution fails ```shell YARN_CONF_DIR=/etc/hadoop/conf ./bin/spark-submit --num-executors 2 --driver-memory 2g --executor-memory 2g --master yarn-cluster --class toona.DeployTest toona-assembly.jar ``` The following is testing the code ```scala package toona import com.typesafe.config.Config import com.typesafe.config.ConfigFactory object DeployTest { def main(args: Array[String]) { val conf = ConfigFactory.load("toona.conf") val redisNum = conf.getInt("redis.num") // Here will throw an `ConfigException` exception assert(redisNum == 4) } } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---