[GitHub] spark pull request #15627: [SPARK-18099][YARN] Fail if same files added to d...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/15627#discussion_r86930468 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -600,10 +600,14 @@ private[spark] class Client( val (_, localizedPath) = distribute(file, resType = resType) if (addToClasspath) { if (localizedPath != null) { - cachedSecondaryJarLinks += localizedPath +cachedSecondaryJarLinks += localizedPath } } else { - require(localizedPath !=null) + if (localizedPath != null) { --- End diff -- I guess here is `localizedPath == null` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15627: [SPARK-18099][YARN] Fail if same files added to d...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15627 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15627: [SPARK-18099][YARN] Fail if same files added to d...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/15627#discussion_r85948721 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala --- @@ -282,6 +282,37 @@ class ClientSuite extends SparkFunSuite with Matchers with BeforeAndAfterAll } } + test("distribute archive multiple times") { +val libs = Utils.createTempDir() +val jarsDir = new File(libs, "jars") +assert(jarsDir.mkdir()) --- End diff -- I don't see jarsDir being used anywhere either --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15627: [SPARK-18099][YARN] Fail if same files added to d...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/15627#discussion_r85948423 --- Diff: yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala --- @@ -282,6 +282,37 @@ class ClientSuite extends SparkFunSuite with Matchers with BeforeAndAfterAll } } + test("distribute archive multiple times") { +val libs = Utils.createTempDir() +val jarsDir = new File(libs, "jars") +assert(jarsDir.mkdir()) +new FileOutputStream(new File(libs, "RELEASE")).close() +val userLib1 = Utils.createTempDir() +val userLib2 = Utils.createTempDir() + +val jar1 = TestUtils.createJarWithFiles(Map(), jarsDir) --- End diff -- not used anywhere --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15627: [SPARK-18099][YARN] Fail if same files added to d...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/15627#discussion_r85552081 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -598,8 +598,12 @@ private[spark] class Client( ).foreach { case (flist, resType, addToClasspath) => flist.foreach { file => val (_, localizedPath) = distribute(file, resType = resType) -if (addToClasspath && localizedPath != null) { - cachedSecondaryJarLinks += localizedPath +if (addToClasspath) { + if (localizedPath != null) { + cachedSecondaryJarLinks += localizedPath + } +} else { + require(localizedPath !=null) --- End diff -- Lets change the error to illegal argument exception. Also lets comment this to indicate jars are ok due to spark 2.0 jar install, everything else shouldn't have multiple of same jar/file/archive. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15627: [SPARK-18099][YARN] Fail if same files added to d...
GitHub user kishorvpatil opened a pull request: https://github.com/apache/spark/pull/15627 [SPARK-18099][YARN] Fail if same files added to distributed cache for --files and --archives ## What changes were proposed in this pull request? During spark-submit, if yarn dist cache is instructed to add same file under --files and --archives, This code change ensures the spark yarn distributed cache behaviour is retained i.e. to warn and fail if same files is mentioned in both --files and --archives. ## How was this patch tested? Manually tested: 1. if same jar is mentioned in --jars and --files it will continue to submit the job. - basically functionality [SPARK-14423] #12203 is unchanged 2. if same file is mentioned in --files and --archives it will fail to submit the job. Please review https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before opening a pull request. ⦠under archives and files You can merge this pull request into a Git repository by running: $ git pull https://github.com/kishorvpatil/spark spark18099 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15627.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15627 commit 9bb16236ad7bbb982e0ffaa73899ebc11df9e6ee Author: Kishor PatilDate: 2016-10-25T18:19:46Z Dist cache yarn during submit should throw error for adding same file under archives and files --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org