[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237746374 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -1134,39 +1130,40 @@ class SparkSubmitSuite val hadoopConf = new Configuration() updateConfWithFakeS3Fs(hadoopConf) -val tmpDir = Utils.createTempDir() -val pyFile = File.createTempFile("tmpPy", ".egg", tmpDir) +withTempDir { tmpDir => + val pyFile = File.createTempFile("tmpPy", ".egg", tmpDir) -val args = Seq( - "--class", UserClasspathFirstTest.getClass.getName.stripPrefix("$"), - "--name", "testApp", - "--master", "yarn", - "--deploy-mode", "client", - "--py-files", s"s3a://${pyFile.getAbsolutePath}", - "spark-internal" -) + val args = Seq( +"--class", UserClasspathFirstTest.getClass.getName.stripPrefix("$"), +"--name", "testApp", +"--master", "yarn", +"--deploy-mode", "client", +"--py-files", s"s3a://${pyFile.getAbsolutePath}", +"spark-internal" + ) -val appArgs = new SparkSubmitArguments(args) -val (_, _, conf, _) = submit.prepareSubmitEnvironment(appArgs, conf = Some(hadoopConf)) + val appArgs = new SparkSubmitArguments(args) + val (_, _, conf, _) = submit.prepareSubmitEnvironment(appArgs, conf = Some(hadoopConf)) -conf.get(PY_FILES.key) should be (s"s3a://${pyFile.getAbsolutePath}") -conf.get("spark.submit.pyFiles") should (startWith("/")) + conf.get(PY_FILES.key) should be(s"s3a://${pyFile.getAbsolutePath}") --- End diff -- ditto. Technically it should better be assert and avoid infix notation but I think we don't have to do it here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237746228 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -985,37 +985,38 @@ class SparkSubmitSuite val hadoopConf = new Configuration() updateConfWithFakeS3Fs(hadoopConf) -val tmpDir = Utils.createTempDir() -val file = File.createTempFile("tmpFile", "", tmpDir) -val pyFile = File.createTempFile("tmpPy", ".egg", tmpDir) -val mainResource = File.createTempFile("tmpPy", ".py", tmpDir) -val tmpJar = TestUtils.createJarWithFiles(Map("test.resource" -> "USER"), tmpDir) -val tmpJarPath = s"s3a://${new File(tmpJar.toURI).getAbsolutePath}" +withTempDir { tmpDir => + val file = File.createTempFile("tmpFile", "", tmpDir) + val pyFile = File.createTempFile("tmpPy", ".egg", tmpDir) + val mainResource = File.createTempFile("tmpPy", ".py", tmpDir) + val tmpJar = TestUtils.createJarWithFiles(Map("test.resource" -> "USER"), tmpDir) + val tmpJarPath = s"s3a://${new File(tmpJar.toURI).getAbsolutePath}" -val args = Seq( - "--class", UserClasspathFirstTest.getClass.getName.stripPrefix("$"), - "--name", "testApp", - "--master", "yarn", - "--deploy-mode", "client", - "--jars", tmpJarPath, - "--files", s"s3a://${file.getAbsolutePath}", - "--py-files", s"s3a://${pyFile.getAbsolutePath}", - s"s3a://$mainResource" + val args = Seq( +"--class", UserClasspathFirstTest.getClass.getName.stripPrefix("$"), +"--name", "testApp", +"--master", "yarn", +"--deploy-mode", "client", +"--jars", tmpJarPath, +"--files", s"s3a://${file.getAbsolutePath}", +"--py-files", s"s3a://${pyFile.getAbsolutePath}", +s"s3a://$mainResource" ) -val appArgs = new SparkSubmitArguments(args) -val (_, _, conf, _) = submit.prepareSubmitEnvironment(appArgs, conf = Some(hadoopConf)) + val appArgs = new SparkSubmitArguments(args) + val (_, _, conf, _) = submit.prepareSubmitEnvironment(appArgs, conf = Some(hadoopConf)) -// All the resources should still be remote paths, so that YARN client will not upload again. -conf.get("spark.yarn.dist.jars") should be (tmpJarPath) --- End diff -- I wouldn't change those spaces alone tho. Let's leave as were. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237732426 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala --- @@ -66,6 +66,18 @@ private[sql] trait SQLTestUtils extends SparkFunSuite with SQLTestUtilsBase with } } + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + */ + protected override def withTempDir(f: File => Unit): Unit = { +super.withTempDir { dir => --- End diff -- yea this is what I expect, thanks for doing it! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237732369 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala --- @@ -66,6 +66,18 @@ private[sql] trait SQLTestUtils extends SparkFunSuite with SQLTestUtilsBase with } } + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * --- End diff -- nit: unnecessary blank line --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237617706 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -494,13 +494,12 @@ class SparkSubmitSuite } test("launch simple application with spark-submit with redaction") { -val testDir = Utils.createTempDir() -testDir.deleteOnExit() -val testDirPath = new Path(testDir.getAbsolutePath()) val unusedJar = TestUtils.createJarWithClasses(Seq.empty) val fileSystem = Utils.getHadoopFileSystem("/", SparkHadoopUtil.get.newConfiguration(new SparkConf())) -try { +withTempDir { testDir => + testDir.deleteOnExit() --- End diff -- `deleteOnExit` is redundant for directories created by `Utils.createTempDir`. That method already tracks directories to cleanup on exit. (Also `deleteOnExit` does not work for non-empty directories.) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237559695 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala --- @@ -66,6 +66,20 @@ private[sql] trait SQLTestUtils extends SparkFunSuite with SQLTestUtilsBase with } } + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + */ + protected override def withTempDir(f: File => Unit): Unit = { +val dir = Utils.createTempDir().getCanonicalFile +try f(dir) finally { --- End diff -- Why not call the super method with a function that calls f, then waitForTasksToFinish()? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237559321 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -494,13 +494,12 @@ class SparkSubmitSuite } test("launch simple application with spark-submit with redaction") { -val testDir = Utils.createTempDir() -testDir.deleteOnExit() -val testDirPath = new Path(testDir.getAbsolutePath()) val unusedJar = TestUtils.createJarWithClasses(Seq.empty) val fileSystem = Utils.getHadoopFileSystem("/", SparkHadoopUtil.get.newConfiguration(new SparkConf())) -try { +withTempDir { testDir => + testDir.deleteOnExit() --- End diff -- Although I think this is redundant for temp dirs, you can put this in the Utils.createTempDir method and take it out in places like this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237364160 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place + */ + protected def withCreateTempDir(f: File => Unit): Unit = { --- End diff -- I'm not talking about details like which calss to override, just the idea. Why wouldn't override work? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user heary-cao commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237362372 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place + */ + protected def withCreateTempDir(f: File => Unit): Unit = { --- End diff -- Sorry, maybe I didn't make it clear. `withTempDir ` define in `SQLTestUtilsBase` not in `SQLTestUtils `.if we will override `withTempDir ` first must be remove `withTempDir ` from `SQLTestUtilsBase` to `SQLTestUtils` and override, but in this way, other functions are also involved. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237347450 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place + */ + protected def withCreateTempDir(f: File => Unit): Unit = { --- End diff -- OK, how about in `SQLTestUtils` we override `withTempDir` with this extra logic? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user heary-cao commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237344805 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place + */ + protected def withCreateTempDir(f: File => Unit): Unit = { --- End diff -- I'm not sure if I need `assert(spark.sparkContext.statusTracker .getExecutorInfos.map(_.numRunningTasks()).sum == 0)`. after all, `protected def spark: SparkSession` defined in `SQLTestData`. Unless we construct one `waitForTasksToFinish ` look like ``` protected def waitForTasksToFinish(): Unit = { eventually(timeout(10.seconds)) { } } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237338904 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place + */ + protected def withCreateTempDir(f: File => Unit): Unit = { --- End diff -- shall we also do `waitForTasksToFinish` in `withCreateTempDir`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user heary-cao commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237333755 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place + */ + protected def withCreateTempDir(f: File => Unit): Unit = { --- End diff -- Currently, `withTempDir` and `withCreateTempDir ` are somewhat different. **withTempDir** ``` protected def withTempDir(f: File => Unit): Unit = { val dir = Utils.createTempDir().getCanonicalFile try f(dir) finally { // wait for all tasks to finish before deleting files waitForTasksToFinish() Utils.deleteRecursively(dir) } } protected def waitForTasksToFinish(): Unit = { eventually(timeout(10.seconds)) { assert(spark.sparkContext.statusTracker .getExecutorInfos.map(_.numRunningTasks()).sum == 0) } } ``` **withCreateTempDir** ``` protected def withCreateTempDir(f: File => Unit): Unit = { val dir = Utils.createTempDir() try f(dir) finally { Utils.deleteRecursively(dir) } } ``` thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237059126 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place + */ + protected def withCreateTempDir(f: File => Unit): Unit = { --- End diff -- if we have this function in `SparkFunSuite`, why do we need to define it again in `SQLTestUtils`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r237058872 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place --- End diff -- I think this is the most general place we can find... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user heary-cao commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r236922709 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place + */ + protected def withCreateTempDir(f: File => Unit): Unit = { --- End diff -- `trait SQLTestUtils extends SparkFunSuite with SQLTestUtilsBase with PlanTest` if `SparkFunSuite `and `SQLTestUtilsBase `use the same name `withTempDir`. Can cause name contamination? thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user heary-cao commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r236919725 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place + */ + protected def withCreateTempDir(f: File => Unit): Unit = { +val dir = Utils.createTempDir() --- End diff -- I'm not sure if I need call `.getCanonicalFile` again. i feel it's a little redundant. review `Utils.createTempDir()` --> `createDirectory` --> `dir.getCanonicalFile`. It has been called `.getCanonicalFile`. thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r236912228 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place + */ + protected def withCreateTempDir(f: File => Unit): Unit = { +val dir = Utils.createTempDir() --- End diff -- Is it better to call `.getCanonicalFile`, too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/23151#discussion_r236912182 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -105,5 +105,16 @@ abstract class SparkFunSuite logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n") } } - + /** + * Creates a temporary directory, which is then passed to `f` and will be deleted after `f` + * returns. + * + * @todo Probably this method should be moved to a more general place + */ + protected def withCreateTempDir(f: File => Unit): Unit = { --- End diff -- Is there any reason not to use `withTempDir` as a function name like other modules? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...
GitHub user heary-cao opened a pull request: https://github.com/apache/spark/pull/23151 [SPARK-26180][CORE][TEST] Add a withCreateTempDir function to the SparkCore test case ## What changes were proposed in this pull request? Currently, the common `withTempDir` function is used in Spark SQL test cases. To handle `val dir = Utils. createTempDir()` and `Utils. deleteRecursively (dir)`. Unfortunately, the `withTempDir` function cannot be used in the Spark Core test case. This PR adds a common `withCreateTempDir` function to clean up SparkCore test cases. thanks. ## How was this patch tested? N / A You can merge this pull request into a Git repository by running: $ git pull https://github.com/heary-cao/spark withCreateTempDir Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23151.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23151 commit d29d4ece73dd17ee3ba5e1e85e2d35096f524810 Author: caoxuewen Date: 2018-11-27T03:00:08Z Add a withCreateTempDir function to the SparkCore test case --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org