[GitHub] spark pull request #19294: [SPARK-21549][CORE] Respect OutputFormats with no...

szhem Wed, 20 Sep 2017 12:59:28 -0700

Github user szhem commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19294#discussion_r140076564
  
    --- Diff: 
core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
    @@ -568,6 +568,51 @@ class PairRDDFunctionsSuite extends SparkFunSuite with 
SharedSparkContext {
         assert(FakeWriterWithCallback.exception.getMessage contains "failed to 
write")
       }
     
    +  test("saveAsNewAPIHadoopDataset should use current working directory " +
    +    "for files to be committed to an absolute output location when empty 
output path specified") {
    +    val pairs = sc.parallelize(Array((new Integer(1), new Integer(2))), 1)
    +
    +    val job = NewJob.getInstance(new Configuration(sc.hadoopConfiguration))
    +    job.setOutputKeyClass(classOf[Integer])
    +    job.setOutputValueClass(classOf[Integer])
    +    job.setOutputFormatClass(classOf[NewFakeFormat])
    +    val jobConfiguration = job.getConfiguration
    +
    +    val fs = FileSystem.get(jobConfiguration)
    +    fs.setWorkingDirectory(new 
Path(getClass.getResource(".").toExternalForm))
    +    try {
    +      // just test that the job does not fail with
    +      // java.lang.IllegalArgumentException: Can not create a Path from a 
null string
    +      pairs.saveAsNewAPIHadoopDataset(jobConfiguration)
    +    } finally {
    +      // close to prevent filesystem caching across different tests
    +      fs.close()
    --- End diff --
    
    I was counting on indirect filesystem caching, so that it was exactly the 
same both in tests as well as in `SparkHadoopWriter` and calling to 
`newInstance` prevents us from such a possibility. Currently I've updated PR 
not to use filesystem at all.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19294: [SPARK-21549][CORE] Respect OutputFormats with no...

Reply via email to