GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/20616
[SPARK-23434][SQL] Spark should not warn `metadata directory` for a HDFS file path ## What changes were proposed in this pull request? When Spark reads a file path (e.g. `people.json`), it warns with a wrong error message during looking up `people.json/_spark_metadata`. The root cause of this istuation is the difference between `LocalFileSystem` and `DistributedFileSystem`. `LocalFileSystem.exists()` returns `false`, but `DistributedFileSystem.exists` raises Exception. ```scala scala> spark.version res0: String = 2.4.0-SNAPSHOT scala> spark.read.json("file:///usr/hdp/current/spark-client/examples/src/main/resources/people.json").show +----+-------+ | age| name| +----+-------+ |null|Michael| | 30| Andy| | 19| Justin| +----+-------+ scala> spark.read.json("hdfs:///tmp/people.json") 18/02/15 05:00:48 WARN streaming.FileStreamSink: Error while looking for metadata directory. 18/02/15 05:00:48 WARN streaming.FileStreamSink: Error while looking for metadata directory. ``` After this PR, ```scala scala> spark.read.json("hdfs:///tmp/people.json").show +----+-------+ | age| name| +----+-------+ |null|Michael| | 30| Andy| | 19| Justin| +----+-------+ ``` ## How was this patch tested? Manual. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-23434 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20616.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20616 ---- commit a14ff6974446b8e692b03c3e3f1cab52693cc6c4 Author: Dongjoon Hyun <dongjoon@...> Date: 2018-02-15T05:13:24Z [SPARK-23434][SQL] Spark should not warn `metadata directory` for a HDFS file path ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org