Aniket Bhatnagar created SPARK-18273: ----------------------------------------
Summary: DataFrameReader.load takes a lot of time to start the job if a lot of file/dir paths are pass Key: SPARK-18273 URL: https://issues.apache.org/jira/browse/SPARK-18273 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 2.0.1 Reporter: Aniket Bhatnagar If the paths Seq parameter contains a lot of elements, then DataFrameReader.load takes a lot of time starting the job as it attempts to check if each of the path exists using fs.exists. There should be a boolean configuration option to disable the checking for path's existence and that should be passed in as parameter to DataSource.resolveRelation call. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org