[ https://issues.apache.org/jira/browse/SPARK-19340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Reza Safi updated SPARK-19340: ------------------------------ Description: If you want to open a file that its name is like {noformat} "*{*}*.*" {noformat} or {noformat} "*[*]*.*" {noformat} using CSV format, you will get the "org.apache.spark.sql.AnalysisException: Path does not exist" whether the file is a local file or on hdfs. This bug can be reproduced on master and all other Spark 2 branches. To reproduce: # Create a file like "test{00-1}.txt" on a local directory (like in /Users/reza/test/test{00-1}.txt) # Run spark-shell # Execute this command: val df=spark.read.option("header","false").csv("/Users/reza/test/*.txt") You will see the following stack trace: {noformat} org.apache.spark.sql.AnalysisException: Path does not exist: file:/Users/reza/test/test\{00-01\}.txt; at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:367) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:360) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.immutable.List.flatMap(List.scala:344) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:360) at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.readText(CSVFileFormat.scala:208) at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:63) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:174) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:174) at scala.Option.orElse(Option.scala:289) at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:173) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:377) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:158) at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:423) at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:360) ... 48 elided {noformat} was: If you want to open a file that its name is like {noformat} "*{*}*.*" {noformat} or {noformat} "*[*]*.*" {noformat} using CSV format, you will get the "org.apache.spark.sql.AnalysisException: Path does not exist" whether the file is a local file or on hdfs. This bug can be reproduced on master and all other Spark 2 branches. To reproduce: # Create a file like "test{00-1}.txt" on a local directory (like in /Users/reza/test/test{00-1}.txt) # Run spark-shell # Execute this command: val df=spark.read.option("header","false").csv("/Users/reza/test/*.txt") You will see the following stack trace: {noformat} org.apache.spark.sql.AnalysisException: Path does not exist: file:/Users/rezasafi/bck/sp2/test\{00-01\}.txt; at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:367) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:360) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.immutable.List.flatMap(List.scala:344) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:360) at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.readText(CSVFileFormat.scala:208) at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:63) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:174) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:174) at scala.Option.orElse(Option.scala:289) at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:173) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:377) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:158) at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:423) at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:360) ... 48 elided {noformat} > Opening a file in CSV format will result in an exception if the filename > contains special characters > ---------------------------------------------------------------------------------------------------- > > Key: SPARK-19340 > URL: https://issues.apache.org/jira/browse/SPARK-19340 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0, 2.0.1, 2.1.0, 2.2.0 > Reporter: Reza Safi > > If you want to open a file that its name is like {noformat} "*{*}*.*" > {noformat} or {noformat} "*[*]*.*" {noformat} using CSV format, you will get > the "org.apache.spark.sql.AnalysisException: Path does not exist" whether the > file is a local file or on hdfs. > This bug can be reproduced on master and all other Spark 2 branches. > To reproduce: > # Create a file like "test{00-1}.txt" on a local directory (like in > /Users/reza/test/test{00-1}.txt) > # Run spark-shell > # Execute this command: > val df=spark.read.option("header","false").csv("/Users/reza/test/*.txt") > You will see the following stack trace: > {noformat} > org.apache.spark.sql.AnalysisException: Path does not exist: > file:/Users/reza/test/test\{00-01\}.txt; > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:367) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:360) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.immutable.List.flatMap(List.scala:344) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:360) > at > org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.readText(CSVFileFormat.scala:208) > at > org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:63) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:174) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:174) > at scala.Option.orElse(Option.scala:289) > at > org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:173) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:377) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:158) > at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:423) > at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:360) > ... 48 elided > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org