Siying Dong created SPARK-43343: ----------------------------------- Summary: Spark Streaming is not able to read a .txt file whose name has [] special character Key: SPARK-43343 URL: https://issues.apache.org/jira/browse/SPARK-43343 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 3.4.0 Reporter: Siying Dong
* For example, If a directory contains a following file: /path/abc[123] and users would load spark.readStream.format("text").load("/path") as stream input. It throws an exception, saying no matching path /path/abc[123]. Spark thinks abc[123] is a regex that only matches file named abc1, abc2 and abc3. * Upon investigation this is due to how we [getBatch|https://github.com/databricks/runtime/blob/3af402d23620a0952e151d96c3184d2233217c87/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala#L269] in the FileStreamSource. In `FileStreamSource` we already check file pattern matching and find all match file names. However, in DataSource we check for glob characters again and try to expend it [here|https://github.com/databricks/runtime/blob/3af402d23620a0952e151d96c3184d2233217c87/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L274]. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org