Genmao Yu created SPARK-18960: --------------------------------- Summary: Avoid double reading file which is being copied. Key: SPARK-18960 URL: https://issues.apache.org/jira/browse/SPARK-18960 Project: Spark Issue Type: Bug Components: SQL, Structured Streaming Affects Versions: 2.0.2 Reporter: Genmao Yu
In HDFS, when we copy a file into target directory, there will a temporary {{._COPY_}} file for a period of time. The duration depends on file size. If we do not skip this file, we will may read the same data for two times. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org