Mike Dias created SPARK-26875: --------------------------------- Summary: Add an option on FileStreamSource for include modified files Key: SPARK-26875 URL: https://issues.apache.org/jira/browse/SPARK-26875 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 2.4.0 Reporter: Mike Dias
The current behavior only the check the filename to determine if a file should be processed or not. I propose to add an option to also test the file timestamp if is greater than last time it was processed, as an indication that it's modified and have different content. It is useful when the source producer eventually overrides files with new content. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org