[ https://issues.apache.org/jira/browse/SPARK-18974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15785930#comment-15785930 ]
Shixiong Zhu commented on SPARK-18974: -------------------------------------- Do you want to try Structured Streaming? Its FileStreamSource allows 7 days old files by default. > FileInputDStream could not detected files which moved to the directory > ----------------------------------------------------------------------- > > Key: SPARK-18974 > URL: https://issues.apache.org/jira/browse/SPARK-18974 > Project: Spark > Issue Type: Bug > Components: DStreams > Affects Versions: 1.6.3, 2.0.2 > Reporter: Adam Wang > > FileInputDStream use mod time to find new files, but if a file was moved into > the directories it's modification time would not be changed, so > FileInputDStream could not detect these files. > I think a way to fix this bug is get access_time and do judgment, bug it need > a Set of files to save all old files, it would very inefficient for lot of > files directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org