[ https://issues.apache.org/jira/browse/SPARK-18974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15781279#comment-15781279 ]
Shixiong Zhu commented on SPARK-18974: -------------------------------------- Not sure about HDFS. But I just checked my local file system on Mac. It doesn't change `access_time` after renaming a file, either. {code} $ stat -x a.txt File: "a.txt" Size: 2 FileType: Regular File Mode: (0644/-rw-r--r--) ... Device: 1,4 Inode: 437388576 Links: 1 Access: Tue Dec 27 13:10:13 2016 Modify: Tue Dec 27 13:10:13 2016 Change: Tue Dec 27 13:10:13 2016 $ sleep 3 $ mv a.txt b.txt $ stat -x b.txt File: "b.txt" Size: 2 FileType: Regular File Mode: (0644/-rw-r--r--) ... Device: 1,4 Inode: 437388576 Links: 1 Access: Tue Dec 27 13:10:13 2016 Modify: Tue Dec 27 13:10:13 2016 Change: Tue Dec 27 13:10:13 2016 {code} > FileInputDStream could not detected files which moved to the directory > ----------------------------------------------------------------------- > > Key: SPARK-18974 > URL: https://issues.apache.org/jira/browse/SPARK-18974 > Project: Spark > Issue Type: Bug > Components: DStreams > Affects Versions: 1.6.3, 2.0.2 > Reporter: Adam Wang > > FileInputDStream use mod time to find new files, but if a file was moved into > the directories it's modification time would not be changed, so > FileInputDStream could not detect these files. > I think a way to fix this bug is get access_time and do judgment, bug it need > a Set of files to save all old files, it would very inefficient for lot of > files directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org