[ https://issues.apache.org/jira/browse/APEXMALHAR-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308763#comment-15308763 ]
ASF GitHub Bot commented on APEXMALHAR-2103: -------------------------------------------- Github user DT-Priyanka commented on a diff in the pull request: https://github.com/apache/incubator-apex-malhar/pull/300#discussion_r65274187 --- Diff: library/src/main/java/com/datatorrent/lib/io/fs/FileSplitterInput.java --- @@ -375,11 +374,18 @@ public void run() lastScannedInfo = null; numDiscoveredPerIteration = 0; for (String afile : files) { - String filePath = new File(afile).getAbsolutePath(); - LOG.debug("Scan started for input {}", filePath); - Map<String, Long> lastModifiedTimesForInputDir; - lastModifiedTimesForInputDir = referenceTimes.get(filePath); - scan(new Path(afile), null, lastModifiedTimesForInputDir); + Path filePath = new Path(afile); + LOG.debug("Scan started for input {}", filePath.toString()); + Map<String, Long> lastModifiedTimesForInputDir = null; + if (fs.exists(filePath)) { + FileStatus fileStatus = fs.getFileStatus(filePath); + if (fileStatus.isDirectory()) { + lastModifiedTimesForInputDir = referenceTimes.get(fileStatus.getPath().toString()); + } else { + lastModifiedTimesForInputDir = referenceTimes.get(fileStatus.getPath().getParent().toString()); --- End diff -- This is not right, in case user has given input as, /home/myDir, /home/myDir/file1.txt, the scan of second input i.e. /home/myDir/file1.txt will overwrite the reference times for input /home/myDir. > scanner issues in FileSplitterInput class > ----------------------------------------- > > Key: APEXMALHAR-2103 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2103 > Project: Apache Apex Malhar > Issue Type: Bug > Reporter: Chaitanya > Assignee: Chaitanya > > Issue: FileSplitter continuously emitting filemetadata even though there is > a single file. > Observation: For the same file, While updating and accessing the > referenceTimes map in FIleSplitterInput and TimeBasedScanner, the Keys are > different. Because of this, the oldestTimeModification is always null in > TimeBasedScanner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)