> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 161
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line161>
> >
> >     nit: spurious parenthesis before lastSeenParentDirMTime
> 
> Attila Simon wrote:
>     the condition was described in the javadoc, unfortunately it is ugly but 
> needed

How about this?

  List<File> getMatchingFiles() {
    long now = System.currentTimeMillis();
    long currentParentDirMTime = parentDir.lastModified();
    // Only check a maximum of once per second.
    if (!cachePatternMatching ||
        (currentParentDirMTime > lastSeenParentDirMTime &&
         TimeUnit.SECONDS.toMillis(TimeUnit.MILLISECONDS.toSeconds(now)) > 
lastCheckedTime)) {
      lastMatchedFiles = getMatchingFilesNoCache();
      Collections.sort(lastMatchedFiles, new 
TailFile.CompareByLastModifiedTime());
      lastSeenParentDirMTime = currentParentDirMTime;
      lastCheckedTime = 
TimeUnit.SECONDS.toMillis(TimeUnit.MILLISECONDS.toSeconds(now));
    }
    return lastMatchedFiles;
  }

Except that we should replace the sorting with a helper function that only runs 
stat() once per item.


- Mike


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48161/#review136086
-----------------------------------------------------------


On June 13, 2016, 2:14 p.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48161/
> -----------------------------------------------------------
> 
> (Updated June 13, 2016, 2:14 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2918
>     https://issues.apache.org/jira/browse/FLUME-2918
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> The way TailDir source checks which files should be tracked was improved. 
> Existing implementation caused unneccessary high CPU usage for huge (+50K 
> files) directories. This fix allows users to eliminate continous listing of 
> parent directory (on each Source.process invocation) and introduce a more 
> performant method for listing&matching files.
> 
> used java.nio.file.DirectoryStream to filter files
> made pattern match calculation optionally cached
> added junit tests
> added javadoc
> added license
> 
> 
> Diffs
> -----
> 
>   
> flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/ReliableTaildirEventReader.java
>  5b6d465 
>   
> flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java
>  PRE-CREATION 
>   
> flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirSource.java
>  8816327 
>   
> flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirSourceConfigurationConstants.java
>  6165276 
>   
> flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java
>  PRE-CREATION 
>   
> flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirSource.java
>  f9e614c 
> 
> Diff: https://reviews.apache.org/r/48161/diff/
> 
> 
> Testing
> -------
> 
> mvn clean install -DskipTests -> built
> junit tests for flume-taildir-source module -> passed
> 
> 
> Thanks,
> 
> Attila Simon
> 
>

Reply via email to