> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 113
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line113>
> >
> >     Why not set these default values set in the class definition i.e. 
> > private long lastSeenParentDirMTime = -1 rather than specifically 
> > initializing them in the constructor? It's less code and more obvious what 
> > is happening.

This value is not relevant to the instance since it will be replaced instantly 
by the first iteration, but I'm fine with changing it.


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 150
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line150>
> >
> >     Why are we providing a "sorted by last (cached) modification time" as a 
> > guarantee here? Are we using that guarantee somehow? See below on the cost 
> > of that guarantee.

This guarantee was part of the original implementation (my change didn't make 
it worse). I wanted to be non-intrusive.


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 156
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line156>
> >
> >     Can this method ever return null? If so, let's make sure that doesn't 
> > happen. It should return Collections.emptyList() if it can't find anything.

It is guaranteed that it doesn't return null (it is in tha javadoc as well). 
Prerequisites of this guarantee is that lastMachedFiles starts with en empty 
list and getMatchingFilesNoCache won't change its guarantee that it returns not 
null. I think because both are private the additional explicit check for 
if(result == null) would be unneccessary.


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 79
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line79>
> >
> >     Add comments here for the variables that need explanation. For example, 
> > what are the units on these timestamps? Milliseconds?

comment added to member declaration


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 86
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line86>
> >
> >     nit: semi valid? Sounds like inventing terminology. If so, avoid. Also, 
> > s/consist/consisting/

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 103
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line103>
> >
> >     why final?

final removed


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 124
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line124>
> >
> >     grammar nit: s/which are matching regex pattern passed as object 
> > instantiation/that match the regex pattern passed in during object 
> > instantiation/

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 125
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line125>
> >
> >     grammar nit: s/frequently/frequent/

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 127
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line127>
> >
> >     nit: s/instruct/trigger/ here and below

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 161
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line161>
> >
> >     nit: spurious parenthesis before lastSeenParentDirMTime

the condition was described in the javadoc, unfortunately it is ugly but needed


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 164
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line164>
> >
> >     This sorting function seems problematic to me. It can call stat() up to 
> > n^2 times (assuming quicksort). Shouldn't we get the last modified time of 
> > each file in the list once and then sort it? Why are we sorting, anyway?
> >     
> >     By the way, I just checked the OpenJDK source code and indeed every 
> > File.lastModified() maps to FileSystem.getLastModifiedTime(f) which maps to 
> > stat(2). See 
> > https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/solaris/native/java/io/UnixFileSystem_md.c#L205

Sorting was part of the original implementation (my change didn't make it 
worse). I wanted to be non-intrusive.


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 165
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line165>
> >
> >     typo: s/lastMachedFiles/lastMatchedFiles/

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 180
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line180>
> >
> >     nit: weakly consistent? Sounds like invented terminology. Let's just 
> > say not consistent and then describe the guarantees we're providing.

Weakly consistent was defined here I only referenced it:
https://docs.oracle.com/javase/7/docs/api/java/nio/file/DirectoryStream.html
Although DirectoryStream is referenced from javadoc of 
getMatchingFilesNoCache(), I added another link to each occasion to make it 
clear.


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
> >  line 205
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line205>
> >
> >     Hmm. What is the purpose of this?

My idea was to add another configuration option to use the original logic. This 
didn't happen eventually so removing this fn


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 38
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line38>
> >
> >     style nit: just use one empty line here

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 47
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line47>
> >
> >     nit: add a line comment summarizing this helper function's job (just a 
> > short sentence is ok, like "Append a line to the specified file.")

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 49
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line49>
> >
> >     style nit: leave spaces around your brackets. We use the java style 
> > guide in Flume (admittedly not as consistently as I wish we did, though)

Related to coding style nits: is there any importable flume related style 
configuration for intellij somewhere I haven't found already? If yes could you 
please point me to its direction?


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 55
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line55>
> >
> >     style nit: space after the comma

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 59
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line59>
> >
> >     nit: rename to filesToNames() or something similar

done filesToNames


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 62
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line62>
> >
> >     Just curious... how is it possible to get a null here and why can't you 
> > prevent it? Seems unnecessary to allow returning a list with null values in 
> > it.

good catch, corrected. (@Nullable was part of the code generated by intellij 
and I just added handling nulls automatically)


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 63
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line63>
> >
> >     style nit: space after if and also space after ==

removed null check


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 64
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line64>
> >
> >     style nit: use curly braces for the whole thing if you've got an else 
> > clause

removed if/else as no need for null check


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 74
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line74>
> >
> >     nit: Why + ""?

changed to .toString()


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 100
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line100>
> >
> >     nit: s/a file were/a file was/
> >     
> >     here and on the next line

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 105
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line105>
> >
> >     s/a files/a file/

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 106
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line106>
> >
> >     nit: avoid copy / paste. Use a final variable to hold this repeated 
> > assertion message, here and below.

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 187
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line187>
> >
> >     nit: s/doesntexists/doesntexist/

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirSource.java,
> >  line 80
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404561#file1404561line80>
> >
> >     nit: spell out "To"

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 214
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line214>
> >
> >     nit: remove extra lines (1 blank line should be enough for readability)

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 198
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line198>
> >
> >     nit: space after the comma

done


> On June 7, 2016, 1:44 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java,
> >  line 194
> > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line194>
> >
> >     style nit: spaces around the +

done


- Attila


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48161/#review136086
-----------------------------------------------------------


On June 2, 2016, 5:08 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48161/
> -----------------------------------------------------------
> 
> (Updated June 2, 2016, 5:08 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2918
>     https://issues.apache.org/jira/browse/FLUME-2918
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> The way TailDir source checks which files should be tracked was improved. 
> Existing implementation caused unneccessary high CPU usage for huge (+50K 
> files) directories. This fix allows users to eliminate continous listing of 
> parent directory (on each Source.process invocation) and introduce a more 
> performant method for listing&matching files.
> 
> used java.nio.file.DirectoryStream to filter files
> made pattern match calculation optionally cached
> added junit tests
> added javadoc
> added license
> 
> 
> Diffs
> -----
> 
>   
> flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/ReliableTaildirEventReader.java
>  5b6d465 
>   
> flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java
>  PRE-CREATION 
>   
> flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirSource.java
>  8816327 
>   
> flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirSourceConfigurationConstants.java
>  6165276 
>   
> flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java
>  PRE-CREATION 
>   
> flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirSource.java
>  f9e614c 
> 
> Diff: https://reviews.apache.org/r/48161/diff/
> 
> 
> Testing
> -------
> 
> mvn clean install -DskipTests -> built
> junit tests for flume-taildir-source module -> passed
> 
> 
> Thanks,
> 
> Attila Simon
> 
>

Reply via email to