Re: Review Request 60568: Log Feeder input config attribute "tail" should be clearer

Robert Nettleton Wed, 05 Jul 2017 09:03:31 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60568/#review179640
-----------------------------------------------------------



Ship it!




Ship It!

- Robert Nettleton


On June 30, 2017, 3:13 p.m., Miklos Gergely wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60568/
> -----------------------------------------------------------
> 
> (Updated June 30, 2017, 3:13 p.m.)
> 
> 
> Review request for Ambari, Oliver Szabo and Robert Nettleton.
> 
> 
> Bugs: AMBARI-21387
>     https://issues.apache.org/jira/browse/AMBARI-21387
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> tail = true should mean that only the latest log file is loaded
> tail = false should mean that the previous rolled over log files are loaded 
> too, but not followed
> 
> AbstractInputFile must be prepared for previous rolled over files to check in 
> even when the next part is actively handled, as the checking in is done when 
> the data was moved to solr, so all file related fields were moved to maps, so 
> the checkin can identify which file's status is checked in.
> 
> Log files are sorted alphabetically after identified, thus ensureing that the 
> first one is the active one. In practice they were sorted already, but the 
> File.listFiles function doesn't ensures it according to it's documentation, 
> so it's better to do so explicitly.
> 
> The thread that is responsible for deleting unused checkpoint files should do 
> one more check before deleting a checkpoint file because the file it belongs 
> to is no more there, or it's file key doesn't matches the one in the 
> checkpoint file: first it should look up if there is any other file in the 
> same folder which have the same file key, in which case it is assumed that 
> the file was renamed since. As the code that's looking for a checkpoint file 
> to determine the next line to load does that by the name of the checkpoint 
> file, which is generated from the file key it will successfuly identify it 
> irrelevant of the name of the file.
> 
> I know, it's a bit complicated :) But I couldn't put it into words less 
> complicated.
> 
> 
> Also fixed Log Search Config ZK bug not to handle not node data change 
> related events.
> 
> 
> Diffs
> -----
> 
>   
> ambari-logsearch/ambari-logsearch-config-zookeeper/src/main/java/org/apache/ambari/logsearch/config/zookeeper/LogSearchConfigZK.java
>  26375e1 
>   
> ambari-logsearch/ambari-logsearch-logfeeder/src/main/java/org/apache/ambari/logfeeder/input/AbstractInputFile.java
>  ab50eb7 
>   
> ambari-logsearch/ambari-logsearch-logfeeder/src/main/java/org/apache/ambari/logfeeder/input/Input.java
>  c36f96b 
>   
> ambari-logsearch/ambari-logsearch-logfeeder/src/main/java/org/apache/ambari/logfeeder/input/InputFile.java
>  fc40ca4 
>   
> ambari-logsearch/ambari-logsearch-logfeeder/src/main/java/org/apache/ambari/logfeeder/input/InputManager.java
>  19894ae 
>   
> ambari-logsearch/ambari-logsearch-logfeeder/src/main/java/org/apache/ambari/logfeeder/input/InputS3File.java
>  2b19503 
>   
> ambari-logsearch/ambari-logsearch-logfeeder/src/test/java/org/apache/ambari/logfeeder/input/InputManagerTest.java
>  e9bbe7e 
> 
> 
> Diff: https://reviews.apache.org/r/60568/diff/1/
> 
> 
> Testing
> -------
> 
> Tested on local vagrant cluster, by loading previous files again and again, 
> and by renaming rolled over files. In both cases no duplicate data were 
> loaded, checkpoints were handled correctly.
> 
> 
> Thanks,
> 
> Miklos Gergely
> 
>

Re: Review Request 60568: Log Feeder input config attribute "tail" should be clearer

Reply via email to