[ 
https://issues.apache.org/jira/browse/FLUME-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791115#comment-13791115
 ] 

Phil Scala commented on FLUME-2066:
-----------------------------------

@Hari - It could be an option to shutdown, however I'd rather not get a call at 
3AM.  I like Roshan's thought of a dead-letter queue.  This is very much like a 
queueing system where you do not poison the queue, but move the offending 
message to a safe place and move on.  Renaming the file to 
ERROR_lastLine#processed.COMPLETED is an easy solution.

I linked this to Flume-2119 which touches on this same area of code, I have a 
patch for that, but does not implement any of this renaming discussion.

> Spool directory source can get stuck in a "Serializer has been closed" loop 
> when retireCurrentFile throws an exception
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLUME-2066
>                 URL: https://issues.apache.org/jira/browse/FLUME-2066
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.4.0, v1.3.1
>            Reporter: Phil Scala
>            Assignee: Phil Scala
>
> The following 2 java files have similar code and are affected by this 
> issue... 
> 1.31. SpoolingfileLineReader.java 
> 1.4 ReliableSpoolingFileEventReader.java 
> retireCurrentFile is called by 1 caller (readLines in 1.3.1 and readEvents in 
> 1.4) 
> {code:java} 
> retireCurrentFile(); 
>       currentFile = getNextFile(); 
>       if (!currentFile.isPresent()) { 
>         return Collections.emptyList(); 
>       } 
> {code} 
> if retireCurrentFile throws an exception after closing the reader (there are 
> a few causes for an exception tobe raised which are described below) the the 
> currentFile still points to the attempted to be retired file. This causes 
> subsequent calls to readLines/readEvents to raise a "Serializer has been 
> closed" exception. At this point the application needs to be shutdown in 
> order to rectify the problem. If Flume is left running for a while, the logs 
> are littered with the error, so you have to go to the initial error logged to 
> understand what happened. 
> *Exceptions raised in "retireCurrentFile()"* 
> IlligalStateException when the file modified date changes 
> IlligalStateException when the size changes 
> IllegalStateException when renaming the current file and the target file 
> already exists (with different sizes) 
> IllegalStateException when renaming the current file and the target file 
> already exists [non windows] 
> FlumeException when renameTo does not return true. 
> The documentation does say: 
> *Warning This channel expects that only immutable, uniquely named files are 
> dropped in the spooling directory. If duplicate names are used, or files are 
> modified while being read, the source will fail with an error message *
> I am not sure however if the intention was to get caught into the "Serializer 
> has been closed" loop. 3 possible solutions: 
> 1. Re-spool the retired file, this will cause duplicates and could get caught 
> in a loop of constantly spooling this file. 
> 2. Log an error and continue spooling the next files. 
> 3. Shutdown 
> I like option..2



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to