[ 
https://issues.apache.org/jira/browse/FLUME-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997131#comment-13997131
 ] 

Hari Shreedharan commented on FLUME-2245:
-----------------------------------------

This patch does not really need the changes in the HDFSDataStream and 
HDFSCompressedStream classes. We should just catch the exception thrown by the 
flush and try to close. If the close fails, it will get rescheduled anyway.

> HDFS files with errors unable to close
> --------------------------------------
>
>                 Key: FLUME-2245
>                 URL: https://issues.apache.org/jira/browse/FLUME-2245
>             Project: Flume
>          Issue Type: Bug
>            Reporter: Juhani Connolly
>         Attachments: flume.log.1133, flume.log.file
>
>
> This  is running on a snapshot of Flume-1.5 with the git hash 
> 99db32ccd163daf9d7685f0e8485941701e1133d
> When a datanode goes unresponsive for a significant amount of time(for 
> example a big gc) an append failure will occur followed by repeated time outs 
> appearing in the log, and failure to close the stream. Relevant section of 
> logs attached(where it first starts appearing.
> The same log repeats periodically, consistently running into a 
> TimeoutException.
> Restarting  flume(or presumably just the HDFSSink) solves the issue.
> Probable cause in comments



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to