[ https://issues.apache.org/jira/browse/FLUME-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997254#comment-13997254 ]
Brock Noland commented on FLUME-2245: ------------------------------------- Hi, Yes, I was able to test this the BucketWriter writer change (and only that) and I found it fixed this issue. Note: I used kill -STOP on the DN to reproduce. > HDFS files with errors unable to close > -------------------------------------- > > Key: FLUME-2245 > URL: https://issues.apache.org/jira/browse/FLUME-2245 > Project: Flume > Issue Type: Bug > Reporter: Juhani Connolly > Attachments: flume.log.1133, flume.log.file > > > This is running on a snapshot of Flume-1.5 with the git hash > 99db32ccd163daf9d7685f0e8485941701e1133d > When a datanode goes unresponsive for a significant amount of time(for > example a big gc) an append failure will occur followed by repeated time outs > appearing in the log, and failure to close the stream. Relevant section of > logs attached(where it first starts appearing. > The same log repeats periodically, consistently running into a > TimeoutException. > Restarting flume(or presumably just the HDFSSink) solves the issue. > Probable cause in comments -- This message was sent by Atlassian JIRA (v6.2#6252)