Thanks Hari for your help in this. Appreciate it.

We will work towards upgrading to CDH 4.2.1 soon, and hopefully, this issue is 
resolved.

~Rahul.


________________________________
 From: Hari Shreedharan <hshreedha...@cloudera.com>
To: "user@flume.apache.org" <user@flume.apache.org> 
Sent: Monday, May 13, 2013 7:58 PM
Subject: Re: IOException with HDFS-Sink:flushOrSync
 


The patch also made it to Hadoop 2.0.3.

On Monday, May 13, 2013, Hari Shreedharan  wrote:

Looks like CDH4.2.1 does have that patch: 
http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.2.1.CHANGES.txt (but 
it was not in CDH4.1.2)
>
>
>
>
>Hari
>
>
>-- 
>Hari Shreedharan
>
>
>On Monday, May 13, 2013 at 7:23 PM, Rahul Ravindran wrote:
>We are using cdh 4.1.2 - Hadoop version 2.0.0. Looks like cdh 4.2.1 also uses 
>the same Hadoop version. Any suggestions on any mitigations?
>>
>>Sent from my phone.Excuse the terseness.
>>
>>On May 13, 2013, at 7:12 PM, Hari Shreedharan <hshreedha...@cloudera.com> 
>>wrote:
>>
>>
>>What version of Hadoop are you using? Looks like you are getting hit by 
>>https://issues.apache.org/jira/browse/HADOOP-6762. 
>>>
>>>
>>>
>>>
>>>Hari
>>>
>>>
>>>-- 
>>>Hari Shreedharan
>>>
>>>
>>>On Monday, May 13, 2013 at 6:50 PM, Matt Wise wrote:
>>>So we've just had this happen twice to two different flume machines... we're 
>>>using the HDFS sink as well, but ours is writing to an S3N:// URL. Both 
>>>times our sink stopped working and the filechannel clogged up immediately 
>>>causing serious problems. A restart of Flume worked -- but the filechannel 
>>>was so backed up at that point that it took a good long while to get Flume 
>>>started up again properly.
>>>>
>>>>
>>>>Anyone else seeing this behavior?
>>>>
>>>>
>>>>(oh, and we're running flume 1.3.0)
>>>>
>>>>On May 7, 2013, at 8:42 AM, Rahul Ravindran <rahu...@yahoo.com> wrote:
>>>>
>>>>Hi,
>>>>>   We have noticed this a few times now where we appear to have an 
>>>>>IOException from HDFS and this stops draining the channel until the flume 
>>>>>process is restarted. Below are the logs: namenode-v01-00b is the active 
>>>>>namenode (namenode-v01-00a is standby). We are using Quorum Journal 
>>>>>Manager for our Namenode HA, but there was no Namenode failover which was 
>>>>>initiated. If this is an expected error, should flume handle it and 
>>>>>gracefully retry (thereby not requiring a restart)?
>>>>>Thanks,
>>>>>~Rahul.
>>>>>
>>>>>
>>>>>7 May 2013 06:35:02,494 WARN  [hdfs-hdfs-sink4-call-runner-2] 
>>>>>(org.apache.flume.sink.hdfs.BucketWriter.append:378)  - Caught IOException 
>>>>>writing to HDFSWriter (IOException flush:java.io.IOException: Failed on 
>>>>>local exception: java.nio.channels.ClosedByInterruptException; Host 
>>>>>Details : local host is: "flumefs-v01-10a.a.com/10.40.85.170"; destination 
>>>>>host is: "namenode-v01-00a.a.com":8020; ). Closing file 
>>>>>(hdfs://nameservice1/user/br/data_platform/eventstream/event/flumefs-v01-10a-4//event.1367891734983.tmp)
>>>>> and rethrowing exception.
>>>>>07 May 2013 06:35:02,494 WARN  [hdfs-hdfs-sink4-call-runner-2] 
>>>>>(org.apache.flume.sink.hdfs.BucketWriter.append:384)  - Caught IOException 
>>>>>while closing file 
>>>>>(hdfs://nameservice1/user/br/data_platform/eventstream/event/flumefs-v01-10a-4//event.1367891734983.tmp).
>>>>> Exception follows.
>>>>>java.io.IOException: IOException flush:java.io.IOException: Failed on 
>>>>>local exception: java.nio.channels.ClosedByInterruptException; Host 
>>>>>Details : local host is: "flumefs-v01-10a.a.com/10.40.85.170"; destination 
>>>>>host is: "namenode-v01-00a.a.com":8020;

Reply via email to