In summary, although the flume-agent JVM doesnt exit, once a HDFS IO exception occurs due to deleting a .tmp file, the agent doesn't recover from this to log other hdfs sink outputs generated by syslog source.
There was only 1 JIRA remotely related to this HDFS sink issue I found in Apache which we didn't have. I tested by pulling-in jira patch FLUME-2007 into flume-1.4.0. https://github.com/apache/flume/commit/5b5470bd5d3e94842032009c36788d4ae346674bhttps://issues.apache.org/jira/browse/FLUME-2007 But it doesn't solve this issue. Should I open a new jira ticket? Thanks, Suhas. On Fri, Oct 11, 2013 at 4:13 PM, Suhas Satish <[email protected]>wrote: > Hi I have the following flume configuration file flume-syslog.conf > (attached) - > > 1.) I laucnh it with - > > bin/flume-ng agent -n agent -c conf -f conf/flume-syslog.conf > > 2.) Generate log output using loggen (provided by syslog-ng): > loggen -I 30 -s 300 -r 900 localhost 13073 > > 3.) I verify flume output is generated under /flume_import/ on hadoop cluster. > > It generates output of the form - > > -rwxr-xr-x 3 root root 139235 2013-10-11 14:35 > /flume_import/2013/10/14/logdata-2013-10-14-35-45.1381527345384.tmp > -rwxr-xr-x 3 root root 138095 2013-10-11 14:35 > /flume_import/2013/10/14/logdata-2013-10-14-35-46.1381527346543.tmp > -rwxr-xr-x 3 root root 135795 2013-10-11 14:35 > /flume_import/2013/10/14/logdata-2013-10-14-35-47.1381527347670.tmp > > > 4.) Delete the flume output files while loggen is still running and Flume is > generating the sink output. > > hadoop fs -rmr > /flume_import/2013/10/14/logdata-2013-10-14-35-47.1381527347670.tmp > > 5. )This gives me the following exception in the flume log. Although the > flume agent JVM continues to run, it does not generate any more output files > from syslog-ng until the flume agent JVM is restarted. Is flume expected to > behave like this or should it handle IOException gracefully and continue to > log output of syslog to other output directories? > > 10 Oct 2013 16:55:42,092 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] > (org.apache.flume.sink.hdfs.BucketWriter.append:430) - Caught IOException > while closing file > (maprfs:///flume_import/2013/10/16//logdata-2013-10-16-50-03.1381449008596.tmp). > Exception follows. > java.io.IOException: 2049.112.5249612 > /flume_import/2013/10/16/logdata-2013-10-16-50-03.1381449008596.tmp (Stale > file > handle) > at com.mapr.fs.Inode.throwIfFailed(Inode.java:269) > at com.mapr.fs.Inode.flushJniBuffers(Inode.java:402) > at com.mapr.fs.Inode.syncInternal(Inode.java:478) > at com.mapr.fs.Inode.syncUpto(Inode.java:484) > at com.mapr.fs.MapRFsOutStream.sync(MapRFsOutStream.java:244) > at com.mapr.fs.MapRFsDataOutputStream.sync(MapRFsDataOutputStream.java:68) > at org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:946) > at > org.apache.flume.sink.hdfs.HDFSSequenceFile.sync(HDFSSequenceFile.java:107) > at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:356) > at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:353) > at org.apache.flume.sink.hdfs.BucketWriter$8$1.run(BucketWriter.java:536) > at > org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:160) > at > org.apache.flume.sink.hdfs.BucketWriter.access$1000(BucketWriter.java:56) > at org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:533) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(T > > > 6.) I found the following related post > > http://mail-archives.apache.org/mod_mbox/flume-user/201305.mbox/%[email protected]%3E > > Not sure if its related to this issue. Can anyone comment? > > Thanks, > Suhas. >
