Sounds like the expected behavior to me based on the message, though it's a little confusing because it's caught in an IOException.
Somewhat related, we had our idleTimeout probably set too low, so the files would close pretty often. This was causing a memory leak for us, from what I can tell this is due to FLUME-1864. So I think it may be a good idea to bump up the idleTimeout if you're constantly closing idle files. I could be wrong though, I would defer to the developers. :) On Wed, May 22, 2013 at 8:58 AM, Paul Chavez < [email protected]> wrote: > ** > This thread reminded me to check my configs since I use a low idleTimeout > and bucket events by hour. Turned out I still had the default rollInterval > set so I disabled that and updated my configs. > > Now I see a log of exceptions logged as warnings in the log immediately > following an idleTimeout: > > 8:55:40.663 AM INFO org.apache.flume.sink.hdfs.BucketWriter > Closing idle bucketWriter > /flume/WebLogs/datekey=20130522/hour=08/FlumeData.1369238128886.tmp > 8:55:40.675 AM INFO org.apache.flume.sink.hdfs.BucketWriter > Renaming > /flume/WebLogs/datekey=20130522/hour=08/FlumeData.1369238128886.tmp to > /flume/WebLogs/datekey=20130522/hour=08/FlumeData.1369238128886 > 8:55:40.677 AM WARN org.apache.flume.sink.hdfs.HDFSEventSink > HDFS IO error > java.io.IOException: This bucket writer was closed due to idling and this > handle is thus no longer valid > at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:391) > at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:729) > at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:727) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > at java.lang.Thread.run(Thread.java:662) > > Given these are logged WARN I have been assuming they are benign errors. > Is that assumption correct? > > thanks, > Paul Chavez > > ------------------------------ > *From:* Connor Woodson [mailto:[email protected]] > *Sent:* Tuesday, May 21, 2013 2:13 PM > *To:* [email protected] > *Subject:* Re: HDFSEventSink Memory Leak Workarounds > > The other property you will want to look at is maxOpenFiles, which is > the number of file/paths held in memory at one time. > > If you search for the email thread with subject "hdfs.idleTimeout ,what's > it used for ?" from back in January you will find a discussion along these > lines. As a quick summary, if rollInterval is not set to 0, you should > avoid using idleTimeout and should set maxOpenFiles to a reasonable number > (the default is 500 which is too large; I think that default is changed for > 1.4). > > - Connor > > > On Tue, May 21, 2013 at 9:59 AM, Tim Driscoll > <[email protected]>wrote: > >> Hello, >> >> We have a Flume Agent (version 1.3.1) set up using the HDFSEventSink. We >> were noticing that we were running out of memory after a few days of >> running, and believe we had pinpointed it to an issue with using the >> hdfs.idleTimeout setting. I believe this is fixed in 1.4 per FLUME-1864. >> >> Our planned workaround was to just remove the idleTimeout setting, which >> worked, but brought up another issue. Since we are partitioning our data >> by timestamp, at midnight, we rolled over to a new bucket/partition, opened >> new bucket writers, and left the current bucket writers open. Ideally the >> idleTimeout would clean this up. So instead of a slow steady leak, we're >> encountering a 100MB leak every day. >> >> Short of upgrading Flume, does anyone know of a configuration workaround >> for this? Currently we just bumped up the heap memory and I'm having to >> restart our agents every few days, which obviously isn't ideal. >> >> Is anyone else seeing issues like this? Or how do others use the HDFS >> sink to continuously write large amounts of logs from multiple source >> hosts? I can get more in-depth about our setup/environment if necessary. >> >> Here's a snippet of the one of our 4 HDFS Sink configs: >> agent.sinks.rest-xaction-hdfs-sink.type = hdfs >> agent.sinks.rest-xaction-hdfs-sink.channel = rest-xaction-chan >> agent.sinks.rest-xaction-hdfs-sink.hdfs.path = >> /user/svc-neb/rest_xaction_logs/date=%Y-%m-%d >> agent.sinks.rest-xaction-hdfs-sink.hdfs.rollCount = 0 >> agent.sinks.rest-xaction-hdfs-sink.hdfs.rollSize = 0 >> agent.sinks.rest-xaction-hdfs-sink.hdfs.rollInterval = 3600 >> agent.sinks.rest-xaction-hdfs-sink.hdfs.idleTimeout = 300 >> agent.sinks.rest-xaction-hdfs-sink.hdfs.batchSize = 1000 >> agent.sinks.rest-xaction-hdfs-sink.hdfs.filePrefix = %{host} >> agent.sinks.rest-xaction-hdfs-sink.hdfs.fileSuffix = .avro >> agent.sinks.rest-xaction-hdfs-sink.hdfs.fileType = DataStream >> agent.sinks.rest-xaction-hdfs-sink.serializer = avro_event >> >> -Tim >> > >
