[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2015-10-19 Thread tzachi (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963850#comment-14963850
 ] 

tzachi commented on FLUME-2307:
---

Hi guys, this is still happening with Flume 1.5.0-cdh5.4.2. As mentioned above 
it is also not consistent. It works for few days as expected (which means 
deleting old files after 2 checkpoints), and then from some unknown reason it 
stops deleting old files, until the disk gets full and the logs start shouting 
"Usable space exhausted". I am using 2 different file channels (with 2 
different sinks) and both data directories experience the same issue (having 
lots of old files).

> Remove Log writetimeout
> ---
>
> Key: FLUME-2307
> URL: https://issues.apache.org/jira/browse/FLUME-2307
> Project: Flume
>  Issue Type: Bug
>  Components: Channel
>Affects Versions: v1.4.0
>Reporter: Steve Zesch
>Assignee: Hari Shreedharan
> Fix For: v1.5.0
>
> Attachments: FLUME-2307-1.patch, FLUME-2307.patch
>
>
> I've observed Flume failing to clean up old log data in FileChannels. The 
> amount of old log data can range anywhere from tens to hundreds of GB. I was 
> able to confirm that the channels were in fact empty. This behavior always 
> occurs after lock timeouts when attempting to put, take, rollback, or commit 
> to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
> files. I was able to confirm that the Log's writeCheckpoint method was still 
> being called and successfully obtaining a lock from tryLockExclusive(), but I 
> was not able to confirm removeOldLogs being called. The application log did 
> not include "Removing old file: log-xyz" for the old files which the Log 
> class would output if they were correctly being removed. I suspect the lock 
> timeouts were due to high I/O load at the time.
> Some stack traces:
> {code}
> org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
> log. Try increasing the log write timeout value. [channel=fileChannel]
> at 
> org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
> at 
> org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
> at 
> org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
> at 
> org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
> org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
> log. Try increasing the log write timeout value. [channel=fileChannel]
> at 
> org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
> at 
> org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
> at 
> dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
> at 
> dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
> at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:619)
> org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
> log. Try increasing the log write timeout value. [channel=fileChannel]
> at 
> org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
> at 
> org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
> at 
> org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
> at 
> dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
> at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
> at org.apache.avro.ipc.Responder.respond(Responder.java:151)
> at 
> org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
> at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
> at 
> org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
> at 
> 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-12 Thread Nina Safonova (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208414#comment-14208414
 ] 

Nina Safonova commented on FLUME-2307:
--

Flume 1.5.0-cdh5.1.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 14be91ec816bac5a91c321b9e8620ffb04acf04c
Compiled by jenkins on Sat Jul 12 09:17:48 PDT 2014
From source with checksum bf4451b17198a612fea60ad6f5420bbc

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-11 Thread Nina Safonova (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207623#comment-14207623
 ] 

Nina Safonova commented on FLUME-2307:
--

OS is CentOS 6.5, flume version is 1.5 as I mentioned above, all the log I also 
posted above.
I waited log enough and no old files were deleted. This doesn't work that way 
all the time, from the start to some random moment it's working as expected and 
cleans old files, but at some point it just stops to do this and at some later 
point disk ran out of space.

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-11 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207640#comment-14207640
 ] 

Hari Shreedharan commented on FLUME-2307:
-

This does not seem like it is due to log timeout. Can you delete all of the 
checkpoint files (all files) and force a full replay?

Did you delete the files before or after stopping the agent?

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:220)
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-11 Thread Nina Safonova (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207643#comment-14207643
 ] 

Nina Safonova commented on FLUME-2307:
--

I didn't try to delete checkpoint files.
To keep processing data I deleted all the files (data and checkpoints) after I 
stopped the agent.

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:220)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-11 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207645#comment-14207645
 ] 

Hari Shreedharan commented on FLUME-2307:
-

From your logs, I don't see the actual cause of this issue. Did you also 
delete the inflight* files too?

Deleting the data files and keeping any of the checkpoint files can cause 
unpredictable issues. 

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-11 Thread Nina Safonova (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207648#comment-14207648
 ] 

Nina Safonova commented on FLUME-2307:
--

I didn't delete any files before I start to experience this issue. After I ran 
out of disk space I tried to restart the agent. When it didn't help to clean 
old files I stopped the agent and deleted all the files manually.

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-11 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207651#comment-14207651
 ] 

Hari Shreedharan commented on FLUME-2307:
-

From the logs you have here, I can't really see what the issue is. What is the 
exact version of Flume you are running? 

`flume-ng version` can give you that info.

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:220)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-10 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14205212#comment-14205212
 ] 

Hari Shreedharan commented on FLUME-2307:
-

Which version of Flume? OS details and also stacktrace/logs?

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:220)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-10 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14205225#comment-14205225
 ] 

Hari Shreedharan commented on FLUME-2307:
-

Your checkpoint interval is set to 27+ hours (that value is in seconds). Flume 
deletes files only on every checkpoint. In fact, once the events in a file are 
removed, it will remove the file in the 2nd checkpoint after that one (not the 
one immediately after). So in your case, it needs to wait for 54 hours before 
deleting the file. Even after a replay it waits for another checkpoint to 
delete the file. Did you wait for that long and verify?

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-10 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14205230#comment-14205230
 ] 

Hari Shreedharan commented on FLUME-2307:
-

Sorry - my mistake. That is in millis. But you should still account for the 2 
checkpoints.

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:220)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-07 Thread Jeff Lord (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202215#comment-14202215
 ] 

Jeff Lord commented on FLUME-2307:
--

Should we re-open this issue?
Not sure if it is still occurring or what.

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:220)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-11-07 Thread Nina Safonova (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202342#comment-14202342
 ] 

Nina Safonova commented on FLUME-2307:
--

It is still occurring.

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:220)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-08-13 Thread Nina Safonova (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095973#comment-14095973
 ] 

Nina Safonova commented on FLUME-2307:
--

Hi guys, recently we migrated to flume 1.5 and we are experiencing the similar 
issue: at some point flume stopped to remove old files, but it was creating a 
new ones so at the end we ran out of disk space. I log I see many messages for 
the same log-686 (it's exactly the one at which flume stopped to remove old 
logs:

12 Aug 2014 05:24:49,956 INFO  [Log-BackgroundWorker-channel1] 
(org.apache.flume.channel.file.LogFile$RandomReader.close:504)  - Closing 
RandomReader /local/flume-ng/data/channel1/log-686

I see no Removing old file: /local/flume-ng/data/channel1/log-686 message 
while for all the previous (and removed) logs I see:

12 Aug 2014 05:09:49,607 INFO  [Log-BackgroundWorker-channel1] 
(org.apache.flume.channel.file.Log.removeOldLogs:1060)  - Removing old file: 
/local/flume-ng/data/channel1/log-685
12 Aug 2014 05:09:49,715 INFO  [Log-BackgroundWorker-channel1] 
(org.apache.flume.channel.file.Log.removeOldLogs:1060)  - Removing old file: 
/local/flume-ng/data/channel1/log-685.meta

Our configuration is:

tracer.channels.channel1.type = FILE
tracer.channels.channel1.checkpointDir = /local/flume-ng/checkpoints/channel1
tracer.channels.channel1.dataDirs = /local/flume-ng/data/channel1
tracer.channels.channel1.transactionCapacity = 5000
tracer.channels.channel1.checkpointInterval = 10
tracer.channels.channel1.maxFileSize = 2097152000
tracer.channels.channel1.capacity = 1600
tracer.channels.channel1.write-timeout = 60

This happend twice during last 2 days.
I wa able to debug it once and find out that in 
org.apache.flume.channel.file.Log.removeOldLogs(SortedSetInteger fileIDs) for 
fileIDs passed just log-686 and the latest file (which is increasing). Nothing 
in pendingDeletes. 26 entries in idLogFileMap, but none of them is deleted 
because minFileID is 686 and other files have greater ID. Why this is happening?

Thanks

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-08-13 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096268#comment-14096268
 ] 

Hari Shreedharan commented on FLUME-2307:
-

Can you find out how many events are still in the channel using the metrics

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:220)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-08-13 Thread Nina Safonova (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096325#comment-14096325
 ] 

Nina Safonova commented on FLUME-2307:
--

Unfortunately I already restarted flume. But channel was operating normally, 
sinks were reading from it, sources were writing to it. After restart no old 
logs was deleted so I did it manually. Here is the log of restart with channel1 
related info:

12 Aug 2014 21:36:38,022 INFO  [conf-file-poller-0] 
(org.apache.flume.channel.DefaultChannelFactory.create:40)  - Creating instance 
of channel channel1 type FILE
12 Aug 2014 21:36:38,035 INFO  [conf-file-poller-0] 
(org.apache.flume.node.AbstractConfigurationProvider.loadChannels:205)  - 
Created channel channel1
12 Aug 2014 21:36:38,205 INFO  [conf-file-poller-0] 
(org.apache.flume.node.AbstractConfigurationProvider.getConfiguration:119)  - 
Channel channel1 connected to [es-sink1, es-sink4, es-sink3, es-sink2]
12 Aug 2014 21:36:38,221 INFO  [conf-file-poller-0] 
(org.apache.flume.node.Application.startAllComponents:139)  - Starting new 
configuration:{ sourceRunners:{} sinkRunners:{google-BQ-perf-sink1=SinkRunner: 
{ policy:org.apache.flume.sink.DefaultSinkProcessor@6eb5845c counterGroup:{ 
name:null counters:{} } }, es-sink1=SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@4f04eccc counterGroup:{ 
name:null counters:{} } }, es-sink4=SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@4c566d9b counterGroup:{ 
name:null counters:{} } }, es-sink3=SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@3e360244 counterGroup:{ 
name:null counters:{} } }, es-sink2=SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@4bcede44 counterGroup:{ 
name:null counters:{} } }, google-BQ-sink4=SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@7a62693d counterGroup:{ 
name:null counters:{} } }, google-BQ-sink3=SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@52eb6290 counterGroup:{ 
name:null counters:{} } }, google-BQ-sink2=SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@5b940677 counterGroup:{ 
name:null counters:{} } }, google-BQ-sink1=SinkRunner: { 
policy:org.apache.flume.sink.DefaultSinkProcessor@53349d99 counterGroup:{ 
name:null counters:{} } }} channels:{channel1=FileChannel channel1 { dataDirs: 
[/local/flume-ng/data/channel1] }, channel2=FileChannel channel2 { dataDirs: 
[/local/flume-ng/data/channel2] }, 
channel3=org.apache.flume.channel.MemoryChannel{name: channel3}} }
12 Aug 2014 21:36:38,222 INFO  [conf-file-poller-0] 
(org.apache.flume.node.Application.startAllComponents:146)  - Starting Channel 
channel1
12 Aug 2014 21:36:38,222 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.channel.file.FileChannel.start:259)  - Starting FileChannel 
channel1 { dataDirs: [/local/flume-ng/data/channel1] }...
12 Aug 2014 21:36:38,270 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.channel.file.Log.replay:385)  - Found NextFileID 714, from 
[/local/flume-ng/data/channel1/log-691, /local/flume-ng/data/channel1/log-707, 
/local/flume-ng/data/channel1/log-696, /local/flume-ng/data/channel1/log-697, 
/local/flume-ng/data/channel1/log-702, /local/flume-ng/data/channel1/log-710, 
/local/flume-ng/data/channel1/log-704, /local/flume-ng/data/channel1/log-711, 
/local/flume-ng/data/channel1/log-698, /local/flume-ng/data/channel1/log-686, 
/local/flume-ng/data/channel1/log-688, /local/flume-ng/data/channel1/log-706, 
/local/flume-ng/data/channel1/log-712, /local/flume-ng/data/channel1/log-705, 
/local/flume-ng/data/channel1/log-714, /local/flume-ng/data/channel1/log-713, 
/local/flume-ng/data/channel1/log-700, /local/flume-ng/data/channel1/log-689, 
/local/flume-ng/data/channel1/log-687, /local/flume-ng/data/channel1/log-690, 
/local/flume-ng/data/channel1/log-708, /local/flume-ng/data/channel1/log-701, 
/local/flume-ng/data/channel1/log-703, /local/flume-ng/data/channel1/log-692, 
/local/flume-ng/data/channel1/log-694, /local/flume-ng/data/channel1/log-709, 
/local/flume-ng/data/channel1/log-695, /local/flume-ng/data/channel1/log-693, 
/local/flume-ng/data/channel1/log-699]
12 Aug 2014 21:36:38,288 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.channel.file.EventQueueBackingStoreFileV3.init:53)  - 
Starting up with /local/flume-ng/checkpoints/channel1/checkpoint and 
/local/flume-ng/checkpoints/channel1/checkpoint.meta
12 Aug 2014 21:36:38,289 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.channel.file.EventQueueBackingStoreFileV3.init:57)  - 
Reading checkpoint metadata from 
/local/flume-ng/checkpoints/channel1/checkpoint.meta
12 Aug 2014 21:36:38,723 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.channel.file.ReplayHandler.replayLog:249)  - Starting replay 
of [/local/flume-ng/data/channel1/log-686, 
/local/flume-ng/data/channel1/log-687, /local/flume-ng/data/channel1/log-688, 
/local/flume-ng/data/channel1/log-689, 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-02-26 Thread Arun (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912985#comment-13912985
 ] 

Arun commented on FLUME-2307:
-

Is it possible to port this patch to 1.4.0 branch?

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:220)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-02-10 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896990#comment-13896990
 ] 

Brock Noland commented on FLUME-2307:
-

+1

I cannot test/commit as I am on an airplane.

 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:220)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-02-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897034#comment-13897034
 ] 

ASF subversion and git services commented on FLUME-2307:


Commit b4ddd5829897f758f869a5fc3b08dcbf4b55156a in branch refs/heads/trunk from 
[~jarcec]
[ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=b4ddd58 ]

FLUME-2307. Remove Log writetimeout

(Hari Shreedharan via Jarek Jarcec Cecho)


 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-02-10 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897035#comment-13897035
 ] 

ASF subversion and git services commented on FLUME-2307:


Commit 0cb6ce69140f137af947de4e0828ff73a623f042 in branch refs/heads/flume-1.5 
from [~jarcec]
[ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=0cb6ce6 ]

FLUME-2307. Remove Log writetimeout

(Hari Shreedharan via Jarek Jarcec Cecho)


 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:321)
 at 
 org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:303)
 at 
 

[jira] [Commented] (FLUME-2307) Remove Log writetimeout

2014-02-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897130#comment-13897130
 ] 

Hudson commented on FLUME-2307:
---

SUCCESS: Integrated in flume-trunk #549 (See 
[https://builds.apache.org/job/flume-trunk/549/])
FLUME-2307. Remove Log writetimeout (jarcec: 
http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.gita=commith=b4ddd5829897f758f869a5fc3b08dcbf4b55156a)
* 
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java
* 
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java
* 
flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannelConfiguration.java
* flume-ng-doc/sphinx/FlumeUserGuide.rst


 Remove Log writetimeout
 ---

 Key: FLUME-2307
 URL: https://issues.apache.org/jira/browse/FLUME-2307
 Project: Flume
  Issue Type: Bug
  Components: Channel
Affects Versions: v1.4.0
Reporter: Steve Zesch
Assignee: Hari Shreedharan
 Fix For: v1.5.0

 Attachments: FLUME-2307-1.patch, FLUME-2307.patch


 I've observed Flume failing to clean up old log data in FileChannels. The 
 amount of old log data can range anywhere from tens to hundreds of GB. I was 
 able to confirm that the channels were in fact empty. This behavior always 
 occurs after lock timeouts when attempting to put, take, rollback, or commit 
 to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old 
 files. I was able to confirm that the Log's writeCheckpoint method was still 
 being called and successfully obtaining a lock from tryLockExclusive(), but I 
 was not able to confirm removeOldLogs being called. The application log did 
 not include Removing old file: log-xyz for the old files which the Log 
 class would output if they were correctly being removed. I suspect the lock 
 timeouts were due to high I/O load at the time.
 Some stack traces:
 {code}
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93)
 at 
 org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
 at 
 dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548)
 at 
 dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33)
 at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
 at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
 at java.lang.Thread.run(Thread.java:619)
 org.apache.flume.ChannelException: Failed to obtain lock for writing to the 
 log. Try increasing the log write timeout value. [channel=fileChannel]
 at 
 org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621)
 at 
 org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
 at 
 org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194)
 at 
 dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209)
 at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91)
 at org.apache.avro.ipc.Responder.respond(Responder.java:151)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
 at 
 org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75)
 at 
 org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at 
 org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792)