[
https://issues.apache.org/jira/browse/FLUME-798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cameron Gandevia updated FLUME-798:
-----------------------------------
Attachment: 0001-FLUME-798-Modified-RollSink-to-not-cancel-pending-si.patch
We applied https://issues.apache.org/jira/browse/Flume-808 and this patch to
get our collectors working again. This patch is not the best solution because
it creates the original problem of a downstream sink blocking but we needed
something to work quickly so we modified the rollsink to not cancel pending
tasks.
The RollSink test will also not pass with this patch.
We are looking at gracefully handling the InterruptExceptions and will submit a
patch when finished.
> Blocked append interrupted by rotation event
> ---------------------------------------------
>
> Key: FLUME-798
> URL: https://issues.apache.org/jira/browse/FLUME-798
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v0.9.5
> Reporter: Cameron Gandevia
> Attachments:
> 0001-FLUME-798-Modified-RollSink-to-not-cancel-pending-si.patch
>
>
> Our flume collector seem's to work for a short period of time and then fails
> with the following exception. When this happens the collector does not
> reconnect and the system becomes inactive with the processes still running.
> 2011-10-14 01:49:47,386 [logicalNode collector0_log_dir-115] ERROR
> com.cloudera.flume.core.connector.DirectDriver - Closing down due to
> exception during append calls
> 2011-10-14 01:49:47,387 [logicalNode collector0_log_dir-115] INFO
> com.cloudera.flume.core.connector.DirectDriver - Connector logicalNode
> collector0_log_dir-115 exited with error: Blocked append interrupted by
> rotation event
> java.lang.InterruptedException: Blocked append interrupted by rotation event
> at
> com.cloudera.flume.handlers.rolling.RollSink.append(RollSink.java:209)
> at
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
> at com.cloudera.flume.core.MaskDecorator.append(MaskDecorator.java:43)
> at
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
> at
> com.cloudera.flume.handlers.debug.InsistentOpenDecorator.append(InsistentOpenDecorator.java:169)
> at
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
> at
> com.cloudera.flume.handlers.debug.StubbornAppendSink.append(StubbornAppendSink.java:71)
> at
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
> at
> com.cloudera.flume.handlers.debug.InsistentAppendDecorator.append(InsistentAppendDecorator.java:110)
> at
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
> at
> com.cloudera.flume.handlers.endtoend.AckChecksumChecker.append(AckChecksumChecker.java:113)
> at
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
> at
> com.cloudera.flume.handlers.batch.UnbatchingDecorator.append(UnbatchingDecorator.java:62)
> at
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
> at
> com.cloudera.flume.handlers.batch.GunzipDecorator.append(GunzipDecorator.java:81)
> at
> com.cloudera.flume.collector.CollectorSink.append(CollectorSink.java:222)
> at
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
> at
> com.cloudera.flume.core.extractors.DateExtractor.append(DateExtractor.java:129)
> at
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
> at
> com.cloudera.flume.core.extractors.RegexExtractor.append(RegexExtractor.java:88)
> at
> com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:133)
> 2011-10-14 01:49:47,388 [logicalNode collector0_log_dir-115] INFO
> com.cloudera.flume.collector.CollectorSource - closed
> 2011-10-14 01:49:48,391 [logicalNode collector0_log_dir-115] INFO
> com.cloudera.flume.handlers.thrift.ThriftEventSource - Closed server on port
> 36892...
> 2011-10-14 01:49:48,391 [logicalNode collector0_log_dir-115] INFO
> com.cloudera.flume.handlers.thrift.ThriftEventSource - Queue still has 1000
> elements ...
> 2011-10-14 01:49:58,399 [logicalNode collector0_log_dir-115] WARN
> com.cloudera.flume.handlers.thrift.ThriftEventSource - Close timed out due to
> no progress. Closing despite having 1000 values still enqueued
> 2011-10-14 01:49:58,399 [logicalNode collector0_log_dir-115] INFO
> com.cloudera.flume.handlers.rolling.RollSink - closing RollSink
> 'escapedCustomDfs("hdfs://van-mang-perf-hadoop-namenode1.net:8020/rawLogs/%{dateyear}-%{datemonth}-%{dateday}/%{datehr}00","raw-%{rolltag}"
> )'
> 2011-10-14 01:49:58,400 [logicalNode collector0_log_dir-115] INFO
> com.cloudera.flume.handlers.rolling.RollSink - double close
> 'escapedCustomDfs("hdfs://van-mang-perf-hadoop-namenode1.net:8020/rawLogs/%{dateyear}-%{datemonth}-%{dateday}/%{datehr}00","raw-%{rolltag}"
> )'
> 2011-10-14 01:49:58,400 [logicalNode collector0_log_dir-115] ERROR
> com.cloudera.flume.core.connector.DirectDriver - Exiting driver logicalNode
> collector0_log_dir-115 in error state CollectorSource | RegexExtractor
> because Blocked append interrupted by rotation event
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira