[ https://issues.apache.org/jira/browse/FLUME-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615773#comment-15615773 ]
Denes Arvay commented on FLUME-2812: ------------------------------------ I have investigated this issue and the cause is that {{bytesRemaining.release(putByteCounter)}} gets called in {{MemoryTransaction.doRollback}} (https://github.com/apache/flume/blob/trunk/flume-ng-core/src/main/java/org/apache/flume/channel/MemoryChannel.java#L174) while acquire is called only in {{doCommit}}. This results in semaphore leak and the number of permits in the semaphore eventually exceeds {{Integer.MAX_VALUE}} and {{Semaphore.release()}} throws this error. I think this bug was introduced unintentionally when the {{bytesRemaining}} semaphore handling was refactored and moved to {{doCommit}} from {{doPut}} in FLUME-2233. If acquire is called in the {{doPut}} then release should happen in {{doRollback}} but since it has been moved to {{doCommit}} the release should have been deleted. GitHub Pull Request for this ticket: https://github.com/apache/flume/pull/83 > Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" > java.lang.Error: Maximum permit count exceeded > ------------------------------------------------------------------------------------------------------------------ > > Key: FLUME-2812 > URL: https://issues.apache.org/jira/browse/FLUME-2812 > Project: Flume > Issue Type: Bug > Components: Channel, Sinks+Sources > Affects Versions: v1.6.0 > Environment: **OS INFO** > CentOS release 6.6 (Final) > Kernel \r on an \m > **JAVA INFO** > java version "1.8.0_40" > Java(TM) SE Runtime Environment (build 1.8.0_40-b26) > Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode) > Reporter: Rollin Crittendon > Assignee: Denes Arvay > Priority: Critical > > We are finding that around after an hour or so of heavy processing of Flume > data in an agent we are getting the following exception. This is after > processing about 5-7 k lines/second during that time. > The configuration of this agent is using a Kafka source, the one that comes > with 1.6.0. > It is also using a Memory channel, and a Thrift sink. > ======= > Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" > java.lang.Error: Maximum permit count exceeded > at > java.util.concurrent.Semaphore$Sync.tryReleaseShared(Semaphore.java:192) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at java.util.concurrent.Semaphore.release(Semaphore.java:609) > at > org.apache.flume.channel.MemoryChannel$MemoryTransaction.doCommit(MemoryChannel.java:147) > at > org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151) > at > org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:379) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > at java.lang.Thread.run(Thread.java:745) > ======= > The above error is from standard error when running the Flume agent. The > effect is that the "SinkRunner-PollingRunner-DefaultSinkProcessor" thread > disappears from the agent, this can be seen on a JMX console. > For us, this means that the Flume agent needs to get restarted. It is an > error that is terminal in that instance of the Java process due to the thread > disappearing as a result. > It sounds like something in JDK 7+ got stricter?! -- This message was sent by Atlassian JIRA (v6.3.4#6332)