[
https://issues.apache.org/jira/browse/KAFKA-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sönke Liebau resolved KAFKA-1016.
---------------------------------
Resolution: Not A Problem
Closing this as "not a problem", I believe the Purgatory redesign should help
with the issue described here to a large extent.
> Broker should limit purgatory size
> ----------------------------------
>
> Key: KAFKA-1016
> URL: https://issues.apache.org/jira/browse/KAFKA-1016
> Project: Kafka
> Issue Type: Bug
> Components: purgatory
> Affects Versions: 0.8.0
> Reporter: Chris Riccomini
> Assignee: Joel Koshy
> Priority: Major
>
> I recently ran into a case where a poorly configured Kafka consumer was able
> to trigger out of memory exceptions in multiple Kafka brokers. The consumer
> was configured to have a fetcher.max.wait of Int.MaxInt.
> For low volume topics, this configuration causes the consumer to block for
> frequently, and for long periods of time. [~junrao] informs me that the fetch
> request will time out after the socket timeout is reached. In our case, this
> was set to 30s.
> With several thousand consumer threads, the fetch request purgatory got into
> the 100,000-400,000 range, which we believe triggered the out of memory
> exception. [~nehanarkhede] claims to have seem similar behavior in other high
> volume clusters.
> It kind of seems like a bad thing that a poorly configured consumer can
> trigger out of memory exceptions in the broker. I was thinking maybe it makes
> sense to have the broker try and protect itself from this situation. Here are
> some potential solutions:
> 1. Have a broker-side max wait config for fetch requests.
> 2. Threshold the purgatory size, and either drop the oldest connections in
> purgatory, or reject the newest fetch requests when purgatory is full.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)