[ https://issues.apache.org/jira/browse/KAFKA-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sönke Liebau resolved KAFKA-1016. --------------------------------- Resolution: Not A Problem Closing this as "not a problem", I believe the Purgatory redesign should help with the issue described here to a large extent. > Broker should limit purgatory size > ---------------------------------- > > Key: KAFKA-1016 > URL: https://issues.apache.org/jira/browse/KAFKA-1016 > Project: Kafka > Issue Type: Bug > Components: purgatory > Affects Versions: 0.8.0 > Reporter: Chris Riccomini > Assignee: Joel Koshy > Priority: Major > > I recently ran into a case where a poorly configured Kafka consumer was able > to trigger out of memory exceptions in multiple Kafka brokers. The consumer > was configured to have a fetcher.max.wait of Int.MaxInt. > For low volume topics, this configuration causes the consumer to block for > frequently, and for long periods of time. [~junrao] informs me that the fetch > request will time out after the socket timeout is reached. In our case, this > was set to 30s. > With several thousand consumer threads, the fetch request purgatory got into > the 100,000-400,000 range, which we believe triggered the out of memory > exception. [~nehanarkhede] claims to have seem similar behavior in other high > volume clusters. > It kind of seems like a bad thing that a poorly configured consumer can > trigger out of memory exceptions in the broker. I was thinking maybe it makes > sense to have the broker try and protect itself from this situation. Here are > some potential solutions: > 1. Have a broker-side max wait config for fetch requests. > 2. Threshold the purgatory size, and either drop the oldest connections in > purgatory, or reject the newest fetch requests when purgatory is full. -- This message was sent by Atlassian JIRA (v7.6.14#76016)