[ https://issues.apache.org/jira/browse/KAFKA-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15852189#comment-15852189 ]
ASF GitHub Bot commented on KAFKA-4725: --------------------------------------- GitHub user halorgium opened a pull request: https://github.com/apache/kafka/pull/2496 KAFKA-4725: Stop leaking messages in produce request body when requests are delayed This change is in response to [KAFKA-4725](https://issues.apache.org/jira/browse/KAFKA-4725). When a produce request is received, if the user/client is exceeding their produce quota, the response will be delayed until the quota is refilled appropriately. Unfortunately, the request body is still referenced in the callback which in turn leaks the messages contained within the request. This change allows the `KafkaApis` method to take ownership of the request body from the `RequestChannel.Request` object. I am not sure whether this breaks other invariants which are assumed within other parts of Kafka. You can merge this pull request into a Git repository by running: $ git pull https://github.com/heroku/kafka fix-throttled-response-leak Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/2496.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2496 ---- commit ddb0541b156db546fbf6e065670fb25d6e4baba2 Author: Tim Carey-Smith <t...@spork.in> Date: 2017-02-01T23:18:43Z Stop leaking produce request in throttled requests Further isolate the request from the callbacks Remove pointless changes Move body ownership logic into RequestChannel ---- > Kafka broker fails due to OOM when producer exceeds throttling quota for > extended periods of time > ------------------------------------------------------------------------------------------------- > > Key: KAFKA-4725 > URL: https://issues.apache.org/jira/browse/KAFKA-4725 > Project: Kafka > Issue Type: Bug > Components: core, producer > Affects Versions: 0.10.1.1 > Environment: Ubuntu Trusty (14.04.5), Oracle JDK 8 > Reporter: Jeff Chao > Priority: Critical > Labels: reliability > Fix For: 0.10.3.0, 0.10.2.1 > > Attachments: oom-references.png > > > Steps to Reproduce: > 1. Create a non-compacted topic with 1 partition > 2. Set a produce quota of 512 KB/s > 3. Send messages at 20 MB/s > 4. Observe heap memory growth as time progresses > Investigation: > While running performance tests with a user configured with a produce quota, > we found that the lead broker serving the requests would exhaust heap memory > if the producer sustained a inbound request throughput greater than the > produce quota. > Upon further investigation, we took a heap dump from that broker process and > discovered the ThrottledResponse object has a indirect reference to the > byte[] holding the messages associated with the ProduceRequest. > We're happy contributing a patch but in the meantime wanted to first raise > the issue and get feedback from the community. -- This message was sent by Atlassian JIRA (v6.3.15#6346)