[ https://issues.apache.org/jira/browse/ZOOKEEPER-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825539#comment-16825539 ]
Fangmin Lv commented on ZOOKEEPER-3356: --------------------------------------- [~eolivelli] The OOM issue happened when there are lots of continuous traffic beyond what we can support in ZK. It would be nice to have in 3.5, but I don't think it's a blocker. > Request throttling in Netty is not working as expected and could cause direct > buffer OOM issue > ----------------------------------------------------------------------------------------------- > > Key: ZOOKEEPER-3356 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3356 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.5.4, 3.6.0 > Reporter: Fangmin Lv > Assignee: Fangmin Lv > Priority: Major > Fix For: 3.6.0 > > > The current implementation of Netty enable/disable recv logic may cause the > direct buffer OOM because we may enable read a large chunk of packets and > disabled again after consuming a single ZK request. We have seen this problem > on prod occasionally. > > Need a more advanced flow control in Netty instead of using AUTO_READ. Have > improved it internally by enable/disable recv based on the queuedBuffer size, > will upstream this soon. > > With this implementation, the max Netty queued buffer size (direct memory > usage) will be 2 * recv_buffer size. It's not the per message size because in > epoll ET mode it will try to read until the socket is empty, and because of > SslHandler will trigger another read when it's not a full encrypt packet and > haven't issued any decrypt message. -- This message was sent by Atlassian JIRA (v7.6.3#76005)