Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/532#issuecomment-158578419 @revans2 Sorry to respond late. I agree that we can set default value of TOPOLOGY_SHELLBOLT_MAX_PENDING to reasonable value (not off) when deadlock bug you stated is fixed. Then ShellBolt could play well with ABP. I'm curious that ShellBolt could play with ABP when TOPOLOGY_SHELLBOLT_MAX_PENDING is off cause ShellBolt always tries to buffer pending writes ASAP regardless of pressure of subprocess. In order to resolve deadlock, I think we can have another queue which only stores pending messages for taskid which is unbounded (not affected to TOPOLOGY_SHELLBOLT_MAX_PENDING). And we introduce priorities for kinds of ShellBolt message, heartbeat > taskid > process, and let BoltWriterRunnable always tries to send message for higher priority. In result, only Thread which runs execute() could be blocked, and it is unblocked as soon as pending taskids are all sent. It doesn't end up starvation because heartbeat is triggered for every 1 sec, and taskids messages are never generated while subprocess doesn't process any requests. Additional points of this approach are, 1. Subprocess can finish emitting new tuple regardless of count of added pending messages before adding taskids. We can reduce waiting time for taskids. 2. It can also greatly reduce size of subprocess's pending_commands queue, which could reduce out of memory issue for subprocess. It is more important than ShellBolt's OOME because subprocess's memory issue could make machine hang or down. 3. 2. also reduce wait time for processing heartbeat message, which has been a headache issue. I'll close this pull request and work on new approach.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---