Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/2241
  
    @roshannaik I am happy to retry it with max spout pending disabled, but in 
my testing I found that disabling it negatively impacted the performance. (My 
initial tests prior to modifying TVL to have lower parallelism) showed that it 
was having a lot of trouble with GC slowing it down.  It could not handle 
150,000 sentences per second, and would max out at about 120,000 to 130,000
    
    ```
    150000 1 -c topology.workers=1 -c topology.acker.executors=2
    ```
    
    But when I added in a maximum of 500
    
    ```
    150000 1 -c topology.workers=1 -c topology.acker.executors=2 -c 
topology.max.spout.pending=500
    ```
    
    it was able to easily keep up.
    
    Also later on I was trying to tune it to an optimal value, and I tried 
several different values for it.
    
    ```
    300000 1 -c topology.workers=1 -c topology.acker.executors=1 -c 
topology.max.spout.pending=1000 3 wc-test 1 1 1
    ```
    which maxed out the throughput at abut 230,000 sentences per sec but 
setting it to 2000
    
    ```
    300000 1 -c topology.workers=1 -c topology.acker.executors=1 -c 
topology.max.spout.pending=2000 3 wc-test 1 1 1
    ```
    
    dropped that maximum to 100,000. At this time I didn't spend the time to 
really dig in and see what the bottleneck was, like I did before so I cannot 
say if it was GC or not.
    
    I am also opposed to removing `max.spout.pending` entirely until several 
issues with its removal can be addressed, but I'll address that in a separate 
post as it is kind of long and complicated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to