Github user revans2 commented on the issue:
https://github.com/apache/storm/pull/2241
@roshannaik I am happy to retry it with max spout pending disabled, but in
my testing I found that disabling it negatively impacted the performance. (My
initial tests prior to modifying TVL to have lower parallelism) showed that it
was having a lot of trouble with GC slowing it down. It could not handle
150,000 sentences per second, and would max out at about 120,000 to 130,000
```
150000 1 -c topology.workers=1 -c topology.acker.executors=2
```
But when I added in a maximum of 500
```
150000 1 -c topology.workers=1 -c topology.acker.executors=2 -c
topology.max.spout.pending=500
```
it was able to easily keep up.
Also later on I was trying to tune it to an optimal value, and I tried
several different values for it.
```
300000 1 -c topology.workers=1 -c topology.acker.executors=1 -c
topology.max.spout.pending=1000 3 wc-test 1 1 1
```
which maxed out the throughput at abut 230,000 sentences per sec but
setting it to 2000
```
300000 1 -c topology.workers=1 -c topology.acker.executors=1 -c
topology.max.spout.pending=2000 3 wc-test 1 1 1
```
dropped that maximum to 100,000. At this time I didn't spend the time to
really dig in and see what the bottleneck was, like I did before so I cannot
say if it was GC or not.
I am also opposed to removing `max.spout.pending` entirely until several
issues with its removal can be addressed, but I'll address that in a separate
post as it is kind of long and complicated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---