Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2241 @harshach Reiterating what @HeartSaVioR said about benchmarking. Most benchmarking is done where you push a system to its limits and see what maximum throughput it can do. This is far from what a real user wants. It looks good for a vendor to brag about I can do X but that other vendor over there can only do Y. But it is close to worthless for what real users want to know. Real users are trying to balance the cost of the system in $ (CPU time + memory used become this, how many EC whatever instances do I need), the amount of data that they can push through the system and how quickly they can get results back. Each of these variables are reflected by this test. In most cases a user has a set load that they know they get typically, and a reasonable guess at a maximum load that they expect to see. Also most users have a deadline by which the data is no good any more, if not they should be using batch. And a budget that they have to spend on this project, if not call me I want to work for you and my salary requirements are very reasonable. You need to give users tools to explore all three, and because the three are intertwined you want to be able to hold one or two of the variables constant while you measure the others. Storm currently has no way to set a target SLA (I hope to add one eventually), but you can control the rate at which messages arrive and the parallelism of the topology, (which reflects the cost). So the goal is to scan through various throughput values and various parallelisms to see what the latency is, and what resources are actually used. In the read world we would adjust the heap size and parallelism accordingly. Complaining about a benchmark creating 51 threads relates to the parallelism that we want to explore. If that is what I did wrong in the benchmark I am happy to adjust and reevaluate. I want to understand how the parallelism impacts this code. The thing that concerns me now is that it appears that scaling a topology is very different now, and I want to understand exactly how that works. I cannot easily roll out a change to my customers saying things might get a lot better or they might get a lot worse. We need to make it easy for a user with a topology that may not have been ideal (but worked well), to continue to work well.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---