[
https://issues.apache.org/jira/browse/STORM-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280160#comment-15280160
]
Robert Joseph Evans commented on STORM-1772:
--------------------------------------------
I would love to see more performance tests/topologies in storm. I wrote one
https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/org/apache/storm/starter/ThroughputVsLatency.java
but it is just a word count topology, and it requires acking to be enabled for
it to work properly. We also did a storm benchmark as part of
https://github.com/yahoo/streaming-benchmarks. And then there is the
https://github.com/yahoo/storm-perf-test which is outdated, and too many people
use it to test the general performance of storm instead of using it as stress
test on the messaging layer, which is the only thing it is really good at.
I think ThroughputVsLatency is the correct model that we want to follow for
these types of tests. Trying to capture the processing latency at a given
throughput. It can easily be adapted to other topologies, and with a small
amount of work should be able to support an OK measure of latency without the
need for acking, assuming that the clocks among the different nodes are close
to being in sync with one another. The intel benchmarks are great in their
diversity of workloads, but I would also like to see something that is more
real world too.
> Create topologies for measuring performance
> -------------------------------------------
>
> Key: STORM-1772
> URL: https://issues.apache.org/jira/browse/STORM-1772
> Project: Apache Storm
> Issue Type: Bug
> Reporter: Roshan Naik
> Assignee: Roshan Naik
>
> Would be very useful to have some simple reference topologies included with
> Storm that can be used to measure performance both by devs during development
> (to start with) and perhaps also on a real storm cluster (subsequently).
> To start with, the goal is to put the focus on the performance
> characteristics of individual building blocks such as specifics bolts,
> spouts, grouping options, queues, etc. So, initially biased towards
> micro-benchmarking but subsequently we could add higher level ones too.
> Although there is a storm benchmarking tool (originally written by Intel?)
> that can be used, and i have personally used it, its better for this to be
> integrated into Storm proper and also maintained by devs as storm evolves.
> On a side note, in some instances I have noticed (to my surprise) that the
> perf numbers change when the topologies written for Intel benchmark when
> rewritten without the required wrappers so that they runs directly under
> Storm.
> Have a few topologies in mind for measuring each of these:
> # *Queuing and Spout Emit Performance:* A topology with a Generator Spout but
> no bolts.
> # *Queuing & Grouping performance*: Generator Spout -> A grouping method ->
> DevNull Bolt
> # *Hdfs Bolt:* Generator Spout -> Hdfs Bolt
> # *Hdfs Spout:* Hdfs Spout -> DevNull Botl
> # *Kafka Spout:* Kafka Spout -> DevNull Bolt
> # *Simple Data Movement*: Kafka Spout -> Hdfs Bolt
> Shall add these for Storm core first. Then we can have the same for Trident
> also.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)