[GitHub] storm issue #2241: STORM-2306 : Messaging subsystem redesign.

revans2 Wed, 16 Aug 2017 07:31:47 -0700

Github user revans2 commented on the issue:

https://github.com/apache/storm/pull/2241

@roshannaik I have some new performance numbers. These are not final, as I
have not done any tuning yet. But let me explain the test. I really wanted to
get a handle on how this would impact me, yes I am very selfish that way. I
have been wanting for a long time to be able to write a tool that could
simulate this, but it was never high enough up on my priority list to make it
happen until now. Thanks for the nudge to do this by the way.

So I wrote a
[tool](https://github.com/revans2/incubator-storm/tree/STORM-2306-with-loadgen/examples/storm-starter/src/jvm/org/apache/storm/starter/loadgen)
and will do a formal pull request after some more features, cleanup, and
moving it into its own stand-alone package.

The tool will capture a snapshot of topologies running on a cluster. It
grabs the bolts, spouts, and all of the streams connecting them along with
metrics for those streams. The metrics include the output rate and the
process/execute latency for each incoming stream.

I can then simulate the throughput for each stream and the latency for each
bolt (I use a Gaussian distribution that matches the measured distribution).

For this particular test I captured 104 topologies from a single production
cluster 600+ nodes with 19k+ CPU cores and about 70 TB of memory. The nodes
are heterogeneous and the cluster + all of the topologies are tuned for the
Resource Aware Scheduler. (I really want to release these captured metrics, but
I need to fight with legal to get approval to do it first). It is a mix of
trident and regular storm topologies. Some with very high throughput (850k/sec
in the highest case) and some very low throughput ones. I then set up another
much smaller cluster 9 nodes (1 for nimbus + ZK, 8 for supervisors).

```
Processors: 2 x Xeon E5-2430 2.20GHz, 7.2GT QPI (HT enabled, 12 cores, 24
threads) - Sandy Bridge-EP C2, 64-bit, 6-core, 32nm, L3: 15MB
Memory: 46.7GB / 48GB 1600MHz DDR3 == 6 x 8GB - 8GB PC3-12800
Samsung DDR3-1600 ECC Registered CL11 4Rx4
RHEL 6.7
java 1.8.0_131-b11
```

This was a non RAS cluster. The only settings I changed for the cluster
were the location of nimbus and zookeeper.

I then replayed each of the topologies one at a time with default settings
+ `-c topology.worker.max.heap.size.mb=5500 -c
nimbus.thrift.max_buffer_size=10485760`. (Takes almost 9 hours because each
test runs for about 5 mins to be sure that the cluster has stabilized)

The first changed setting is because I had a few issues with nimbus
rejecting some topologies because they had some RAS settings for a very large
cluster and the default settings in storm did not allow them to run. The
second one is because on a few very large topologies the client was rejecting
results from nimbus because the thrift metrics were too large (1MB is the
default cutoff).

Of these 104 topologies 95 of them ran on 2.x-SNAPSHOT without any real
issues and were able to keep up with the throughput. For STORM-2306 94 of them
ran without any issues and were able to keep up with the throughput. I have
not debugged yet the one topology that would just stop processing on STORM-2306
but was running fine on 2.x-SNAPSHOT.

From those I measured the latency and CPU/Memory used (just like with
ThroughputVsLatency). My goal was to see how these changed from one version of
storm to another and get an idea of how much pain out of the box it would be
for me and my customers to start moving to it, again my own selfishness here in
action. As I said I have not tuned any of the topologies yet. I plan on doing
that next to see if I can improve some of the measurements. Here are my
results. As a note for COST I used 1 core equal to 2 GB of memory because that
is the ratio we bought hardware at originally (but not so much any more).

Â | CPU Summary | Memory Summary | Cost Summary | Avg Latency Diff ms
(weighted by throughput)
-- | -- | -- | -- | --
Measured Diff | about 250 more needed | about 27 GB less needed | 235 more
needed | 5.22 ms more per tuple
In Cluster Total | about 19k cores | about 70TB | about 55k | Â
Percent diff of cluster total | 1.28% | -0.04% | 0.43% | Â
In Cluster Assigned | about 13k cores | about 50TB | about 40k | Â
Percent of Assigned | 1.92% | -0.05% | 0.61% | Â
Amount Per Node | 47 | about 200GB| Â | Â
Change (Nodes) | 6 | -0 | Â | Â

As I said I have not tuned anything yet. I need to because the assumption
here is that users will need to tune their topologies for higher throughput
jobs. I don't like that because it adds to the pain for my users, but I will
try to quantify that in some way.

As it stands now things don't look good for STORM-2306 (and I know running
a test that no one else can reproduce is really crappy, but hopefully others
can take my tool and play around with their own topologies).

On average each tuple is 5 ms slower with STORM-2306, although most of that
came from the one really big topology with 850k/sec where STORM-2306 was 30 ms
slower. On average though 5 ms or 30 ms is not a big deal, and I would
consider them more or less equal.

For resource usage though we start to run into something that is more
significant. Memory usage went down by about 27GB, but CPU usage went up by
about 250 cores. This means my currently running topologies would need
approximately 6 more servers to be able to handle the same load that they have
now. Or just under a 2% increase in cluster usage. It is small enough that for
most topologies they might just be able to absorb it without any changes, and
hopefully with some tuning we can make it smaller.

My next steps are to take a sample of the topologies and tune them on both
systems to see how much I can reduce the cost, but still maintain the
throughput, and not increase the latency by too much.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #2241: STORM-2306 : Messaging subsystem redesign.

Reply via email to