GitHub user revans2 opened a pull request:
https://github.com/apache/storm/pull/765
Disruptor batching v2
I wanted to come up with an alternative to #694 that does batching at the
queue level instead of at that spout level. This is the result of that work,
although it still needs some testing/cleanup before I consider it fully ready.
This is based off of #750 using the latest version of disruptor, and has a few
bug fixes for automatic back-pressure that I need to split out and include as a
separate pull request.
The work here is driven by results I got from some micro benchmarks I wrote
looking at what the bottlenecks are in storm and what the theoretical limit is
for storm. In this case doing a simple word count topology as represented by
Q_SWC_X_04. With the same batching here the throughput when from 300,000
sentences/second processed to around 1.5 million, as run on my Mac Book Pro
Laptop.
https://github.com/revans2/storm-micro-perf
The reason for queue level batching is that it gives a finer grained
control over latency, which I would like to use for automatic tuning in the
future. It also is a smaller code change. If others disagree I am okay with
#694. But I do want to do a head to head comparison between the two approaches.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/revans2/incubator-storm disruptor-batching-v2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/765.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #765
----
commit 14eab88064cde2b267fe2a4979d7a0d26d83b5aa
Author: Robert (Bobby) Evans <[email protected]>
Date: 2015-09-18T21:33:39Z
STORM-350: Upgrade to newer version of disruptor
commit 0b254fd6ab7caa3e965bc5c5156caa4d478ddf61
Author: Robert (Bobby) Evans <[email protected]>
Date: 2015-09-23T13:43:41Z
Fixed null reads from disruptor.
commit 020aae06902706c5316a26c5ead716e175f243a8
Author: Robert (Bobby) Evans <[email protected]>
Date: 2015-09-24T18:45:36Z
Added in an in-order test case.
commit c5d63e5a3ad0379ec2a08f6622a02a22a6121a4c
Author: Robert (Bobby) Evans <[email protected]>
Date: 2015-09-22T20:37:25Z
Tests pass. Things run faster in some cases, and tehre are performance
issues in others.
commit 8db3113f26780ae91d658940277580182fcf05b8
Author: Robert (Bobby) Evans <[email protected]>
Date: 2015-09-22T21:34:56Z
Fixed issue with test not shutting down cleanly
commit 7bdcfc16be3cee399fd20089383454a27ed5128d
Author: Robert (Bobby) Evans <[email protected]>
Date: 2015-09-23T00:42:07Z
Fixed issues with auto-backpressure.
commit 8629016d4fd4004e90ba277fbf1a6f41561ac443
Author: Robert (Bobby) Evans <[email protected]>
Date: 2015-09-23T17:08:43Z
Fixed tests that were failing
commit bed24caf1011d8a9533f2e02ec5f02007ec28735
Author: Robert (Bobby) Evans <[email protected]>
Date: 2015-09-24T22:39:12Z
Made test sentences match micro benchmark. Fixed reflection issue
commit b9a27b08a215681976bf1ba23939121de6ea18d2
Author: Robert (Bobby) Evans <[email protected]>
Date: 2015-09-25T20:21:39Z
Made the batch size and timeout configurable
commit 2f725d980c1ba5a893fc4c1a6fe2b2ecff07c86b
Author: Robert (Bobby) Evans <[email protected]>
Date: 2015-09-25T21:53:23Z
Improved disruptor memory allocation
commit 5673c54365710a559d8edf69831e97e582d78c45
Author: Robert (Bobby) Evans <[email protected]>
Date: 2015-09-28T20:20:46Z
Fixed some bugs with auto-backpressure
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---