Github user revans2 commented on a diff in the pull request:
https://github.com/apache/storm/pull/2241#discussion_r129566289
--- Diff: conf/defaults.yaml ---
@@ -253,11 +247,16 @@ topology.trident.batch.emit.interval.millis: 500
topology.testing.always.try.serialize: false
topology.classpath: null
topology.environment: null
-topology.bolts.outgoing.overflow.buffer.enable: false
-topology.disruptor.wait.timeout.millis: 1000
-topology.disruptor.batch.size: 100
-topology.disruptor.batch.timeout.millis: 1
-topology.disable.loadaware.messaging: false
+topology.bolts.outgoing.overflow.buffer.enable: false # TODO: Roshan :
Whats this ?
+topology.disruptor.wait.timeout.millis: 1000 # TODO: Roshan: not used,
but we may/not want this behavior
+topology.transfer.buffer.size: 50000
+topology.transfer.batch.size: 10
+topology.executor.receive.buffer.size: 50000
+topology.producer.batch.size: 1000 # TODO: Roshan: rename
+topology.flush.tuple.freq.millis: 5000
+topology.spout.recvq.skips: 3 # Check recvQ once every N invocations of
Spout's nextTuple() [when ACKs disabled]
+
+topology.disable.loadaware.messaging: true # load aware messaging
reduces throughput by ~20%.
--- End diff --
I agree that turning it off by default may not be what we want. It really
helps on heterogeneous clusters. And quite honestly a lot of the benchmarks
that we have been running are no where near what I have seen happen in
production. It gives you a nice feeling to say we can do X million tuples per
second, but these are for bolts that do close to nothing. Most real bolts I
have seen actually take some time to process.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---