Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2241#discussion_r130523483
  
    --- Diff: conf/defaults.yaml ---
    @@ -253,11 +244,15 @@ topology.trident.batch.emit.interval.millis: 500
     topology.testing.always.try.serialize: false
     topology.classpath: null
     topology.environment: null
    -topology.bolts.outgoing.overflow.buffer.enable: false
    -topology.disruptor.wait.timeout.millis: 1000
    -topology.disruptor.batch.size: 100
    -topology.disruptor.batch.timeout.millis: 1
    -topology.disable.loadaware.messaging: false
    +topology.disruptor.wait.timeout.millis: 1000  # TODO: Roshan: not used, 
but we may/not want this behavior
    +topology.transfer.buffer.size: 50000
    +topology.transfer.batch.size: 10
    +topology.executor.receive.buffer.size: 50000
    +topology.producer.batch.size: 1000
    +topology.flush.tuple.freq.millis: 100
    --- End diff --
    
    @roshannaik, why is it 100 ms, is it based on some benchmarks ?  
    
    As per [design 
doc](https://docs.google.com/document/d/1PpQaWVHg06-OqxTzYxQlzg1yEhzA4Y46_NC7HMO6tsI/edit#heading=h.gjdgxs)
 posted in the JIRA, the JCTools - MPSCArrayQ provides a throughput of 68 
Million/Sec with 20 producers and the performance doesn't seem to degrade much 
as the number of producers increases. If so why do we need to batch and flush 
the tuples to the consumer queue. If the producers enqueue the events directly 
into the receivers queue it would simplify the design and address the latency 
concerns.
    
    Also I assume if the batch size is set to 1, the events are directly 
enqueued and the flush threads are not started?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to