
We are trying to optimize our Storm topology that uses Kafka-Spout and
Elastic-Search-Bolt (no other spouts/bolts).

Current performance statistics are as follows:

storm-workers:  1
elastic-search primaries : 1
elastic-search replicas : 1
1 process in storm having 1 kafka-spout thread and 6 elastic-search bolt
kafka-fetch-size  : 10 MB
kafka-buffer-size : 11 MB
es-flush-entries-size : 10,000
16gb heap size with new-ratio = 1 (for Elastic-Search as well as Storm)
average kafka-message-size : 1 kb

The maximum ingestion rate we are able to achieve with the above is 800,000
messages per minute from kafka to elastic-search.

These statistics scale almost horizontally with the number of storm worker
nodes/processes (we use LOCAL_OR_SHUFFLE grouping) and with a similar
increase in elastic-search nodes.

Can someone comment on these throughput statistics?

Any recommendations on increasing the throughput would be much appreciated.


Reply via email to