[ https://issues.apache.org/jira/browse/KAFKA-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712821#comment-15712821 ]
Guozhang Wang commented on KAFKA-4474: -------------------------------------- [~enothereska] Could you take a look and see if there is any obvious issues causing the low throughput? It does not match our nightly benchmark results. > Poor kafka-streams throughput > ----------------------------- > > Key: KAFKA-4474 > URL: https://issues.apache.org/jira/browse/KAFKA-4474 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 0.10.1.0 > Reporter: Juan Chorro > > Hi! > I'm writing because I have a worry about kafka-streams throughput. > I have only a kafka-streams application instance that consumes from 'input' > topic, prints on the screen and produces in 'output' topic. All topics have 4 > partitions. As can be observed the topology is very simple. > I produce 120K messages/second to 'input' topic, when I measure the 'output' > topic I detect that I'm receiving ~4K messages/second. I had next > configuration (Remaining parameters by default): > application.id: myApp > bootstrap.servers: localhost:9092 > zookeeper.connect: localhost:2181 > num.stream.threads: 1 > I was doing proofs and tests without success, but when I created a new > 'input' topic with 1 partition (Maintain 'output' topic with 4 partitions) I > got in 'output' topic 120K messages/seconds. > I have been doing some performance tests and proof with next cases (All > topics have 4 partitions in all cases): > Case A - 1 Instance: > - With num.stream.threads set to 1 I had ~3785 messages/second > - With num.stream.threads set to 2 I had ~3938 messages/second > - With num.stream.threads set to 4 I had ~120K messages/second > Case B - 2 Instances: > - With num.stream.threads set to 1 I had ~3930 messages/second for each > instance (And throughput ~8K messages/second) > - With num.stream.threads set to 2 I had ~3945 messages/second for each > instance (And more or less same throughput that with num.stream.threads set > to 1) > Case C - 4 Instances > - With num.stream.threads set to 1 I had 3946 messages/seconds for each > instance (And throughput ~17K messages/second): > As can be observed when num.stream.threads is set to #partitions I have best > results. Then I have next questions: > - Why whether I have a topic with #partitions > 1 and with > num.streams.threads is set to 1 I have ~4K messages/second always? > - In case C. 4 instances with num.stream.threads set to 1 should be better > that 1 instance with num.stream.threads set to 4. Is corrects this > supposition? > This is the kafka-streams application that I use: > https://gist.github.com/Chorro/5522ec4acd1a005eb8c9663da86f5a18 -- This message was sent by Atlassian JIRA (v6.3.4#6332)