Hi,

I have performance problem with trident kafka spout.
My trident topology looks like this:

               ...
conf.put(Config.TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS, 200);
conf.put(RichSpoutBatchExecutor.MAX_BATCH_SIZE_CONF, 4* 1024);
...

TridentKafkaConfig tridentKafkaConfig = new TridentKafkaConfig(
StaticHosts.fromHostString(ImmutableList.of(
"kafka001.prod.moneymall.ssgbi.com:9092",
"kafka002.prod.moneymall.ssgbi.com:9092"), 2),
"every-click-event");
 tridentKafkaConfig.fetchSizeBytes = 10 * 1024 * 1024;
tridentKafkaConfig.bufferSizeBytes = 10 * 1024 * 1024;

...

TridentTopology topology = new TridentTopology();
topology.newStream("kafkaSpout", spout) .name("EveryClickEventSpout")
.parallelismHint(60) .shuffle() .each(new Fields("bytes"), new
DecodeAvroBytes(), new Fields("row", "json")) .name("DecodeAvroBytesEach")
.parallelismHint(50) .shuffle() .each(new Fields("json"), new
FilterPageViewType()) .name("FilterPageViewTypeEach") .parallelismHint(40)
.shuffle() .project(new Fields("row", "json")) .name("RowJsonProject")
.parallelismHint(20) .shuffle() .partitionPersist(new
HBaseValueFactory(tridentConfig, confMap), new Fields("row", "json"), new
HBaseValueUpdater()) .parallelismHint(30);

I have changed the configurations such as MAX_BATCH_SIZE_CONF,
tridentKafkaConfig.fetchSizeBytes, etc., but I have got just about 4000
TPS(1KB Message) in my cluster.

I have coded another storm topology with traditional spout(using
KafkaSpout)  and bolts, which runs faster (about 20000 TPS) than this
trident topology.

I thought, to fetch more data from kafka, tridentKafkaConfig.fetchSizeBytes
and tridentKafkaConfig.bufferSizeBytes will affect it, and to generate more
tuples in cluster, Config.TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS and
RichSpoutBatchExecutor.MAX_BATCH_SIZE_CONF will do it.
But in my topology, no performance gain happened dispite of changes of
these configurations.

My question is, for trident kafka spout, which configuration should I
change to gain more performance and is there anybody who got good
reasonable performance, for instance, over 100000 TPS with the trident
kafka spout?

- Kidong Lee.

Reply via email to