Hi all, My environment and conf. are as follows: [Machines] 1 nimbus and 3 supervisors on AWS with m1.medium [Topology] 4 Spouts(each for a topic of kafka with parallelism hint 2) and 10 bolts [Topology] 6 workers, 34 executors, 34 tasks
My first bolt(parallelism hint=5) is parsing data from soput, and its capacity is over 1.0 often. My consideration is as follows: 1. Using tick-tuple feature to write my result into mysql database: if (TupleHelpers.isTickTuple(tuple)) { //emit the result to next bolt collector.emit(new Values(result)); }else{ //store result in memory collector.ack(tuple); } I set TOPOLOGY_TICK_TUPLE_FREQ_SECS for 30 seconds. Is it correct to emit in unanchor way, so that the tuple will not be tracked? I'm afraid something wrong here. 2. Bad way in 1 topic with 1 KafkaSpout? Actually I will use 12 topics so taht I have 12 spouts in my topology. Is it good for 1 tpic for 1 spout? 3. Slow speed for my topology. One of my bolt is connectd from spout and counting the number of tuples received. I found it can process 300~400 tuples/sec only...Whats wrong with my topology? [storm UI] In the beginning of start, the complete latency is over 30000 ms, and lots of fail tuples in "spouts" but no fail tuple in "bolts". Can anyone give me some advice and speed up my topology? Best regards, James