Thanks for your suggestion. I actually came from multiple partitions and found out I could not control the severity of out-of-order tuple issues from different partitions (I could only tolerate a small degree of out-of-order, but throughput from different partitions can't be coordinated so it's a no-no for us). That said, multiple partitions do work and throughput was way better.
With Kafka console consumer, I did not do anything fancy, just timing how fast it could consume from the same Kafka partition. I did some calculation, it seems like the console consumer almost saturated the 1G link taking into account compression used. This makes me believe the bottleneck is inside storm. My real topology is also simple, like: kafka topic (1partition) -> spout (1task) --> boltType1(20, fieldgrouping). Field grouping is used to ensure that each tuple only goes to one consumer bolt. I even replaced my real bolt with an empty one as mentioned before and also tried disabling acking, still without significant improvement. Thanks, Fang On Fri, Jun 5, 2015 at 8:03 AM, nitin sharma <kumarsharma.ni...@gmail.com> wrote: > Hi Fang, > > Can you elaborate more on what kind of testing you performed with "Kafka's > consumer console"? > > Moreover, can you try creating more partitions in your kafka topics and > then increase the kafka spout parallelism and see if that works? > > it will be great if you can paste how u are building your topology > > Regards, > Nitin Kumar Sharma. > > > On Thu, Jun 4, 2015 at 3:03 PM, Fang Chen <fc2...@gmail.com> wrote: > >> One correction, my msg size is about 3k each. >> >> I did another round for comparison. I disabled acking all together, still >> the throughput is only slightly better at 12k tuples/s. So I used kafka's >> console consumer from one of the cluster nodes (different from where the >> partition is located) in order to find out if network was the bottleneck. >> This time I easily achieved 70k+ tuples/s. >> >> Any thoughts on this? >> >> Thanks >> >> On Wed, Jun 3, 2015 at 10:55 PM, Fang Chen <fc2...@gmail.com> wrote: >> >>> My use case requires total order in kafka queue, so I tested with a >>> topic with only 1 partition. My spout parallelism was set to 1, and bolt >>> parallelism 20. The message size is less than 1k bytes each. >>> >>> No matter how I tune kafka spout configs, including those queue fetch >>> related params, and max spout pending, I could only get about 10K tuples/s >>> with very low complete latency (<10ms) >>> >>> I even tried with empty bolt that acks tuples immediately without any >>> extra processing. Still the throughput is similar, though the complete >>> latency was even lower. This makes me wonder if I hit some sort of perf. >>> walls. >>> >>> My boxes are quite powerful baremetals (40 cores, lots of disk space, >>> 96G memory, 1G network), also the worker jvm was tuned so negligible pauses >>> there. >>> >>> Any advice on what I can tune or look into? >>> >>> Thanks a lot! >>> >>> Fang >>> >> >> >