Hello, I'm looking for guidance for a lag issue with very low level incoming traffic, that turns to an increasing lag (10K) in rush hour. Curiously I'm not getting any part of my HW not nearly 60% used. I've drawn the architecture so you can inspect the HW and Configuration details.
*Some steps I did* I turned on / off processing, in order to isolate the degradation factor somewhere in the consumption phase, and its influence on and off: got totally diluted within the number rates. So I've tried out the Scala script kafka-console-consumer, with exactly same results. Originally I had 12 partitions, and reduced to 4, blaming fetcher's context-switch degradation per partition, I also had num.consumer.fetchers=1 to feed a single thread consumer, then turned to 6, that got really better and enough with my current traffic, The only solution I've found to the increasing lag is -also to provide HA-, adding another consumer machine, then the Lag descends, and stays controlled. May be I'm hitting some HW or SW limits here ? may be in the consumer client ? (HighLevel consumer) We started with a low-profile HW scenario, trying to discover latency aspects, rehearse throughput rates, to exploit kafka the best it can be, before using it widely across our pipelined data staging platform. I dont have any exceptions at kafka/server.log, only some CanceledKeyExceptions each 10 mins or less in ZK. __ *Cristian A. Gonzalez* Backend development · flowics.com <http://www.flowics.com/?utm_source=Google_Signature&utm_medium=email&utm_campaign=Email_Signature_Campaign> skype: gcristian.ariel | tw: @gcariel