Yes, there were some improvements in 2.4.0. However, the puzzling aspect of your description is that the CPU usage basically disappeared when you removed the consumers even though your profiler claimed that the replica fetchers were using a large chunk of the CPU.
Ismael On Mon, Feb 3, 2020 at 9:50 AM Brandon Barron <brandon.bar...@live.com> wrote: > Haven't tried updating to 2.4.0 yet. Were there any related fixes or > improvements in that version? I skimmed the changelog but I didn't see > anything. > > I found this issue https://issues.apache.org/jira/browse/KAFKA-9039 which > I thought could be related to our problems but it seems like it's projected > to be included in 2.5.0. > > Thanks, > Brandon > > ________________________________ > From: Ismael Juma <ism...@juma.me.uk> > Sent: Monday, February 3, 2020 7:31 AM > To: Kafka Users <users@kafka.apache.org> > Subject: Re: High CPU in 2.2.0 kafka cluster > > Hi Brandon, > > Are you still seeing this behavior with Apache Kafka 2.4.0? > > Ismael > > On Fri, Jan 31, 2020 at 10:51 AM Brandon Barron <brandon.bar...@live.com> > wrote: > > > We were running client version 2.3.0 for a while, then bumped to 2.3.1 > for > > a particular kafka streams bug fix. We saw this issue while both versions > > were running. > > > > Brandon > > > > ________________________________ > > From: Jamie <jamied...@aol.co.uk.INVALID> > > Sent: Thursday, January 30, 2020 1:03 PM > > To: users@kafka.apache.org <users@kafka.apache.org> > > Subject: Re: High CPU in 2.2.0 kafka cluster > > > > Hi Brandon, > > Which version of Kafka are the consumers running? My understanding is > that > > if they're running a version lower than the brokers then they could be > > using a different format for the messages which means the brokers have to > > convert each record before sending to the consumer. > > Thanks, > > Jamie > > > > > > -----Original Message----- > > From: Brandon Barron <brandon.bar...@live.com> > > To: users@kafka.apache.org <users@kafka.apache.org> > > Sent: Thu, 30 Jan 2020 16:11 > > Subject: High CPU in 2.2.0 kafka cluster > > > > Hi, > > > > We had a small cluster (4 brokers) dealing with very low throughput - a > > couple hundred messages per minute at the very most. In that cluster we > had > > a little under 3300 total consumers (all were kafka streams instances). > All > > broker CPUs were maxed out almost consistently for a few weeks. > > > > We switched traffic to a new cluster eventually. The old cluster sitting > > idle for a few days was at ~40% CPU, with consumers still running. When I > > took down all the consumers, the idle CPU on the brokers went to about > 4%. > > > > To test, we decided to mirror active traffic in our new cluster to the > old > > cluster (which now has no running consumers). The CPU didn't budge; it's > > still at ~4% as expected with the low throughput. > > > > One more thing to add: I ran a thread profiler on a couple brokers when > > the old cluster was taking active traffic with running consumers and the > > CPU was maxed out. Each time, I saw the ReplicaFetcherThread eating up > > around 40% of CPU time. > > > > Can you give any advice on what might be the root cause of this? > > > > Thanks, > > Brandon > > >