Hi folks,

I've been chasing an issue for a bit now without much luck. We're seeing
occasional (1-2 times a day) pause times of 10+ seconds in a broker
only handling ~3k messages/s. We're only seeing it on one node at a time in
a three node cluster, though which node is affected can change
occasionally. We've tried G1 and Parnew/CMS with various heap sizes and
configurations without fixing the issue.

In digging into things, I found a somewhat odd thing: YourKit's allocation
tracking shows that ~98% (by both count and size) of objects allocated are
closures around string formatting  created in the calls to trace() in the
ReplicaFetcherThread such as

Can anyone replicate this? I see that trace() guards printing internally
with a call to isTraceEnabled, would folks be amenable to explicitly
wrapping the calls there in isTraceEnabled given that it's a decently tight

Also, if anyone is willing to pitch ideas for GC configs or experiments yo
try, I'm all ears.

Cory K

Reply via email to