Joe,
Absolutely. I can provide configuration of every single processor. Could
you point me to anything I can read through to see how actual content can
be cached in memory? Perhaps a link to github. If there's a condition where
processors can avoid reading from local disk to fetch actual content I
Saltys
It can be possible because those things can still be cached. The way this
thing really works at scale can be quite awesome actually.
However, definitely want to help you understand what is happening but the
pictures alone dont cut it. We appreciate you have sensitivities/stuff you
have t
We're still on an old version of Kafka that's why we're still using old
processors.
File sizes vary .. Generally they are all within +-100mb range before they
are uncompressed. There can be some small files but they are not a
majority. From logging I can see that all hosts are processing files of
Not saying this is the issue, but is your Kafka cluster using Kafka 0.11?
Looking at the screenshot, you're using the Kafka processors from the 0.11
bundle, you might want to look at the processors for Kafka 2.x instead.
Are your files more or less evenly distributed in terms of sizes?
I suppose y
My other issue is that the balancing is not rebalancing the queue? Perhaps
I misunderstand how balancing should work and it only balances round robin
new incoming files? I can easily manually rebalance by disabling balancing
and enabling it again but after a while it gets back to the same situation
Hi Joe,
Yes it is the same issue. We have used your advice and reduced the amount
of threads on our large processors: fetch/compress/publish to a minimum and
then increased gradually to 4 until the processing rate became acceptable
(about 2000 files per 5 min). This is a cluster of 25 nodes of 36