Re: [E] Re: Lagging worker nodes

2021-01-28 Thread Zilvinas Saltys
Joe, Absolutely. I can provide configuration of every single processor. Could you point me to anything I can read through to see how actual content can be cached in memory? Perhaps a link to github. If there's a condition where processors can avoid reading from local disk to fetch actual content I

Re: [E] Re: Lagging worker nodes

2021-01-28 Thread Joe Witt
Saltys It can be possible because those things can still be cached. The way this thing really works at scale can be quite awesome actually. However, definitely want to help you understand what is happening but the pictures alone dont cut it. We appreciate you have sensitivities/stuff you have t

Re: [E] Re: Lagging worker nodes

2021-01-28 Thread Zilvinas Saltys
We're still on an old version of Kafka that's why we're still using old processors. File sizes vary .. Generally they are all within +-100mb range before they are uncompressed. There can be some small files but they are not a majority. From logging I can see that all hosts are processing files of

Re: [E] Re: Lagging worker nodes

2021-01-28 Thread Pierre Villard
Not saying this is the issue, but is your Kafka cluster using Kafka 0.11? Looking at the screenshot, you're using the Kafka processors from the 0.11 bundle, you might want to look at the processors for Kafka 2.x instead. Are your files more or less evenly distributed in terms of sizes? I suppose y

Re: [E] Re: Lagging worker nodes

2021-01-28 Thread Zilvinas Saltys
My other issue is that the balancing is not rebalancing the queue? Perhaps I misunderstand how balancing should work and it only balances round robin new incoming files? I can easily manually rebalance by disabling balancing and enabling it again but after a while it gets back to the same situation

Re: [E] Re: Lagging worker nodes

2021-01-28 Thread Zilvinas Saltys
Hi Joe, Yes it is the same issue. We have used your advice and reduced the amount of threads on our large processors: fetch/compress/publish to a minimum and then increased gradually to 4 until the processing rate became acceptable (about 2000 files per 5 min). This is a cluster of 25 nodes of 36