Re: more uniform task assignment across kafka stream nodes

Ara Ebrahimi Sat, 25 Mar 2017 17:57:39 -0700

Mathias,

This apparently happens because we have more than 1 source topic. We have 3 
source topics in the same application. So it seems like the task assignment 
algorithm creates topologies not for one specific topic at a time but the total 
partitions across all source topics consumed in an application instance. 
Because we have some code dependencies between these 3 source topics we can’t 
separate them into 3 applications at this time. Hence the reason I want to get 
the task assignment algorithm basically do a uniform and simple task assignment 
PER source topic.


Ara.

On Mar 25, 2017, at 5:21 PM, Matthias J. Sax 
<[email protected]<mailto:[email protected]>> wrote:




________________________________

This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Thank you in advance for your 
cooperation.

________________________________

From: "Matthias J. Sax" <[email protected]<mailto:[email protected]>>
Subject: Re: more uniform task assignment across kafka stream nodes
Date: March 25, 2017 at 5:21:47 PM PDT
To: [email protected]<mailto:[email protected]>
Reply-To: <[email protected]<mailto:[email protected]>>


Hi,

I am wondering why this happens in the first place. Streams,
load-balanced over all running instances, and each instance should be
the same number of tasks (and thus partitions) assigned.

What is the overall assignment? Do you have StandyBy tasks configured?
What version do you use?


-Matthias


On 3/24/17 8:09 PM, Ara Ebrahimi wrote:
Hi,

Is there a way to tell kafka streams to uniformly assign partitions across 
instances? If I have n kafka streams instances running, I want each to handle 
EXACTLY 1/nth number of partitions. No dynamic task assignment logic. Just dumb 
1/n assignment.

Here’s our scenario. Lets say we have an “source" topic with 8 partitions. We 
also have 2 kafka streams instances. Each instances get assigned to handle 4 
“source" topic partitions. BUT then we do a few maps and an aggregate. So data 
gets shuffled around. The map function uniformly distributes these across all 
partitions (I can verify that by looking at the partition offsets). After the 
map what I notice by looking at the topology is that one kafka streams instance 
get assigned to handle say 2 aggregate repartition topics and the other one 
gets assigned 6. Even worse, on bigger clusters (say 4 instances) we see say 2 
nodes gets assigned downstream aggregate repartition topics and 2 other nodes 
assigned NOTHING to handle.

Ara.



________________________________

This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Thank you in advance for your 
cooperation.

________________________________








________________________________

This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Thank you in advance for your 
cooperation.

________________________________

Re: more uniform task assignment across kafka stream nodes

Reply via email to