Please share the rest of your topology code (without any UDFs / business
logic). Otherwise, I cannot give further advice.

-Matthias


On 3/25/17 6:08 PM, Ara Ebrahimi wrote:
> Via:
> 
> builder.stream("topic1");
> builder.stream("topic2");
> builder.stream("topic3”);
> 
> These are different kinds of topics consuming different avro objects.
> 
> Ara.
> 
> On Mar 25, 2017, at 6:04 PM, Matthias J. Sax 
> <matth...@confluent.io<mailto:matth...@confluent.io>> wrote:
> 
> 
> 
> 
> ________________________________
> 
> This message is for the designated recipient only and may contain privileged, 
> proprietary, or otherwise confidential information. If you have received it 
> in error, please notify the sender immediately and delete the original. Any 
> other use of the e-mail by you is prohibited. Thank you in advance for your 
> cooperation.
> 
> ________________________________
> 
> From: "Matthias J. Sax" <matth...@confluent.io<mailto:matth...@confluent.io>>
> Subject: Re: more uniform task assignment across kafka stream nodes
> Date: March 25, 2017 at 6:04:30 PM PDT
> To: users@kafka.apache.org<mailto:users@kafka.apache.org>
> Reply-To: <users@kafka.apache.org<mailto:users@kafka.apache.org>>
> 
> 
> Ara,
> 
> How do you consume your topics? Via
> 
> builder.stream("topic1", "topic2", "topic3);
> 
> or via
> 
> builder.stream("topic1");
> builder.stream("topic2");
> builder.stream("topic3");
> 
> Both and handled differently with regard to creating tasks (partition to
> task assignment also depends on you downstream code though).
> 
> If this does not help, can you maybe share the structure of processing?
> To dig deeper, we would need to know the topology DAG.
> 
> 
> -Matthias
> 
> 
> On 3/25/17 5:56 PM, Ara Ebrahimi wrote:
> Mathias,
> 
> This apparently happens because we have more than 1 source topic. We have 3 
> source topics in the same application. So it seems like the task assignment 
> algorithm creates topologies not for one specific topic at a time but the 
> total partitions across all source topics consumed in an application 
> instance. Because we have some code dependencies between these 3 source 
> topics we can’t separate them into 3 applications at this time. Hence the 
> reason I want to get the task assignment algorithm basically do a uniform and 
> simple task assignment PER source topic.
> 
> Ara.
> 
> On Mar 25, 2017, at 5:21 PM, Matthias J. Sax 
> <matth...@confluent.io<mailto:matth...@confluent.io><mailto:matth...@confluent.io>>
>  wrote:
> 
> 
> 
> 
> ________________________________
> 
> This message is for the designated recipient only and may contain privileged, 
> proprietary, or otherwise confidential information. If you have received it 
> in error, please notify the sender immediately and delete the original. Any 
> other use of the e-mail by you is prohibited. Thank you in advance for your 
> cooperation.
> 
> ________________________________
> 
> From: "Matthias J. Sax" 
> <matth...@confluent.io<mailto:matth...@confluent.io><mailto:matth...@confluent.io>>
> Subject: Re: more uniform task assignment across kafka stream nodes
> Date: March 25, 2017 at 5:21:47 PM PDT
> To: 
> users@kafka.apache.org<mailto:users@kafka.apache.org><mailto:users@kafka.apache.org>
> Reply-To: 
> <users@kafka.apache.org<mailto:users@kafka.apache.org><mailto:users@kafka.apache.org>>
> 
> 
> Hi,
> 
> I am wondering why this happens in the first place. Streams,
> load-balanced over all running instances, and each instance should be
> the same number of tasks (and thus partitions) assigned.
> 
> What is the overall assignment? Do you have StandyBy tasks configured?
> What version do you use?
> 
> 
> -Matthias
> 
> 
> On 3/24/17 8:09 PM, Ara Ebrahimi wrote:
> Hi,
> 
> Is there a way to tell kafka streams to uniformly assign partitions across 
> instances? If I have n kafka streams instances running, I want each to handle 
> EXACTLY 1/nth number of partitions. No dynamic task assignment logic. Just 
> dumb 1/n assignment.
> 
> Here’s our scenario. Lets say we have an “source" topic with 8 partitions. We 
> also have 2 kafka streams instances. Each instances get assigned to handle 4 
> “source" topic partitions. BUT then we do a few maps and an aggregate. So 
> data gets shuffled around. The map function uniformly distributes these 
> across all partitions (I can verify that by looking at the partition 
> offsets). After the map what I notice by looking at the topology is that one 
> kafka streams instance get assigned to handle say 2 aggregate repartition 
> topics and the other one gets assigned 6. Even worse, on bigger clusters (say 
> 4 instances) we see say 2 nodes gets assigned downstream aggregate 
> repartition topics and 2 other nodes assigned NOTHING to handle.
> 
> Ara.
> 
> 
> 
> ________________________________
> 
> This message is for the designated recipient only and may contain privileged, 
> proprietary, or otherwise confidential information. If you have received it 
> in error, please notify the sender immediately and delete the original. Any 
> other use of the e-mail by you is prohibited. Thank you in advance for your 
> cooperation.
> 
> ________________________________
> 
> 
> 
> 
> 
> 
> 
> 
> ________________________________
> 
> This message is for the designated recipient only and may contain privileged, 
> proprietary, or otherwise confidential information. If you have received it 
> in error, please notify the sender immediately and delete the original. Any 
> other use of the e-mail by you is prohibited. Thank you in advance for your 
> cooperation.
> 
> ________________________________
> 
> 
> 
> 
> ________________________________
> 
> This message is for the designated recipient only and may contain privileged, 
> proprietary, or otherwise confidential information. If you have received it 
> in error, please notify the sender immediately and delete the original. Any 
> other use of the e-mail by you is prohibited. Thank you in advance for your 
> cooperation.
> 
> ________________________________
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to