Please share the rest of your topology code (without any UDFs / business logic). Otherwise, I cannot give further advice.
-Matthias
On 3/25/17 6:08 PM, Ara Ebrahimi wrote:
> Via:
>
> builder.stream("topic1");
> builder.stream("topic2");
> builder.stream("topic3”);
>
> These are different kinds of topics consuming different avro objects.
>
> Ara.
>
> On Mar 25, 2017, at 6:04 PM, Matthias J. Sax
> <[email protected]<mailto:[email protected]>> wrote:
>
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain privileged,
> proprietary, or otherwise confidential information. If you have received it
> in error, please notify the sender immediately and delete the original. Any
> other use of the e-mail by you is prohibited. Thank you in advance for your
> cooperation.
>
> ________________________________
>
> From: "Matthias J. Sax" <[email protected]<mailto:[email protected]>>
> Subject: Re: more uniform task assignment across kafka stream nodes
> Date: March 25, 2017 at 6:04:30 PM PDT
> To: [email protected]<mailto:[email protected]>
> Reply-To: <[email protected]<mailto:[email protected]>>
>
>
> Ara,
>
> How do you consume your topics? Via
>
> builder.stream("topic1", "topic2", "topic3);
>
> or via
>
> builder.stream("topic1");
> builder.stream("topic2");
> builder.stream("topic3");
>
> Both and handled differently with regard to creating tasks (partition to
> task assignment also depends on you downstream code though).
>
> If this does not help, can you maybe share the structure of processing?
> To dig deeper, we would need to know the topology DAG.
>
>
> -Matthias
>
>
> On 3/25/17 5:56 PM, Ara Ebrahimi wrote:
> Mathias,
>
> This apparently happens because we have more than 1 source topic. We have 3
> source topics in the same application. So it seems like the task assignment
> algorithm creates topologies not for one specific topic at a time but the
> total partitions across all source topics consumed in an application
> instance. Because we have some code dependencies between these 3 source
> topics we can’t separate them into 3 applications at this time. Hence the
> reason I want to get the task assignment algorithm basically do a uniform and
> simple task assignment PER source topic.
>
> Ara.
>
> On Mar 25, 2017, at 5:21 PM, Matthias J. Sax
> <[email protected]<mailto:[email protected]><mailto:[email protected]>>
> wrote:
>
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain privileged,
> proprietary, or otherwise confidential information. If you have received it
> in error, please notify the sender immediately and delete the original. Any
> other use of the e-mail by you is prohibited. Thank you in advance for your
> cooperation.
>
> ________________________________
>
> From: "Matthias J. Sax"
> <[email protected]<mailto:[email protected]><mailto:[email protected]>>
> Subject: Re: more uniform task assignment across kafka stream nodes
> Date: March 25, 2017 at 5:21:47 PM PDT
> To:
> [email protected]<mailto:[email protected]><mailto:[email protected]>
> Reply-To:
> <[email protected]<mailto:[email protected]><mailto:[email protected]>>
>
>
> Hi,
>
> I am wondering why this happens in the first place. Streams,
> load-balanced over all running instances, and each instance should be
> the same number of tasks (and thus partitions) assigned.
>
> What is the overall assignment? Do you have StandyBy tasks configured?
> What version do you use?
>
>
> -Matthias
>
>
> On 3/24/17 8:09 PM, Ara Ebrahimi wrote:
> Hi,
>
> Is there a way to tell kafka streams to uniformly assign partitions across
> instances? If I have n kafka streams instances running, I want each to handle
> EXACTLY 1/nth number of partitions. No dynamic task assignment logic. Just
> dumb 1/n assignment.
>
> Here’s our scenario. Lets say we have an “source" topic with 8 partitions. We
> also have 2 kafka streams instances. Each instances get assigned to handle 4
> “source" topic partitions. BUT then we do a few maps and an aggregate. So
> data gets shuffled around. The map function uniformly distributes these
> across all partitions (I can verify that by looking at the partition
> offsets). After the map what I notice by looking at the topology is that one
> kafka streams instance get assigned to handle say 2 aggregate repartition
> topics and the other one gets assigned 6. Even worse, on bigger clusters (say
> 4 instances) we see say 2 nodes gets assigned downstream aggregate
> repartition topics and 2 other nodes assigned NOTHING to handle.
>
> Ara.
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain privileged,
> proprietary, or otherwise confidential information. If you have received it
> in error, please notify the sender immediately and delete the original. Any
> other use of the e-mail by you is prohibited. Thank you in advance for your
> cooperation.
>
> ________________________________
>
>
>
>
>
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain privileged,
> proprietary, or otherwise confidential information. If you have received it
> in error, please notify the sender immediately and delete the original. Any
> other use of the e-mail by you is prohibited. Thank you in advance for your
> cooperation.
>
> ________________________________
>
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain privileged,
> proprietary, or otherwise confidential information. If you have received it
> in error, please notify the sender immediately and delete the original. Any
> other use of the e-mail by you is prohibited. Thank you in advance for your
> cooperation.
>
> ________________________________
>
signature.asc
Description: OpenPGP digital signature
