Re: Kafka streams parallelism - why not separate stream task per partition per input topic

2020-10-08 Thread Pushkar Deole
I looked at the task assignment and it looked random for some threads: e.g.
i have 3 topics 24 partitions each and have 3 instances of application. So,
each instance assigned 8 partitions per topic, i.e. total 24 partitions for
3 topics.

When I set 8 stream threads, I expected each thread to be assigned 1
partition from each topic, however some of the threads got assigned
partitions only from 2 of the topics.
Since topic C is not carrying traffic, those threads that did not get
assigned partition from topic C got overloaded than others.

Topic A
Topic

On Wed, Oct 7, 2020 at 11:45 PM Matthias J. Sax  wrote:

> Well, there are many what-ifs and I am not sure if there is general advice.
>
> Maybe a more generic response: do you actually observe a concrete issue
> with the task assignment that impacts your app measurable? Or might this
> be a case of premature optimization?
>
> -Matthias
>
> On 10/6/20 10:13 AM, Pushkar Deole wrote:
> > So, what do you suggest to address the topic C with lesser traffic?
> Should
> > we create a separate StreamBuilder and build a separate topology for
> topic
> > C so we can configure number of threads as per our requirement for that
> > topic?
> >
> > On Tue, Oct 6, 2020 at 10:27 PM Matthias J. Sax 
> wrote:
> >
> >> The current assignment would be "round robin". Ie, after all tasks are
> >> created, we just take task-by-task and assign one to the threads
> >> one-by-one.
> >>
> >> Note though, that the assignment algorithm might change at any point, so
> >> you should not rely on it.
> >>
> >> We are also not able to know if one topic has less traffic than others
> >> and thus must blindly assume (what is of course a simplification) that
> >> all topics have the same traffic. We only consider the difference
> >> between stateless and stateful tasks atm.
> >>
> >> -Matthias
> >>
> >> On 10/6/20 3:57 AM, Pushkar Deole wrote:
> >>> Matthias,
> >>>
> >>> I am just wondering how the tasks will be spread across threads in
> case I
> >>> have lesser threads than the number of partitions. Specifically taking
> my
> >>> use case, I have 3 inputs topics with 8 partitions each and I can
> >> configure
> >>> 12 threads, so how below topics partitions will be distributed among 12
> >>> threads.
> >>> Note that topic C is generally idle and carries traffic only sometimes,
> >> so
> >>> I would want partitions from topic C to be evenly distributed so all
> >>> partitions from topic C don't get assigned to only some of the threads.
> >>>
> >>> Topic A - 8 partitions
> >>> Topic B - 8 partitions
> >>> Topic C - 8 partitions
> >>>
> >>> On Fri, Sep 4, 2020 at 9:56 PM Matthias J. Sax 
> wrote:
> >>>
>  That is correct.
> 
>  If topicA has 5 partitions and topicB has 6 partitions, you get 5
> tasks
>  for the first sub-topology and 6 tasks for the second sub-topology and
>  you can run up to 11 threads, each executing one task.
> 
> 
>  -Matthias
> 
>  On 9/4/20 1:30 AM, Pushkar Deole wrote:
> > Matthias,
> >
> > Let's say we have independent sub topologies like: in this case, will
> >> the
> > streams create tasks equal to the total number of partitions from
> >> topicA
> > and topicB, and can we assign stream thread count that is sum of the
> > partition of the two topics?
> >
> > builder.stream("topicA").filter().to();
> > builder.stream("topicB").filter().to();
> >
> > On Thu, Sep 3, 2020 at 8:28 PM Matthias J. Sax 
> >> wrote:
> >
> >> Well, it depends on your program.
> >>
> >> The reason for the current task creating strategy are joins: If you
> >> have
> >> two input topic that you want to join, the join happens on a
> >> per-partition basis, ie, topic-p0 is joined to topicB-p0 etc and
> thus
> >> both partitions must be assigned to the same task (to get
> >> co-partitioned
> >> data processed together).
> >>
> >> Note, that the following program would create independent tasks as
> it
> >> consist of two independent sub-topologies:
> >>
> >> builder.stream("topicA").filter().to();
> >> builder.stream("topicB").filter().to();
> >>
> >> However, the next program would be one sub-topology and thus we
> apply
> >> the "join" rule (as we don't really know if you actually execute a
> >> join
> >> or not when we create tasks):
> >>
> >> KStream s1 = builder.stream("topicA");
> >> builser.stream("topicB").merge(s1).filter().to();
> >>
> >>
> >> Having said that, I agree that it would be a nice improvement to be
> >> more
> >> clever about it. However, it not easy to do. There is actually a
> >> related
> >> ticket: https://issues.apache.org/jira/browse/KAFKA-9282
> >>
> >>
> >> Hope this helps.
> >>   -Matthias
> >>
> >> On 9/2/20 11:09 PM, Pushkar Deole wrote:
> >>> Hi,
> >>>
> >>> I came across articles where it is explained how parallelism is
> >> handled
> >> in
> >

Re: Reverse-Proxy and Kafka

2020-10-08 Thread Naveen D
Hi guys

We are using debezium sql connector and Kafka, we figured out debezium
eating up space as part of loggging, how can we make sure the topic do not
out grow, for 500 mb table log consumes 12 GB Space.

Any alternative to reduce space?.

Regards,
Navin

On Thu, Oct 8, 2020, 4:48 PM Mario Lafleur 
wrote:

> Hi, we want to route Kafka producer connection thru a reverse-proxy.
> Is this possible?
>
> M<
>
> [http://images.logibec.com/LogoLogibec.png]
>
>
> Mario Lafleur
> Analyste-programmeur Solutions administratives
> Programmer Analyst Administrative Solutions
> T (438) 315-5301 | T +1 800 361-9659
> mario.lafl...@logibec.com
> www.logibec.com
>
>
> <
> http://evolution-sante.com/ateliers-devolution-sante-linnovation-levolution-sante-v2/integrationetoptimisation/
> >
>
>
>
>
>
>
>
>
>
>
>
> [http://images.logibec.com/LogoCorona.png]
>
> [http://images.logibec.com/CalendrierOctobre.png]<
> https://www.logibec.com/page-destination/calendrier-2021#utm_source=courriel&utm_medium=signature&utm_campaign=calendrier-2021
> >
>
> AVIS DE CONFIDENTIALIT? Le pr?sent courriel et les renseignements qu'il
> contient sont confidentiels et demeurent la propri?t? exclusive de Logibec.
> Ils ne peuvent ?tre diffus?s ? quiconque hormis les membres du personnel
> devant en prendre connaissance, et ce, ? condition que ces derniers aient
> ?t? inform?s que ces renseignements sont la propri?t? de Logibec, sont
> confidentiels et ne peuvent ?tre partag?s ? un tiers sans une autorisation
> pr?alable ?crite de Logibec. Si vous n'?tes pas le destinataire vis?,
> veuillez en aviser imm?diatement l'?metteur et d?truire le contenu du
> courriel sans le communiquer ou le reproduire.
>
> CONFIDENTIALITY NOTICE This email and the information it contains are
> confidential and remain the exclusive property of Logibec. They may not be
> disseminated to anyone except staff members who must be aware of it,
> provided that they have been informed that the documents are the property
> of Logibec, are confidential and cannot be shared with a third party
> without prior written authorization from Logibec. If you are not the
> intended recipient, please notify the sender immediately and destroy the
> contents of the email without disclosing or reproducing them.
>


Reverse-Proxy and Kafka

2020-10-08 Thread Mario Lafleur
Hi, we want to route Kafka producer connection thru a reverse-proxy.
Is this possible?

M<

[http://images.logibec.com/LogoLogibec.png]


Mario Lafleur
Analyste-programmeur Solutions administratives
Programmer Analyst Administrative Solutions
T (438) 315-5301 | T +1 800 361-9659
mario.lafl...@logibec.com
www.logibec.com














[http://images.logibec.com/LogoCorona.png]

[http://images.logibec.com/CalendrierOctobre.png]

AVIS DE CONFIDENTIALIT? Le pr?sent courriel et les renseignements qu'il 
contient sont confidentiels et demeurent la propri?t? exclusive de Logibec. Ils 
ne peuvent ?tre diffus?s ? quiconque hormis les membres du personnel devant en 
prendre connaissance, et ce, ? condition que ces derniers aient ?t? inform?s 
que ces renseignements sont la propri?t? de Logibec, sont confidentiels et ne 
peuvent ?tre partag?s ? un tiers sans une autorisation pr?alable ?crite de 
Logibec. Si vous n'?tes pas le destinataire vis?, veuillez en aviser 
imm?diatement l'?metteur et d?truire le contenu du courriel sans le communiquer 
ou le reproduire.

CONFIDENTIALITY NOTICE This email and the information it contains are 
confidential and remain the exclusive property of Logibec. They may not be 
disseminated to anyone except staff members who must be aware of it, provided 
that they have been informed that the documents are the property of Logibec, 
are confidential and cannot be shared with a third party without prior written 
authorization from Logibec. If you are not the intended recipient, please 
notify the sender immediately and destroy the contents of the email without 
disclosing or reproducing them.


kafka schema registry - some queries and questions

2020-10-08 Thread Manoj.Agrawal2
 Hi All,

Wanted to understand a bit more on the schema registry
1. With Apache kafka , can we use  schema registry
2. Amazon MSK , can we use Schema registry ?

Thanks
Manoj A

This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.