ok..Thanks Matthius for all the help.

-Sameer.

On Tue, Dec 19, 2017 at 11:14 PM, Matthias J. Sax <matth...@confluent.io>
wrote:

> Processor API does not do any automatic repartitioning. If you need to
> repartition data, you always need to do it "manually" by writeing to and
> reading back from a topic.
>
>
> -Matthias
>
>
> On 12/19/17 3:49 AM, Sameer Kumar wrote:
> > Could you please clarify, if i just choose to use low level processor
> api,
> > what directs it to do re partitioning. I am not using them in conjunction
> > with DSLs, I plan to use only them.
> > Apart from sink processors, are there any conditions for re partitioning
> to
> > occur.
> >
> > -Sameer.
> >
> > On Tue, Dec 19, 2017 at 11:06 AM, Sameer Kumar <sam.kum.w...@gmail.com>
> > wrote:
> >
> >> I understand it now, even if we are able to attach custom partitioning.
> >> The data shall still travel from stream nodes to broker on join topic,
> so
> >> travel to network will still be there.
> >>
> >> -Sameer.
> >>
> >> On Tue, Dec 19, 2017 at 1:17 AM, Matthias J. Sax <matth...@confluent.io
> >
> >> wrote:
> >>
> >>>> need to map the keys, modify them
> >>>>> and then do a join.
> >>>
> >>> This will always trigger a rebalance. There is no API atm to tell KS
> >>> that partitioning is preserved.
> >>>
> >>> Custom partitioner won't help for your case as far as I understand it.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>> On 12/17/17 9:48 PM, Sameer Kumar wrote:
> >>>> Actually, I am doing joining after map. I need to map the keys, modify
> >>> them
> >>>> and then do a join.
> >>>>
> >>>> I was thinking of using always passing a partition key based on which
> >>>> partition happens.
> >>>> Step by step flow is:-
> >>>> 1. Data is already partitoned by do userid.
> >>>> 2. I do a map to joins impressions tied to a user with view
> >>> notifications.
> >>>> 3. I count valid impressions across different aggregations(i.e. across
> >>> diff
> >>>> dimension groups).
> >>>>
> >>>> Thanks,
> >>>> -Sameer.
> >>>>
> >>>> On Mon, Dec 18, 2017 at 1:37 AM, Matthias J. Sax <
> matth...@confluent.io
> >>>>
> >>>> wrote:
> >>>>
> >>>>> Two comments:
> >>>>>
> >>>>> 1) As long, as you don't do an aggregation/join after a map(), there
> >>>>> will be not repartitioning. Streams does repartitioning "lazy", ie,
> >>> only
> >>>>> if it's required. As long as you only chain filter/map etc, no
> >>>>> repartitioning will be done.
> >>>>>
> >>>>> 2) Can't you use mapValue() instead of map()? If you use map() to
> only
> >>>>> read the key but only modify the value (-> "data is still local") a
> >>>>> custom partitioner won't help. Also, we are improving this in
> upcoming
> >>>>> version 1.1 and allows read access to a key in mapValue() (cf.
> KIP-149
> >>>>> for details).
> >>>>>
> >>>>> Hope this helps.
> >>>>>
> >>>>>
> >>>>> -Matthias
> >>>>>
> >>>>> On 12/17/17 8:20 AM, Sameer Kumar wrote:
> >>>>>> I have multiple map and filter phases in my application dag and
> >>> though I
> >>>>> am
> >>>>>> generating different keys at different points, the data is still
> >>> local.
> >>>>>> Re-partitioning for me here is adding unnecessary network
> shuffling, I
> >>>>> want
> >>>>>> to minimize it.
> >>>>>>
> >>>>>> -Sameer.
> >>>>>>
> >>>>>> On Friday, December 15, 2017, Matthias J. Sax <
> matth...@confluent.io>
> >>>>> wrote:
> >>>>>>
> >>>>>>> It's not recommended to write a custom partitioner because it's
> >>> pretty
> >>>>>>> difficult to write a correct one. There are many dependencies and
> you
> >>>>>>> need deep knowledge of Kafka Streams internals to get it write.
> >>>>>>> Otherwise, your custom partitioner breaks Kafka Streams.
> >>>>>>>
> >>>>>>> That is the reason why it's not documented...
> >>>>>>>
> >>>>>>> Not sure so, what you try to achieve in the first place. What do
> you
> >>>>>>> mean by
> >>>>>>>
> >>>>>>>> I want to make sure that during map phase, the keys
> >>>>>>>>> produced adhere to the customized partitioner.
> >>>>>>>
> >>>>>>> Maybe you achieve what you want differently.
> >>>>>>>
> >>>>>>>
> >>>>>>> -Matthias
> >>>>>>>
> >>>>>>> On 12/15/17 1:19 AM, Sameer Kumar wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I want to use the custom partitioner in streams, I couldnt find
> the
> >>>>> same
> >>>>>>> in
> >>>>>>>> the documentation. I want to make sure that during map phase, the
> >>> keys
> >>>>>>>> produced adhere to the customized partitioner.
> >>>>>>>>
> >>>>>>>> -Sameer.
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
>
>

Reply via email to