Re: [DISCUSS] KIP-385: Provide configuration allowing consumer to no throw away prefetched data

Colin McCabe Fri, 26 Oct 2018 10:49:18 -0700

Hi Zahari,

I think we can retire the KIP, since the KAFKA-7548 patch should solve the 
issue without any changes that require a KIP.  This is actually the best thing 
we could do for our users, since things will "just work" more efficiently 
without a lot of configuration knobs.


I think you did an excellent job raising this issue and discussing it.  It's a 
very good contribution to the project even if you don't end up writing the 
patch yourself.  I'm going to take a look at the patch today.  If you want to 
take a look, that would also be good.

best,
Colin


On Thu, Oct 25, 2018, at 12:25, Zahari Dichev wrote:
> Hi there Mayuresh,
> 
> Great to heat that this is actually working well in production for some
> time now. I have changed the details of the KIP to reflect the fact that as
> already discussed - we do not really need any kind of configuration as this
> data should not be thrown away at all.  Submitting a PR sounds great,
> although I feel a bit jealous you (LinkedIn) beat me to my first kafka
> commit  ;)  Not sure how things stand with the voting process ?
> 
> Zahari
> 
> 
> 
> On Thu, Oct 25, 2018 at 7:39 PM Mayuresh Gharat <[email protected]>
> wrote:
> 
> > Hi Colin/Zahari,
> >
> > I have created a ticket for the similar/same feature :
> > https://issues.apache.org/jira/browse/KAFKA-7548
> > We (Linkedin) had a use case in Samza at Linkedin when they moved from the
> > SimpleConsumer to KafkaConsumer and they wanted to do this pause and resume
> > pattern.
> > They realized there was performance degradation when they started using
> > KafkaConsumer.assign() and pausing and unPausing partitions. We realized
> > that not throwing away the prefetched data for paused partitions might
> > improve the performance. We wrote a benchmark (I can share it if needed) to
> > prove this. I have attached the findings in the ticket.
> > We have been running the hotfix internally for quite a while now. When
> > samza ran this fix in production, they realized 30% improvement in there
> > app performance.
> > I have the patch ready on our internal branch and would like to submit a PR
> > for this on the above ticket asap.
> > I am not sure, if we need a separate config for this as we haven't seen a
> > lot of memory overhead due to this in our systems. We have had this running
> > in production for a considerable amount of time without any issues.
> > It would be great if you guys can review the PR once its up and see if that
> > satisfies your requirement. If it doesn't then we can think more on the
> > config driven approach.
> > Thoughts??
> >
> > Thanks,
> >
> > Mayuresh
> >
> >
> > On Thu, Oct 25, 2018 at 8:21 AM Colin McCabe <[email protected]> wrote:
> >
> > > Hi Zahari,
> > >
> > > One question we didn't figure out earlier was who would actually want
> > this
> > > cached data to be thrown away.  If there's nobody who actually wants
> > this,
> > > then perhaps we can simplify the proposal by just unconditionally
> > retaining
> > > the cache until the partition is resumed, or we unsubscribe from the
> > > partition.  This would avoid adding a new configuration.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Sun, Oct 21, 2018, at 11:54, Zahari Dichev wrote:
> > > > Hi there, although it has been discussed briefly already in this thread
> > > > <
> > >
> > https://lists.apache.org/thread.html/fbb7e9ccc41084fc2ff8612e6edf307fb400f806126b644d383b4a64@%3Cdev.kafka.apache.org%3E
> > > >,
> > > > I decided to follow the process and initiate a DISCUSS thread. Comments
> > > > and
> > > > suggestions are more than welcome.
> > > >
> > > >
> > > > Zahari Dichev
> > >
> >
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
> >

Re: [DISCUSS] KIP-385: Provide configuration allowing consumer to no throw away prefetched data

Reply via email to