Thanks for the background context Jay.

Do we have any context on what size is small (but still effect for small
deployments) for the compaction buffer? and what is large? what factors
help you choose the correct (or a safe) size?

Currently the default "log.cleaner.dedupe.buffer.size" is 500 MiB. If we
are enabling the log cleaner by default, should we adjust that default size
to be smaller?

On a similar note, log.cleaner.delete.retention.ms is currently defaulted
to 1 day. I am not sure the background here either, but would it make sense
to default this setting to 7 days to match the default log retention and
ensure no delete messages are missed by consumers?

Thanks,
Grant

On Mon, Dec 14, 2015 at 2:19 PM, Jay Kreps <j...@confluent.io> wrote:

> The reason for disabling it by default was (1) general paranoia about
> log compaction when we released it, (2) avoid allocating the
> compaction buffer. The first concern is now definitely obsolete, but
> the second concern is maybe valid. Basically that compaction buffer is
> a preallocated chunk of memory used in compaction and is closely tied
> to the efficiency of the compaction process (so you want it to be
> big). But if you're not using compaction then it is just wasting
> memory. I guess since the new consumer requires native offsets
> (right?) and native offsets require log compaction, maybe we should
> just default it to on...
>
> -Jay
>
> On Mon, Dec 14, 2015 at 11:51 AM, Jason Gustafson <ja...@confluent.io>
> wrote:
> > That's a good point. It doesn't look like there's any special handling
> for
> > the offsets topic, so enabling the cleaner by default makes sense to me.
> If
> > compaction is not enabled, it would grow without bound, so I wonder if we
> > should even deprecate that setting. Are there any use cases where it
> needs
> > to be disabled?
> >
> > -Jason
> >
> > On Mon, Dec 14, 2015 at 11:31 AM, Gwen Shapira <g...@confluent.io>
> wrote:
> >
> >> This makes sense to me. Copycat also works better if topics are
> compacted.
> >>
> >> Just to clarify:
> >> log.cleaner.enable = true just makes the compaction thread run, but
> doesn't
> >> force compaction on any specific topic. You still need to set
> >> delete.policy=compact, and we should not change defaults here.
> >>
> >> On Mon, Dec 14, 2015 at 10:32 AM, Grant Henke <ghe...@cloudera.com>
> wrote:
> >>
> >> > Since 0.9.0 the internal "__consumer_offsets" topic is being used more
> >> > heavily. Because this is a compacted topic does "log.cleaner.enable"
> need
> >> > to be "true" in order for it to be compacted? Or is there special
> >> handling
> >> > for internal topics?
> >> >
> >> > If log.cleaner.enable=true is required, should we make it true by
> >> default?
> >> > Or document that is required for normal operation?
> >> >
> >> > Thanks,
> >> > Grant
> >> > --
> >> > Grant Henke
> >> > Software Engineer | Cloudera
> >> > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> >> >
> >>
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke

Reply via email to