Hi Guozhang,

thinking out loud... delete then recreate works if it is acceptable to have
a topic specific downtime during which Kafka can't accept requests for that
topic. This downtime would last for the duration while the topic gets
deleted and then recreated. I am assuming here that a producer sending data
for a topic, while it is being deleted and before it is recreated, will
receive an error. The error will be pushed to clients if this process lasts
longer than the time allowed for retries and number of retries configured
on the producer. In our case, the producer is a web service.

It may be acceptable if we do this maintenance during low use periods and
the process is rapid enough (guessing within 30s). Our clients have means
to resend messages when an error occurs but we may still lose messages if
it lasts too long. E.g. the client may be shutdown with pending messages. I
would like to avoid buffering in the web service as much as possible.

What if you don't merge partitions and simply keep shrunk partitions until
log segments are rolled out and deleted? The only thing you have to worry
about is to prevent producers from sending data to those partitions by
having a producer specific metadata which doesn't contain the partitions to
be deleted? This has the impact of having a different set of metadata for
topics depending on if you are producer or consumer, which isn't so nice
though.

I admit this is probably way more simplistic than it really is...

marc



On Mon, Jan 27, 2014 at 7:24 PM, Guozhang Wang <wangg...@gmail.com> wrote:

> Siyuan, Marc:
>
> We are currently working on topic-deletion supports
> (KAFKA-330<https://issues.apache.org/jira/browse/KAFKA-330>),
> would first-delete-then-recreate-with-fewer-partitions work for your cases?
> The reason why we are trying to avoid shrinking partition is that it would
> make the logic very complicated. For example, we need to think about
> within-partition ordering guarantee with partition merging and
> producing-in-progress simultaneously.
>
> Guozhang
>
>
> On Mon, Jan 27, 2014 at 12:35 PM, Marc Labbe <mrla...@gmail.com> wrote:
>
> > I have the same need, and I've just created a Jira:
> > https://issues.apache.org/jira/browse/KAFKA-1231
> >
> > The reasoning behind it is because our topics are created on a per
> product
> > basis and each of them usually starts big during the initial weeks and
> > gradually reduces in time (1-2 years).
> >
> > thanks
> > marc
> >
> >
> > On Thu, Dec 5, 2013 at 7:45 PM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> > > Hi Siyuan,
> > >
> > > We do not have a tool to shrink the number of partitions (if that is
> what
> > > you want) for a topic at runtime yet. Could you file a JIRA for this?
> > >
> > > Guozhang
> > >
> > >
> > > On Thu, Dec 5, 2013 at 2:16 PM, hsy...@gmail.com <hsy...@gmail.com>
> > wrote:
> > >
> > > > Hi guys,
> > > >
> > > > I found there is a tool to add partition on the fly. My question is,
> is
> > > > there a way to delete a partition at runtime? Thanks!
> > > >
> > > > Best,
> > > > Siyuan
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Reply via email to