Re: [DISCUSS] KIP-1335: Bounded concurrency for partition reassignment via kafka-reassign-partitions.sh

Manan Gupta Mon, 25 May 2026 07:06:39 -0700

Hey TaiJuWu

Thank you for reviewhing the KIP, my response is inline.


> TJ00: If we have multiple batch requests, how do you handle single batch
failure?
- If a submit step fails, the tool returns immediately with errors and does
not enqueue the rest; partitions already submitted stay under the
controller’s reassignment as they do today.
- The process exits with a TerseException listing the failed partitions and
the error message from the broker/controller (the same pattern as a
single-shot execute when some alters fail).

> TJ01: If there is a long time operation, how can the users know it still
running instead of hang?
- Controller / cluster side: ongoing reassignments and replication
(metrics, kafka-reassign-partitions --list, Admin / JMX).
- verify in another terminal shows progress toward the target.
Batch wait is mostly quiet; incremental is a bit chattier; true progress is
best observed from cluster state or --verify, not only from stdout during
the wait loop.

Thanks,
Manan Gupta

On Mon, May 25, 2026 at 6:06 PM TaiJu Wu <[email protected]> wrote:

> Hi Manan,
>
> Thanks for the KIP, just for some question.
>
> TJ00: If we have multiple batch requests, how do you handle single batch
> failure?
>
> TJ01: If there is a long time operation, how can the users know it still
> running instead of hang?
>
> Thanks,
> TaiJuWu
>
>
>
> Manan Gupta <[email protected]> 於 2026年5月18日週一 下午6:09寫道：
>
> > Hey Kamal
> >
> > Thank you for your comments.
> >
> > > Should we have a configurable list poll interval?
> > The current fixed interval of 500ms should not degrade the controller
> but I
> > agree that operators should have an option to change this value, updated
> > the KIP to also take another parameter reassignment-poll-interval-ms to
> > update the default value from 500 ms.
> >
> > > Shall we extend the batching logic to also kafka-leader-election
> script?
> > Good point, I will pick this up as a separate KIP as a followup to this
> > KIP.
> >
> > Thanks,
> > Manan
> >
> > On Mon, May 18, 2026 at 2:52 PM Kamal Chandraprakash <
> > [email protected]> wrote:
> >
> > > Hi Manan,
> > >
> > > Thanks for improving the user-facing tools! Overall LGTM. Few
> questions:
> > >
> > > 1. Should we have a configurable list poll interval? With 500ms, does
> it
> > > poll the controller often to list the currently running reassignments
> for
> > > large partitions?
> > > 2. Shall we extend the batching logic to also kafka-leader-election
> > script?
> > > It will be useful when running with --all-topic-partitions.
> > >
> > > Thanks,
> > > Kamal
> > >
> > >
> > > On Mon, May 11, 2026 at 8:55 AM Manan Gupta <[email protected]>
> > wrote:
> > >
> > > > Hello
> > > >
> > > > Gentle reminder to review the KIP.
> > > >
> > > > Thanks,
> > > > Manan
> > > >
> > > > On Wed, May 6, 2026 at 7:52 PM Manan Gupta <[email protected]>
> > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > This email starts the discussion thread for *KIP-1335: Bounded
> > > > > concurrency for partition reassignment via
> > > kafka-reassign-partitions.sh*.
> > > > > The proposal adds optional reassignment-batch-size and incremental
> > > > > parameters to kafka-reassign-partitions.sh so operators can cap how
> > > many
> > > > > partition reassignments are submitted or kept in flight at once
> using
> > > > > existing Admin API,
> > > > >
> > > > > I will appreciate your initial thoughts and feedback on the
> proposal.
> > > > >
> > > > > https://cwiki.apache.org/confluence/x/8ZAmGQ
> > > > >
> > > > > Thanks,
> > > > > Manan
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-1335: Bounded concurrency for partition reassignment via kafka-reassign-partitions.sh

Reply via email to