Mickael, have you had some time to review this by any chance?

On Tue, Jun 20, 2023 at 5:23 PM Viktor Somogyi-Vass <
viktor.somo...@cloudera.com> wrote:

> Hey all,
>
> I'd like to revive this discussion. I've created
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-879%3A+Multi-level+Rack+Awareness
> last November and it seems to be that there is a nice overlap between the
> two and would be good to merge. Should we revive KIP-660 and merge the two
> KIPs?
> If you don't have time for this Mickael currently, I'm happy to take it
> over from you and merge the two interfaces, it seems like they're somewhat
> similar (and also with the current internal interface).
>
> Best,
> Viktor
>
> On Tue, May 31, 2022 at 3:57 PM Mickael Maison <mickael.mai...@gmail.com>
> wrote:
>
>> Hi Vikas,
>>
>> You make some very good points and most importantly I agree that being
>> able to prevent putting new partitions on a broker should be part of
>> Kafka itself and not require a plugin.
>>
>> This feature would addresses 2 out of the 3 scenarios mentioned in the
>> motivation section. The last one "When adding brokers to a cluster,
>> Kafka currently does not necessarily place new partitions on new
>> brokers" is clearly less important.
>>
>> So I think I'll retire this KIP and I'll follow up with a new KIP to
>> focus on that feature.
>>
>> Thanks,
>> Mickael
>>
>>
>> On Mon, May 9, 2022 at 8:11 PM Vikas Singh <vi...@confluent.io.invalid>
>> wrote:
>> >
>> > Hi Mickael,
>> >
>> > It's a nice proposal. It's appealing to have a pluggable way to override
>> > default kafka placement decisions, and the motivation section lists
>> some of
>> > them. Here are few comments:
>> >
>> > * The motivation section has "When adding brokers to a cluster, Kafka
>> > currently does not necessarily place new partitions on new brokers". I
>> am
>> > not sure how valuable doing this will be. A newly created kafka topic
>> takes
>> > time to reach the same usage level as existing topics, say because the
>> > topic created by a new workload that is getting onboarded, or the
>> expansion
>> > was done to relieve disk pressure on existing nodes etc. While new
>> topics
>> > catch up to existing workload, the new brokers are not sharing equal
>> load
>> > in the cluster, which probably defeats the purpose of adding new
>> brokers.
>> > In addition to that clustering new topics like this on new brokers have
>> > implications from fault domain perspective. A reasonable way to
>> approach it
>> > is to indeed use CruiseControl to move things around so that the newly
>> > added nodes become immediately involved and share cluster load.
>> > * Regarding "When administrators want to remove brokers from a cluster,
>> > there is no way to prevent Kafka from placing partitions on them", this
>> is
>> > indeed an issue. I would argue that this is needed by everyone and
>> should
>> > be part of Kafka, instead of being implemented as part of a plugin
>> > interface by multiple teams.
>> > * For "When some brokers are near their storage/throughput limit, Kafka
>> > could avoid putting new partitions on them", while this can help relieve
>> > short term overload I think again the correct solution here is something
>> > like CruiseControl where the system is monitored and things moved
>> around to
>> > maintain a balanced cluster. A new topic will not take any disk space,
>> so
>> > placing them anywhere normally isn't going to add to the storage
>> overload.
>> > Similar to the previous case, maybe a mechanism in Kafka to put nodes
>> in a
>> > quarantine state is a better way to approach this.
>> >
>> > In terms of the proposed api, I have a couple of comments:
>> >
>> > * It is not clear if the proposal applies to partitions of new topics or
>> > addition on partitions to an existing topic. Explicitly stating that
>> will
>> > be helpful.
>> > * Regarding part "To address the use cases identified in the motivation
>> > section, some knowledge about the current state of the cluster is
>> > necessary. Details whether a new broker has just been added or is being
>> > decommissioned are not part of the cluster metadata. Therefore such
>> > knowledge has to be provided via an external means to the ReplicaPlacer,
>> > for example via the configuration". It's not clear how this will be
>> done.
>> > If I have to implement this interface, it will be helpful to have clear
>> > guidance/examples here which hopefully ties to the use cases in the
>> > motivation section. It also allows us to figure out if the proposed
>> > interface is complete and helps future implementers of the interface.
>> >
>> > Couple of minor comments:
>> > * The KIP is not listed in the main KIP page (
>> >
>> https://cwiki-test.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>> ).
>> > Can you please add it there.
>> > * The page has "This is especially true for the 4 scenarios listed in
>> the
>> > Motivation section", but there are only 3 scenarios listed.
>> >
>> > Regards,
>> > Vikas
>> >
>> >
>> > On Tue, May 3, 2022 at 5:51 PM Colin McCabe <cmcc...@apache.org> wrote:
>> >
>> > > Hi Mickael,
>> > >
>> > > We did discuss this earlier, and I remember not being too enthusiastic
>> > > about a pluggable policy here :)
>> > >
>> > > There have been several changes to the placement code in the last few
>> > > weeks. (These are examples of the kind of changes that are impossible
>> to do
>> > > once an API is established, by the way.) Can you please revise the
>> KIP to
>> > > take these into account?
>> > >
>> > > I'd also like to understand a little bit better why we need this API
>> when
>> > > we have the explicit placement API for createTopics and
>> createPartitions.
>> > > Can you give me a few scenarios where the manual placement API would
>> be
>> > > insufficient?
>> > >
>> > > best,
>> > > Colin
>> > >
>> > >
>> > > On Mon, May 2, 2022, at 09:28, Mickael Maison wrote:
>> > > > Hi,
>> > > >
>> > > > If there are no further comments, I'll start a vote in the next few
>> days.
>> > > >
>> > > > Thanks,
>> > > > Mickael
>> > > >
>> > > > On Wed, Mar 30, 2022 at 3:51 AM Luke Chen <show...@gmail.com>
>> wrote:
>> > > >>
>> > > >> Hi Mickael,
>> > > >>
>> > > >> Thanks for the update.
>> > > >> It answered my questions!
>> > > >>
>> > > >> Thank you.
>> > > >> Luke
>> > > >>
>> > > >> On Wed, Mar 30, 2022 at 12:09 AM Mickael Maison <
>> > > mickael.mai...@gmail.com>
>> > > >> wrote:
>> > > >>
>> > > >> > Hi Luke,
>> > > >> >
>> > > >> > Thanks for the feedback.
>> > > >> >
>> > > >> > 1. Thanks, fixed!
>> > > >> > 2. Yes that's right. It's the same behavior for topic policies
>> > > >> > 3. I've added details about how the mentioned scenarios could be
>> > > >> > addressed. The information required to make such decisions is
>> not part
>> > > >> > of the Kafka cluster metadata so an external input is necessary.
>> This
>> > > >> > KIP does not propose a specific mechanism for doing it.
>> > > >> >
>> > > >> > I hope this answers your questions.
>> > > >> >
>> > > >> > Thanks,
>> > > >> > Mickael
>> > > >> >
>> > > >> >
>> > > >> > On Tue, Mar 29, 2022 at 5:42 PM Mickael Maison <
>> > > mickael.mai...@gmail.com>
>> > > >> > wrote:
>> > > >> > >
>> > > >> > > Hi Ryanne,
>> > > >> > >
>> > > >> > > That's a good point!
>> > > >> > >
>> > > >> > > There's no value in having all implementations perform the same
>> > > sanity
>> > > >> > > checks. If the replication factor is < 1 or larger than the
>> current
>> > > >> > > number of registered brokers, the controller should directly
>> throw
>> > > >> > > InvalidReplicationFactorException and not call the
>> ReplicaPlacer.
>> > > I've
>> > > >> > > updated the KIP so the place() method now only throws
>> > > >> > > ReplicaPlacementException.
>> > > >> > >
>> > > >> > > Thanks,
>> > > >> > > Mickael
>> > > >> > >
>> > > >> > >
>> > > >> > >
>> > > >> > > On Mon, Mar 28, 2022 at 6:20 PM Ryanne Dolan <
>> ryannedo...@gmail.com
>> > > >
>> > > >> > wrote:
>> > > >> > > >
>> > > >> > > > Wondering about InvalidReplicationFactorException. Why would
>> an
>> > > >> > > > implementation throw this? Given the information passed to
>> the
>> > > method,
>> > > >> > > > seems like this could only be thrown if there were obviously
>> > > invalid
>> > > >> > > > arguments, like a negative number or zero. Can we just
>> guarantee
>> > > such
>> > > >> > > > invalid arguments aren't passed in?
>> > > >> > > >
>> > > >> > > > Ryanne
>> > > >> > > >
>> > > >> > > > On Sat, Mar 26, 2022, 8:51 AM Luke Chen <show...@gmail.com>
>> > > wrote:
>> > > >> > > >
>> > > >> > > > > Hi Mickael,
>> > > >> > > > >
>> > > >> > > > > Thanks for the KIP!
>> > > >> > > > > It's indeed a pain point for the Kafka admins.
>> > > >> > > > >
>> > > >> > > > > I have some comments:
>> > > >> > > > > 1. Typo in motivation section: When administrators [when
>> to]
>> > > remove
>> > > >> > brokers
>> > > >> > > > > from a cluster,....
>> > > >> > > > > 2. If different `replica.placer.class.name` configs are
>> set in
>> > > all
>> > > >> > > > > controllers, I think only the config for  "active
>> controller"
>> > > will
>> > > >> > take
>> > > >> > > > > effect, right?
>> > > >> > > > > 3. Could you explain more about how the proposal fixes some
>> > > >> > scenarios you
>> > > >> > > > > listed, ex: the new added broker case. How could we know
>> the
>> > > broker
>> > > >> > is new
>> > > >> > > > > added? I guess it's by checking the broker load via some
>> metrics
>> > > >> > > > > dynamically, right?
>> > > >> > > > >
>> > > >> > > > >
>> > > >> > > > > Thank you.
>> > > >> > > > > Luke
>> > > >> > > > >
>> > > >> > > > > On Fri, Mar 18, 2022 at 10:30 AM Ryanne Dolan <
>> > > ryannedo...@gmail.com
>> > > >> > >
>> > > >> > > > > wrote:
>> > > >> > > > >
>> > > >> > > > > > Thanks Mickael, this makes sense to me! I've been wanting
>> > > >> > something like
>> > > >> > > > > > this in order to decommission a broker without new
>> partitions
>> > > >> > getting
>> > > >> > > > > > accidentally assigned to it.
>> > > >> > > > > >
>> > > >> > > > > > Ryanne
>> > > >> > > > > >
>> > > >> > > > > > On Thu, Mar 17, 2022, 5:56 AM Mickael Maison <
>> > > >> > mickael.mai...@gmail.com>
>> > > >> > > > > > wrote:
>> > > >> > > > > >
>> > > >> > > > > > > Hi,
>> > > >> > > > > > >
>> > > >> > > > > > > I'd like to start a new discussion on KIP-660. I
>> originally
>> > > >> > wrote this
>> > > >> > > > > > > KIP in 2020 and the initial discussion
>> > > >> > > > > > > (
>> > > >> > https://lists.apache.org/thread/xn7xyb74nyt281brto4x28r9rzxm4lp9
>> )
>> > > >> > > > > > > raised some concerns especially around KRaft (which
>> did not
>> > > >> > exist at
>> > > >> > > > > > > that time) and scalability.
>> > > >> > > > > > >
>> > > >> > > > > > > Since then, we got a new KRaft controller so I've been
>> able
>> > > to
>> > > >> > revisit
>> > > >> > > > > > > this KIP. I kept the KIP number as it's essentially
>> the same
>> > > >> > idea, but
>> > > >> > > > > > > the proposal is significantly different:
>> > > >> > > > > > >
>> > > >> > > > > > >
>> > > >> > > > > >
>> > > >> > > > >
>> > > >> >
>> > >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-660%3A+Pluggable+ReplicaPlacer
>> > > >> > > > > > >
>> > > >> > > > > > > Please take a look and let me know if you have any
>> feedback.
>> > > >> > > > > > >
>> > > >> > > > > > > Thanks,
>> > > >> > > > > > > Mickael
>> > > >> > > > > > >
>> > > >> > > > > >
>> > > >> > > > >
>> > > >> >
>> > >
>>
>

Reply via email to