Hi Andrew,
Thanks for your comments.
1) Yes that makes sense and that's what even would expect to see as well. I
just wanted to highlight that we might still need a way to let client side
partitioning logic be present as well. Anyways, I am good on this point.
2) The example provided does seem achievable by simply attaching the
partition number in the ProducerRecord. I guess if we can't find any
further examples which strengthen the case of this partitioner, it might be
harder to justify adding it.
Thanks!
Sagar.
On Fri, Jul 28, 2023 at 2:05 PM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:
> Hi Sagar,
> Thanks for your comments.
>
> 1) Server-side partitioning doesn’t necessarily mean that there’s only one
> way to do it. It just means that the partitioning logic runs on the broker
> and
> any configuration of partitioning applies to the broker’s partitioner. If
> we ever
> see a KIP for this, that’s the kind of thing I would expect to see.
>
> 2) In the priority example in the KIP, there is a kind of contract between
> the
> producers and consumers so that some records can be processed before
> others regardless of the order in which they were sent. The producer
> wants to apply special significance to a particular header to control which
> partition is used. I would simply achieve this by setting the partition
> number
> in the ProducerRecord at the time of sending.
>
> I don’t think the KIP proposes adjusting the built-in partitioner or
> adding to AK
> a new one that uses headers in the partitioning decision. So, any
> configuration
> for a partitioner that does support headers would be up to the
> implementation
> of that specific partitioner. Partitioner implements Configurable.
>
> I’m just providing an alternative view and I’m not particularly opposed to
> the KIP.
> I just don’t think it quite merits the work involved to get it voted and
> merged.
> As an aside, a long time ago, I created a small KIP that was never adopted
> and I didn’t push it because I eventually didn’t need it.
>
> Thanks,
> Andrew
>
> > On 28 Jul 2023, at 05:15, Sagar wrote:
> >
> > Hey Andrew,
> >
> > Thanks for the review. Since I had reviewed the KIP I thought I would
> also
> > respond. Of course Jack has the final say on this since he wrote the KIP.
> >
> > 1) This is an interesting point and I hadn't considered it. The
> > comparison with KIP-848 is a valid one but even within that KIP, it
> allows
> > client side partitioning for power users like Streams. So while we would
> > want to move away from client side partitioner as much as possible, we
> > still shouldn't do away completely with Client side partitioning and end
> up
> > being in a state of inflexibility for different kinds of usecases. This
> is
> > my opinion though and you have more context on Clients, so would like to
> > know your thoughts on this.
> >
> > 2) Regarding this, I assumed that since the headers are already part of
> the
> > consumer records they should have access to the headers and if there is a
> > contract b/w the applications producing and the application consuming,
> that
> > decisioning should be transparent. Was my assumption incorrect? But as
> you
> > rightly pointed out header based partitioning with keys is going to lead
> to
> > surprising results. Assuming there is merit in this proposal, do you
> think
> > we should ignore the keys in this case (similar to the effect of
> > setting *partitioner.ignore.keys
> > *config to false) and document it appropriately?
> >
> > Let me know what you think.
> >
> > Thanks!
> > Sagar.
> >
> >
> > On Thu, Jul 27, 2023 at 9:41 PM Andrew Schofield <
> > andrew_schofield_j...@outlook.com> wrote:
> >
> >> Hi Jack,
> >> Thanks for the KIP. I have a few concerns about the idea.
> >>
> >> 1) I think that while a client-side partitioner seems like a neat idea
> and
> >> it’s an established part of Kafka,
> >> it’s one of the things which makes Kafka clients quite complicated. Just
> >> as KIP-848 is moving from
> >> client-side assignors to server-side assignors, I wonder whether really
> we
> >> should be looking to make
> >> partitioning a server-side capability too over time. So, I’m not
> convinced
> >> that making the Partitioner
> >> interface richer is moving in the right direction.
> >>
> >> 2) For records with a key, the partitioner usually calculates the
> >> partition from the key. This means
> >> that records with the same key end up on the same partition. Many
> >> applications expect this to give ordering.
> >> Log compaction expects this. There are situations in which records have
> to
> >> be repartitioned, such as
> >> sometimes happens with Kafka Streams. I think that a header-based
> >> partitioner for records which have
> >> keys is going to be surprising and only going to have limited
> >> applicability as a result.
> >>
> >> The tricky part about clever partitioning is that downstream systems
> have
> >> no idea how the partition
> >> number was arrived at,