Hi Ashwin,

Regarding point 4--I think this is still a bit unwise. When workers pick up
a new connector config from the config topic, they participate in a
rebalance. It may be safe to write offsets during that rebalance, but in
the name of simplicity, do you think we can write the offsets for the
connector before its config? The sequence of steps would be something like
this:

1. Validate offsets and connector config (can be done in any order)
2. Write offsets
3. Write connector config (with whatever initial state is specified in the
request, or the default if none is specified)

Cheers,

Chris

On Wed, Dec 6, 2023 at 9:13 AM Ashwin <apan...@confluent.io.invalid> wrote:

> Hello Chris,
>
> Thanks for the quick and detailed review. Please see my responses below
>
> High-level thoughts:
>
> 1. I had not thought of this till now, thanks for pointing it out. I
> would lean towards the second option of cleaning previous offsets as
> it will result in the fewer surprises for the user.
>
> 2 and 3. I agree and have updated the wiki to that effect. I just
> wanted to use the connector name as a mutex - we can handle the race
> in other ways.
>
> 4. Yes, I meant submitting config to the config topic. Have updated
> the wiki to make it clearer.
>
>
> Nits:
>
> Thanks for pointing these - I have made the necessary changes.
>
>
> Thanks again,
>
> Ashwin
>
>
> On Mon, Dec 4, 2023 at 9:05 PM Chris Egerton <chr...@aiven.io.invalid>
> wrote:
>
> > Hi Ashwin,
> >
> > Thanks for the KIP! This would be a nice simplification to the process
> for
> > migrating connectors enabled by KIP-980, and would also add global
> support
> > for a feature I've seen implemented by hand in at least a couple
> connectors
> > over the years.
> >
> >
> > High-level thoughts:
> >
> > 1. If there are leftover offsets from a previous connector, what will the
> > impact on those offsets (if any) be if a new connector is created with
> the
> > same name with initial offsets specified? I can think of at least two
> > options: we leave those offsets as-are but allow any of the initial
> offsets
> > in the new connector request to overwrite them, or we automatically wipe
> > all existing offsets for the connector first before writing its initial
> > offsets and then creating it. I have a slight preference for the first
> > because it's simpler to implement and aligns with existing precedent for
> > offset handling where we never wipe them unless explicitly requested by
> the
> > user or connector, but it could be argued that the second is less likely
> to
> > generate footguns for users. Interested in your thoughts!
> >
> > 2. IMO preflight validation (i.e., the "Validate initial_offset before
> > creating the connector in STOPPED state rejected alternative) is a
> > must-have for this kind of feature. I acknowledge the possible race
> > condition (and I don't think it's worth it to track in-flight connector
> > creation requests in the herder in order to prevent this race, since
> > ever-blocking Connector operations would be cumbersome to deal with), and
> > the extra implementation effort. But I don't think either of those tip
> the
> > scales far enough to override the benefit of ensuring the submitted
> offsets
> > are valid before creating the connector.
> >
> > 3. On the topic of preflight validation--I also think it's important that
> > we validate both the connector config and the initial offsets before
> either
> > creating the connector or storing the initial offsets. I don't think this
> > point requires any changes to the KIP that aren't already proposed with
> > point 2. above, but wanted to see if we could adopt this as a primary
> goal
> > for the design of the feature and keep it in mind with future changes.
> >
> > 4. In the proposed changes section, the first step is "Create a connector
> > in STOPPED state", and the second step is "Validate the offset...". What
> > exactly is entailed by "Create a connector"? Submit the config to the
> > config topic (presumably after a preflight validation of the config)?
> > Participate in the ensuing rebalance? I'm a little hesitant to start
> > performing rebalances inside REST requests, wondering if we can find a
> > lighter-weight way to implement this.
> >
> >
> > Nits:
> >
> > 5. You can remove the italicized "This page is meant as a template for
> > writing a KIP" section after the table of contents.
> >
> > 6. Can you file a JIRA ticket for this change and add a link to it in the
> > KIP under the status section?
> >
> > 7. What do you think about changing the name for the new field from
> > "initial_offset" (singular) to "initial_offsets" (plural)? This is
> > especially relevant for connectors that read from multiple source
> > partitions, like MM2 and the various JDBC source connectors out there.
> >
> > 8. IMO it's not necessary to include this section: "Please note that sink
> > and source connectors have different schemas for offset." While it's
> > technically true that the fields of the objects inside the "partition"
> and
> > "offset" fields will likely differ between sink and source connectors,
> > they'll also likely differ in the exact same way across different sink
> > connectors. I think it's enough to link to the relevant section in
> KIP-875
> > on the details for the format.
> >
> > 9. Instead of "Connector-defined source partition" and "Connector-defined
> > source offset" in the comments for the sample connector creation body
> > (which aren't strictly accurate for sink connectors), can we say
> something
> > like "Source partition" and "Desired initial offset"?
> >
> > 10. In the compatibility section the KIP states that "This new feature
> will
> > use the current OffsetStorageWriter." IMO we should refrain from
> referring
> > to internal API class names in KIPs when possible, since those class
> names
> > may change and users may also mistakenly assume that they're part of the
> > public API. Can we say something like "This new feature will reuse
> existing
> > logic for storing offsets" instead?
> >
> >
> > Thanks again for the KIP!
> >
> > Cheers,
> >
> > Chris
> >
> > On Thu, Nov 30, 2023 at 2:26 AM Ashwin <apan...@confluent.io.invalid>
> > wrote:
> >
> > > Hi all,
> > >
> > > I'd like to begin discussion on KIP-995 which proposes to allow users
> > > to specify initial offset as part of the request to create a connector
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-995%3A+Allow+users+to+specify+initial+offsets+while+creating+connectors
> > >
> > > During the discussion for KIP-980
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-980%3A+Allow+creating+connectors+in+a+stopped+state
> > > >,
> > > which proposed the creation of connectors in STOPPED state, there was
> > > a suggestion to also allow setting the initial offset for a connector
> > > in the connector creation API. The proposal was deemed valid
> > > <https://lists.apache.org/thread/tq79l3975hy69wsycvwoh9n8gbp5j26f>
> > > (point no.4) but was deferred to a future KIP. This KIP proposes to
> > > implement that change and builds upon the changes introduced in
> > > KIP-875 <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-875%3A+First-class+offsets+support+in+Kafka+Connect#KIP875:FirstclassoffsetssupportinKafkaConnect-Alteringoffsets(request)
> > > >
> > > and KIP-980 <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-980%3A+Allow+creating+connectors+in+a+stopped+state
> > > >.
> > >
> > > Thanks,
> > > Ashwin
> > >
> >
>

Reply via email to