Re: [DISCUSS] KIP-883: Add delete callback method to Connector API

Hector Geraldino (BLOOMBERG/ 919 3RD A) Wed, 16 Nov 2022 12:45:56 -0800

Hi Sagar,

Thanks for your feedback! I actually renamed the method from "deleted()" to 
"destroyed()", which I think conveys the intention more clearly. I can 
certainly rename it to be 'onDeleted()', although I feel any method named 
onXXX() belongs to a listener class :)


Regarding failure scenarios, an option I'm considering is to just provide an 
overloaded Connector#stop(boolean deleted) method that is called during 
WorkerConnector#doShutdown(). This has the advantage of providing the same 
semantics that the current Connector#stop() has, with the caveat that the API 
won't be as expressive. Also, the extra 'cleanup' bits that were supposed to 
happen when a connector is deleted might not to happen at all if the connector 
doesn't stop before the configured timeout (and is therefore cancelled).

At this point I think the simplest option would be to provide an overloaded 
method (with a default implementation) that connectors can override. Wdyt?

From: dev@kafka.apache.org At: 11/15/22 11:40:26 UTC-5:00To:  
dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-883: Add delete callback method to Connector API

Hey Hector,

Thanks for the KIP. I have a minor suggestion in terms of naming. Since
this is a callback method, would it make sense to call it onDelete()?

Also, the failure scenarios discussed by Greg would need handling. Among
other things, I like the idea of having a timeout for graceful shutdown or
else try a force shutdown. What do you think about that approach?

Thanks!
Sagar.

On Sat, Nov 12, 2022 at 1:53 AM Hector Geraldino (BLOOMBERG/ 919 3RD A) <
hgerald...@bloomberg.net> wrote:

> Thanks Greg for taking your time to review not just the KIP but also the
> PR.
>
> 1. You made very valid points regarding the behavior of the destroy()
> callback for connectors that don't follow the happy path. After thinking
> about it, I decided to tweak the implementation a bit and have the
> destroy() method be called during the worker shutdown: this means it will
> share the same guarantees the connector#stop() method has. An alternative
> implementation can be to have an overloaded connector#stop(boolean deleted)
> method that signals a connector that it is being stopped due to deletion,
> but I think that having a separate destroy() method provides clearer
> semantics.
>
> I'll make sure to ammend the KIP with these details.
>
> 3. Without going too deep on the types of operations that can be performed
> by a connector when it's being deleted, I can imagine the
> org.apache.kafka.connect.source.SourceConnector base class having a default
> implementation that deletes the connector's offsets automatically
> (controlled by a property); this is in the context of KIP-875 (first-class
> offsets support in Kafka Connect). Similar behaviors can be introduced for
> the SinkConnector, however I'm not sure if this KIP is the right place to
> discuss all the possibilities, or if we shoold keeping it more
> narrow-focused on  providing a callback mechanism for when connectors are
> deleted, and what the expectations are around this newly introduced method.
> What do you think?
>
>
> From: dev@kafka.apache.org At: 11/09/22 16:55:04 UTC-5:00To:
> dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-883: Add delete callback method to Connector API
>
> Hi Hector,
>
> Thanks for the KIP!
>
> This is certainly missing functionality from the native Connect framework,
> and we should try to make it possible to inform connectors about this part
> of their lifecycle.
> However, as with most functionality that was left out of the initial
> implementation of the framework, the details are more challenging to work
> out.
>
> 1. What happens when the destroy call throws an error, how does the
> framework respond?
>
> This is unspecified in the KIP, and it appears that your proposed changes
> could cause the herder to fail.
> From the perspective of operators & connector developers, what is a
> reasonable expectation to have for failure of a destroy?
> I could see operators wanting both a graceful-delete to make use of this
> new feature, and a force-delete for when the graceful-delete fails.
> A connector developer could choose to swallow all errors encountered, or
> fail-fast to indicate to the operator that there is an issue with the
> graceful-delete flow.
> If the alternative is crashing the herder, connector developers may choose
> to hide serious errors, which is undesirable.
>
> 2. What happens when the destroy() call takes a long time to complete, or
> is interrupted?
>
> It appears that your implementation serially destroy()s each appropriate
> connector, and may prevent the herder thread from making progress while the
> operation is ongoing.
> We have previously had to patch Connect to perform all connector and task
> operations on a background thread, because some connector method
> implementations can stall indefinitely.
> Connect also has the notion of "cancelling" a connector/task if a graceful
> shutdown timeout operation takes too long. Perhaps some of that design or
> machinery may be useful to protect this method call as well.
>
> More specific to the destroy() call itself, what happens when a connector
> completes part of a destroy operation and then cannot complete the
> remainder, either due to timing out or a worker crashing?
> What is the contract with the connector developer about this method? Is the
> destroy() only started exactly once during the lifetime of the connector,
> or may it be retried?
>
> 3. What should be considered a reasonable custom implementation of the
> destroy() call? What resources should it clean up by default?
>
> I think we can broadly categorize the state a connector mutates among the
> following
> * Framework-managed state (e.g. source offsets, consumer offsets)
> * Implementation detail state (e.g. debezium db history topic, audit
> tables, temporary accounts)
> * Third party system data (e.g. the actual data being written by a sink
> connector)
> * Third party system metadata (e.g. tables in a database, delivery
> receipts, permissions)
>
> I think it's apparent that the framework-managed state cannot/should not be
> interacted with by the destroy() call. However, the framework could be
> changed to clean up these resources at the same time that destroy() is
> called. Is that out-of-scope of this proposal, and better handled by manual
> intervention?
> From the text of the KIP, I think it explicitly includes the Implementation
> detail state, which should not be depended on externally and should be safe
> to clean up during a destroy(). I think this is completely reasonable.
> Are the third-party data and metadata out-of-scope for this proposal? Can
> we officially recommend against it, or should we accommodate users and
> connector developers that wish to clean up data/metadata during destroy()?
>
> 4. How should connector implementations of destroy handle backwards
> compatibility?
>
> In terms of backward-compatibility for the framework vs connector versions,
> I think the default-noop method is very reasonable.
> However, what happens when someone upgrades from a version of a connector
> without a destroy() implementation to one with an implementation, and
> maintain backwards compatibility?
> To replicate the same behavior, the connector might include something like
> an `enable.cleanup` config which allows users to opt-in to the new
> behavior. This could mean the proliferation of many different
> configurations to handle this behavior.
> Maybe we can provide some recommendations to developers, or some mechanism
> to standardize this opt-in behavior.
>
> I'm interested to hear if you have any experience with the above, if you've
> experimented with this feature in your fork.
>
> Thanks,
> Greg
>
>
> On Thu, Nov 3, 2022 at 11:55 AM Hector Geraldino (BLOOMBERG/ 919 3RD A) <
> hgerald...@bloomberg.net> wrote:
>
> > Hi everyone,
> >
> > I've submitted KIP-883, which introduces a callback to the public
> > Connector API called when deleting a connector:
> >
> >
> >
>
> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-883%3A+Add+delete+callback
> +method+to+Connector+API
> >
> > It adds a new `deleted()` method (open to better naming suggestions) to
> > the org.apache.kafka.connect.connector.Connector abstract class, which
> will
> > be invoked by connect Workers when a connector is being deleted.
> >
> > Feedback and comments are welcome.
> >
> > Thank you!
> > Hector
> >
> >
>
>
>

Re: [DISCUSS] KIP-883: Add delete callback method to Connector API

Reply via email to