Hi Chris and Randall,

I can see that for connectors where exactly once is configuration-dependent
it makes sense to use a default method. The problem with having an explicit
UNKNOWN case is we really want connector developers to _not_ use it. That
could mean it's deprecated from the start. Alternatively we could omit it
from the enum and use null to mean unknown (we'd have to check for a null
result anyway), with the contract for the method being that it should
return non-null. Of course, this doesn't remove the ambiguous case, but
avoids the need to eventually remove UNKNOWN in the future.

I think there's another way for a worker to use the value too: Imagine
you're deploying a connector that you need to be exactly once. It's awkward
to have to query the REST API to determine that exactly once was working,
especially if you need to do this after config changes too. What you
actually want is to make an EOS assertion, via a connector config (e.g.
require.exactly.once=true, or perhaps exactly.once=required/not_required),
which would fail the connector/task if exactly once could not be provided.

The not_required case wouldn't disable the transactional runtime
environment, simply not guarantee that it was providing EOS. Although it
would leave the door open to supporting mixed EOS/non-transactional
deployments in the cluster in the future, if that became possible (i.e. we
could retrospectively make not_required mean no transactions).

On the subject of why it's not possible to enabled exactly once on a
per-connector basis: Is the problem here simply that the zombie fencing
provided by the producer is only available when using transactions, and
therefore having a non-transactional producer in the cluster poses a risk
of a zombie not being fenced? This makes me wonder whether there's a case
for a producer with zombie fencing that is not transactional (intermediate
between idempotent and transactional producer). IIUC this would need to
make a InitProducerId request and use the PID in produce requests, but
could dispense with the other transactional RPCs. If such a thing existed
would the zombie fencing it provided be sufficient to provide safe
semantics for running a non-EOS connector in an EOS-capable cluster?

The endpoint for zombie fencing: It's not described how this works when
exactly.once.source.enabled=false

Cheers,

Tom

Reply via email to