Hi Greg,
Thanks for replying.

> From your description, it sounds like you want the success/failure of
> a callback to influence whether that record (and later records) are
> present in the topic. Is this correct?
Yes

> The solution that you posted does actually write a record that has an
> erroneous callback, is that desirable, or would you want that record
> to also be rejected?
The source code demonstrated the original problem, not the solution.
What I want is: once an exception is present in producer callback, I would
assume the record is not delivered to the broker and I would like all
records after the exception to be suspended from sending to the broker so
that all records already delivered to the broker is strictly ordered after
the producer was closed.
In the source code demo, the former batch met an exception, but the latter
batch was sent successfully even though I initiated the producer closing
operation immediately when I detected the exception. That is not what I
want.
As for "does actually write a record that has an erroneous callback" you
said, I noticed this problem, but this is not the key point of my problem
so I did not mention it.

As for transactional producers, transactional producers are not suited to
my user case. My data is database CDC(change data capture) data which
should maintain order strictly(similar to debezium project). There is no
need for me to use a transactional producer.

> I think you should carefully consider throwing delivery-critical
> errors from the callback, as that is not a common workflow. Could
> those errors be moved to a different part of the pipeline, such as the
> consumer application?
The problem is not related to the consumer side. I do want to achieve
Exactly Once delivery. I previously thought idempotent producer could be a
solution, but I later found that idempotent producer could only guarantee
ordering when kafka is retrying producer batch internally.

Thanks and regards,
William

Greg Harris <greg.har...@aiven.io.invalid> 于2024年3月12日周二 00:50写道:

> Hi William,
>
> From your description, it sounds like you want the success/failure of
> a callback to influence whether that record (and later records) are
> present in the topic. Is this correct?
> The solution that you posted does actually write a record that has an
> erroneous callback, is that desirable, or would you want that record
> to also be rejected?
>
> This sounds like a use-case for transactional producers [1] utilizing
> Exactly Once delivery. You can start a transaction, emit records, have
> them persisted in Kafka, perform some computation afterwards, and then
> decide whether to commit or abort the transaction based on the result
> of that computation.
>
> There is also a performance penalty to transactional producers, but it
> is different from the max.in.flight.requests.per.connection bottleneck
> and not directly comparable.
> I think you should carefully consider throwing delivery-critical
> errors from the callback, as that is not a common workflow. Could
> those errors be moved to a different part of the pipeline, such as the
> consumer application?
>
> And since you're performance sensitive, please be aware that
> performance (availability) nearly always comes at the cost of delivery
> guarantees (consistency) [2]. If you're not willing to pay the
> performance cost of max.in.flight.requests.per.connection=1, then you
> may need to make a compromise on the consistency and find a way to
> manage the extra data.
>
> Thanks,
> Greg Harris
>
> [1]
> https://kafka.apache.org/37/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html
> [2] https://en.wikipedia.org/wiki/CAP_theorem
>
> On Mon, Mar 11, 2024 at 7:32 AM William Lee <ligaopeng...@gmail.com>
> wrote:
> >
> > Hi Haruki,
> > Thanks for your answer.
> > > I still don't get why you need this behavior though
> > The reason is I have to ensure message ordering per partition strictly.
> > Once there is an exception in the producer callback, it indicates that
> the
> > exception is not a retryable exception(from kafka producer's
> perspective).
> > There must be something wrong, so I have to stop sending records and
> > resolve the underlying issue.
> >
> > As for the performance problem, I found a kafka wiki which investigated
> the
> > impact of max.in.flight.requests.per.connection: An analysis of the
> impact
> > of max.in.flight.requests.per.connection and acks on Producer
> performance -
> > Apache Kafka - Apache Software Foundation
> > <
> https://cwiki.apache.org/confluence/display/KAFKA/An+analysis+of+the+impact+of+max.in.flight.requests.per.connection+and+acks+on+Producer+performance
> >
> > From the wiki, max.in.flight.requests.per.connection is better set to 2
> or
> > more.
> >
> > By setting max.in.flight.requests.per.connection to 1, I'm concerned that
> > this could become a performance bottleneck
>

Reply via email to