Re: Question about KRaft

2023-03-10 Thread David Arthur
Hi Zhenyu,

> Currently I am using 3.3.2 (upgrade from 3.2) with only one node, which is
both controller & broker, even ZK is installed on this node too (sorry I
know it is not distributed and I will try to improve it with more knowledge
learned in future)

Controllers are always colocated with brokers in ZK mode. Only in
KRaft mode do we separate the two concepts by introducing the
"process.roles" configuration.

As Luke mentioned, you can try a regular migration by following the
docs and it should work. Essentially, you would be bringing up a new
KRaft controller (on the same or different server) and letting it do a
migration of your single-node Kafka cluster. Once you've gone through
all the steps, you should have a single KRaft broker and a single
KRaft controller. At that point you can decommission ZooKeeper.

If you run into any trouble, feel free to reach out here on the users
list or file a JIRA (if you think you found a bug 😉)
https://issues.apache.org/jira/browse/KAFKA

Cheers,
David

On Fri, Mar 10, 2023 at 12:17 AM Luke Chen  wrote:
>
> For questions related to confluent, I think you'd better ask in their
> channel.
>
> Luke
>
> On Fri, Mar 10, 2023 at 12:54 PM sunil chaudhari <
> sunilmchaudhar...@gmail.com> wrote:
>
> > Hi Luke,
> > This docu is good.
> > Does it apply for confluent as well?
> >
> >
> >
> > On Fri, 10 Mar 2023 at 8:47 AM, Luke Chen  wrote:
> >
> > > Hi Zhenyu,
> > >
> > > Answering your question:
> > >
> > > > Should I simply
> > > 1. download 3.4 binary
> > > 2. stop ZK & Kafka service
> > > 3. upgrade Kafka to 3.4
> > > 4. start only Kafka service with KRaft server.properties
> > >
> > > That is not migrating, actually. That is just creating another kafka
> > > cluster in KRaft mode.
> > > The point for migration is to move metadata in ZK into KRaft controllers.
> > > You can follow the guide here to do migration:
> > > https://kafka.apache.org/documentation/#kraft_zk_migration
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Tue, Mar 7, 2023 at 11:07 PM Zhenyu Wang 
> > > wrote:
> > >
> > > > Hi Sunil,
> > > >
> > > > As mentioned earlier in my question, I have only one "combined" node as
> > > > both controller and broker, and I totally accept downtime (stop
> > service)
> > > >
> > > > So just want to ask for my case, single node, if I want to upgrade to
> > 3.4
> > > > then start service under KRaft (get rid of ZK), what would be the
> > steps?
> > > >
> > > > Thanks~
> > > >
> > > > On Mon, Mar 6, 2023 at 11:49 PM sunil chaudhari <
> > > > sunilmchaudhar...@gmail.com>
> > > > wrote:
> > > >
> > > > > How will you achieve zero downtime of you stop zookeeper and kafka?
> > > > > There must be some standard steps so that stop zookeeper one by one
> > and
> > > > > start kraft same time so that it will be migrated gradually.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, 7 Mar 2023 at 9:26 AM, Zhenyu Wang 
> > > > wrote:
> > > > >
> > > > > > Hi team,
> > > > > >
> > > > > > Here is a question about KRaft from normal user, who starts to use
> > > and
> > > > > > learn Kafka since 3.2
> > > > > >
> > > > > > Last month Kafka 3.4, the first bridge release was available, and I
> > > am
> > > > > > considering to have a plan to use KRaft (get rid of ZK) since this
> > > > > version
> > > > > >
> > > > > > Currently I am using 3.3.2 (upgrade from 3.2) with only one node,
> > > which
> > > > > is
> > > > > > both controller & broker, even ZK is installed on this node too
> > > (sorry
> > > > I
> > > > > > know it is not distributed and I will try to improve it with more
> > > > > knowledge
> > > > > > learned in future)
> > > > > >
> > > > > > When I read KIP-866, ZK to KRaft migration, from section Migration
> > > > > > Overview, seems like the document is for multi-nodes with no or
> > > almost
> > > > no
> > > > > > downtime, enable KRaft node by node; however my case accepts
> > downtime
> > > > > (one
> > > > > > node -_-!!), just want to have Kafka upgrade to 3.4 then start
> > > service
> > > > > > under KRaft mode, make sure everything works well and no log lost
> > > > > >
> > > > > > Should I simply
> > > > > > 1. download 3.4 binary
> > > > > > 2. stop ZK & Kafka service
> > > > > > 3. upgrade Kafka to 3.4
> > > > > > 4. start only Kafka service with KRaft server.properties
> > > > > >
> > > > > > Or any other thing I need to pay attention to?
> > > > > >
> > > > > > If there is a documentation as guide that would be quite helpful
> > > > > >
> > > > > > Really appreciate
> > > > > >
> > > > >
> > > >
> > >
> >



-- 
David Arthur


Re: Exactly once kafka connect query

2023-03-10 Thread Chris Egerton
Hi Nitty,

> I called commitTransaction when I reach the first error record, but
commit is not happening for me. Kafka connect tries to abort the
transaction automatically

This is really interesting--are you certain that your task never invoked
TransactionContext::abortTransaction in this case? I'm looking over the
code base and it seems fairly clear that the only thing that could trigger
a call to KafkaProducer::abortTransaction is a request by the task to abort
a transaction (either for a next batch, or for a specific record). It may
help to run the connector in a debugger and/or look for "Aborting
transaction for batch as requested by connector" or "Aborting transaction
for record on topic  as requested by connector" log
messages (which will be emitted at INFO level by
the org.apache.kafka.connect.runtime.ExactlyOnceWorkerSourceTask class if
the task is requesting an abort).

Regardless, I'll work on a fix for the bug with aborting empty
transactions. Thanks for helping uncover that one!

Cheers,

Chris

On Thu, Mar 9, 2023 at 6:36 PM NITTY BENNY  wrote:

> Hi Chris,
>
> We have a use case to commit previous successful records and stop the
> processing of the current file and move on with the next file. To achieve
> that I called commitTransaction when I reach the first error record, but
> commit is not happening for me. Kafka connect tries to abort the
> transaction automatically, I checked the _transaction_state topic and
> states marked as PrepareAbort and CompleteAbort. Do you know why kafka
> connect automatically invokes abort instead of the implicit commit I
> called?
> Then as a result, when I tries to parse the next file - say ABC, I saw the
> logs "Aborting incomplete transaction" and ERROR: "Failed to sent record to
> topic", and we lost the first batch of records from the current transaction
> in the file ABC.
>
> Is it possible that there's a case where an abort is being requested while
> the current transaction is empty (i.e., the task hasn't returned any
> records from SourceTask::poll since the last transaction was
> committed/aborted)? --- Yes, that case is possible for us. There is a case
> where the first record itself an error record.
>
> Thanks,
> Nitty
>
> On Thu, Mar 9, 2023 at 3:48 PM Chris Egerton 
> wrote:
>
> > Hi Nitty,
> >
> > Thanks for the code examples and the detailed explanations, this is
> really
> > helpful!
> >
> > > Say if I have a file with 5 records and batch size is 2, and in my 3rd
> > batch I have one error record then in that batch, I dont have a valid
> > record to call commit or abort. But I want to commit all the previous
> > batches that were successfully parsed. How do I do that?
> >
> > An important thing to keep in mind with the TransactionContext API is
> that
> > all records that a task returns from SourceTask::poll are implicitly
> > included in a transaction. Invoking SourceTaskContext::transactionContext
> > doesn't alter this or cause transactions to start being used; everything
> is
> > already in a transaction, and the Connect runtime automatically begins
> > transactions for any records it sees from the task if it hasn't already
> > begun one. It's also valid to return a null or empty list of records from
> > SourceTask::poll. So in this case, you can invoke
> > transactionContext.commitTransaction() (the no-args variant) and return
> an
> > empty batch from SourceTask::poll, which will cause the transaction
> > containing the 4 valid records that were returned in the last 2 batches
> to
> > be committed.
> >
> > FWIW, I would be a little cautious about this approach. Many times it's
> > better to fail fast on invalid data; it might be worth it to at least
> allow
> > users to configure whether the connector fails on invalid data, or
> silently
> > skips over it (which is what happens when transactions are aborted).
> >
> > > Why is abort not working without adding the last record to the list?
> >
> > Is it possible that there's a case where an abort is being requested
> while
> > the current transaction is empty (i.e., the task hasn't returned any
> > records from SourceTask::poll since the last transaction was
> > committed/aborted)? I think this may be a bug in the Connect framework
> > where we don't check to see if a transaction is already open when a task
> > requests that a transaction be aborted, which can cause tasks to fail
> (see
> > https://issues.apache.org/jira/browse/KAFKA-14799 for more details).
> >
> > Cheers,
> >
> > Chris
> >
> >
> > On Wed, Mar 8, 2023 at 6:44 PM NITTY BENNY  wrote:
> >
> > > Hi Chris,
> > >
> > > I am not sure if you are able to see the images I shared with you .
> > > Copying the code snippet below,
> > >
> > >  if (expectedRecordCount >= 0) {
> > > int missingCount = expectedRecordCount - (int) this.
> > > recordOffset() - 1;
> > > if (missingCount > 0) {
> > >   if (transactionContext != null) {
> > > isMissedRecords = true;
> > >   } else