Re: [DISCUSS] KIP-939: Support Participation in 2PC

2023-10-06 Thread Artem Livshits
Hi Raman,

Thank you for the questions.  Given that the primary effect of setting
enable2pc flag is disabling timeout, it makes sense to make enable2pc have
similar behavior w.r.t. when it can be set.

One clarification about the Ongoing case -- the current (pre-KIP-939)
behavior is to abort ongoing transaction and let the client retry
(eventually getting into CompleteAbort state), so even though transaction
timeout is not changed when actually hitting the ongoing transaction, the
new timeout value would take effect before the call completes to the
caller.  So if we look from the caller perspective, the transaction timeout
is set whenever the InitProducerId functionality is used.

-Artem

On Wed, Oct 4, 2023 at 8:58 PM Raman Verma  wrote:

> Hello Artem,
>
> Now that `InitProducerIdRequest` will have an extra parameter (enable2PC),
> can the client change the value of this parameter during an ongoing
> transaction.
>
> Here is how the transaction coordinator responds to InitProducerId requests
> according
> to the current transaction's state.
>
> - Empty | CompleteAbort | CompleteCommit
> Bump epoch and move to Empty state. Accept any changes from incoming
> InitProducerId
> request like transactionTimeoutMs
>
> - Ongoing
> Bump epoch and move to PrepareEpochFence state. Transaction time out is not
> changed.
>
> - PrepareAbort | PrepareCommit
> No changes internally. Return Concurrent transactions error to the client.
>
> I guess we should allow the same behavior for mutating enable2PC flag
> under these conditions as for transaction timeout value.
>


Re: [VOTE] KIP-858: Handle JBOD broker disk failure in KRaft

2023-10-06 Thread Igor Soarez
Hi Colin,

> I would call #2 LOST. It was assigned in the past, but we don't know where.
> I see that you called this OFFLINE). This is not really normal...
> it should happen only when we're migrating from ZK mode to KRaft mode,
> or going from an older KRaft release with multiple directories to a
> post-JBOD release.

What you refer to as #2 LOST is actually what I named SELECTED,
as in: a directory has already been _selected_ sometime before,
we just don't know which one yet.

In the mean time this change has already been merged, but let me know
if you feel strongly about the naming here as I'm happy to rename
SELECTED_DIR to LOST_DIR in a new PR.
https://github.com/apache/kafka/pull/14291

> As for the third state -- I'm not sure why SELECTED_DIR needs to exist.

The third state (actually it is ordered second) - OFFLINE_DIR - conveys
that a replica is assigned to an unspecified offline directory.

This can be used by the broker in the following way:

  * When catching up with metadata, seeing that one of it's partitions
  is mapped to SELECTED_DIR, and it cannot find that partition in
  any of the online log directories, and at least one log dir is offline,
  then the broker sends AssignReplicasToDirs to converge the assignment
  to OFFLINE_DIR

  * If a log directory failure happens during an intra-broker (across dir)
  replica movement, after sending AssignReplicasToDirs with the new UUID,
  and before the future replica catches up again. (there's a section
  in the KIP about this).

We could just use a random UUID, as if a replica is assigned to a dir
that is not in the broker's registration online dirs set then it is
considered offline by controllers and metadata cache, but using a
reserved UUID feels cleaner.

> I think we need a general mechanism for checking that replicas are
> in the directories we expect and sending an RPC to the controller
> if they are not. A mechanism like this will automatically get rid
> of the LOST replicas just as part of normal operation -- nothing
> special required.

Thanks for pointing this out, I forgot to put in the notes in my
previous email that we discussed this too.

The KIP proposes this is done when catching up with metadata,
but you also suggested we extend the stray replica detection
mechanism to also check for these inconsistencies. I think
this is a good idea, and we'll look into that as well.

Best,

--
Igor


Re: [VOTE] KIP-858: Handle JBOD broker disk failure in KRaft

2023-10-06 Thread Igor Soarez
Hi David,

Thanks for shedding light on migration goals, makes sense.
Your preference for option a) makes it even more attractive.
We'll keep that as the preferred approach, thanks for the advice.

> One question with this approach is how the KRaft controller learns about
> the multiple log directories after the broker is restarted in KRaft mode.
> If I understand the design correctly, this would be similar to a single
> directory kraft broker being reconfigured as a multiple directory broker.
> That is, the broker sees that the PartitionRecords are missing the
> directory assignments and then sends AssignReplicasToDirs to the controller.

It is not similar to single dir KRaft transitioning to multi-dir mode.
On single-dir mode AssignReplicasToDirs is not sent, but the dir
assignment is still written along in the partition records, as the
Controller knows the UUID of the only log dir that can be selected.
But you're right about the rest.

The multiple log dirs are indicated in the broker registration request.
The first registration in a migrating ZK broker will include dir UUIDs.
Directory assignments are communicated via the new RPC AssignReplicasToDirs.
We have debated whether the broker should start sending these while still
in mixed (migration) mode, or if it should wait until it is restarted
into full KRaft mode, and stay fenced until all dir assignments are
sent to the controller. Any advice for us on this one?

Best,

--
Igor


Re: [DISCUSS] KIP-714: Client metrics and observability

2023-10-06 Thread Matthias J. Sax

Thanks Andrew. SGTM.

One point you did not address is the idea to add a method to 
`KafkaStreams` similar to the proposed `clientInstanceId()` that will be 
added to consumer/producer/admin clients.


Without addressing this, Kafka Streams users won't have a way to get the 
assigned `instanceId` of the internally created clients, and thus it 
would be very difficult for them to know which metrics that the broker 
receives belong to a Kafka Streams app. It seems they would only find 
the `instanceIds` in the log4j output if they enable client logging?


Of course, because there is multiple clients inside Kafka Streams, the 
return type cannot be an single "String", but must be some some complex 
data structure -- we could either add a new class, or return a 
Map using a client key that maps to the `instanceId`.


For example we could use the following key:

   [Global]StreamThread[-][-restore][consumer|producer]

(Of course, only the valid combination.)

Or maybe even better, we might want to return a `Future` because 
collection all the `instanceId` might be a blocking all on each client? 
I have already a few idea how it could be implemented but I don't think 
it must be discussed on the KIP, as it's an implementation detail.


Thoughts?


-Matthias

On 10/6/23 4:21 AM, Andrew Schofield wrote:

Hi Matthias,
Thanks for your comments. I agree that a follow-up KIP for Kafka Streams makes 
sense. This KIP currently has made a bit
of an effort to embrace KS, but it’s not enough by a long way.

I have removed `application.id `. This should be done 
properly in the follow-up KIP. I don’t believe there’s a downside to
removing it from this KIP.

I have reworded the statement about temporarily. In practice, the 
implementation of this KIP that’s going on while the voting
progresses happens to use delta temporality, but that’s an implementation 
detail. Supporting clients must support both
temporalities.

I thought about exposing the client instance ID as a metric, but non-numeric 
metrics are not usual practice and tools
do not universally support them. I don’t think the KIP is improved by adding 
one now.

I have also added constants for the various Config classes for 
ENABLE_METRICS_PUSH_CONFIG, including to
StreamsConfig. It’s best to be explicit about this.

Thanks,
Andrew


On 2 Oct 2023, at 23:47, Matthias J. Sax  wrote:

Hi,

I did not pay attention to this KIP in the past; seems it was on-hold for a 
while.

Overall it sounds very useful, and I think we should extend this with a follow 
up KIP for Kafka Streams. What is unclear to me at this point is the statement:


Kafka Streams applications have an application.id configured and this 
identifier should be included as the application_id metrics label.


The `application.id` is currently only used as the (main) consumer's `group.id` 
(and is part of an auto-generated `client.id` if the user does not set one).

This comment related to:


The following labels should be added by the client as appropriate before 
metrics are pushed.


Given that Kafka Streams uses the consumer/producer/admin client as "black 
boxes", a client does at this point not know that it's part of a Kafka Streams 
application, and thus, it won't be able to attach any such label to the metrics it sends. 
(Also producer and admin don't even know the value of `application.id` -- only the (main) 
consumer, indirectly via `group.id`, but also restore and global consumer don't know it, 
because they don't have `group.id` set).

While I am totally in favor of the proposal, I am wondering how we intent to implement it 
in clean way? Or would we do ok to have some internal client APIs that KS can use to 
"register" itself with the client?




While clients must support both temporalities, the broker will initially only 
send GetTelemetrySubscriptionsResponse.DeltaTemporality=True


Not sure if I can follow. How make the decision about DELTA or CUMULATIVE metrics? Should 
the broker side plugin not decide what metrics it what to receive in which form? So what 
does "initially" mean -- the broker won't ship with a default plugin 
implementation?




The following method is added to the Producer, Consumer, and Admin client 
interfaces:


Should we add anything to Kafka Streams to expose the underlying clients' 
assigned client-instance-ids programmatically? I am also wondering if clients 
should report their assigned client-instance-ids as metrics itself (for this 
case, Kafka Streams won't need to do anything, because we already expose all 
client metrics).

If we add anything programmatic, we need to make it simple, given that Kafka 
Streams has many clients per `StreamThread` and may have multiple threads.




enable.metrics.push

It might be worth to add this to `StreamsConfig`, too? It set via 
StreamsConfig, we would forward it to all clients automatically.




-Matthias


On 9/29/23 5:45 PM, David Jacot wrote:

Hi Andrew,
Thanks for driving this one. I haven't read 

Re: [DISCUSS] KIP-939: Support Participation in 2PC

2023-10-06 Thread Artem Livshits
Hi Justine,

Thank you for the questions.  Currently (pre-KIP-939) we always bump the
epoch on InitProducerId and abort an ongoing transaction (if any).  I
expect this behavior will continue with KIP-890 as well.

With KIP-939 we need to support the case when the ongoing transaction needs
to be preserved when keepPreparedTxn=true.  Bumping epoch without aborting
or committing a transaction is tricky because epoch is a short value and
it's easy to overflow.  Currently, the overflow case is handled by aborting
the ongoing transaction, which would send out transaction markers with
epoch=Short.MAX_VALUE to the partition leaders, which would fence off any
messages with the producer id that started the transaction (they would have
epoch that is less than Short.MAX_VALUE).  Then it is safe to allocate a
new producer id and use it in new transactions.

We could say that maybe when keepPreparedTxn=true we bump epoch unless it
leads to overflow, and don't bump epoch in the overflow case.  I don't
think it's a good solution because if it's not safe to keep the same epoch
when keepPreparedTxn=true, then we must handle the epoch overflow case as
well.  So either we should convince ourselves that it's safe to keep the
epoch and do it in the general case, or we always bump the epoch and handle
the overflow.

With KIP-890, we bump the epoch on every transaction commit / abort.  This
guarantees that even if InitProducerId(keepPreparedTxn=true) doesn't
increment epoch on the ongoing transaction, the client will have to call
commit or abort to finish the transaction and will increment the epoch (and
handle epoch overflow, if needed).  If the ongoing transaction was in a bad
state and had some zombies waiting to arrive, the abort operation would
fence them because with KIP-890 every abort would bump the epoch.

We could also look at this from the following perspective.  With KIP-890,
zombies won't be able to cross transaction boundaries; each transaction
completion creates a boundary and any activity in the past gets confined in
the boundary.  Then data in any partition would look like this:

1. message1, epoch=42
2. message2, epoch=42
3. message3, epoch=42
4. marker (commit or abort), epoch=43

Now if we inject steps 3a and 3b like this:

1. message1, epoch=42
2. message2, epoch=42
3. message3, epoch=42
3a. crash
3b. InitProducerId(keepPreparedTxn=true)
4. marker (commit or abort), epoch=43

The invariant still holds even with steps 3a and 3b -- whatever activity
was in the past will get confined in the past with mandatory abort / commit
that must follow  InitProducerId(keepPreparedTxn=true).

So KIP-890 provides the proper isolation between transactions, so injecting
crash + InitProducerId(keepPreparedTxn=true) into the transaction sequence
is safe from the zombie protection perspective.

That said, I'm still thinking about it and looking for cases that might
break because we don't bump epoch when
InitProducerId(keepPreparedTxn=true), if such cases exist, we'll need to
develop the logic to handle epoch overflow for ongoing transactions.

-Artem



On Tue, Oct 3, 2023 at 10:15 AM Justine Olshan 
wrote:

> Hey Artem,
>
> Thanks for the KIP. I had a question about epoch bumping.
>
> Previously when we send an InitProducerId request on Producer startup, we
> bump the epoch and abort the transaction. Is it correct to assume that we
> will still bump the epoch, but just not abort the transaction?
> If we still bump the epoch in this case, how does this interact with
> KIP-890 where we also bump the epoch on every transaction. (I think this
> means that we may skip epochs and the data itself will all have the same
> epoch)
>
> I may have follow ups depending on the answer to this. :)
>
> Thanks,
> Justine
>
> On Thu, Sep 7, 2023 at 9:51 PM Artem Livshits
>  wrote:
>
> > Hi Alex,
> >
> > Thank you for your questions.
> >
> > > the purpose of having broker-level transaction.two.phase.commit.enable
> >
> > The thinking is that 2PC is a bit of an advanced construct so enabling
> 2PC
> > in a Kafka cluster should be an explicit decision.  If it is set to
> 'false'
> > InitiProducerId (and initTransactions) would
> > return TRANSACTIONAL_ID_AUTHORIZATION_FAILED.
> >
> > > WDYT about adding an AdminClient method that returns the state of
> > transaction.two.phase.commit.enable
> >
> > I wonder if the client could just try to use 2PC and then handle the
> error
> > (e.g. if it needs to fall back to ordinary transactions).  This way it
> > could uniformly handle cases when Kafka cluster doesn't support 2PC
> > completely and cases when 2PC is restricted to certain users.  We could
> > also expose this config in describeConfigs, if the fallback approach
> > doesn't work for some scenarios.
> >
> > -Artem
> >
> >
> > On Tue, Sep 5, 2023 at 12:45 PM Alexander Sorokoumov
> >  wrote:
> >
> > > Hi Artem,
> > >
> > > Thanks for publishing this KIP!
> > >
> > > Can you please clarify the purpose of having broker-level
> > > 

Re: [VOTE] KIP-858: Handle JBOD broker disk failure in KRaft

2023-10-06 Thread Colin McCabe
Thanks for posting these notes, Igor.

I think we should definitely distinguish between these two cases:

1. this replica hasn't been assigned to a storage directory
2. we don't know which storage directory this replica was assigned to in the 
past

I would call #1 UNASSIGNED. I see that you called it UNKNOWN in your message. 
This is normal. It jus means that the controller created this thing but the 
broker didn't put it somewhere yet.

I would call #2 LOST. It was assigned in the past, but we don't know where. I 
see that you called this OFFLINE). This is not really normal... it should 
happen only when we're migrating from ZK mode to KRaft mode, or going from an 
older KRaft release with multiple directories to a post-JBOD release.

The reason for distinguishing between these two cases is that we don't want to 
recreate LOST replicas. They probably are already on the broker in some 
directory, but we just don't know where. This is different than the case with 
UNASSIGNED, which should trigger the broker placing the replica somewhere and 
basically treating it like a new replica.

As for the third state -- I'm not sure why SELECTED_DIR needs to exist.

I think we need a general mechanism for checking that replicas are in the 
directories we expect and sending an RPC to the controller if they are not. A 
mechanism like this will automatically get rid of the LOST replicas just as 
part of normal operation -- nothing special required.

I think the part that you were trying to address with the "can't have failures 
during migration" stipulation is that if we do things the obvious way, we will 
have LOST replicas for a while immediately after migration. But we won't know 
if those replicas are actually there or not. There are a bunch of ways to 
handle this... we'll have to think about what is the easiest. Writing the data 
to ZK while still in hybrid mode might even be on the table as an option.

best,
Colin


On Thu, Oct 5, 2023, at 07:03, David Arthur wrote:
> Hey, just chiming in regarding the ZK migration piece.
>
> Generally speaking, one of the design goals of the migration was to have
> minimal changes on the ZK brokers and especially the ZK controller. Since
> ZK mode is our safe/well-known fallback mode, we wanted to reduce the
> chances of introducing bugs there. Following that logic, I'd prefer option
> (a) since it does not involve changing any migration code or (much) ZK
> broker code. Disk failures should be pretty rare, so this seems like a
> reasonable option.
>
> a) If a migrating ZK mode broker encounters a directory failure,
>>   it will shutdown. While this degrades failure handling during,
>>   the temporary migration window, it is a useful simplification.
>>   This is an attractive option, and it isn't ruled out, but it
>>   is also not clear that it is necessary at this point.
>
>
> If a ZK broker experiences a disk failure before the metadata is migrated,
> it will prevent the migration from happening. If the metadata is already
> migrated, then you simply have an offline broker.
>
> If an operator wants to minimize the time window of the migration, they can
> simply do the requisite rolling restarts one after the other.
>
> 1) Provision KRaft controllers
> 2) Configure ZK brokers for migration and do rolling restart (migration
> happens automatically here)
> 3) Configure ZK brokers as KRaft and do rolling restart
>
> This reduces the time window to essentially the time it takes to do two
> rolling restarts of the cluster. One the brokers are in KRaft mode, they
> won't have the "shutdown if log dir fails" behavior.
>
>
>
> One question with this approach is how the KRaft controller learns about
> the multiple log directories after the broker is restarted in KRaft mode.
> If I understand the design correctly, this would be similar to a single
> directory kraft broker being reconfigured as a multiple directory broker.
> That is, the broker sees that the PartitionRecords are missing the
> directory assignments and then sends AssignReplicasToDirs to the controller.
>
> Thanks!
> David


Re: [DISCUSS] KIP-979: Allow independently stop KRaft controllers or brokers

2023-10-06 Thread Colin McCabe
Not everything is a broker, though. So --node-id seems better.

best,
Colin

On Sun, Oct 1, 2023, at 23:08, Kamal Chandraprakash wrote:
> Hi Hailey,
>
> Thanks for working on this! This is one of the long-standing open issues.
> Now, users have to find the PID of the respective Kafka process to stop if
> more than one node is being run locally for testing purposes.
> The updated KIP is addressing that. LGTM.
>
> Is `node.id` and `broker.id` the same? If yes, can we rename it to `
> broker.id` instead?
>
> --
> Kamal
>
>
>
> On Sun, Oct 1, 2023 at 3:00 AM Ron Dagostino  wrote:
>
>> Thanks, Ismael.  I think you are proposing a pair of mutually exclusive
>> args --process.roles and --node.id, right?  I agree that is more
>> user-friendly than the --required-config arg, and it comes at the possible
>> expense of generality.  So that’s the tradeoff between the two, I think.
>> No other config comes to mind now that we’ve identified these two.  I think
>> the two specific and mutually exclusive parameters would be the way to go
>> unless someone else identifies still more options that people might want.
>>
>> Did I get that right, or were you proposing something different?
>>
>> Ron
>>
>> > On Sep 30, 2023, at 10:42 AM, Ismael Juma  wrote:
>> >
>> > Hi,
>> >
>> > Thanks for the KIP. I think this approach based on configs is a bit too
>> > open ended and not very user friendly. Why don't we simply provide flags
>> > for the things a user may care about? So far, it seems like we have two
>> > good candidates (node id and process role). Are there any others?
>> >
>> > Ismael
>> >
>> >> On Fri, Sep 29, 2023 at 6:19 PM Hailey Ni 
>> wrote:
>> >>
>> >> Hi Ron,
>> >>
>> >> I think you made a great point, making the "name" arbitrary instead of
>> >> hard-coding it will make the functionality much more flexible. I've
>> updated
>> >> the KIP and the code accordingly. Thanks for the great idea!
>> >>
>> >> Thanks,
>> >> Hailey
>> >>
>> >>
>> >>> On Fri, Sep 29, 2023 at 2:34 PM Ron Dagostino 
>> wrote:
>> >>>
>> >>> Thanks, Hailey.  Is there a reason to restrict it to just
>> >>> process.roles and node.id?  Someone might want to do
>> >>> "--required-config any.name=whatever.value", for example, and at first
>> >>> glance I don't see a reason why the implementation should be any
>> >>> different -- it seems it would probably be easier to not have to worry
>> >>> about restricting to specific cases, actually.  WDYT?
>> >>>
>> >>> Ron
>> >>>
>> >>> On Fri, Sep 29, 2023 at 5:12 PM Hailey Ni 
>> >>> wrote:
>> 
>>  Updated. Please let me know if you have any additional comments. Thank
>> >>> you!
>> 
>>  On Thu, Sep 21, 2023 at 3:02 PM Hailey Ni  wrote:
>> 
>> > Hi Ron. Thanks for the response. I agree with your point. I'll make
>> >> the
>> > corresponding changes in the KIP and KAFKA-15471
>> > .
>> >
>> > On Thu, Sep 21, 2023 at 1:40 PM Ron Dagostino 
>> >>> wrote:
>> >
>> >> Hi Hailey.  No, I just looked, and zookeeper-server-stop does not
>> >> have
>> >> any facility to be specific about which ZK nodes to signal.  So
>> >> providing the ability in kafka-server-stop to be more specific than
>> >> just "signal all controllers" or "signal all brokers" would be a
>> >> bonus
>> >> and therefore not necessarily required.  But if it is easy to
>> >> achieve
>> >> and doesn't add any additional cognitive load -- and at first glance
>> >> it does seem so -- we should probably just support it.
>> >>
>> >> Ron
>> >>
>> >> On Wed, Sep 20, 2023 at 6:15 PM Hailey Ni > >>>
>> >> wrote:
>> >>>
>> >>> Hi Ron,
>> >>>
>> >>> Thank you very much for the comment. I think it makes sense to me
>> >>> that
>> >> we
>> >>> provide an even more specific way to kill individual
>> >> controllers/brokers.
>> >>> I have one question: does the command line for ZooKeeper cluster
>> >>> provide
>> >>> such a way to kill individual controllers/brokers?
>> >>>
>> >>> Thanks,
>> >>> Hailey
>> >>>
>> >>> On Tue, Sep 19, 2023 at 11:01 AM Ron Dagostino > >>>
>> >> wrote:
>> >>>
>>  Thanks for the KIP, Hailey.  It will be nice to provide some
>>  fine-grained control for when people running the broker and
>> >>> controller
>>  this way want to stop just one of them.
>> 
>>  One thing that occurs to me is that in a development environment
>>  someone might want to run multiple controllers and multiple
>> >>> brokers
>>  all on the same box, and in that case they might want to
>> >> actually
>> >>> stop
>>  just one controller or just one broker instead of all of them.
>> >>> So I'm
>>  wondering if maybe instead of supporting kafka-server-stop
>>  [--process.roles ] we might want to instead support
>>  kafka-server-stop [--required-config ].  If someone
>> >>> wanted
>> 

Re: [VOTE]KIP-966: Eligible Leader Replicas

2023-10-06 Thread Colin McCabe
Hi Calvin,

Thanks for the KIP. I think the config discussion was good and I have no more 
comments there.

I have one last thing I think we should fix up:

I think we should improve DescribeTopicRequest. The current mechanism of "you 
can only list 20 topics" doesn't do a very good job of limiting the results. 
After all, if those topics only have 1 partition each, this means a pretty 
small RPC. If they have 10,000 partitions each, then it's a very large RPC.

I think a better mechanism would be:
1. Have the request be a list of (topic_name, partition_id) pairs plus a 
(first_topic_name, first_partition_id) pair.
(for the initial request, first_topic_name="" and first_partition_id=-1, of 
course)
(if partition_id = -1 then we should list all partitions for the topic)

2. When returning results, sort everything alphabetically and return the first 
1000, plus a (next_topic, next_partition_id) pair. (if there is nothing more to 
return, next_topic = null.)

With those changes I would be +1

best,
Colin


If the response wasn't long enough, the caller can set 
On Wed, Oct 4, 2023, at 17:44, Jun Rao wrote:
> Hi, Calvin,
>
> Thanks for the KIP. +1 from me too.
>
> Jun
>
> On Wed, Sep 20, 2023 at 5:28 PM Justine Olshan 
> wrote:
>
>> Thanks Calvin.
>> I think this will be very helpful going forward to minimize data loss.
>>
>> +1 from me (binding)
>>
>> Justine
>>
>> On Wed, Sep 20, 2023 at 3:42 PM Calvin Liu 
>> wrote:
>>
>> > Hi all,
>> > I'd like to call for a vote on KIP-966 which includes a series of
>> > enhancements to the current ISR model.
>> >
>> >- Introduce the new HWM advancement requirement which enables the
>> system
>> >to have more potentially data-safe replicas.
>> >- Introduce Eligible Leader Replicas(ELR) to represent the above
>> >data-safe replicas.
>> >- Introduce Unclean Recovery process which will deterministically
>> choose
>> >the best replica during an unclean leader election.
>> >
>> >
>> > KIP:
>> >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
>> >
>> > Discussion thread:
>> > https://lists.apache.org/thread/gpbpx9kpd7c62dm962h6kww0ghgznb38
>> >
>>


[jira] [Created] (KAFKA-15563) Provide informative error messages when Connect REST requests time out

2023-10-06 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-15563:
-

 Summary: Provide informative error messages when Connect REST 
requests time out
 Key: KAFKA-15563
 URL: https://issues.apache.org/jira/browse/KAFKA-15563
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Reporter: Chris Egerton
Assignee: Chris Egerton


The Kafka Connect REST API has a hardcoded timeout of 90 seconds. If any 
operations take longer than that, a 500 error response is returned with the 
message "Request timed out" (see 
[here|https://github.com/apache/kafka/blob/7e1c453af9533aba8c19da2d08ce6595c1441fc0/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/HerderRequestHandler.java#L70]).

This can be a source of frustration for users, who want to understand what is 
causing the request to time out. This can be specific to the request (for 
example, a connector's [custom multi-property validation 
logic|https://kafka.apache.org/35/javadoc/org/apache/kafka/connect/connector/Connector.html#validate(java.util.Map)]
 is taking too long), or applicable to any request that goes through the 
herder's tick thread (for which there are a variety of possible causes).

We can give users better, immediate insight into what is causing requests to 
time out by including information about the last possibly-blocking operation 
the worker performed while servicing the request (or attempting to enter a 
state where all preconditions necessary to service the request have been 
satisfied), and when the worker began that operation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2263

2023-10-06 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-15562) CommitRequestManager needs to test different error handling

2023-10-06 Thread Philip Nee (Jira)
Philip Nee created KAFKA-15562:
--

 Summary: CommitRequestManager needs to test different error 
handling
 Key: KAFKA-15562
 URL: https://issues.apache.org/jira/browse/KAFKA-15562
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Reporter: Philip Nee
Assignee: Philip Nee


Review the code in Consumercoordinator#OffsetCommitResponseHandler 

Implement the error handling and add tests for these errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-985 Add reverseRange and reverseAll query over kv-store in IQv2

2023-10-06 Thread Hanyu (Peter) Zheng
Thank you, Matthias, for the detailed implementation and explanation. As of
now, our capability is limited to executing interactive queries on
individual partitions. To illustrate:

Consider the IQv2StoreIntegrationTest:

We have two partitions:
Partition0 contains key-value pairs: <0,0> and <2,2>.
Partition1 contains key-value pairs: <1,1> and <3,3>.
When executing RangeQuery.withRange(1,3), the results are:

Partition0: [2]
Partition1: [1, 3]
To support functionalities like reverseRange and reverseAll, we can
introduce the withDescendingKeys() method. For instance, using
RangeQuery.withRange(1,3).withDescendingKeys(), the anticipated results are:

Partition0: [2]
Partition1: [3, 1]

In response to Hao's inquiry about the boundary issue, please refer to the
StoreQueryUtils class. The code snippet:

iterator = kvStore.range(lowerRange.orElse(null), upperRange.orElse(null));
indicates that when implementing range in each store, it's structured like:

@Override
public KeyValueIterator range(final Bytes from, final Bytes
to) {
if (from != null && to != null && from.compareTo(to) > 0) {
This section performs the necessary checks.

Sincerely,
Hanyu

On Thu, Oct 5, 2023 at 9:52 AM Hanyu (Peter) Zheng 
wrote:

> Hi, Hao,
>
> In this case, it will return an empty set or list in the end.
>
> Sincerely,
> Hanyu
>
> On Wed, Oct 4, 2023 at 10:29 PM Matthias J. Sax  wrote:
>
>> Great discussion!
>>
>> It seems the only open question might be about ordering guarantees?
>> IIRC, we had a discussion about this in the past.
>>
>>
>> Technically (at least from my POV), existing `RangeQuery` does not have
>> a guarantee that data is return in any specific order (not even on a per
>> partitions bases). It just happens that RocksDB (and as pointed out by
>> Hanyu already, also the built-in in-memory store that is base on a
>> tree-map) allows us to return data ordered by key; as mentioned already,
>> this guarantee is limited on a per partition basis.
>>
>> If there would be custom store base on a hashed key-value store, this
>> store could implement RangeQuery and return data (even for a single
>> partition) with no ordering, without violating the contract.
>>
>>
>>
>> Thus, it could actually make sense, to extend `RangeQuery` and allow
>> three options: no-order, ascending, descending. For our existing
>> Rocks/InMemory implementations, no-order could be equal to ascending and
>> nothing changes effectively, but it might be a better API contract? --
>> If we assume that there might be a custom hash-based store, such a store
>> could reject a query if "ascending" is required, or might need to do
>> more work to implement it (up to the store maintainer). This is actually
>> the beauty of IQv2 that different stores can pick what queries they want
>> to support.
>>
>>  From an API contract point of view, it seems confusing to say:
>> specifying nothing means no guarantee (or ascending if the store can
>> offer it), but descending can we explicitly request. Thus, a hash-based
>> store, might be able to accept "order not specified query", but would
>> reject "descending". This seems to be somewhat unbalanced?
>>
>> Thus, I am wondering if we should actually add `withAscendingKeys()`,
>> too, even if it won't impact our current RocksDB/In-Memory
>> implementations?
>>
>>
>> The second question is about per-partition or across-partition ordering:
>> it's not possible right now to actually offer across-partition ordering
>> the way IQv2 is setup. The reason is, that the store that implements a
>> query type, is always a single shard. Thus, the implementation does not
>> have access to other shards. It's hard-coded inside Kafka Streams, to
>> query each shared, and to "accumulate" partial results, and return the
>> back to the user. Note that the API is:
>>
>>
>> > StateQueryResult result = KafkaStreams.query(...);
>> > Map> resultPerPartitions =
>> result.getPartitionResults();
>>
>>
>> Thus, if we would want to offer across-partition ordering, we cannot do
>> it right now, because Kafka Streams does not know anything about the
>> semantics of the query it distributes... -- the result is an unknown
>> type . We would need to extend IQv2 with an additional mechanism,
>> that allows users to plug in more custom code to "merge" multiple
>> partitions result into a "global result". This is clearly out-of-scope
>> for this KIP and would require a new KIP by itself.
>>
>> I seems that this contract, which is independent of the query type is
>> not well understood, and thus a big +1 to fix the documentation. I don't
>> think that this KIP must "define" anything, but it might of course be
>> worth to add the explanation why the KIP cannot even offer
>> global-ordering, as it's defined/limited by the IQv2 "framework" itself,
>> not the individual queries.
>>
>>
>>
>> -Matthias
>>
>>
>>
>>
>> On 10/4/23 4:38 PM, Hao Li wrote:
>> > Hi Hanyu,
>> >
>> > Thanks for the KIP! Seems there are already a lot of good discussions. I
>> > only have 

[PR] MINOR Access SslPrincipalMapper from Custom KafkaPrincipalBuilder(KIP-982) #14491

2023-10-06 Thread Raghu B
Hi Kafka Dev Community,

Hope you are doing well. I wanted to bring your attention to the following
pull request
KAFKA-15452: Access SslPrincipalMapper and kerberosShortNamer in Custom
KafkaPrincipalBuilder(KIP-982) #14491
.

The changes made will allow custom KafkaPrincipalBuilder implementations to
access SslPrincipalMapper and kerberosShortNamer, which will enable them to
parse Regex Rules from BrokerSecurityConfigs
SSL_PRINCIPAL_MAPPING_RULES_CONFIG.

Please take a look at it and let me know if it's possible to merge it.

KIP-982

has complete details about the change.

Thank you for your time and consideration.

Best regards,
Raghu Baddam


Re: [VOTE] KIP-976: Cluster-wide dynamic log adjustment for Kafka Connect

2023-10-06 Thread Greg Harris
Hey Chris,

Thanks for the KIP!
I think that preserving the ephemeral nature of the logging change is
the right choice here, and using the config topic for intra-cluster
broadcast is better than REST forwarding.

+1 (binding)

Thanks,
Greg

On Fri, Oct 6, 2023 at 9:05 AM Chris Egerton  wrote:
>
> Hi all,
>
> I'd like to call for a vote on KIP-976, which augments the existing dynamic
> logger adjustment REST API for Kafka Connect to apply changes cluster-wide
> instead on a per-worker basis.
>
> The KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-976:+Cluster-wide+dynamic+log+adjustment+for+Kafka+Connect
>
> The discussion thread:
> https://lists.apache.org/thread/w3x3f3jmyd1vfjxho06y8xgt6mhhzpl5
>
> Cheers,
>
> Chris


[VOTE] KIP-976: Cluster-wide dynamic log adjustment for Kafka Connect

2023-10-06 Thread Chris Egerton
Hi all,

I'd like to call for a vote on KIP-976, which augments the existing dynamic
logger adjustment REST API for Kafka Connect to apply changes cluster-wide
instead on a per-worker basis.

The KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-976:+Cluster-wide+dynamic+log+adjustment+for+Kafka+Connect

The discussion thread:
https://lists.apache.org/thread/w3x3f3jmyd1vfjxho06y8xgt6mhhzpl5

Cheers,

Chris


[jira] [Created] (KAFKA-15561) Client support for new SubscriptionPattern based subscription

2023-10-06 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-15561:
--

 Summary: Client support for new SubscriptionPattern based 
subscription 
 Key: KAFKA-15561
 URL: https://issues.apache.org/jira/browse/KAFKA-15561
 Project: Kafka
  Issue Type: Sub-task
  Components: clients, consumer
Reporter: Lianet Magrans


New consumer should support subscribe with the new SubscriptionPattern 
introduced in the new consumer group protocol. When subscribing with this 
regex, the client should provide the regex in the HB request on the 
SubscribedTopicRegex field, delegating the resolution to the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2262

2023-10-06 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-714: Client metrics and observability

2023-10-06 Thread Andrew Schofield
Hi Matthias,
Thanks for your comments. I agree that a follow-up KIP for Kafka Streams makes 
sense. This KIP currently has made a bit
of an effort to embrace KS, but it’s not enough by a long way.

I have removed `application.id `. This should be done 
properly in the follow-up KIP. I don’t believe there’s a downside to
removing it from this KIP.

I have reworded the statement about temporarily. In practice, the 
implementation of this KIP that’s going on while the voting
progresses happens to use delta temporality, but that’s an implementation 
detail. Supporting clients must support both
temporalities.

I thought about exposing the client instance ID as a metric, but non-numeric 
metrics are not usual practice and tools
do not universally support them. I don’t think the KIP is improved by adding 
one now.

I have also added constants for the various Config classes for 
ENABLE_METRICS_PUSH_CONFIG, including to
StreamsConfig. It’s best to be explicit about this.

Thanks,
Andrew

> On 2 Oct 2023, at 23:47, Matthias J. Sax  wrote:
>
> Hi,
>
> I did not pay attention to this KIP in the past; seems it was on-hold for a 
> while.
>
> Overall it sounds very useful, and I think we should extend this with a 
> follow up KIP for Kafka Streams. What is unclear to me at this point is the 
> statement:
>
>> Kafka Streams applications have an application.id configured and this 
>> identifier should be included as the application_id metrics label.
>
> The `application.id` is currently only used as the (main) consumer's 
> `group.id` (and is part of an auto-generated `client.id` if the user does not 
> set one).
>
> This comment related to:
>
>> The following labels should be added by the client as appropriate before 
>> metrics are pushed.
>
> Given that Kafka Streams uses the consumer/producer/admin client as "black 
> boxes", a client does at this point not know that it's part of a Kafka 
> Streams application, and thus, it won't be able to attach any such label to 
> the metrics it sends. (Also producer and admin don't even know the value of 
> `application.id` -- only the (main) consumer, indirectly via `group.id`, but 
> also restore and global consumer don't know it, because they don't have 
> `group.id` set).
>
> While I am totally in favor of the proposal, I am wondering how we intent to 
> implement it in clean way? Or would we do ok to have some internal client 
> APIs that KS can use to "register" itself with the client?
>
>
>
>> While clients must support both temporalities, the broker will initially 
>> only send GetTelemetrySubscriptionsResponse.DeltaTemporality=True
>
> Not sure if I can follow. How make the decision about DELTA or CUMULATIVE 
> metrics? Should the broker side plugin not decide what metrics it what to 
> receive in which form? So what does "initially" mean -- the broker won't ship 
> with a default plugin implementation?
>
>
>
>> The following method is added to the Producer, Consumer, and Admin client 
>> interfaces:
>
> Should we add anything to Kafka Streams to expose the underlying clients' 
> assigned client-instance-ids programmatically? I am also wondering if clients 
> should report their assigned client-instance-ids as metrics itself (for this 
> case, Kafka Streams won't need to do anything, because we already expose all 
> client metrics).
>
> If we add anything programmatic, we need to make it simple, given that Kafka 
> Streams has many clients per `StreamThread` and may have multiple threads.
>
>
>
>> enable.metrics.push
> It might be worth to add this to `StreamsConfig`, too? It set via 
> StreamsConfig, we would forward it to all clients automatically.
>
>
>
>
> -Matthias
>
>
> On 9/29/23 5:45 PM, David Jacot wrote:
>> Hi Andrew,
>> Thanks for driving this one. I haven't read all the KIP yet but I already
>> have an initial question. In the Threading section, it is written
>> "KafkaConsumer: the "background" thread (based on the consumer threading
>> refactor which is underway)". If I understand this correctly, it means
>> that KIP-714 won't work if the "old consumer" is used. Am I correct?
>> Cheers,
>> David
>> On Fri, Sep 22, 2023 at 12:18 PM Andrew Schofield <
>> andrew_schofield_j...@outlook.com> wrote:
>>> Hi Philip,
>>> No, I do not think it should actively search for a broker that supports
>>> the new
>>> RPCs. In general, either all of the brokers or none of the brokers will
>>> support it.
>>> In the window, where the cluster is being upgraded or client telemetry is
>>> being
>>> enabled, there might be a mixed situation. I wouldn’t put too much effort
>>> into
>>> this mixed scenario. As the client finds brokers which support the new
>>> RPCs,
>>> it can begin to follow the KIP-714 mechanism.
>>>
>>> Thanks,
>>> Andrew
>>>
 On 22 Sep 2023, at 20:01, Philip Nee  wrote:

 Hi Andrew -

 Question on top of your answers: Do you think the client should actively
 search for a broker that supports this RPC? As previously 

[VOTE] KIP-960: Support single-key_single-timestamp interactive queries (IQv2) for versioned state stores

2023-10-06 Thread Alieh Saeedi
Hi everyone,

Since KIP-960 is reduced to the simplest IQ type and all further comments
are related to the following-up KIPs, I decided to finalize it at this
point.


A huge thank you to everyone who has reviewed this KIP (and also the
following-up ones), and
participated in the discussion thread!

I'd also like to thank you in advance for taking the time to vote.

Best,
Alieh


Re: [PR] MINOR Update default docs point to 3.6.0 release docs [kafka-site]

2023-10-06 Thread via GitHub


divijvaidya commented on code in PR #556:
URL: https://github.com/apache/kafka-site/pull/556#discussion_r1348427549


##
downloads.html:
##
@@ -6,12 +6,41 @@

 Download
 
-3.5.1 is the latest release. The current stable version is 3.5.1
+3.6.0 is the latest release. The current stable version is 3.6.0
 
 
 You can verify your download by following these https://www.apache.org/info/verification.html;>procedures and using 
these https://downloads.apache.org/kafka/KEYS;>KEYS.
 
 
+
+3.6.0
+
+
+Released Oct 7, 2023
+
+
+https://archive.apache.org/dist/kafka/3.6.0/RELEASE_NOTES.html;>Release 
Notes

Review Comment:
   the latest version is usually downloaded from `downloads.apache.org` and all 
prior versions are moved to `archive.apache.org` as per 
https://www.apache.org/legal/release-policy.html#archived 
   
   (you can view prior commits in this file for reference examples)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15530: Add 3.6 metrics documentation for new transactions metrics [kafka-site]

2023-10-06 Thread via GitHub


divijvaidya merged PR #555:
URL: https://github.com/apache/kafka-site/pull/555


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (KAFKA-15560) Upgrade argparse4j to 0.9.0

2023-10-06 Thread Siddharth R (Jira)
Siddharth R created KAFKA-15560:
---

 Summary: Upgrade argparse4j to 0.9.0
 Key: KAFKA-15560
 URL: https://issues.apache.org/jira/browse/KAFKA-15560
 Project: Kafka
  Issue Type: Improvement
Reporter: Siddharth R


Remediate vulnerability in the dependencies:


Vulnerabilities from dependencies:
[CVE-2020-15250|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-15250]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2261

2023-10-06 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 316223 lines...]

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldAddAndRemovePendingTaskToUpdateInputPartitions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldAddAndRemovePendingTaskToUpdateInputPartitions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
onlyRemovePendingTaskToCloseDirtyShouldRemoveTaskFromPendingUpdateActions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
onlyRemovePendingTaskToCloseDirtyShouldRemoveTaskFromPendingUpdateActions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldAddAndRemovePendingTaskToSuspend() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldAddAndRemovePendingTaskToSuspend() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldVerifyIfPendingTaskToInitExist() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldVerifyIfPendingTaskToInitExist() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
onlyRemovePendingTaskToCloseCleanShouldRemoveTaskFromPendingUpdateActions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
onlyRemovePendingTaskToCloseCleanShouldRemoveTaskFromPendingUpdateActions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldDrainPendingTasksToCreate() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldDrainPendingTasksToCreate() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldKeepAddedTasks() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > TasksTest > 
shouldKeepAddedTasks() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > StateQueryResultTest 
> More than one query result throws IllegalArgumentException STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > StateQueryResultTest 
> More than one query result throws IllegalArgumentException PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > StateQueryResultTest 
> Zero query results shouldn't error STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > StateQueryResultTest 
> Zero query results shouldn't error PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > StateQueryResultTest 
> Valid query results still works STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > StateQueryResultTest 
> Valid query results still works PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [1] 
org.apache.kafka.streams.state.internals.RocksDBStore@71eb334a, 
org.apache.kafka.test.MockInternalProcessorContext@31de0bc5 STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [1] 
org.apache.kafka.streams.state.internals.RocksDBStore@71eb334a, 
org.apache.kafka.test.MockInternalProcessorContext@31de0bc5 PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [2] 
org.apache.kafka.streams.state.internals.RocksDBTimestampedStore@25cbd895, 
org.apache.kafka.test.MockInternalProcessorContext@33c2c3f STARTED

Gradle Test Run :streams:test > Gradle Test Executor 88 > 
RocksDBBlockCacheMetricsTest > shouldRecordCorrectBlockCacheUsage(RocksDBStore, 
StateStoreContext) > [2] 
org.apache.kafka.streams.state.internals.RocksDBTimestampedStore@25cbd895, 
org.apache.kafka.test.MockInternalProcessorContext@33c2c3f PASSED

Gradle Test Run :streams:test > Gradle Test Executor 88 > 
RocksDBBlockCacheMetricsTest > 
shouldRecordCorrectBlockCachePinnedUsage(RocksDBStore, StateStoreContext) > [1] 

Re: [DISCUSS] KIP-980: Allow creating connectors in a stopped state

2023-10-06 Thread Yash Mayya
Hi Chris,

I've updated the KIP to call out the parsing logic and the user
expectations explicitly. Thanks again for all your feedback on this KIP!
I'll wait for a few more days to see if anyone else has comments before
kicking off a vote thread.

Thanks,
Yash

On Thu, Oct 5, 2023 at 10:49 PM Chris Egerton 
wrote:

> Hi Yash,
>
> Yeah, I think just hardcoding with JSON-first, properties-second is fine.
> IMO it's worth calling this out explicitly in the KIP.
>
> Apart from that, no further comments. LGTM, thanks for the KIP!
>
> Cheers,
>
> Chris
>
> On Thu, Oct 5, 2023 at 6:23 AM Yash Mayya  wrote:
>
> > Hi Chris,
> >
> > Thanks for all your feedback so far!
> >
> > 3. That's a good question. I was thinking we'd do some "intelligent"
> > parsing internally during the implementation (i.e. essentially your last
> > option - attempting to parse first as one format, then the other) which
> is
> > why I didn't include any more details in the KIP itself (where I've only
> > outlined the contract changes). This would allow for the smoothest user
> > experience IMO and all the heavy lifting will be done in the parsing
> logic.
> > All the other options seemed either clunky or brittle from the user
> > experience point of view. In terms of the actual implementation itself,
> > we'd probably want to first try parsing it into the supported JSON
> > structures before trying to parse it into Java properties since the Java
> > properties format is very permissive (i.e. we won't really see any errors
> > on attempting to parse a JSON file into Java properties).
> >
> > Thanks,
> > Yash
> >
> > On Thu, Oct 5, 2023 at 1:39 AM Chris Egerton 
> > wrote:
> >
> > > Hi Yash,
> > >
> > > Looking great! Few more thoughts:
> > >
> > >
> > > 1. (Downgraded to nit) I still prefer dot-delimitation but it's not a
> > > blocker; thanks for addressing my concerns about the name of the field
> > and
> > > how it may be perceived by users.
> > >
> > > 2. (Addressed) Thanks for looking into this, and sorry it turned out to
> > be
> > > a bit of a dead end! I'm convinced that the current proposal is good
> > > enough.
> > >
> > > 3. Can you shed a little more light on how we'll determine whether a
> > > connector config should be parsed as JSON or as a properties file? Will
> > > this be based on file extension, a command-line flag (which might apply
> > to
> > > all configs, or individual configs), attempting to parse first as one
> > > format then the other, something else?
> > >
> > > 4. (Addressed) Thanks! Looks great.
> > >
> > > 6. (Addressed) Awesome, great to hear. The point about laggy connector
> > > startup is very convincing; my paranoia is satiated.
> > >
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Wed, Oct 4, 2023 at 5:35 AM Yash Mayya 
> wrote:
> > >
> > > > Hi Chris,
> > > >
> > > > Thanks for the quick follow up and the continued insightful
> discourse!
> > > >
> > > > 1. Fair point on the need to differentiate it from the actual state
> > > > displayed in the status API, I like the prefix of "initial" to make
> > that
> > > > differentiation (from your suggested alternatives previously).
> > Regarding
> > > > the dots vs underscores as delimiters - the new state field will be a
> > top
> > > > level field in the connector creation request body alongside the
> > "config"
> > > > map (i.e. it won't be a connector configuration itself), so I think
> we
> > > > should be using the underscore delimiter for consistency. For now,
> I've
> > > > updated the KIP to use "initial_state" as the new field's name - let
> me
> > > > know if you disagree, and I'd be happy to reconsider.
> > > >
> > > > 2. Hm, I actually hadn't considered the downgrade implications with
> > your
> > > > proposed single record approach. I agree that it's a bigger downside
> > than
> > > > writing two records to the config topic. I do understand your
> concerns
> > > with
> > > > the potential for config topic inconsistencies which is why I
> proposed
> > > > writing the target state first (since the presence of a target state
> > for
> > > a
> > > > connector with no configuration is a benign condition). Also, even in
> > the
> > > > non-transactional config topic producer case - if there is a failure
> > > > between the two writes, the user will be notified of the error
> > > > synchronously via the API response (ref -
> > > > https://github.com/apache/kafka/pull/12984) and will be able to
> safely
> > > > retry the operation. I don't see how we'd be able to do a single
> record
> > > > write approach along with supporting clean downgrades since we'd
> either
> > > > need to introduce a new record type or add a new field to an existing
> > > > record type - neither of which would be recognized as such by an
> older
> > > > Connect worker.
> > > >
> > > > > Standalone mode has always supported the REST API,
> > > > > and so far FWICTwe've maintained feature parity between
> > > > > the two modes
> > > >
> > > > > add support for JSON files with 

Re: [PR] Added a blog entry for 3.6.0 release [kafka-site]

2023-10-06 Thread via GitHub


satishd commented on PR #547:
URL: https://github.com/apache/kafka-site/pull/547#issuecomment-1750070114

   @mimaison I did not update the PR completely earlier based on the review 
comments as we were planning to update the KIP items section with more details 
as received from others' feedback. I have updated the PR with each KIP 
description and with minor format changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Added a blog entry for 3.6.0 release [kafka-site]

2023-10-06 Thread via GitHub


satishd commented on PR #547:
URL: https://github.com/apache/kafka-site/pull/547#issuecomment-1750064512

   @mimaison Earlier I did not update the PR completely earlier based on the 
review comments as we were planning to update the KIP items section with more 
details as received from others' feedback. I have updated the PR with each KIP 
description and with minor format changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org