date:20240228

[jira] [Created] (KAFKA-16314) Add the new ABORTABLE_ERROR

2024-02-28 Thread Sanskar Jhajharia (Jira)

Sanskar Jhajharia created KAFKA-16314:
-

 Summary: Add the new ABORTABLE_ERROR
 Key: KAFKA-16314
 URL: https://issues.apache.org/jira/browse/KAFKA-16314
 Project: Kafka
  Issue Type: Sub-task
Reporter: Sanskar Jhajharia
Assignee: Sanskar Jhajharia


As mentioned in the KIP, we would bump the ProduceRequest and ProduceResponse 
to indicate that the server now returns a new ABORTABLE_ERROR. This error would 
essentially require the client to abort the current transaction and continue 
(without a need to restart the client).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] KIP-1021: Allow to get last stable offset (LSO) in kafka-get-offsets.sh

2024-02-28 Thread Luke Chen

Hi Ahmed,

Thanks for the KIP!

Comments:
1. If we all agree with the suggestion from Andrew, you could update the
KIP.

Otherwise, LGTM!


Thanks.
Luke

On Thu, Feb 29, 2024 at 1:32 AM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi Ahmed,
> Could do. Personally, I find the existing “--time -1” totally horrid
> anyway, which was why
> I suggested an alternative. I think your suggestion of a flag for
> isolation level is much
> better than -6.
>
> Maybe I should put in a KIP which adds:
> --latest (as a synonym for --time -1)
> --earliest (as a synonym for --time -2)
> --max-timestamp (as a synonym for --time -3)
>
> That’s really what I would prefer. If the user has a timestamp, use
> `--time`. If they want a
> specific special offset, use a separate flag.
>
> Thanks,
> Andrew
>
> > On 28 Feb 2024, at 09:22, Ahmed Sobeh 
> wrote:
> >
> > Hi Andrew,
> >
> > Thanks for the hint! That sounds reasonable, do you think adding a
> > conditional argument, something like "--time -1 --isolation -committed"
> and
> > "--time -1 --isolation -uncommitted" would make sense to keep the
> > consistency of getting the offset by time? or do you think having a
> special
> > argument for this case is better?
> >
> > On Tue, Feb 27, 2024 at 2:19 PM Andrew Schofield <
> > andrew_schofield_j...@outlook.com> wrote:
> >
> >> Hi Ahmed,
> >> Thanks for the KIP.  It looks like a useful addition.
> >>
> >> The use of negative timestamps, and in particular letting the user use
> >> `--time -1` or the equivalent `--time latest`
> >> is a bit peculiar in this tool already. The negative timestamps come
> from
> >> org.apache.kafka.common.requests.ListOffsetsRequest,
> >> but you’re not actually adding another value to that. As a result, I
> >> really wouldn’t recommend using -6 for the new
> >> flag. LSO is really a variant of -1 with read_committed isolation level.
> >>
> >> I think that perhaps it would be better to add `--last-stable` as an
> >> alternative to `--time`. Then you’ll get the LSO with
> >> cleaner syntax.
> >>
> >> Thanks,
> >> Andrew
> >>
> >>
> >>> On 27 Feb 2024, at 10:12, Ahmed Sobeh 
> >> wrote:
> >>>
> >>> Hi all,
> >>> I would like to start a discussion on KIP-1021, which would enable
> >> getting
> >>> LSO in kafka-get-offsets.sh:
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1021%3A+Allow+to+get+last+stable+offset+%28LSO%29+in+kafka-get-offsets.sh
> >>>
> >>> Best,
> >>> Ahmed
> >>
> >>
> >
> > --
> > [image: Aiven] 
> > *Ahmed Sobeh*
> > Engineering Manager OSPO, *Aiven*
> > ahmed.so...@aiven.io 
> > aiven.io    |   <
> https://www.facebook.com/aivencloud>
> >     <
> https://twitter.com/aiven_io>
> > *Aiven Deutschland GmbH*
> > Immanuelkirchstraße 26, 10405 Berlin
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > Amtsgericht Charlottenburg, HRB 209739 B
>
>
>

Re: [DISCUSS] KIP-1022 Formatting and Updating Features

2024-02-28 Thread Jun Rao

Hi, Justine,

Thanks for the KIP. A few comments below.

10. Currently, MV controls records, inter-broker RPCs and code logic. With
more features, would each of those be controlled by a separate feature or
multiple features. For example, is the new transaction record format
controlled only by MV with TV having a dependency on MV or is it controlled
by both MV and TV.

11. "If not all features are covered with this flag, the command will just
use the latest production version of the feature like it does for metadata
version."
Should we apply that logic to --release-version too? Basically, if
--release-version is not used, the command will just use the latest
production version of every feature. Should we apply that logic to both
tools?

12. Should we remove --metadata METADATA from kafka-features? It does the
same thing as --release-version.

13. KIP-853 also extends the tools to support a new feature kraft.version.
It would be useful to have alignment between that KIP and this one.

Jun


On Wed, Feb 28, 2024 at 10:49 AM Artem Livshits
 wrote:

> Hi Justine,
>
> Thank you for the KIP.  I think the KIP is pretty clear and makes sense to
> me.  Maybe it would be good to give a little more detail on the
> implementation of feature mapping and how the tool would validate the
> feature combinations.  For example, I'd expect that
>
> bin/kafka-storage.sh format --release-version 3.6-IVI --feature
> transaction.version=2
>
> would give an error because the new transaction protocol is not supported
> in 3.6.  Also, we may decide that
>
> bin/kafka-storage.sh format --release-version 5.0-IV0 --feature
> transaction.version=0
>
> would be an unsupported combination as it'll have been a while since the
> new transaction protocol has been the default and it would be too risky to
> enable this combination as it may not be tested any more.
>
> As for the new names, I'm thinking of the "transaction feature version"
> more like a "transaction protocol version" -- from the user perspective we
> don't really add new functionality in KIP-890, we're changing the protocol
> to be more robust (and potentially faster).
>
> -Artem
>
>
>
> On Wed, Feb 28, 2024 at 10:08 AM Justine Olshan
>  wrote:
>
> > Hey Andrew,
> >
> > Thanks for taking a look.
> >
> > I previously didn't include 1. We do plan to use these features
> immediately
> > for KIP-890 and KIP-848. If we think it is easier to put the discussion
> in
> > those KIP discussions we can, but I fear that it will easily get lost
> given
> > the size of the KIPs.
> >
> > I named the features similar to how we named metadata version.
> Transaction
> > version would control transaction features like enabling a new
> transaction
> > record format and APIs to enable KIP-890 part 2. Likewise, the group
> > coordinator version would also enable the new record formats there and
> the
> > new group coordinator. I am open to new names or further discussion.
> >
> > For 2 and 3, I can provide example scripts that show the usage. I am open
> > to adding --latest-stable as well.
> >
> > Justine
> >
> > On Tue, Feb 27, 2024 at 4:59 AM Andrew Schofield <
> > andrew_schofield_j...@outlook.com> wrote:
> >
> > > Hi Justine,
> > > Thanks for the KIP. This area of Kafka is complicated and making it
> > easier
> > > is good.
> > >
> > > When I use the `kafka-features.sh` tool to describe the features on my
> > > cluster, I find that there’s a
> > > single feature called “metadata.version”. I think this KIP does a
> handful
> > > of things:
> > >
> > > 1) It introduces the idea of two new features, TV and GCV, without
> giving
> > > them concrete names or
> > > describing their behaviour.
> > > 2) It introduces a new flag on the storage tool to enable advanced
> users
> > > to control individual features
> > > when they format storage for a new broker.
> > > 3) It introduces a new flag on the features tool to enable a set of
> > latest
> > > stable features for a given
> > > version to be enabled all together.
> > >
> > > I think that (1) probably shouldn’t be in this KIP unless there are
> > > concrete details. Maybe this KIP is enabling
> > > the operator experience when we introduce TV and GCV in other KIPs. I
> > > don’t believe the plan is to enable
> > > the new group coordinator with a feature, and it seems unnecessary to
> me.
> > > I think it’s more compelling for TV
> > > given the changes in transactions.
> > >
> > > For (2) and (3), it would be helpful to explicit about the syntax for
> the
> > > enhancements to the tool. I think
> > > that for the features tool, `--release-version` is an optional
> parameter
> > > which requires a RELEASE_VERSION
> > > argument. I wonder whether it would be helpful to have
> `--latest-stable`
> > > as an option too.
> > >
> > > Thanks,
> > > Andrew
> > >
> > > > On 26 Feb 2024, at 21:26, Justine Olshan
>  > >
> > > wrote:
> > > >
> > > > Hello folks,
> > > >
> > > > I'm proposing a KIP that allows for setting and upgrading new
> features
> > > >

Re: Shortened URLs for KIPs?

2024-02-28 Thread Chris Egerton

Hi Kirk,

Interesting proposal! I gave it a shot with one of my own prior KIPs and
was able to generate https://s.apache.org/kip-618 for it.

It looks like uppercase characters aren't permitted for URL IDs (even
though the regex listed in that text box does claim to allow them).

I can't commit to doing this for every KIP in perpetuity, but I wouldn't
mind giving it a shot on at least a trial basis unless any of my colleagues
have objections.

Best,

Chris

On Wed, Feb 28, 2024 at 5:45 PM Kirk True  wrote:

> I just found https://s.apache.org/, which is an Apache shortened URL
> service. That might provide the needed infrastructure, but it requires a
> login, so someone (a committer(?)) to create that for each KIP :(
>
> > On Feb 28, 2024, at 2:40 PM, Kirk True  wrote:
> >
> > Hi all,
> >
> > Is it possible to set up shortened URLs for KIPs? So instead of, say:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
> >
> > We could refer to it as:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-966
> >
> > Or maybe even go so far as to have something like:
> >
> > https://kafka.apache.org/kips/KIP-966
> >
> > I know the wiki has a way to generate a short URL (e.g.
> https://cwiki.apache.org/confluence/x/mpOzDw), but, IMO, it’s so opaque
> as to be nearly worthless.
> >
> > Pros:
> >
> > 1. Succinct: great for written documentation
> > 2. Discoverability: it’s predictable and easy to find
> > 3. Robust: the URL doesn’t break when the KIP title changes
> >
> > Cons:
> >
> > 1. Time
> > 2. Money
> > 3. Perpetual maintenance: requires 100% commitment indefinitely
> >
> > I know the list of cons is probably much more than I realize. At this
> point I’m just wondering if it’s even a desired mechanism.
> >
> > Thoughts?
> >
> > Thanks,
> > Kirk
>
>

Re: Newbie need some help

2024-02-28 Thread Kirk True

Hi Chia,

Welcome!

One suggestion is to look at the list of open Jiras that are marked with the 
“newbie” label:

https://issues.apache.org/jira/issues/?jql=project %3D KAFKA AND labels IN 
(newbie%2C "newbie%2B%2B") AND status in (Open%2C Reopened) ORDER BY created 
DESC 


Thanks,
Kirk

> On Feb 27, 2024, at 5:00 PM, Chia-Chuan Yu  wrote:
> 
> Hi, team
> I’m a new member just joined the community. I work in the KLA as a software
> engineer. We use Kafka as part of our data pipeline. I’m currently looking
> for some issues to work on as my first step of contributing. Any
> recommending task I can look into it? Please let me know.
> 
> JIRA username:chiacyu
> 
> Best,
> Chia Chuan Yu

Re: Shortened URLs for KIPs?

2024-02-28 Thread Kirk True

I just found https://s.apache.org/, which is an Apache shortened URL service. 
That might provide the needed infrastructure, but it requires a login, so 
someone (a committer(?)) to create that for each KIP :(

> On Feb 28, 2024, at 2:40 PM, Kirk True  wrote:
> 
> Hi all,
> 
> Is it possible to set up shortened URLs for KIPs? So instead of, say:
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
> 
> We could refer to it as:
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966
> 
> Or maybe even go so far as to have something like:
> 
> https://kafka.apache.org/kips/KIP-966
> 
> I know the wiki has a way to generate a short URL (e.g. 
> https://cwiki.apache.org/confluence/x/mpOzDw), but, IMO, it’s so opaque as to 
> be nearly worthless. 
> 
> Pros:
> 
> 1. Succinct: great for written documentation
> 2. Discoverability: it’s predictable and easy to find
> 3. Robust: the URL doesn’t break when the KIP title changes
> 
> Cons:
> 
> 1. Time
> 2. Money
> 3. Perpetual maintenance: requires 100% commitment indefinitely
> 
> I know the list of cons is probably much more than I realize. At this point 
> I’m just wondering if it’s even a desired mechanism.
> 
> Thoughts?
> 
> Thanks,
> Kirk

[jira] [Created] (KAFKA-16313) Offline group protocol upgrade

2024-02-28 Thread Dongnuo Lyu (Jira)

Dongnuo Lyu created KAFKA-16313:
---

 Summary: Offline group protocol upgrade
 Key: KAFKA-16313
 URL: https://issues.apache.org/jira/browse/KAFKA-16313
 Project: Kafka
  Issue Type: Sub-task
Reporter: Dongnuo Lyu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Shortened URLs for KIPs?

2024-02-28 Thread Kirk True

Hi all,

Is it possible to set up shortened URLs for KIPs? So instead of, say:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas

We could refer to it as:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-966

Or maybe even go so far as to have something like:

https://kafka.apache.org/kips/KIP-966

I know the wiki has a way to generate a short URL (e.g. 
https://cwiki.apache.org/confluence/x/mpOzDw), but, IMO, it’s so opaque as to 
be nearly worthless. 

Pros:

1. Succinct: great for written documentation
2. Discoverability: it’s predictable and easy to find
3. Robust: the URL doesn’t break when the KIP title changes

Cons:

1. Time
2. Money
3. Perpetual maintenance: requires 100% commitment indefinitely

I know the list of cons is probably much more than I realize. At this point I’m 
just wondering if it’s even a desired mechanism.

Thoughts?

Thanks,
Kirk

Newbie need some help

2024-02-28 Thread Chia-Chuan Yu

Hi, team
I’m a new member just joined the community. I work in the KLA as a software
engineer. We use Kafka as part of our data pipeline. I’m currently looking
for some issues to work on as my first step of contributing. Any
recommending task I can look into it? Please let me know.

JIRA username:chiacyu

Best,
Chia Chuan Yu

Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2024-02-28 Thread David Arthur

Andrew/Jose, I like the suggested Flow API. It's also similar to the stream
observers in GPRC. I'm not sure we should expose something as complex as
the Flow API directly in KafkaAdminClient, but certainly we can provide a
similar interface.

---
Cancellations:

Another thing not yet discussed is how to cancel in-flight requests. For
other calls in KafkaAdminClient, we use KafkaFuture which has a "cancel"
method. With the callback approach, we need to be able to cancel the
request from within the callback as well as externally. Looking to the Flow
API again for inspiration, we could have the admin client pass an object to
the callback which can be used for cancellation. In the simple case, users
can ignore this object. In the advanced case, they can create a concrete
class for the callback and cache the cancellation object so it can be
accessed externally. This would be similar to the Subscription in the Flow
API.

---
Topics / Partitions:

For the case of topic descriptions, we actually have two data types
interleaved in one stream (topics and partitions). This means if we go with
TopicDescription in the "onNext" method, we will have a partial set of
topics in some cases. Also, we will end up calling "onNext" more than once
for each RPC in the case that a single RPC response spans multiple topics.

One alternative to a single "onNext" would be an interface more tailored to
the RPC like:

interface DescribeTopicsStreamObserver {
  // Called for each topic in the result stream.
  void onTopic(TopicInfo topic);

  // Called for each partition of the topic last handled by onTopic
  void onPartition(TopicPartitionInfo partition);

  // Called once the broker has finished streaming results to the admin
client. This marks the end of the stream.
  void onComplete();

  // Called if an error occurs on the underlying stream. This marks the end
of the stream.
  void onError(Throwable t);
}

---
Consumer API:

Offline, there was some discussion about using a simple SAM consumer-like
interface:

interface AdminResultsConsumer {
  void onNext(T next, Throwable t);
}

This has the benefit of being quite simple and letting callers supply a
lambda instead of a full anonymous class definition. This would use
nullable arguments like CompletableFuture#whenComplete. We could also use
an Optional pattern here instead of nullables.

---
Summary:

So far, it seems like we are looking at these different options. The main
difference in terms of API design is if the user will need to implement
more than one method, or if a lambda can suffice.

1. Generic, Flow-like interface: AdminResultsSubscriber
2. DescribeTopicsStreamObserver (in this message above)
3. AdminResultsConsumer
4. AdminResultsConsumer with an Optional-like type instead of nullable
arguments

-David

On Fri, Feb 23, 2024 at 4:00 PM José Armando García Sancio
 wrote:

> Hi Calvin
>
> On Fri, Feb 23, 2024 at 9:23 AM Calvin Liu 
> wrote:
> > As we agreed to implement the pagination for the new API
> > DescribeTopicPartitions, the client side must also add a proper interface
> > to handle the pagination.
> > The current KafkaAdminClient.describeTopics returns
> > the DescribeTopicsResult which is the future for querying all the topics.
> > It is awkward to fit the pagination into it because
>
> I suggest taking a look at Java's Flow API:
>
> https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/Flow.html
> It was design for this specific use case and many libraries integrate with
> it.
>
> If the Kafka client cannot be upgraded to support the Java 9 which
> introduced that API, you can copy the same interface and semantics.
> This would allow users to easily integrate with reactive libraries
> since they all integrate with Java Flow.
>
> Thanks,
> --
> -José
>

-- 
-David

Re: [DISCUSS] KIP-939: Support Participation in 2PC

2024-02-28 Thread Jun Rao

Hi, Artem,

Thanks for the reply.

I understand your concern on having a timeout breaking the 2PC guarantees.
However, the fallback plan to disable 2PC with an independent
keepPreparedTxn is subject to the timeout too. So, it doesn't provide the
same guarantees as 2PC either.

To me, if we provide a new functionality, we should make it easy such that
the application developer only needs to implement it in one way, which is
always correct. Then, we can consider what additional things are needed to
make the operator comfortable enabling it.

Jun

On Tue, Feb 27, 2024 at 4:45 PM Artem Livshits
 wrote:

> Hi Jun,
>
> Thank you for the discussion.
>
> > For 3b, it would be useful to understand the reason why an admin doesn't
> authorize 2PC for self-hosted Flink
>
> I think the nuance here is that for cloud, there is a cloud admin
> (operator) and there is cluster admin (who, for example could manage acls
> on topics or etc.).  The 2PC functionality can affect cloud operations,
> because a long running transaction can block the last stable offset and
> prevent compaction or data tiering.  In a multi-tenant environment, a long
> running transaction that involves consumer offset may affect data that is
> shared by multiple tenants (Flink transactions don't use consumer offsets,
> so this is not an issue for Flink, but we'd need a separate ACL or some
> other way to express this permission if we wanted to go in that direction).
>
> For that reason, I expect 2PC to be controlled by the cloud operator and it
> just may not be scalable for the cloud operator to manage all potential
> interactions required to resolve in-doubt transactions (communicate to the
> end users, etc.).  In general, we make no assumptions about Kafka
> applications -- they may come and go, they may abandon transactional ids
> and generate new ones.  For 2PC we need to make sure that the application
> is highly available and wouldn't easily abandon an open transaction in
> Kafka.
>
> > If so, another way to address that is to allow the admin to set a timeout
> even for the 2PC case.
>
> This effectively abandons the 2PC guarantee because it creates a case for
> Kafka to unilaterally make an automatic decision on a prepared
> transaction.  I think it's fundamental for 2PC to abandon this ability and
> wait for the external coordinator for the decision, after all the
> coordinator may legitimately be unavailable for an arbitrary amount of
> time.  Also, we already have a timeout on regular Kafka transactions,
> having another "special" timeout could be confusing, and a large enough
> timeout could still produce the undesirable effects for the cloud
> operations (so we kind of get worst of both options -- we don't provide
> guarantees and still have impact on operations).
>
> -Artem
>
> On Fri, Feb 23, 2024 at 8:55 AM Jun Rao  wrote:
>
> > Hi, Artem,
> >
> > Thanks for the reply.
> >
> > For 3b, it would be useful to understand the reason why an admin doesn't
> > authorize 2PC for self-hosted Flink. Is the main reason that 2PC has
> > unbounded timeout that could lead to unbounded outstanding transactions?
> If
> > so, another way to address that is to allow the admin to set a timeout
> even
> > for the 2PC case. The timeout would be long enough for behavioring
> > applications to complete 2PC operations, but not too long for
> non-behaving
> > applications' transactions to hang.
> >
> > Jun
> >
> > On Wed, Feb 21, 2024 at 4:34 PM Artem Livshits
> >  wrote:
> >
> > > Hi Jun,
> > >
> > > > 20A. One option is to make the API initTransactions(boolean
> enable2PC).
> > >
> > > We could do that.  I think there is a little bit of symmetry between
> the
> > > client and server that would get lost with this approach (server has
> > > enable2PC as config), but I don't really see a strong reason for
> > enable2PC
> > > to be a config vs. an argument for initTransactions.  But let's see if
> we
> > > find 20B to be a strong consideration for keeping a separate flag for
> > > keepPreparedTxn.
> > >
> > > > 20B. But realistically, we want Flink (and other apps) to have a
> single
> > > implementation
> > >
> > > That's correct and here's what I think can happen if we don't allow
> > > independent keepPreparedTxn:
> > >
> > > 1. Pre-KIP-939 self-hosted Flink vs. any Kafka cluster -- reflection is
> > > used, which effectively implements keepPreparedTxn=true without our
> > > explicit support.
> > > 2. KIP-939 self-hosted Flink vs. pre-KIP-939 Kafka cluster -- we can
> > > either fall back to reflection or we just say we don't support this,
> have
> > > to upgrade Kafka cluster first.
> > > 3. KIP-939 self-hosted Flink vs. KIP-939 Kafka cluster, this becomes
> > > interesting depending on whether the Kafka cluster authorizes 2PC or
> not:
> > >  3a. Kafka cluster autorizes 2PC for self-hosted Flink -- everything
> uses
> > > KIP-939 and there is no problem
> > >  3b. Kafka cluster doesn't authorize 2PC for self-hosted Flink -- we
> can
> > > either fallback to

Re: [ANNOUNCE] Apache Kafka 3.7.0

2024-02-28 Thread Boudjelda Mohamed Said

Thanks Stanislav for running the release!

On Wed, Feb 28, 2024 at 10:36 PM Kirk True  wrote:

> Thanks Stanislav
>
> > On Feb 27, 2024, at 10:01 AM, Stanislav Kozlovski <
> stanislavkozlov...@apache.org> wrote:
> >
> > The Apache Kafka community is pleased to announce the release of
> > Apache Kafka 3.7.0
> >
> > This is a minor release that includes new features, fixes, and
> > improvements from 296 JIRAs
> >
> > An overview of the release and its notable changes can be found in the
> > release blog post:
> > https://kafka.apache.org/blog#apache_kafka_370_release_announcement
> >
> > All of the changes in this release can be found in the release notes:
> > https://www.apache.org/dist/kafka/3.7.0/RELEASE_NOTES.html
> >
> > You can download the source and binary release (Scala 2.12, 2.13) from:
> > https://kafka.apache.org/downloads#3.7.0
> >
> >
> ---
> >
> >
> > Apache Kafka is a distributed streaming platform with four core APIs:
> >
> >
> > ** The Producer API allows an application to publish a stream of records
> to
> > one or more Kafka topics.
> >
> > ** The Consumer API allows an application to subscribe to one or more
> > topics and process the stream of records produced to them.
> >
> > ** The Streams API allows an application to act as a stream processor,
> > consuming an input stream from one or more topics and producing an
> > output stream to one or more output topics, effectively transforming the
> > input streams to output streams.
> >
> > ** The Connector API allows building and running reusable producers or
> > consumers that connect Kafka topics to existing applications or data
> > systems. For example, a connector to a relational database might
> > capture every change to a table.
> >
> >
> > With these APIs, Kafka can be used for two broad classes of application:
> >
> > ** Building real-time streaming data pipelines that reliably get data
> > between systems or applications.
> >
> > ** Building real-time streaming applications that transform or react
> > to the streams of data.
> >
> >
> > Apache Kafka is in use at large and small companies worldwide, including
> > Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
> > Target, The New York Times, Uber, Yelp, and Zalando, among others.
> >
> > A big thank you to the following 146 contributors to this release!
> > (Please report an unintended omission)
> >
> > Abhijeet Kumar, Akhilesh Chaganti, Alieh, Alieh Saeedi, Almog Gavra,
> > Alok Thatikunta, Alyssa Huang, Aman Singh, Andras Katona, Andrew
> > Schofield, Anna Sophie Blee-Goldman, Anton Agestam, Apoorv Mittal,
> > Arnout Engelen, Arpit Goyal, Artem Livshits, Ashwin Pankaj,
> > ashwinpankaj, atu-sharm, bachmanity1, Bob Barrett, Bruno Cadonna,
> > Calvin Liu, Cerchie, chern, Chris Egerton, Christo Lolov, Colin
> > Patrick McCabe, Colt McNealy, Crispin Bernier, David Arthur, David
> > Jacot, David Mao, Deqi Hu, Dimitar Dimitrov, Divij Vaidya, Dongnuo
> > Lyu, Eaugene Thomas, Eduwer Camacaro, Eike Thaden, Federico Valeri,
> > Florin Akermann, Gantigmaa Selenge, Gaurav Narula, gongzhongqiang,
> > Greg Harris, Guozhang Wang, Gyeongwon, Do, Hailey Ni, Hanyu Zheng, Hao
> > Li, Hector Geraldino, hudeqi, Ian McDonald, Iblis Lin, Igor Soarez,
> > iit2009060, Ismael Juma, Jakub Scholz, James Cheng, Jason Gustafson,
> > Jay Wang, Jeff Kim, Jim Galasyn, John Roesler, Jorge Esteban Quilcate
> > Otoya, Josep Prat, José Armando García Sancio, Jotaniya Jeel, Jouni
> > Tenhunen, Jun Rao, Justine Olshan, Kamal Chandraprakash, Kirk True,
> > kpatelatwork, kumarpritam863, Laglangyue, Levani Kokhreidze, Lianet
> > Magrans, Liu Zeyu, Lucas Brutschy, Lucia Cerchie, Luke Chen, maniekes,
> > Manikumar Reddy, mannoopj, Maros Orsak, Matthew de Detrich, Matthias
> > J. Sax, Max Riedel, Mayank Shekhar Narula, Mehari Beyene, Michael
> > Westerby, Mickael Maison, Nick Telford, Nikhil Ramakrishnan, Nikolay,
> > Okada Haruki, olalamichelle, Omnia G.H Ibrahim, Owen Leung, Paolo
> > Patierno, Philip Nee, Phuc-Hong-Tran, Proven Provenzano, Purshotam
> > Chauhan, Qichao Chu, Matthias J. Sax, Rajini Sivaram, Renaldo Baur
> > Filho, Ritika Reddy, Robert Wagner, Rohan, Ron Dagostino, Roon, runom,
> > Ruslan Krivoshein, rykovsi, Sagar Rao, Said Boudjelda, Satish Duggana,
> > shuoer86, Stanislav Kozlovski, Taher Ghaleb, Tang Yunzi, TapDang,
> > Taras Ledkov, tkuramoto33, Tyler Bertrand, vamossagar12, Vedarth
> > Sharma, Viktor Somogyi-Vass, Vincent Jiang, Walker Carlson,
> > Wuzhengyu97, Xavier Léauté, Xiaobing Fang, yangy, Ritika Reddy,
> > Yanming Zhou, Yash Mayya, yuyli, zhaohaidao, Zihao Lin, Ziming Deng
> >
> > We welcome your help and feedback. For more information on how to
> > report problems, and to get involved, visit the project website at
> > https://kafka.apache.org/
> >
> > Thank you!
> >
> >
> > Regards,
> >
> > Stanislav Kozlovski
> > Release Manager for Apache Kafka 3.7.0
>
>

Re: [ANNOUNCE] Apache Kafka 3.7.0

2024-02-28 Thread Kirk True

Thanks Stanislav

> On Feb 27, 2024, at 10:01 AM, Stanislav Kozlovski 
>  wrote:
> 
> The Apache Kafka community is pleased to announce the release of
> Apache Kafka 3.7.0
> 
> This is a minor release that includes new features, fixes, and
> improvements from 296 JIRAs
> 
> An overview of the release and its notable changes can be found in the
> release blog post:
> https://kafka.apache.org/blog#apache_kafka_370_release_announcement
> 
> All of the changes in this release can be found in the release notes:
> https://www.apache.org/dist/kafka/3.7.0/RELEASE_NOTES.html
> 
> You can download the source and binary release (Scala 2.12, 2.13) from:
> https://kafka.apache.org/downloads#3.7.0
> 
> ---
> 
> 
> Apache Kafka is a distributed streaming platform with four core APIs:
> 
> 
> ** The Producer API allows an application to publish a stream of records to
> one or more Kafka topics.
> 
> ** The Consumer API allows an application to subscribe to one or more
> topics and process the stream of records produced to them.
> 
> ** The Streams API allows an application to act as a stream processor,
> consuming an input stream from one or more topics and producing an
> output stream to one or more output topics, effectively transforming the
> input streams to output streams.
> 
> ** The Connector API allows building and running reusable producers or
> consumers that connect Kafka topics to existing applications or data
> systems. For example, a connector to a relational database might
> capture every change to a table.
> 
> 
> With these APIs, Kafka can be used for two broad classes of application:
> 
> ** Building real-time streaming data pipelines that reliably get data
> between systems or applications.
> 
> ** Building real-time streaming applications that transform or react
> to the streams of data.
> 
> 
> Apache Kafka is in use at large and small companies worldwide, including
> Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
> Target, The New York Times, Uber, Yelp, and Zalando, among others.
> 
> A big thank you to the following 146 contributors to this release!
> (Please report an unintended omission)
> 
> Abhijeet Kumar, Akhilesh Chaganti, Alieh, Alieh Saeedi, Almog Gavra,
> Alok Thatikunta, Alyssa Huang, Aman Singh, Andras Katona, Andrew
> Schofield, Anna Sophie Blee-Goldman, Anton Agestam, Apoorv Mittal,
> Arnout Engelen, Arpit Goyal, Artem Livshits, Ashwin Pankaj,
> ashwinpankaj, atu-sharm, bachmanity1, Bob Barrett, Bruno Cadonna,
> Calvin Liu, Cerchie, chern, Chris Egerton, Christo Lolov, Colin
> Patrick McCabe, Colt McNealy, Crispin Bernier, David Arthur, David
> Jacot, David Mao, Deqi Hu, Dimitar Dimitrov, Divij Vaidya, Dongnuo
> Lyu, Eaugene Thomas, Eduwer Camacaro, Eike Thaden, Federico Valeri,
> Florin Akermann, Gantigmaa Selenge, Gaurav Narula, gongzhongqiang,
> Greg Harris, Guozhang Wang, Gyeongwon, Do, Hailey Ni, Hanyu Zheng, Hao
> Li, Hector Geraldino, hudeqi, Ian McDonald, Iblis Lin, Igor Soarez,
> iit2009060, Ismael Juma, Jakub Scholz, James Cheng, Jason Gustafson,
> Jay Wang, Jeff Kim, Jim Galasyn, John Roesler, Jorge Esteban Quilcate
> Otoya, Josep Prat, José Armando García Sancio, Jotaniya Jeel, Jouni
> Tenhunen, Jun Rao, Justine Olshan, Kamal Chandraprakash, Kirk True,
> kpatelatwork, kumarpritam863, Laglangyue, Levani Kokhreidze, Lianet
> Magrans, Liu Zeyu, Lucas Brutschy, Lucia Cerchie, Luke Chen, maniekes,
> Manikumar Reddy, mannoopj, Maros Orsak, Matthew de Detrich, Matthias
> J. Sax, Max Riedel, Mayank Shekhar Narula, Mehari Beyene, Michael
> Westerby, Mickael Maison, Nick Telford, Nikhil Ramakrishnan, Nikolay,
> Okada Haruki, olalamichelle, Omnia G.H Ibrahim, Owen Leung, Paolo
> Patierno, Philip Nee, Phuc-Hong-Tran, Proven Provenzano, Purshotam
> Chauhan, Qichao Chu, Matthias J. Sax, Rajini Sivaram, Renaldo Baur
> Filho, Ritika Reddy, Robert Wagner, Rohan, Ron Dagostino, Roon, runom,
> Ruslan Krivoshein, rykovsi, Sagar Rao, Said Boudjelda, Satish Duggana,
> shuoer86, Stanislav Kozlovski, Taher Ghaleb, Tang Yunzi, TapDang,
> Taras Ledkov, tkuramoto33, Tyler Bertrand, vamossagar12, Vedarth
> Sharma, Viktor Somogyi-Vass, Vincent Jiang, Walker Carlson,
> Wuzhengyu97, Xavier Léauté, Xiaobing Fang, yangy, Ritika Reddy,
> Yanming Zhou, Yash Mayya, yuyli, zhaohaidao, Zihao Lin, Ziming Deng
> 
> We welcome your help and feedback. For more information on how to
> report problems, and to get involved, visit the project website at
> https://kafka.apache.org/
> 
> Thank you!
> 
> 
> Regards,
> 
> Stanislav Kozlovski
> Release Manager for Apache Kafka 3.7.0

Re: [DISCUSS] KIP-956: Tiered Storage Quotas

2024-02-28 Thread Jun Rao

Hi, Abhijeet,

Thanks for the reply.

The issue with recording the throttle time as a gauge is that it's
transient. If the metric is not read immediately, the recorded value could
be reset to 0. The admin won't realize that throttling has happened.

For client quotas, the throttle time is tracked as the average
throttle-time per user/client-id. This makes the metric less transient.

Also, the configs use read/write whereas the metrics use fetch/copy. Could
we make them consistent?

Jun

On Wed, Feb 28, 2024 at 6:49 AM Abhijeet Kumar 
wrote:

> Hi Jun,
>
> Clarified the meaning of the two metrics. Also updated the KIP.
>
> kafka.log.remote:type=RemoteLogManager, name=RemoteFetchThrottleTime -> The
> duration of time required at a given moment to bring the observed fetch
> rate within the allowed limit, by preventing further reads.
> kafka.log.remote:type=RemoteLogManager, name=RemoteCopyThrottleTime -> The
> duration of time required at a given moment to bring the observed remote
> copy rate within the allowed limit, by preventing further copies.
>
> Regards.
>
> On Wed, Feb 28, 2024 at 12:28 AM Jun Rao  wrote:
>
> > Hi, Abhijeet,
> >
> > Thanks for the explanation. Makes sense to me now.
> >
> > Just a minor comment. Could you document the exact meaning of the
> following
> > two metrics? For example, is the time accumulated? If so, is it from the
> > start of the broker or over some window?
> >
> > kafka.log.remote:type=RemoteLogManager, name=RemoteFetchThrottleTime
> > kafka.log.remote:type=RemoteLogManager, name=RemoteCopyThrottleTime
> >
> > Jun
> >
> > On Tue, Feb 27, 2024 at 1:39 AM Abhijeet Kumar <
> abhijeet.cse@gmail.com
> > >
> > wrote:
> >
> > > Hi Jun,
> > >
> > > The existing quota system for consumers is designed to throttle the
> > > consumer if it exceeds the allowed fetch rate.
> > > The additional quota we want to add works on the broker level. If the
> > > broker-level remote read quota is being
> > > exceeded, we prevent additional reads from the remote storage but do
> not
> > > prevent local reads for the consumer.
> > > If the consumer has specified other partitions to read, which can be
> > served
> > > from local, it can continue to read those
> > > partitions. To elaborate more, we make a check for quota exceeded if we
> > > know a segment needs to be read from
> > > remote. If the quota is exceeded, we simply skip the partition and move
> > to
> > > other segments in the fetch request.
> > > This way consumers can continue to read the local data as long as they
> > have
> > > not exceeded the client-level quota.
> > >
> > > Also, when we choose the appropriate consumer-level quota, we would
> > > typically look at what kind of local fetch
> > > throughput is supported. If we were to reuse the same consumer quota,
> we
> > > should also consider the throughput
> > > the remote storage supports. The throughput supported by remote may be
> > > less/more than the throughput supported
> > > by local (when using a cloud provider, it depends on the plan opted by
> > the
> > > user). The consumer quota has to be carefully
> > > set considering both local and remote throughput. Instead, if we have a
> > > separate quota, it makes things much simpler
> > > for the user, since they already know what throughput their remote
> > storage
> > > supports.
> > >
> > > (Also, thanks for pointing out. I will update the KIP based on the
> > > discussion)
> > >
> > > Regards,
> > > Abhijeet.
> > >
> > > On Tue, Feb 27, 2024 at 2:49 AM Jun Rao 
> > wrote:
> > >
> > > > Hi, Abhijeet,
> > > >
> > > > Sorry for the late reply. It seems that you haven't updated the KIP
> > based
> > > > on the discussion? One more comment.
> > > >
> > > > 11. Currently, we already have a quota system for both the producers
> > and
> > > > consumers. I can understand why we need an additional
> > > > remote.log.manager.write.quota.default quota. For example, when tier
> > > > storage is enabled for the first time, there could be a lot of
> segments
> > > > that need to be written to the remote storage, even though there is
> no
> > > > increase in the produced data. However, I am not sure about an
> > > > additional remote.log.manager.read.quota.default. The KIP says that
> the
> > > > reason is "This may happen when the majority of the consumers start
> > > reading
> > > > from the earliest offset of their respective Kafka topics.". However,
> > > this
> > > > can happen with or without tier storage and the current quota system
> > for
> > > > consumers is designed for solving this exact problem. Could you
> explain
> > > the
> > > > usage of this additional quota?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Mon, Feb 12, 2024 at 11:08 AM Abhijeet Kumar <
> > > > abhijeet.cse@gmail.com>
> > > > wrote:
> > > >
> > > > > Comments inline
> > > > >
> > > > > On Wed, Dec 6, 2023 at 1:12 AM Jun Rao 
> > > wrote:
> > > > >
> > > > > > Hi, Abhijeet,
> > > > > >
> > > > > > Thanks for the KIP. A few

Re: [DISCUSS] KIP-1022 Formatting and Updating Features

2024-02-28 Thread Artem Livshits

Hi Justine,

Thank you for the KIP.  I think the KIP is pretty clear and makes sense to
me.  Maybe it would be good to give a little more detail on the
implementation of feature mapping and how the tool would validate the
feature combinations.  For example, I'd expect that

bin/kafka-storage.sh format --release-version 3.6-IVI --feature
transaction.version=2

would give an error because the new transaction protocol is not supported
in 3.6.  Also, we may decide that

bin/kafka-storage.sh format --release-version 5.0-IV0 --feature
transaction.version=0

would be an unsupported combination as it'll have been a while since the
new transaction protocol has been the default and it would be too risky to
enable this combination as it may not be tested any more.

As for the new names, I'm thinking of the "transaction feature version"
more like a "transaction protocol version" -- from the user perspective we
don't really add new functionality in KIP-890, we're changing the protocol
to be more robust (and potentially faster).

-Artem



On Wed, Feb 28, 2024 at 10:08 AM Justine Olshan
 wrote:

> Hey Andrew,
>
> Thanks for taking a look.
>
> I previously didn't include 1. We do plan to use these features immediately
> for KIP-890 and KIP-848. If we think it is easier to put the discussion in
> those KIP discussions we can, but I fear that it will easily get lost given
> the size of the KIPs.
>
> I named the features similar to how we named metadata version. Transaction
> version would control transaction features like enabling a new transaction
> record format and APIs to enable KIP-890 part 2. Likewise, the group
> coordinator version would also enable the new record formats there and the
> new group coordinator. I am open to new names or further discussion.
>
> For 2 and 3, I can provide example scripts that show the usage. I am open
> to adding --latest-stable as well.
>
> Justine
>
> On Tue, Feb 27, 2024 at 4:59 AM Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
> > Hi Justine,
> > Thanks for the KIP. This area of Kafka is complicated and making it
> easier
> > is good.
> >
> > When I use the `kafka-features.sh` tool to describe the features on my
> > cluster, I find that there’s a
> > single feature called “metadata.version”. I think this KIP does a handful
> > of things:
> >
> > 1) It introduces the idea of two new features, TV and GCV, without giving
> > them concrete names or
> > describing their behaviour.
> > 2) It introduces a new flag on the storage tool to enable advanced users
> > to control individual features
> > when they format storage for a new broker.
> > 3) It introduces a new flag on the features tool to enable a set of
> latest
> > stable features for a given
> > version to be enabled all together.
> >
> > I think that (1) probably shouldn’t be in this KIP unless there are
> > concrete details. Maybe this KIP is enabling
> > the operator experience when we introduce TV and GCV in other KIPs. I
> > don’t believe the plan is to enable
> > the new group coordinator with a feature, and it seems unnecessary to me.
> > I think it’s more compelling for TV
> > given the changes in transactions.
> >
> > For (2) and (3), it would be helpful to explicit about the syntax for the
> > enhancements to the tool. I think
> > that for the features tool, `--release-version` is an optional parameter
> > which requires a RELEASE_VERSION
> > argument. I wonder whether it would be helpful to have `--latest-stable`
> > as an option too.
> >
> > Thanks,
> > Andrew
> >
> > > On 26 Feb 2024, at 21:26, Justine Olshan  >
> > wrote:
> > >
> > > Hello folks,
> > >
> > > I'm proposing a KIP that allows for setting and upgrading new features
> > > (other than metadata version) via the kafka storage format and feature
> > > tools. This KIP extends on the feature versioning changes introduced by
> > > KIP-584 by allowing for the features to be set and upgraded.
> > >
> > > Please take a look:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1023%3A+Formatting+and+Updating+Features
> > >
> > > Thanks,
> > >
> > > Justine
> >
> >
>

Re: [DISCUSS] KIP-1022 Formatting and Updating Features

2024-02-28 Thread Justine Olshan

Hey Andrew,

Thanks for taking a look.

I previously didn't include 1. We do plan to use these features immediately
for KIP-890 and KIP-848. If we think it is easier to put the discussion in
those KIP discussions we can, but I fear that it will easily get lost given
the size of the KIPs.

I named the features similar to how we named metadata version. Transaction
version would control transaction features like enabling a new transaction
record format and APIs to enable KIP-890 part 2. Likewise, the group
coordinator version would also enable the new record formats there and the
new group coordinator. I am open to new names or further discussion.

For 2 and 3, I can provide example scripts that show the usage. I am open
to adding --latest-stable as well.

Justine

On Tue, Feb 27, 2024 at 4:59 AM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi Justine,
> Thanks for the KIP. This area of Kafka is complicated and making it easier
> is good.
>
> When I use the `kafka-features.sh` tool to describe the features on my
> cluster, I find that there’s a
> single feature called “metadata.version”. I think this KIP does a handful
> of things:
>
> 1) It introduces the idea of two new features, TV and GCV, without giving
> them concrete names or
> describing their behaviour.
> 2) It introduces a new flag on the storage tool to enable advanced users
> to control individual features
> when they format storage for a new broker.
> 3) It introduces a new flag on the features tool to enable a set of latest
> stable features for a given
> version to be enabled all together.
>
> I think that (1) probably shouldn’t be in this KIP unless there are
> concrete details. Maybe this KIP is enabling
> the operator experience when we introduce TV and GCV in other KIPs. I
> don’t believe the plan is to enable
> the new group coordinator with a feature, and it seems unnecessary to me.
> I think it’s more compelling for TV
> given the changes in transactions.
>
> For (2) and (3), it would be helpful to explicit about the syntax for the
> enhancements to the tool. I think
> that for the features tool, `--release-version` is an optional parameter
> which requires a RELEASE_VERSION
> argument. I wonder whether it would be helpful to have `--latest-stable`
> as an option too.
>
> Thanks,
> Andrew
>
> > On 26 Feb 2024, at 21:26, Justine Olshan 
> wrote:
> >
> > Hello folks,
> >
> > I'm proposing a KIP that allows for setting and upgrading new features
> > (other than metadata version) via the kafka storage format and feature
> > tools. This KIP extends on the feature versioning changes introduced by
> > KIP-584 by allowing for the features to be set and upgraded.
> >
> > Please take a look:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1023%3A+Formatting+and+Updating+Features
> >
> > Thanks,
> >
> > Justine
>
>

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2683

2024-02-28 Thread Apache Jenkins Server

See

[jira] [Created] (KAFKA-16312) ConsumerRebalanceListener.onPartitionsAssigned should be called after joining, even if empty

2024-02-28 Thread Kirk True (Jira)

Kirk True created KAFKA-16312:
-

 Summary: ConsumerRebalanceListener.onPartitionsAssigned should be 
called after joining, even if empty
 Key: KAFKA-16312
 URL: https://issues.apache.org/jira/browse/KAFKA-16312
 Project: Kafka
  Issue Type: Bug
  Components: clients, consumer
Affects Versions: 3.7.0
Reporter: Kirk True
Assignee: Lianet Magrans
 Fix For: 3.8.0


There is a difference between the {{LegacyKafkaConsumer}} and 
{{AsyncKafkaConsumer}} respecting when the 
{{ConsumerRebalanceListener.onPartitionsAssigned()}} method is invoked.

 
For example, with {{onPartitionsAssigned()}}:

* {{LegacyKafkaConsumer}}: the listener method is invoked when the consumer 
joins the group, even if that consumer was not assigned any partitions. In this 
case it's passed an empty list.
* {{AsyncKafkaConsumer}}: the listener method is only invoked after the 
consumer joins the group iff it has assigned partitions

 
This difference is affecting the system tests. The system tests use a Java 
class named {{VerifiableConsumer}} which uses a {{ConsumerRebalanceListener}} 
that logs when the callbacks are invoked. The system tests then read from that 
log to determine when the callbacks are invoked. This coordination is used by 
the system tests to determine the lifecycle and status of the consumers.
 
The system tests rely heavily on the listener behavior of the 
{{LegacyKafkaConsumer}}. It invokes the {{onPartitionsAssigned()}} method when 
the consumer joins the group, and the system tests use that to determine when 
the consumer is actively a member of the group. This validation of membership 
is used as an assertion throughout the consumer-related tests.
 
In the system test I'm executing from {{consumer_test.py}}, there's a test that 
creates three consumers to read from a single topic with a single partition. 
It's a bit of an oddball test, but it demonstrates the issue.
 
Here are the logs pulled from the test run when executed using the 
{{LegacyKafkaConsumer}}:
 
Node 1:
 
{code:java}
[2024-02-15 00:43:52,400] INFO Adding newly assigned partitions:  
(org.apache.kafka.clients.consumer.internals.ConsumerRebalanceListenerInvoker){code}
 
Node 2:
 
{code:java}
[2024-02-15 00:43:52,401] INFO Adding newly assigned partitions: test_topic-0 
(org.apache.kafka.clients.consumer.internals.ConsumerRebalanceListenerInvoker){code}
 
Node 3:
 
{code:java}
[2024-02-15 00:43:52,399] INFO Adding newly assigned partitions:  
(org.apache.kafka.clients.consumer.internals.ConsumerRebalanceListenerInvoker){code}

Here are the logs when executing the same test using the {{AsyncKafkaConsumer}}:

Node 1:

{code:java}
[2024-02-15 01:15:46,576] INFO Adding newly assigned partitions: test_topic-0 
(org.apache.kafka.clients.consumer.internals.ConsumerRebalanceListenerInvoker){code}

Node 2:

{code:java}n/a{code}

Node 3:

{code:java}n/a{code}

As a result of this change, the existing system tests do not work with the new 
consumer. However, even more importantly, this change in behavior may adversely 
affect existing users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-16116) AsyncKafkaConsumer: Add missing rebalance metrics

2024-02-28 Thread Philip Nee (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-16116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Nee resolved KAFKA-16116.

Resolution: Fixed

> AsyncKafkaConsumer: Add missing rebalance metrics
> -
>
> Key: KAFKA-16116
> URL: https://issues.apache.org/jira/browse/KAFKA-16116
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer, metrics
>Reporter: Philip Nee
>Assignee: Philip Nee
>Priority: Critical
>  Labels: consumer-threading-refactor, metrics
> Fix For: 3.8.0
>
>
> The following metrics are missing:
> |[rebalance-latency-avg|https://docs.confluent.io/platform/current/kafka/monitoring.html#rebalance-latency-avg]|
> |[rebalance-latency-max|https://docs.confluent.io/platform/current/kafka/monitoring.html#rebalance-latency-max]|
> |[rebalance-latency-total|https://docs.confluent.io/platform/current/kafka/monitoring.html#rebalance-latency-total]|
> |[rebalance-rate-per-hour|https://docs.confluent.io/platform/current/kafka/monitoring.html#rebalance-rate-per-hour]|
> |[rebalance-total|https://docs.confluent.io/platform/current/kafka/monitoring.html#rebalance-total]|
> |[failed-rebalance-rate-per-hour|https://docs.confluent.io/platform/current/kafka/monitoring.html#failed-rebalance-rate-per-hour]|
> |[failed-rebalance-total|https://docs.confluent.io/platform/current/kafka/monitoring.html#failed-rebalance-total]|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] MINOR: Use archive repository for older releases [kafka-site]

2024-02-28 Thread via GitHub



mimaison commented on PR #589:
URL: https://github.com/apache/kafka-site/pull/589#issuecomment-1969205238

   Good catch! I pushed an update.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [VOTE] 3.7.0 RC4

2024-02-28 Thread Ismael Juma

Hi,

mvnrepository doesn't matter, we should update the guidelines to remove it.
All that matters is that it has reached the maven central repository.

Ismael

On Tue, Feb 27, 2024 at 9:49 AM Stanislav Kozlovski
 wrote:

> The thinking is that it is available for use - and it is in maven central -
> https://central.sonatype.com/artifact/org.apache.kafka/kafka_2.13 - but
> the
> mvnrepository seems to be an unofficial website that offers a nice UI for
> accessing maven central. The release is in the latter -
> https://central.sonatype.com/artifact/org.apache.kafka/kafka_2.13
>
> On Tue, Feb 27, 2024 at 6:35 PM Divij Vaidya 
> wrote:
>
> > We wait before making the announcement. The rationale is that there is
> not
> > much point announcing a release if folks cannot start using that version
> > artifacts immediately.
> >
> > See "Wait for about a day for the artifacts to show up in apache mirror
> > (releases, public group) and maven central (mvnrepository.com or
> maven.org
> > )."
> > in the release process wiki.
> >
> > --
> > Divij Vaidya
> >
> >
> >
> > On Tue, Feb 27, 2024 at 4:43 PM Stanislav Kozlovski
> >  wrote:
> >
> > > Hey all,
> > >
> > > Everything site-related is merged.
> > >
> > > I have been following the final steps of the release process.
> > > - Docker contains the release -
> > https://hub.docker.com/r/apache/kafka/tags
> > > - Maven central contains the release -
> > >
> > >
> >
> https://central.sonatype.com/artifact/org.apache.kafka/kafka_2.13/3.7.0/versions
> > > .
> > > Note it says Feb 9 publish date, but it was just published. The RC4
> files
> > > were created on Feb 9 though, so I assume that's why it says that
> > > - mvnrepository is NOT yet up to date -
> > > https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients and
> > > https://mvnrepository.com/artifact/org.apache.kafka/kafka
> > >
> > > Am I free to announce the release, or should I wait more for
> > MVNRepository
> > > to get up to date? For what it's worth, I "Released" the files 24 hours
> > ago
> > >
> > > On Mon, Feb 26, 2024 at 10:42 AM Stanislav Kozlovski <
> > > stanis...@confluent.io>
> > > wrote:
> > >
> > > >
> > > > This vote passes with *10 +1 votes* (3 bindings) and no 0 or -1
> votes.
> > > >
> > > > +1 votes
> > > >
> > > > PMC Members (binding):
> > > > * Mickael Maison
> > > > * Justine Olshan
> > > > * Divij Vaidya
> > > >
> > > > Community (non-binding):
> > > > * Proven Provenzano
> > > > * Federico Valeri
> > > > * Vedarth Sharma
> > > > * Andrew Schofield
> > > > * Paolo Patierno
> > > > * Jakub Scholz
> > > > * Josep Prat
> > > >
> > > > 
> > > >
> > > > 0 votes
> > > >
> > > > * No votes
> > > >
> > > > 
> > > >
> > > > -1 votes
> > > >
> > > > * No votes
> > > >
> > > > 
> > > >
> > > > Vote thread:
> > > > https://lists.apache.org/thread/71djwz292y2lzgwzm7n6n8o7x56zbgh9
> > > >
> > > > I'll continue with the release process and the release announcement
> > will
> > > > follow ASAP.
> > > >
> > > > Best,
> > > > Stanislav
> > > >
> > > > On Sun, Feb 25, 2024 at 7:08 PM Mickael Maison <
> > mickael.mai...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> Thanks for sorting out the docs issues.
> > > >> +1 (binding)
> > > >>
> > > >> Mickael
> > > >>
> > > >> On Fri, Feb 23, 2024 at 11:50 AM Stanislav Kozlovski
> > > >>  wrote:
> > > >> >
> > > >> > Some quick updates:
> > > >> >
> > > >> > There were some inconsistencies between the documentation in the
> > > >> > apache/kafka repo and the one in kafka-site. The process is such
> > that
> > > >> the
> > > >> > apache/kafka docs are the source of truth, but we had a few
> > > divergences
> > > >> in
> > > >> > the other repo. I have worked on correcting those with:
> > > >> > - MINOR: Reconcile upgrade.html with kafka-site/36's version
> > > >> >  and cherry-picked it
> > > into
> > > >> the
> > > >> > 3.6 and 3.7 branches too
> > > >> >
> > > >> > Additionally, the 3.7 upgrade notes have been merged in
> > apache/kafka -
> > > >> MINOR:
> > > >> > Add 3.7 upgrade notes <
> > > https://github.com/apache/kafka/pull/15407/files
> > > >> >.
> > > >> >
> > > >> > With that, I have opened a PR to move them to the kafka-site
> > > repository
> > > >> -
> > > >> > https://github.com/apache/kafka-site/pull/587. That is awaiting
> > > review.
> > > >> >
> > > >> > Similarly, the 3.7 blog post is ready for review again
> > > >> >  and awaiting a
> > review
> > > >> on 37:
> > > >> > Update default docs to point to the 3.7.0 release docs
> > > >> > 
> > > >> >
> > > >> > I also have a WIP for fixing the 3.6 docs in the kafka-site repo
> > > >> > . This isn't
> really
> > > >> related
> > > >> > to the release, but it's good to do.
> > > >> >
> > > >> > On Wed, Feb 21, 2024 at 4:55 AM Luke Chen 
> > wrote:
> > > >> >
> > > >> > >

Re: [PR] MINOR: Use archive repository for older releases [kafka-site]

2024-02-28 Thread via GitHub



divijvaidya commented on PR #589:
URL: https://github.com/apache/kafka-site/pull/589#issuecomment-1969186600

   We need a similar change in blog as well please (e.g. 
https://github.com/apache/kafka-site/blob/7ae222e3a5a80252063f7bd068cada382b39812b/blog.html#L317)
 otherwise we will have inconsistent broken links in the blog section.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [DISCUSS] KIP-956: Tiered Storage Quotas

2024-02-28 Thread Abhijeet Kumar

Hi Jun,

Clarified the meaning of the two metrics. Also updated the KIP.

kafka.log.remote:type=RemoteLogManager, name=RemoteFetchThrottleTime -> The
duration of time required at a given moment to bring the observed fetch
rate within the allowed limit, by preventing further reads.
kafka.log.remote:type=RemoteLogManager, name=RemoteCopyThrottleTime -> The
duration of time required at a given moment to bring the observed remote
copy rate within the allowed limit, by preventing further copies.

Regards.

On Wed, Feb 28, 2024 at 12:28 AM Jun Rao  wrote:

> Hi, Abhijeet,
>
> Thanks for the explanation. Makes sense to me now.
>
> Just a minor comment. Could you document the exact meaning of the following
> two metrics? For example, is the time accumulated? If so, is it from the
> start of the broker or over some window?
>
> kafka.log.remote:type=RemoteLogManager, name=RemoteFetchThrottleTime
> kafka.log.remote:type=RemoteLogManager, name=RemoteCopyThrottleTime
>
> Jun
>
> On Tue, Feb 27, 2024 at 1:39 AM Abhijeet Kumar  >
> wrote:
>
> > Hi Jun,
> >
> > The existing quota system for consumers is designed to throttle the
> > consumer if it exceeds the allowed fetch rate.
> > The additional quota we want to add works on the broker level. If the
> > broker-level remote read quota is being
> > exceeded, we prevent additional reads from the remote storage but do not
> > prevent local reads for the consumer.
> > If the consumer has specified other partitions to read, which can be
> served
> > from local, it can continue to read those
> > partitions. To elaborate more, we make a check for quota exceeded if we
> > know a segment needs to be read from
> > remote. If the quota is exceeded, we simply skip the partition and move
> to
> > other segments in the fetch request.
> > This way consumers can continue to read the local data as long as they
> have
> > not exceeded the client-level quota.
> >
> > Also, when we choose the appropriate consumer-level quota, we would
> > typically look at what kind of local fetch
> > throughput is supported. If we were to reuse the same consumer quota, we
> > should also consider the throughput
> > the remote storage supports. The throughput supported by remote may be
> > less/more than the throughput supported
> > by local (when using a cloud provider, it depends on the plan opted by
> the
> > user). The consumer quota has to be carefully
> > set considering both local and remote throughput. Instead, if we have a
> > separate quota, it makes things much simpler
> > for the user, since they already know what throughput their remote
> storage
> > supports.
> >
> > (Also, thanks for pointing out. I will update the KIP based on the
> > discussion)
> >
> > Regards,
> > Abhijeet.
> >
> > On Tue, Feb 27, 2024 at 2:49 AM Jun Rao 
> wrote:
> >
> > > Hi, Abhijeet,
> > >
> > > Sorry for the late reply. It seems that you haven't updated the KIP
> based
> > > on the discussion? One more comment.
> > >
> > > 11. Currently, we already have a quota system for both the producers
> and
> > > consumers. I can understand why we need an additional
> > > remote.log.manager.write.quota.default quota. For example, when tier
> > > storage is enabled for the first time, there could be a lot of segments
> > > that need to be written to the remote storage, even though there is no
> > > increase in the produced data. However, I am not sure about an
> > > additional remote.log.manager.read.quota.default. The KIP says that the
> > > reason is "This may happen when the majority of the consumers start
> > reading
> > > from the earliest offset of their respective Kafka topics.". However,
> > this
> > > can happen with or without tier storage and the current quota system
> for
> > > consumers is designed for solving this exact problem. Could you explain
> > the
> > > usage of this additional quota?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Mon, Feb 12, 2024 at 11:08 AM Abhijeet Kumar <
> > > abhijeet.cse@gmail.com>
> > > wrote:
> > >
> > > > Comments inline
> > > >
> > > > On Wed, Dec 6, 2023 at 1:12 AM Jun Rao 
> > wrote:
> > > >
> > > > > Hi, Abhijeet,
> > > > >
> > > > > Thanks for the KIP. A few comments.
> > > > >
> > > > > 10. remote.log.manager.write.quota.default:
> > > > > 10.1 For other configs, we
> > > > > use replica.alter.log.dirs.io.max.bytes.per.second. To be
> consistent,
> > > > > perhaps this can be sth like
> > > > remote.log.manager.write.max.bytes.per.second.
> > > > >
> > > >
> > > > This makes sense, we can rename the following configs to be
> consistent.
> > > >
> > > > Remote.log.manager.write.quota.default ->
> > > > remote.log.manager.write.max.bytes.per.second
> > > >
> > > > Remote.log.manager.read.quota.default ->
> > > > remote.log.manager.read.max.bytes.per.second.
> > > >
> > > >
> > > >
> > > > > 10.2 Could we list the new metrics associated with the new quota.
> > > > >
> > > >
> > > > We will add the following metrics as mentioned in the other response.
> > > >

Re: [DISCUSS] KIP-939: Support Participation in 2PC

2024-02-28 Thread Andrew Schofield

Hi Artem,
I totally agree that a timeout for the 2PC case is a bad idea. It does abandon
the 2PC guarantee.

Thanks,
Andrew

> On 28 Feb 2024, at 00:44, Artem Livshits  
> wrote:
>
> Hi Jun,
>
> Thank you for the discussion.
>
>> For 3b, it would be useful to understand the reason why an admin doesn't
> authorize 2PC for self-hosted Flink
>
> I think the nuance here is that for cloud, there is a cloud admin
> (operator) and there is cluster admin (who, for example could manage acls
> on topics or etc.).  The 2PC functionality can affect cloud operations,
> because a long running transaction can block the last stable offset and
> prevent compaction or data tiering.  In a multi-tenant environment, a long
> running transaction that involves consumer offset may affect data that is
> shared by multiple tenants (Flink transactions don't use consumer offsets,
> so this is not an issue for Flink, but we'd need a separate ACL or some
> other way to express this permission if we wanted to go in that direction).
>
> For that reason, I expect 2PC to be controlled by the cloud operator and it
> just may not be scalable for the cloud operator to manage all potential
> interactions required to resolve in-doubt transactions (communicate to the
> end users, etc.).  In general, we make no assumptions about Kafka
> applications -- they may come and go, they may abandon transactional ids
> and generate new ones.  For 2PC we need to make sure that the application
> is highly available and wouldn't easily abandon an open transaction in
> Kafka.
>
>> If so, another way to address that is to allow the admin to set a timeout
> even for the 2PC case.
>
> This effectively abandons the 2PC guarantee because it creates a case for
> Kafka to unilaterally make an automatic decision on a prepared
> transaction.  I think it's fundamental for 2PC to abandon this ability and
> wait for the external coordinator for the decision, after all the
> coordinator may legitimately be unavailable for an arbitrary amount of
> time.  Also, we already have a timeout on regular Kafka transactions,
> having another "special" timeout could be confusing, and a large enough
> timeout could still produce the undesirable effects for the cloud
> operations (so we kind of get worst of both options -- we don't provide
> guarantees and still have impact on operations).
>
> -Artem
>
> On Fri, Feb 23, 2024 at 8:55 AM Jun Rao  wrote:
>
>> Hi, Artem,
>>
>> Thanks for the reply.
>>
>> For 3b, it would be useful to understand the reason why an admin doesn't
>> authorize 2PC for self-hosted Flink. Is the main reason that 2PC has
>> unbounded timeout that could lead to unbounded outstanding transactions? If
>> so, another way to address that is to allow the admin to set a timeout even
>> for the 2PC case. The timeout would be long enough for behavioring
>> applications to complete 2PC operations, but not too long for non-behaving
>> applications' transactions to hang.
>>
>> Jun
>>
>> On Wed, Feb 21, 2024 at 4:34 PM Artem Livshits
>>  wrote:
>>
>>> Hi Jun,
>>>
 20A. One option is to make the API initTransactions(boolean enable2PC).
>>>
>>> We could do that.  I think there is a little bit of symmetry between the
>>> client and server that would get lost with this approach (server has
>>> enable2PC as config), but I don't really see a strong reason for
>> enable2PC
>>> to be a config vs. an argument for initTransactions.  But let's see if we
>>> find 20B to be a strong consideration for keeping a separate flag for
>>> keepPreparedTxn.
>>>
 20B. But realistically, we want Flink (and other apps) to have a single
>>> implementation
>>>
>>> That's correct and here's what I think can happen if we don't allow
>>> independent keepPreparedTxn:
>>>
>>> 1. Pre-KIP-939 self-hosted Flink vs. any Kafka cluster -- reflection is
>>> used, which effectively implements keepPreparedTxn=true without our
>>> explicit support.
>>> 2. KIP-939 self-hosted Flink vs. pre-KIP-939 Kafka cluster -- we can
>>> either fall back to reflection or we just say we don't support this, have
>>> to upgrade Kafka cluster first.
>>> 3. KIP-939 self-hosted Flink vs. KIP-939 Kafka cluster, this becomes
>>> interesting depending on whether the Kafka cluster authorizes 2PC or not:
>>> 3a. Kafka cluster autorizes 2PC for self-hosted Flink -- everything uses
>>> KIP-939 and there is no problem
>>> 3b. Kafka cluster doesn't authorize 2PC for self-hosted Flink -- we can
>>> either fallback to reflection or use keepPreparedTxn=true even if 2PC is
>>> not enabled.
>>>
>>> It seems to be ok to not support case 2 (i.e. require Kafka upgrade
>> first),
>>> it shouldn't be an issue for cloud offerings as cloud providers are
>> likely
>>> to upgrade their Kafka to the latest versions.
>>>
>>> The case 3b seems to be important to support, though -- the latest
>> version
>>> of everything should work at least as well (and preferably better) than
>>> previous ones.  It's possible to downgrade to case 1, but it's probably

Re: [DISCUSS] KIP-1021: Allow to get last stable offset (LSO) in kafka-get-offsets.sh

2024-02-28 Thread Andrew Schofield

Hi Ahmed,
Could do. Personally, I find the existing “--time -1” totally horrid anyway, 
which was why
I suggested an alternative. I think your suggestion of a flag for isolation 
level is much
better than -6.

Maybe I should put in a KIP which adds:
--latest (as a synonym for --time -1)
--earliest (as a synonym for --time -2)
--max-timestamp (as a synonym for --time -3)

That’s really what I would prefer. If the user has a timestamp, use `--time`. 
If they want a
specific special offset, use a separate flag.

Thanks,
Andrew

> On 28 Feb 2024, at 09:22, Ahmed Sobeh  wrote:
>
> Hi Andrew,
>
> Thanks for the hint! That sounds reasonable, do you think adding a
> conditional argument, something like "--time -1 --isolation -committed" and
> "--time -1 --isolation -uncommitted" would make sense to keep the
> consistency of getting the offset by time? or do you think having a special
> argument for this case is better?
>
> On Tue, Feb 27, 2024 at 2:19 PM Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
>> Hi Ahmed,
>> Thanks for the KIP.  It looks like a useful addition.
>>
>> The use of negative timestamps, and in particular letting the user use
>> `--time -1` or the equivalent `--time latest`
>> is a bit peculiar in this tool already. The negative timestamps come from
>> org.apache.kafka.common.requests.ListOffsetsRequest,
>> but you’re not actually adding another value to that. As a result, I
>> really wouldn’t recommend using -6 for the new
>> flag. LSO is really a variant of -1 with read_committed isolation level.
>>
>> I think that perhaps it would be better to add `--last-stable` as an
>> alternative to `--time`. Then you’ll get the LSO with
>> cleaner syntax.
>>
>> Thanks,
>> Andrew
>>
>>
>>> On 27 Feb 2024, at 10:12, Ahmed Sobeh 
>> wrote:
>>>
>>> Hi all,
>>> I would like to start a discussion on KIP-1021, which would enable
>> getting
>>> LSO in kafka-get-offsets.sh:
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1021%3A+Allow+to+get+last+stable+offset+%28LSO%29+in+kafka-get-offsets.sh
>>>
>>> Best,
>>> Ahmed
>>
>>
>
> --
> [image: Aiven] 
> *Ahmed Sobeh*
> Engineering Manager OSPO, *Aiven*
> ahmed.so...@aiven.io 
> aiven.io    |   
>     
> *Aiven Deutschland GmbH*
> Immanuelkirchstraße 26, 10405 Berlin
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> Amtsgericht Charlottenburg, HRB 209739 B

Re: [PR] MINOR: Use archive repository for older releases [kafka-site]

2024-02-28 Thread via GitHub



mimaison commented on code in PR #589:
URL: https://github.com/apache/kafka-site/pull/589#discussion_r1506070292


##
downloads.html:
##
@@ -109,16 +109,16 @@ 3.5.2
 
-https://downloads.apache.org/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes
+https://archive.apache.org/dist/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes

Review Comment:
   I've undone the changes related to 3.5.2.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: Assistance Needed with MM2 Heartbeat Topic Configuration

2024-02-28 Thread aaron ai

Hello,

Thank you very much for your suggestions and assistance. I have tried your
second method and successfully used `producer.override.bootstrap.servers`
as the key, achieving the desired effect. I appreciate your warm response
and valuable time. Thanks again!

Best regards,

On Tue, Feb 27, 2024 at 11:55 PM Ryanne Dolan  wrote:

> Hello, the heartbeat connector is a sink connector, so it normally would
> write to the target cluster. I can think of two ways to achieve what you
> want:
>
> 1) set up a second connect cluster that sinks to the source cluster, and
> run just the heartbeat connector there.
>
> 2) override the heartbeat connector's producer configuration to point to
> the source cluster. I forget the exact spelling, but something like
> `producer.bootstrap.servers` etc might work.
>
> Ryanne
>
> On Tue, Feb 27, 2024, 8:51 AM aaron ai  wrote:
>
> > Dear Kafka Community,
> >
> > I am reaching out for some guidance regarding an issue I've encountered
> > while setting up Kafka MirrorMaker2 (MM2) for data synchronization from a
> > source cluster (A) to a target cluster (B).
> >
> > During my setup with MM2 on dedicated mode, I observed that the heartbeat
> > topic is established on the source cluster, and the messages within the
> > heartbeat are being synchronized to the target cluster as expected.
> > However, when configuring MM2 on Kafka Connect using the
> > connect-distributed.properties file, it appears that I can only set the
> > bootstrap.servers to point to the endpoints of the target cluster B.
> > Consequently, when I manually establish the MirrorHeartbeatConnector, the
> > heartbeat topic also gets created on the target cluster instead of the
> > source.
> >
> > This leads me to wonder if this behavior is expected or if I might be
> > missing a step in the configuration process. My goal is to have the
> > heartbeat topic created on the source cluster. Could you please advise on
> > whether this is possible and, if so, how it can be achieved? Are there
> > specific configurations or steps that I should follow to ensure the
> > heartbeat topic is correctly established on the source cluster?
> >
> > I appreciate any insights or recommendations you might have on this
> matter.
> >
> > Best regards,
> >
>

[jira] [Resolved] (KAFKA-16311) [Debezium Informix Connector Unable to Commit Processed Log Position]

2024-02-28 Thread Hector Geraldino (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-16311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hector Geraldino resolved KAFKA-16311.
--
Resolution: Invalid

Please try reporting this to 
[Debezium|https://github.com/debezium/debezium-connector-informix]

https://issues.redhat.com/browse/DBZ

> [Debezium Informix Connector Unable to Commit Processed Log Position]
> -
>
> Key: KAFKA-16311
> URL: https://issues.apache.org/jira/browse/KAFKA-16311
> Project: Kafka
>  Issue Type: Bug
>Reporter: Maaheen Yasin
>Priority: Blocker
> Attachments: connect logs.out
>
>
> I am using Debezium Informix Source connector and JDBC Sink connector and the 
> below versions of Informix database and KAFKA Connect.
> Informix Dynamic Server
> 14.10.FC10W1X2
> Informix JDBC Driver for Informix Dynamic Server
> 4.50.JC10
> KAFKA Version: 7.4.1-ce
>  
> *Expected Behavior:*
> All tasks of the Informix source connector are running, and all messages are 
> being published in the topic. During the DDL Execution, the informix database 
> is put under single user mode and the DDL on the table on which CDC was 
> previously enabled was executed. After the database exits from the single 
> user mode, then the connector should be able to reconnect with the source 
> database and be able to publish messages in the topic for each new event. 
> *Actual Behavior:*
> All tasks of the Informix source connector are running, and all messages are 
> being published in the topic. During the DDL Execution, the database is put 
> under single user mode and the DDL on the table on which CDC was previously 
> enabled was executed. After the database exits from the single user mode, the 
> source connector is able to reconnect with the database, however, no messages 
> are being published in the topic and the below error is being printed in the 
> KAFKA Connect Logs. 
>  
> *[2024-02-22 15:54:34,913] WARN [kafka_devs|task-0|offsets] Couldn't commit 
> processed log positions with the source database due to a concurrent 
> connector shutdown or restart 
> (io.debezium.connector.common.BaseSourceTask:349)*
>  
> The complete KAFKA Connect logs has been attached. Kindly comment on why this 
> issue is occurring and what steps should be followed to avoid this issue or 
> to resolve this issue. 
>  
> Thanks. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] MINOR: Use archive repository for older releases [kafka-site]

2024-02-28 Thread via GitHub



mimaison commented on code in PR #589:
URL: https://github.com/apache/kafka-site/pull/589#discussion_r1506019540


##
downloads.html:
##
@@ -109,16 +109,16 @@ 3.5.2
 
-https://downloads.apache.org/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes
+https://archive.apache.org/dist/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes

Review Comment:
   My understanding is that we use "best judgment" to select what to delete so 
it may vary each time. Maybe deciding to keep all not EOL'ed releases in the 
mirroring system would be a better process. 
   I'll undo the 3.5.2 changes and will keep this release in the mirroring 
system for now.
   
   > Is there something we can do in the release process to enforce this?
   Some steps have to be performed by PMC members and once a release is 
announced it's easy to move onto something else and not complete the post 
release steps. If possible, yes we should try to automate as much as possible 
but it's often tricky.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2682

2024-02-28 Thread Apache Jenkins Server

See

Re: [PR] MINOR: Use archive repository for older releases [kafka-site]

2024-02-28 Thread via GitHub



divijvaidya commented on code in PR #589:
URL: https://github.com/apache/kafka-site/pull/589#discussion_r1505961253


##
downloads.html:
##
@@ -109,16 +109,16 @@ 3.5.2
 
-https://downloads.apache.org/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes
+https://archive.apache.org/dist/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes

Review Comment:
   I don't have an objection but I want to ensure that we have some fixed 
criteria on what to move to archive and what not to. I agree it's unlikely that 
we will have a bug fix version of 3.5.x until it is EOL in June but 
technically, it's still not EOL'ed.
   
   I am ok with the criteria being, best judgement of the PMC member at the 
time of release of a minor version. My motivation is to ensure that we have a 
criteria which all PMC members could follow in future.
   
   > We have a tendency to forget to clean it up
   
   Is there something we can do in the release process to enforce this? I was 
thinking of having a multi-stage script which is run by PMC member and it will 
prompt whether this is done or not (or automatically do it). We can take the 
help of community to improve this as part of 
https://issues.apache.org/jira/browse/KAFKA-15198 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [DISCUSS] KIP-853: KRaft Controller Membership Changes

2024-02-28 Thread Luke Chen

> 2. After "RemoveVoter", what is the role of the node?
> It looks like after the voter got removed from the voter set, it is not a
> voter anymore. But I think it can still fetch with the leader. So it
should
> be an observer, with a "process.role=controller"? And if the node was
> originally "process.role=controller,broker", it'll become a broker-only
> node?

> Kafka nodes need to allow for controllers that are not voters. I don't
expect too many issues from an implementation point of view. Most of
it may just be aggressive validation in KafkaConfig. I think the
easier way to explain this state is that there will be controllers
that will never become active controllers. If we want, we can have a
monitor that turns on (1) if a node is in this state. What do you
think?

I agree we have a way for users to monitor the node state, like when does
the controller completed the voter removal ( so that it is safe to be
shutdown), or when does the controller completed the voter addition (so
that users can start to add another controller), etc.

10. controller.quorum.voters:
This is an existing configuration. This configuration describes the state
of the quorum and will only be used if the kraft.version feature is 0.
> From the discussion, it looks like even if the kraft.version is 1, we
still first check the `controller.quorum.voters` if
`controller.quorum.bootstrap.servers` is not set. Is that correct? If so,
maybe we need to update the description?

11. When a controller starts up before joining as a voter, it'll be an
observer. In this case, will it be shown in the observer field of
`kafka-metadata-quorum describe --status`? Same question to a controller
after getting removed.

12. What will happen if there's only 1 voter and user still tries to remove
the voter? Any error returned?

Thanks.
Luke



On Thu, Jan 25, 2024 at 7:50 AM José Armando García Sancio
 wrote:

> Thanks for the feedback Luke. See my comments below:
>
> On Wed, Jan 24, 2024 at 4:20 AM Luke Chen  wrote:
> > 1. About "VotersRecord":
> >
> > > When a KRaft voter becomes leader it will write a KRaftVersionRecord
> and
> > VotersRecord to the log if the log or the latest snapshot doesn't contain
> > any VotersRecord. This is done to make sure that the voter set in the
> > bootstrap snapshot gets replicated to all of the voters and to not rely
> on
> > all of the voters being configured with the same bootstrapped voter set.
> >
> > > This record will also be written to the log if it has never been
> written
> > to the log in the past. This semantic is nice to have to consistently
> > replicate the bootstrapping snapshot, at
> > -00.checkpoint, of the leader to all of the
> > voters.
> >
> > If the `VotersRecord` has written into
> > -00.checkpoint,
> > later, a new voter added. Will we write a new checkpoint to the file?
> > If so, does that mean the `metadata.log.max.snapshot.interval.ms` will
> be
> > ignored?
>
> KRaft (KafkaRaftClient) won't initiate the snapshot generation. The
> snapshot generation will be initiated by the state machine (controller
> or broker) using the RaftClient::createSnapshot method. When the state
> machine calls into RaftClient::createSnapshot the KafkaRaftClient will
> compute the set of voters at the provided offset and epoch, and write
> the VotersRecord after the SnapshotHeaderRecord. This does mean that
> the KafkaRaftClient needs to store in memory all of the voter set
> configurations between the RaftClient::latestSnapshotId and the LEO
> for the KRaft partition.
>
> > If not, then how could we make sure the voter set in the bootstrap
> snapshot
> > gets replicated to all of the voters and to not rely on all of the voters
> > being configured with the same bootstrapped voter set?
>
> I think my answer above should answer your question. VoterRecord-s
> will be in the log (log segments) and the snapshots so they will be
> replicated by Fetch and FetchSnapshot. When the voter set is changed
> or bootstrapped, the leader will write the VotersRecord to the log
> (active log segment). When the state machine (controller or broker)
> asks to create a snapshot, KRaft will write the VotersRecord at the
> start to the snapshot after the SnapshotHeaderRecord.
>
> > 2. After "RemoveVoter", what is the role of the node?
> > It looks like after the voter got removed from the voter set, it is not a
> > voter anymore. But I think it can still fetch with the leader. So it
> should
> > be an observer, with a "process.role=controller"? And if the node was
> > originally "process.role=controller,broker", it'll become a broker-only
> > node?
>
> Kafka nodes need to allow for controllers that are not voters. I don't
> expect too many issues from an implementation point of view. Most of
> it may just be aggressive validation in KafkaConfig. I think the
> easier way to explain this state is that there will be controllers
> that will never become active controllers. If we want,

Re: [PR] MINOR: Use archive repository for older releases [kafka-site]

2024-02-28 Thread via GitHub



mimaison commented on code in PR #589:
URL: https://github.com/apache/kafka-site/pull/589#discussion_r1505877761


##
downloads.html:
##
@@ -109,16 +109,16 @@ 3.5.2
 
-https://downloads.apache.org/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes
+https://archive.apache.org/dist/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes

Review Comment:
   Apache asks us to remove older releases from the mirroring system to reduce 
the load. Currently we have the following releases in the mirroring system: 
3.4.1, 3.5.0, 3.5.1, 3.5.2, 3.6.0, 3.6.1 and 3.7.0.
   
   So I'm doing a bit of cleanup as it seems it was missed in previous 
releases. As per the link you shared above:
   
   > Each project's distribution directory should contain the latest release in 
each branch that is currently under development. When development ceases on a 
version branch, the PMC should remove links to releases of that branch from 
their download directory.
   
   While we still support 3.5.2, I wouldn't say it's currently under 
development and no new releases for that branch are planned. So I was planning 
on deleting 3.4.1, 3.5.0, 3.5.1, 3.5.2 and 3.6.0 from the mirroring system. The 
releases are still downloadable via the archive repos and via maven. Also as 
3.5.2 is below 3.6.0, which should definitively be removed from the mirroring 
system, in the downloads page, I thought it was just simpler to switch to the 
archive links for 3.5.2.
   
   If you have a strong objection I'm happy to keep 3.5.2 in the mirroring 
system for now. We have a tendency to forget to clean it up, as you can see in 
https://issues.apache.org/jira/browse/KAFKA-6223, so each time we delete quite 
a few versions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] MINOR: Use archive repository for older releases [kafka-site]

2024-02-28 Thread via GitHub



divijvaidya commented on code in PR #589:
URL: https://github.com/apache/kafka-site/pull/589#discussion_r1505808343


##
downloads.html:
##
@@ -109,16 +109,16 @@ 3.5.2
 
-https://downloads.apache.org/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes
+https://archive.apache.org/dist/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes

Review Comment:
   3.5 branch is not EOL yet (it's first release will complete 12 months in 
June 2024). Why are we moving to archive for this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] MINOR: Use archive repository for older releases [kafka-site]

2024-02-28 Thread via GitHub



divijvaidya commented on code in PR #589:
URL: https://github.com/apache/kafka-site/pull/589#discussion_r1505808343


##
downloads.html:
##
@@ -109,16 +109,16 @@ 3.5.2
 
-https://downloads.apache.org/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes
+https://archive.apache.org/dist/kafka/3.5.2/RELEASE_NOTES.html;>Release 
Notes

Review Comment:
   3.5 branch is not EOL yet. Why are we moving to archive for this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [kafka-clients] Re: [ANNOUNCE] Apache Kafka 3.7.0

2024-02-28 Thread Apoorv Mittal

Thanks Stansilav for running the release. That must be exhausting but you
executed it well.

Regards,
Apoorv Mittal
+44 7721681581


On Wed, Feb 28, 2024 at 11:22 AM Divij Vaidya 
wrote:

> Thank you Stanislav for running the release, especially fixing the whole
> mess with out of sync site docs in different branches. Really appreciate
> your hard work on this one.
>
> Thank you all contributors! Your contributions is what makes Apache Kafka
> community awesome <3
>
> There are many impactful changes in this release but the one closest to my
> heart is https://issues.apache.org/jira/browse/KAFKA-15046. I am very glad
> this is fixed. The P999 latency spikes were driving me crazy for a long
> time now.
>
> --
> Divij Vaidya
>
>
>
> On Wed, Feb 28, 2024 at 10:06 AM Satish Duggana 
> wrote:
>
> > Thanks Stanislav for all your hard work on running the release. Thanks
> > to all the contributors to this release.
> >
> >
> > On Wed, 28 Feb 2024 at 13:43, Bruno Cadonna  wrote:
> > >
> > > Thanks Stan and all contributors for the release!
> > >
> > > Best,
> > > Bruno
> > >
> > > On 2/27/24 7:01 PM, Stanislav Kozlovski wrote:
> > > > The Apache Kafka community is pleased to announce the release of
> > > > Apache Kafka 3.7.0
> > > >
> > > > This is a minor release that includes new features, fixes, and
> > > > improvements from 296 JIRAs
> > > >
> > > > An overview of the release and its notable changes can be found in
> the
> > > > release blog post:
> > > > https://kafka.apache.org/blog#apache_kafka_370_release_announcement
> > > >
> > > > All of the changes in this release can be found in the release notes:
> > > > https://www.apache.org/dist/kafka/3.7.0/RELEASE_NOTES.html
> > > >
> > > > You can download the source and binary release (Scala 2.12, 2.13)
> from:
> > > > https://kafka.apache.org/downloads#3.7.0
> > > >
> > > >
> >
> ---
> > > >
> > > >
> > > > Apache Kafka is a distributed streaming platform with four core APIs:
> > > >
> > > >
> > > > ** The Producer API allows an application to publish a stream of
> > records to
> > > > one or more Kafka topics.
> > > >
> > > > ** The Consumer API allows an application to subscribe to one or more
> > > > topics and process the stream of records produced to them.
> > > >
> > > > ** The Streams API allows an application to act as a stream
> processor,
> > > > consuming an input stream from one or more topics and producing an
> > > > output stream to one or more output topics, effectively transforming
> > the
> > > > input streams to output streams.
> > > >
> > > > ** The Connector API allows building and running reusable producers
> or
> > > > consumers that connect Kafka topics to existing applications or data
> > > > systems. For example, a connector to a relational database might
> > > > capture every change to a table.
> > > >
> > > >
> > > > With these APIs, Kafka can be used for two broad classes of
> > application:
> > > >
> > > > ** Building real-time streaming data pipelines that reliably get data
> > > > between systems or applications.
> > > >
> > > > ** Building real-time streaming applications that transform or react
> > > > to the streams of data.
> > > >
> > > >
> > > > Apache Kafka is in use at large and small companies worldwide,
> > including
> > > > Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest,
> > Rabobank,
> > > > Target, The New York Times, Uber, Yelp, and Zalando, among others.
> > > >
> > > > A big thank you to the following 146 contributors to this release!
> > > > (Please report an unintended omission)
> > > >
> > > > Abhijeet Kumar, Akhilesh Chaganti, Alieh, Alieh Saeedi, Almog Gavra,
> > > > Alok Thatikunta, Alyssa Huang, Aman Singh, Andras Katona, Andrew
> > > > Schofield, Anna Sophie Blee-Goldman, Anton Agestam, Apoorv Mittal,
> > > > Arnout Engelen, Arpit Goyal, Artem Livshits, Ashwin Pankaj,
> > > > ashwinpankaj, atu-sharm, bachmanity1, Bob Barrett, Bruno Cadonna,
> > > > Calvin Liu, Cerchie, chern, Chris Egerton, Christo Lolov, Colin
> > > > Patrick McCabe, Colt McNealy, Crispin Bernier, David Arthur, David
> > > > Jacot, David Mao, Deqi Hu, Dimitar Dimitrov, Divij Vaidya, Dongnuo
> > > > Lyu, Eaugene Thomas, Eduwer Camacaro, Eike Thaden, Federico Valeri,
> > > > Florin Akermann, Gantigmaa Selenge, Gaurav Narula, gongzhongqiang,
> > > > Greg Harris, Guozhang Wang, Gyeongwon, Do, Hailey Ni, Hanyu Zheng,
> Hao
> > > > Li, Hector Geraldino, hudeqi, Ian McDonald, Iblis Lin, Igor Soarez,
> > > > iit2009060, Ismael Juma, Jakub Scholz, James Cheng, Jason Gustafson,
> > > > Jay Wang, Jeff Kim, Jim Galasyn, John Roesler, Jorge Esteban Quilcate
> > > > Otoya, Josep Prat, José Armando García Sancio, Jotaniya Jeel, Jouni
> > > > Tenhunen, Jun Rao, Justine Olshan, Kamal Chandraprakash, Kirk True,
> > > > kpatelatwork, kumarpritam863, Laglangyue, Levani Kokhreidze, Lianet
> > > > Magrans, Liu Zeyu, Lucas Brutschy, Lucia Cerchie, Luke

Re: [kafka-clients] Re: [ANNOUNCE] Apache Kafka 3.7.0

2024-02-28 Thread Divij Vaidya

Thank you Stanislav for running the release, especially fixing the whole
mess with out of sync site docs in different branches. Really appreciate
your hard work on this one.

Thank you all contributors! Your contributions is what makes Apache Kafka
community awesome <3

There are many impactful changes in this release but the one closest to my
heart is https://issues.apache.org/jira/browse/KAFKA-15046. I am very glad
this is fixed. The P999 latency spikes were driving me crazy for a long
time now.

--
Divij Vaidya



On Wed, Feb 28, 2024 at 10:06 AM Satish Duggana 
wrote:

> Thanks Stanislav for all your hard work on running the release. Thanks
> to all the contributors to this release.
>
>
> On Wed, 28 Feb 2024 at 13:43, Bruno Cadonna  wrote:
> >
> > Thanks Stan and all contributors for the release!
> >
> > Best,
> > Bruno
> >
> > On 2/27/24 7:01 PM, Stanislav Kozlovski wrote:
> > > The Apache Kafka community is pleased to announce the release of
> > > Apache Kafka 3.7.0
> > >
> > > This is a minor release that includes new features, fixes, and
> > > improvements from 296 JIRAs
> > >
> > > An overview of the release and its notable changes can be found in the
> > > release blog post:
> > > https://kafka.apache.org/blog#apache_kafka_370_release_announcement
> > >
> > > All of the changes in this release can be found in the release notes:
> > > https://www.apache.org/dist/kafka/3.7.0/RELEASE_NOTES.html
> > >
> > > You can download the source and binary release (Scala 2.12, 2.13) from:
> > > https://kafka.apache.org/downloads#3.7.0
> > >
> > >
> ---
> > >
> > >
> > > Apache Kafka is a distributed streaming platform with four core APIs:
> > >
> > >
> > > ** The Producer API allows an application to publish a stream of
> records to
> > > one or more Kafka topics.
> > >
> > > ** The Consumer API allows an application to subscribe to one or more
> > > topics and process the stream of records produced to them.
> > >
> > > ** The Streams API allows an application to act as a stream processor,
> > > consuming an input stream from one or more topics and producing an
> > > output stream to one or more output topics, effectively transforming
> the
> > > input streams to output streams.
> > >
> > > ** The Connector API allows building and running reusable producers or
> > > consumers that connect Kafka topics to existing applications or data
> > > systems. For example, a connector to a relational database might
> > > capture every change to a table.
> > >
> > >
> > > With these APIs, Kafka can be used for two broad classes of
> application:
> > >
> > > ** Building real-time streaming data pipelines that reliably get data
> > > between systems or applications.
> > >
> > > ** Building real-time streaming applications that transform or react
> > > to the streams of data.
> > >
> > >
> > > Apache Kafka is in use at large and small companies worldwide,
> including
> > > Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest,
> Rabobank,
> > > Target, The New York Times, Uber, Yelp, and Zalando, among others.
> > >
> > > A big thank you to the following 146 contributors to this release!
> > > (Please report an unintended omission)
> > >
> > > Abhijeet Kumar, Akhilesh Chaganti, Alieh, Alieh Saeedi, Almog Gavra,
> > > Alok Thatikunta, Alyssa Huang, Aman Singh, Andras Katona, Andrew
> > > Schofield, Anna Sophie Blee-Goldman, Anton Agestam, Apoorv Mittal,
> > > Arnout Engelen, Arpit Goyal, Artem Livshits, Ashwin Pankaj,
> > > ashwinpankaj, atu-sharm, bachmanity1, Bob Barrett, Bruno Cadonna,
> > > Calvin Liu, Cerchie, chern, Chris Egerton, Christo Lolov, Colin
> > > Patrick McCabe, Colt McNealy, Crispin Bernier, David Arthur, David
> > > Jacot, David Mao, Deqi Hu, Dimitar Dimitrov, Divij Vaidya, Dongnuo
> > > Lyu, Eaugene Thomas, Eduwer Camacaro, Eike Thaden, Federico Valeri,
> > > Florin Akermann, Gantigmaa Selenge, Gaurav Narula, gongzhongqiang,
> > > Greg Harris, Guozhang Wang, Gyeongwon, Do, Hailey Ni, Hanyu Zheng, Hao
> > > Li, Hector Geraldino, hudeqi, Ian McDonald, Iblis Lin, Igor Soarez,
> > > iit2009060, Ismael Juma, Jakub Scholz, James Cheng, Jason Gustafson,
> > > Jay Wang, Jeff Kim, Jim Galasyn, John Roesler, Jorge Esteban Quilcate
> > > Otoya, Josep Prat, José Armando García Sancio, Jotaniya Jeel, Jouni
> > > Tenhunen, Jun Rao, Justine Olshan, Kamal Chandraprakash, Kirk True,
> > > kpatelatwork, kumarpritam863, Laglangyue, Levani Kokhreidze, Lianet
> > > Magrans, Liu Zeyu, Lucas Brutschy, Lucia Cerchie, Luke Chen, maniekes,
> > > Manikumar Reddy, mannoopj, Maros Orsak, Matthew de Detrich, Matthias
> > > J. Sax, Max Riedel, Mayank Shekhar Narula, Mehari Beyene, Michael
> > > Westerby, Mickael Maison, Nick Telford, Nikhil Ramakrishnan, Nikolay,
> > > Okada Haruki, olalamichelle, Omnia G.H Ibrahim, Owen Leung, Paolo
> > > Patierno, Philip Nee, Phuc-Hong-Tran, Proven Provenzano, Purshotam
> > > Chauhan, Qichao Chu,

Re: [kafka-clients] Re: [ANNOUNCE] Apache Kafka 3.7.0

2024-02-28 Thread Manikumar

Thanks Stanislav for running the release!



On Wed, Feb 28, 2024 at 2:36 PM Satish Duggana 
wrote:

> Thanks Stanislav for all your hard work on running the release. Thanks
> to all the contributors to this release.
>
>
> On Wed, 28 Feb 2024 at 13:43, Bruno Cadonna  wrote:
> >
> > Thanks Stan and all contributors for the release!
> >
> > Best,
> > Bruno
> >
> > On 2/27/24 7:01 PM, Stanislav Kozlovski wrote:
> > > The Apache Kafka community is pleased to announce the release of
> > > Apache Kafka 3.7.0
> > >
> > > This is a minor release that includes new features, fixes, and
> > > improvements from 296 JIRAs
> > >
> > > An overview of the release and its notable changes can be found in the
> > > release blog post:
> > > https://kafka.apache.org/blog#apache_kafka_370_release_announcement
> > >
> > > All of the changes in this release can be found in the release notes:
> > > https://www.apache.org/dist/kafka/3.7.0/RELEASE_NOTES.html
> > >
> > > You can download the source and binary release (Scala 2.12, 2.13) from:
> > > https://kafka.apache.org/downloads#3.7.0
> > >
> > >
> ---
> > >
> > >
> > > Apache Kafka is a distributed streaming platform with four core APIs:
> > >
> > >
> > > ** The Producer API allows an application to publish a stream of
> records to
> > > one or more Kafka topics.
> > >
> > > ** The Consumer API allows an application to subscribe to one or more
> > > topics and process the stream of records produced to them.
> > >
> > > ** The Streams API allows an application to act as a stream processor,
> > > consuming an input stream from one or more topics and producing an
> > > output stream to one or more output topics, effectively transforming
> the
> > > input streams to output streams.
> > >
> > > ** The Connector API allows building and running reusable producers or
> > > consumers that connect Kafka topics to existing applications or data
> > > systems. For example, a connector to a relational database might
> > > capture every change to a table.
> > >
> > >
> > > With these APIs, Kafka can be used for two broad classes of
> application:
> > >
> > > ** Building real-time streaming data pipelines that reliably get data
> > > between systems or applications.
> > >
> > > ** Building real-time streaming applications that transform or react
> > > to the streams of data.
> > >
> > >
> > > Apache Kafka is in use at large and small companies worldwide,
> including
> > > Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest,
> Rabobank,
> > > Target, The New York Times, Uber, Yelp, and Zalando, among others.
> > >
> > > A big thank you to the following 146 contributors to this release!
> > > (Please report an unintended omission)
> > >
> > > Abhijeet Kumar, Akhilesh Chaganti, Alieh, Alieh Saeedi, Almog Gavra,
> > > Alok Thatikunta, Alyssa Huang, Aman Singh, Andras Katona, Andrew
> > > Schofield, Anna Sophie Blee-Goldman, Anton Agestam, Apoorv Mittal,
> > > Arnout Engelen, Arpit Goyal, Artem Livshits, Ashwin Pankaj,
> > > ashwinpankaj, atu-sharm, bachmanity1, Bob Barrett, Bruno Cadonna,
> > > Calvin Liu, Cerchie, chern, Chris Egerton, Christo Lolov, Colin
> > > Patrick McCabe, Colt McNealy, Crispin Bernier, David Arthur, David
> > > Jacot, David Mao, Deqi Hu, Dimitar Dimitrov, Divij Vaidya, Dongnuo
> > > Lyu, Eaugene Thomas, Eduwer Camacaro, Eike Thaden, Federico Valeri,
> > > Florin Akermann, Gantigmaa Selenge, Gaurav Narula, gongzhongqiang,
> > > Greg Harris, Guozhang Wang, Gyeongwon, Do, Hailey Ni, Hanyu Zheng, Hao
> > > Li, Hector Geraldino, hudeqi, Ian McDonald, Iblis Lin, Igor Soarez,
> > > iit2009060, Ismael Juma, Jakub Scholz, James Cheng, Jason Gustafson,
> > > Jay Wang, Jeff Kim, Jim Galasyn, John Roesler, Jorge Esteban Quilcate
> > > Otoya, Josep Prat, José Armando García Sancio, Jotaniya Jeel, Jouni
> > > Tenhunen, Jun Rao, Justine Olshan, Kamal Chandraprakash, Kirk True,
> > > kpatelatwork, kumarpritam863, Laglangyue, Levani Kokhreidze, Lianet
> > > Magrans, Liu Zeyu, Lucas Brutschy, Lucia Cerchie, Luke Chen, maniekes,
> > > Manikumar Reddy, mannoopj, Maros Orsak, Matthew de Detrich, Matthias
> > > J. Sax, Max Riedel, Mayank Shekhar Narula, Mehari Beyene, Michael
> > > Westerby, Mickael Maison, Nick Telford, Nikhil Ramakrishnan, Nikolay,
> > > Okada Haruki, olalamichelle, Omnia G.H Ibrahim, Owen Leung, Paolo
> > > Patierno, Philip Nee, Phuc-Hong-Tran, Proven Provenzano, Purshotam
> > > Chauhan, Qichao Chu, Matthias J. Sax, Rajini Sivaram, Renaldo Baur
> > > Filho, Ritika Reddy, Robert Wagner, Rohan, Ron Dagostino, Roon, runom,
> > > Ruslan Krivoshein, rykovsi, Sagar Rao, Said Boudjelda, Satish Duggana,
> > > shuoer86, Stanislav Kozlovski, Taher Ghaleb, Tang Yunzi, TapDang,
> > > Taras Ledkov, tkuramoto33, Tyler Bertrand, vamossagar12, Vedarth
> > > Sharma, Viktor Somogyi-Vass, Vincent Jiang, Walker Carlson,
> > > Wuzhengyu97, Xavier Léauté, Xiaobing Fang, yangy, Ritika Reddy,

[jira] [Created] (KAFKA-16311) [Debezium Informix Connector Unable to Commit Processed Log Position]

2024-02-28 Thread Maaheen Yasin (Jira)

Maaheen Yasin created KAFKA-16311:
-

 Summary: [Debezium Informix Connector Unable to Commit Processed 
Log Position]
 Key: KAFKA-16311
 URL: https://issues.apache.org/jira/browse/KAFKA-16311
 Project: Kafka
  Issue Type: Bug
Reporter: Maaheen Yasin
 Attachments: connect logs.out

I am using Debezium Informix Source connector and JDBC Sink connector and the 
below versions of Informix database and KAFKA Connect.

Informix Dynamic Server
14.10.FC10W1X2
Informix JDBC Driver for Informix Dynamic Server
4.50.JC10

KAFKA Version: 7.4.1-ce

 

*Expected Behavior:*

All tasks of the Informix source connector are running, and all messages are 
being published in the topic. During the DDL Execution, the informix database 
is put under single user mode and the DDL on the table on which CDC was 
previously enabled was executed. After the database exits from the single user 
mode, then the connector should be able to reconnect with the source database 
and be able to publish messages in the topic for each new event. 

*Actual Behavior:*

All tasks of the Informix source connector are running, and all messages are 
being published in the topic. During the DDL Execution, the database is put 
under single user mode and the DDL on the table on which CDC was previously 
enabled was executed. After the database exits from the single user mode, the 
source connector is able to reconnect with the database, however, no messages 
are being published in the topic and the below error is being printed in the 
KAFKA Connect Logs. 

 

*[2024-02-22 15:54:34,913] WARN [kafka_devs|task-0|offsets] Couldn't commit 
processed log positions with the source database due to a concurrent connector 
shutdown or restart (io.debezium.connector.common.BaseSourceTask:349)*

 

The complete KAFKA Connect logs has been attached. Kindly comment on why this 
issue is occurring and what steps should be followed to avoid this issue or to 
resolve this issue. 

 

Thanks. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-16310) ListOffsets doesn't report the offset with maxTimestamp anymore

2024-02-28 Thread Emanuele Sabellico (Jira)

Emanuele Sabellico created KAFKA-16310:
--

 Summary: ListOffsets doesn't report the offset with maxTimestamp 
anymore
 Key: KAFKA-16310
 URL: https://issues.apache.org/jira/browse/KAFKA-16310
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.7.0
Reporter: Emanuele Sabellico


The last one is reported instead.
A test in librdkafka (0081/do_test_ListOffsets) is failing an it's checking 
that the offset with the max timestamp is the middle one and not the last one. 
The tests is passing with 3.6.0 and previous versions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] MINOR: Use archive repository for older releases [kafka-site]

2024-02-28 Thread via GitHub



mimaison opened a new pull request, #589:
URL: https://github.com/apache/kafka-site/pull/589

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [DISCUSS] KIP-1021: Allow to get last stable offset (LSO) in kafka-get-offsets.sh

2024-02-28 Thread Ahmed Sobeh

Hi Andrew,

Thanks for the hint! That sounds reasonable, do you think adding a
conditional argument, something like "--time -1 --isolation -committed" and
"--time -1 --isolation -uncommitted" would make sense to keep the
consistency of getting the offset by time? or do you think having a special
argument for this case is better?

On Tue, Feb 27, 2024 at 2:19 PM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi Ahmed,
> Thanks for the KIP.  It looks like a useful addition.
>
> The use of negative timestamps, and in particular letting the user use
> `--time -1` or the equivalent `--time latest`
> is a bit peculiar in this tool already. The negative timestamps come from
> org.apache.kafka.common.requests.ListOffsetsRequest,
> but you’re not actually adding another value to that. As a result, I
> really wouldn’t recommend using -6 for the new
> flag. LSO is really a variant of -1 with read_committed isolation level.
>
> I think that perhaps it would be better to add `--last-stable` as an
> alternative to `--time`. Then you’ll get the LSO with
> cleaner syntax.
>
> Thanks,
> Andrew
>
>
> > On 27 Feb 2024, at 10:12, Ahmed Sobeh 
> wrote:
> >
> > Hi all,
> > I would like to start a discussion on KIP-1021, which would enable
> getting
> > LSO in kafka-get-offsets.sh:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1021%3A+Allow+to+get+last+stable+offset+%28LSO%29+in+kafka-get-offsets.sh
> >
> > Best,
> > Ahmed
>
>

-- 
[image: Aiven] 
*Ahmed Sobeh*
Engineering Manager OSPO, *Aiven*
ahmed.so...@aiven.io 
aiven.io    |   
     
*Aiven Deutschland GmbH*
Immanuelkirchstraße 26, 10405 Berlin
Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
Amtsgericht Charlottenburg, HRB 209739 B

Re: [DISCUSS] Apache Kafka 3.8.0 release

2024-02-28 Thread Satish Duggana

Thanks Josep, +1.

On Tue, 27 Feb 2024 at 17:29, Divij Vaidya  wrote:
>
> Thank you for volunteering Josep. +1 from me.
>
> --
> Divij Vaidya
>
>
>
> On Tue, Feb 27, 2024 at 9:35 AM Bruno Cadonna  wrote:
>
> > Thanks Josep!
> >
> > +1
> >
> > Best,
> > Bruno
> >
> > On 2/26/24 9:53 PM, Chris Egerton wrote:
> > > Thanks Josep, I'm +1 as well.
> > >
> > > On Mon, Feb 26, 2024 at 12:32 PM Justine Olshan
> > >  wrote:
> > >
> > >> Thanks Joesp. +1 from me.
> > >>
> > >> On Mon, Feb 26, 2024 at 3:37 AM Josep Prat  > >
> > >> wrote:
> > >>
> > >>> Hi all,
> > >>>
> > >>> I'd like to volunteer as release manager for the Apache Kafka 3.8.0
> > >>> release.
> > >>> If there are no objections, I'll start building a release plan (or
> > >> adapting
> > >>> the one Colin made some weeks ago) in the wiki in the next days.
> > >>>
> > >>> Thank you.
> > >>>
> > >>> --
> > >>> [image: Aiven] 
> > >>>
> > >>> *Josep Prat*
> > >>> Open Source Engineering Director, *Aiven*
> > >>> josep.p...@aiven.io   |   +491715557497
> > >>> aiven.io    |   <
> > >> https://www.facebook.com/aivencloud
> > 
> > >>>   <
> > >>> https://twitter.com/aiven_io>
> > >>> *Aiven Deutschland GmbH*
> > >>> Alexanderufer 3-7, 10117 Berlin
> > >>> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > >>> Amtsgericht Charlottenburg, HRB 209739 B
> > >>>
> > >>
> > >
> >

Re: [kafka-clients] Re: [ANNOUNCE] Apache Kafka 3.7.0

2024-02-28 Thread Satish Duggana

Thanks Stanislav for all your hard work on running the release. Thanks
to all the contributors to this release.


On Wed, 28 Feb 2024 at 13:43, Bruno Cadonna  wrote:
>
> Thanks Stan and all contributors for the release!
>
> Best,
> Bruno
>
> On 2/27/24 7:01 PM, Stanislav Kozlovski wrote:
> > The Apache Kafka community is pleased to announce the release of
> > Apache Kafka 3.7.0
> >
> > This is a minor release that includes new features, fixes, and
> > improvements from 296 JIRAs
> >
> > An overview of the release and its notable changes can be found in the
> > release blog post:
> > https://kafka.apache.org/blog#apache_kafka_370_release_announcement
> >
> > All of the changes in this release can be found in the release notes:
> > https://www.apache.org/dist/kafka/3.7.0/RELEASE_NOTES.html
> >
> > You can download the source and binary release (Scala 2.12, 2.13) from:
> > https://kafka.apache.org/downloads#3.7.0
> >
> > ---
> >
> >
> > Apache Kafka is a distributed streaming platform with four core APIs:
> >
> >
> > ** The Producer API allows an application to publish a stream of records to
> > one or more Kafka topics.
> >
> > ** The Consumer API allows an application to subscribe to one or more
> > topics and process the stream of records produced to them.
> >
> > ** The Streams API allows an application to act as a stream processor,
> > consuming an input stream from one or more topics and producing an
> > output stream to one or more output topics, effectively transforming the
> > input streams to output streams.
> >
> > ** The Connector API allows building and running reusable producers or
> > consumers that connect Kafka topics to existing applications or data
> > systems. For example, a connector to a relational database might
> > capture every change to a table.
> >
> >
> > With these APIs, Kafka can be used for two broad classes of application:
> >
> > ** Building real-time streaming data pipelines that reliably get data
> > between systems or applications.
> >
> > ** Building real-time streaming applications that transform or react
> > to the streams of data.
> >
> >
> > Apache Kafka is in use at large and small companies worldwide, including
> > Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
> > Target, The New York Times, Uber, Yelp, and Zalando, among others.
> >
> > A big thank you to the following 146 contributors to this release!
> > (Please report an unintended omission)
> >
> > Abhijeet Kumar, Akhilesh Chaganti, Alieh, Alieh Saeedi, Almog Gavra,
> > Alok Thatikunta, Alyssa Huang, Aman Singh, Andras Katona, Andrew
> > Schofield, Anna Sophie Blee-Goldman, Anton Agestam, Apoorv Mittal,
> > Arnout Engelen, Arpit Goyal, Artem Livshits, Ashwin Pankaj,
> > ashwinpankaj, atu-sharm, bachmanity1, Bob Barrett, Bruno Cadonna,
> > Calvin Liu, Cerchie, chern, Chris Egerton, Christo Lolov, Colin
> > Patrick McCabe, Colt McNealy, Crispin Bernier, David Arthur, David
> > Jacot, David Mao, Deqi Hu, Dimitar Dimitrov, Divij Vaidya, Dongnuo
> > Lyu, Eaugene Thomas, Eduwer Camacaro, Eike Thaden, Federico Valeri,
> > Florin Akermann, Gantigmaa Selenge, Gaurav Narula, gongzhongqiang,
> > Greg Harris, Guozhang Wang, Gyeongwon, Do, Hailey Ni, Hanyu Zheng, Hao
> > Li, Hector Geraldino, hudeqi, Ian McDonald, Iblis Lin, Igor Soarez,
> > iit2009060, Ismael Juma, Jakub Scholz, James Cheng, Jason Gustafson,
> > Jay Wang, Jeff Kim, Jim Galasyn, John Roesler, Jorge Esteban Quilcate
> > Otoya, Josep Prat, José Armando García Sancio, Jotaniya Jeel, Jouni
> > Tenhunen, Jun Rao, Justine Olshan, Kamal Chandraprakash, Kirk True,
> > kpatelatwork, kumarpritam863, Laglangyue, Levani Kokhreidze, Lianet
> > Magrans, Liu Zeyu, Lucas Brutschy, Lucia Cerchie, Luke Chen, maniekes,
> > Manikumar Reddy, mannoopj, Maros Orsak, Matthew de Detrich, Matthias
> > J. Sax, Max Riedel, Mayank Shekhar Narula, Mehari Beyene, Michael
> > Westerby, Mickael Maison, Nick Telford, Nikhil Ramakrishnan, Nikolay,
> > Okada Haruki, olalamichelle, Omnia G.H Ibrahim, Owen Leung, Paolo
> > Patierno, Philip Nee, Phuc-Hong-Tran, Proven Provenzano, Purshotam
> > Chauhan, Qichao Chu, Matthias J. Sax, Rajini Sivaram, Renaldo Baur
> > Filho, Ritika Reddy, Robert Wagner, Rohan, Ron Dagostino, Roon, runom,
> > Ruslan Krivoshein, rykovsi, Sagar Rao, Said Boudjelda, Satish Duggana,
> > shuoer86, Stanislav Kozlovski, Taher Ghaleb, Tang Yunzi, TapDang,
> > Taras Ledkov, tkuramoto33, Tyler Bertrand, vamossagar12, Vedarth
> > Sharma, Viktor Somogyi-Vass, Vincent Jiang, Walker Carlson,
> > Wuzhengyu97, Xavier Léauté, Xiaobing Fang, yangy, Ritika Reddy,
> > Yanming Zhou, Yash Mayya, yuyli, zhaohaidao, Zihao Lin, Ziming Deng
> >
> > We welcome your help and feedback. For more information on how to
> > report problems, and to get involved, visit the project website at
> > https://kafka.apache.org/
> >
> > Thank you!
> >
> >
> > Regards,
> >
> > Stanislav

Re: [ANNOUNCE] Apache Kafka 3.7.0

2024-02-28 Thread Bruno Cadonna


Thanks Stan and all contributors for the release!

Best,
Bruno

On 2/27/24 7:01 PM, Stanislav Kozlovski wrote:

The Apache Kafka community is pleased to announce the release of
Apache Kafka 3.7.0

This is a minor release that includes new features, fixes, and
improvements from 296 JIRAs

An overview of the release and its notable changes can be found in the
release blog post:
https://kafka.apache.org/blog#apache_kafka_370_release_announcement

All of the changes in this release can be found in the release notes:
https://www.apache.org/dist/kafka/3.7.0/RELEASE_NOTES.html

You can download the source and binary release (Scala 2.12, 2.13) from:
https://kafka.apache.org/downloads#3.7.0

---


Apache Kafka is a distributed streaming platform with four core APIs:


** The Producer API allows an application to publish a stream of records to
one or more Kafka topics.

** The Consumer API allows an application to subscribe to one or more
topics and process the stream of records produced to them.

** The Streams API allows an application to act as a stream processor,
consuming an input stream from one or more topics and producing an
output stream to one or more output topics, effectively transforming the
input streams to output streams.

** The Connector API allows building and running reusable producers or
consumers that connect Kafka topics to existing applications or data
systems. For example, a connector to a relational database might
capture every change to a table.


With these APIs, Kafka can be used for two broad classes of application:

** Building real-time streaming data pipelines that reliably get data
between systems or applications.

** Building real-time streaming applications that transform or react
to the streams of data.


Apache Kafka is in use at large and small companies worldwide, including
Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
Target, The New York Times, Uber, Yelp, and Zalando, among others.

A big thank you to the following 146 contributors to this release!
(Please report an unintended omission)

Abhijeet Kumar, Akhilesh Chaganti, Alieh, Alieh Saeedi, Almog Gavra,
Alok Thatikunta, Alyssa Huang, Aman Singh, Andras Katona, Andrew
Schofield, Anna Sophie Blee-Goldman, Anton Agestam, Apoorv Mittal,
Arnout Engelen, Arpit Goyal, Artem Livshits, Ashwin Pankaj,
ashwinpankaj, atu-sharm, bachmanity1, Bob Barrett, Bruno Cadonna,
Calvin Liu, Cerchie, chern, Chris Egerton, Christo Lolov, Colin
Patrick McCabe, Colt McNealy, Crispin Bernier, David Arthur, David
Jacot, David Mao, Deqi Hu, Dimitar Dimitrov, Divij Vaidya, Dongnuo
Lyu, Eaugene Thomas, Eduwer Camacaro, Eike Thaden, Federico Valeri,
Florin Akermann, Gantigmaa Selenge, Gaurav Narula, gongzhongqiang,
Greg Harris, Guozhang Wang, Gyeongwon, Do, Hailey Ni, Hanyu Zheng, Hao
Li, Hector Geraldino, hudeqi, Ian McDonald, Iblis Lin, Igor Soarez,
iit2009060, Ismael Juma, Jakub Scholz, James Cheng, Jason Gustafson,
Jay Wang, Jeff Kim, Jim Galasyn, John Roesler, Jorge Esteban Quilcate
Otoya, Josep Prat, José Armando García Sancio, Jotaniya Jeel, Jouni
Tenhunen, Jun Rao, Justine Olshan, Kamal Chandraprakash, Kirk True,
kpatelatwork, kumarpritam863, Laglangyue, Levani Kokhreidze, Lianet
Magrans, Liu Zeyu, Lucas Brutschy, Lucia Cerchie, Luke Chen, maniekes,
Manikumar Reddy, mannoopj, Maros Orsak, Matthew de Detrich, Matthias
J. Sax, Max Riedel, Mayank Shekhar Narula, Mehari Beyene, Michael
Westerby, Mickael Maison, Nick Telford, Nikhil Ramakrishnan, Nikolay,
Okada Haruki, olalamichelle, Omnia G.H Ibrahim, Owen Leung, Paolo
Patierno, Philip Nee, Phuc-Hong-Tran, Proven Provenzano, Purshotam
Chauhan, Qichao Chu, Matthias J. Sax, Rajini Sivaram, Renaldo Baur
Filho, Ritika Reddy, Robert Wagner, Rohan, Ron Dagostino, Roon, runom,
Ruslan Krivoshein, rykovsi, Sagar Rao, Said Boudjelda, Satish Duggana,
shuoer86, Stanislav Kozlovski, Taher Ghaleb, Tang Yunzi, TapDang,
Taras Ledkov, tkuramoto33, Tyler Bertrand, vamossagar12, Vedarth
Sharma, Viktor Somogyi-Vass, Vincent Jiang, Walker Carlson,
Wuzhengyu97, Xavier Léauté, Xiaobing Fang, yangy, Ritika Reddy,
Yanming Zhou, Yash Mayya, yuyli, zhaohaidao, Zihao Lin, Ziming Deng

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
https://kafka.apache.org/

Thank you!


Regards,

Stanislav Kozlovski
Release Manager for Apache Kafka 3.7.0

45 matches

Mail list logo