Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-19 Thread Raman Verma
Hello Mikael,

Thanks for the KIP.

I see that the API response contains some information about each partition.
```
{ "name": "PartitionSize", "type": "int64", "versions": "0+",
  "about": "The size of the log segments in this partition in bytes." }
```
Can this be summed up to provide a used space in a `log.dir`
This will also be specific to a `log.dir` (for the case where multiple
log.dir are hosted on the same underlying device)

On Thu, May 19, 2022 at 10:21 AM Cong Ding  wrote:
>
> Hey Mickael,
>
> Great KIP!
>
> I have one question:
>
> You mentioned "DescribeLogDirs is usually a low volume API. This change
> should not
> significantly affect the latency of this API." and "That would allow to
> easily validate whether disk operations (like a resize), or topic deletion
> (log deletion only happen after a short delay) have completed." I wonder if
> there is an existing metric/API that can allow administrators to determine
> whether we need to resize? If administrators use this API to determine
> whether we need a resize, would this API become a high-volume API? I
> understand we don't want this API to be a high-volume one because the API
> is already costly by returning `"name": "Topics"`.
>
> Cong
>
> On Thu, Apr 7, 2022 at 2:17 AM Mickael Maison 
> wrote:
>
> > Hi,
> >
> > I wrote a small KIP to expose the total and usable space of logdirs
> > via the DescribeLogDirs API:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-827%3A+Expose+logdirs+total+and+usable+space+via+Kafka+API
> >
> > Please take a look and let me know if you have any feedback.
> >
> > Thanks,
> > Mickael
> >



-- 
Best Regards,
Raman Verma


Re: [VOTE] KIP-831: Add metric for log recovery progress

2022-05-19 Thread Raman Verma
^^  (non binding)

On Thu, May 19, 2022 at 5:40 PM Raman Verma  wrote:
>
> Hello Luke,
>
> The proposal looks good to me. Thanks. +1
>
>
> On Tue, May 17, 2022 at 9:26 PM James Cheng  wrote:
> >
> > +1 (non-binding)
> >
> > -James
> >
> > Sent from my iPhone
> >
> > > On May 16, 2022, at 12:12 AM, Luke Chen  wrote:
> > >
> > > Hi all,
> > >
> > > I'd like to start a vote on KIP to expose metrics for log recovery
> > > progress. These metrics would let the admins have a way to monitor the log
> > > recovery progress.
> > >
> > > Details can be found here:
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-831%3A+Add+metric+for+log+recovery+progress
> > >
> > > Any feedback is appreciated.
> > >
> > > Thank you.
> > > Luke
>
>
>
> --
> Best Regards,
> Raman Verma



-- 
Best Regards,
Raman Verma


Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-19 Thread Raman Verma
Hi Luke,

The change is useful and simple. Thanks.
Please update the links to JIRA and the discussion thread.

Best Regards,
Raman Verma

On Thu, May 19, 2022 at 8:57 AM Tom Bentley  wrote:
>
> Hi Luke,
>
> Thanks for the KIP. I think the idea makes sense and would provide useful
> observability of log recovery. I have a few comments.
>
> 1. There's not a JIRA for this KIP (or the JIRA link needs updating).
> 2. Similarly the link to this discussion thread needs updating.
> 3. I wonder whether we need to keep these metrics (with value 0) once the
> broker enters the running state. Do you see it as valuable? A benefit of
> removing the metrics would be a reduction on storage required for metric
> stores which are recording these metrics.
> 4. I think the KIP's public interfaces section could be a bit clearer.
> Previous KIPs which added metrics usually used a table, with the MBean
> name, metric type and description. SeeKIP-551 for example (or KIP-748,
> KIP-608). Similarly you could use a table in the proposed changes section
> rather than describing the tree you'd see in an MBean console.
>
> Kind regards,
>
> Tom
>
> On Wed, 11 May 2022 at 09:08, Luke Chen  wrote:
>
> > > And if people start using RemainingLogs and RemainingSegments and then
> > REALLY FEEL like they need RemainingBytes, then we can always add it in the
> > future.
> >
> > +1
> >
> > Thanks James!
> > Luke
> >
> > On Wed, May 11, 2022 at 3:57 PM James Cheng  wrote:
> >
> > > Hi Luke,
> > >
> > > Thanks for the detailed explanation. I agree that the current proposal of
> > > RemainingLogs and RemainingSegments will greatly improve the situation,
> > and
> > > that we can go ahead with the KIP as is.
> > >
> > > If RemainingBytes were straight-forward to implement, then I’d like to
> > > have it. But we can live without it for now. And if people start using
> > > RemainingLogs and RemainingSegments and then REALLY FEEL like they need
> > > RemainingBytes, then we can always add it in the future.
> > >
> > > Thanks Luke, for the detailed explanation, and for responding to my
> > > feedback!
> > >
> > > -James
> > >
> > > Sent from my iPhone
> > >
> > > > On May 10, 2022, at 6:48 AM, Luke Chen  wrote:
> > > >
> > > > Hi James and all,
> > > >
> > > > I checked again and I can see when creating UnifiedLog, we expected the
> > > > logs/indexes/snapshots are in good state.
> > > > So, I don't think we should break the current design to expose the
> > > > `RemainingBytesToRecovery`
> > > > metric.
> > > >
> > > > If there is no other comments, I'll start a vote within this week.
> > > >
> > > > Thank you.
> > > > Luke
> > > >
> > > >> On Fri, May 6, 2022 at 6:00 PM Luke Chen  wrote:
> > > >>
> > > >> Hi James,
> > > >>
> > > >> Thanks for your input.
> > > >>
> > > >> For the `RemainingBytesToRecovery` metric proposal, I think there's
> > one
> > > >> thing I didn't make it clear.
> > > >> Currently, when log manager start up, we'll try to load all logs
> > > >> (segments), and during the log loading, we'll try to recover logs if
> > > >> necessary.
> > > >> And the logs loading is using "thread pool" as you thought.
> > > >>
> > > >> So, here's the problem:
> > > >> All segments in each log folder (partition) will be loaded in each log
> > > >> recovery thread, and until it's loaded, we can know how many segments
> > > (or
> > > >> how many Bytes) needed to recover.
> > > >> That means, if we have 10 partition logs in one broker, and we have 2
> > > log
> > > >> recovery threads (num.recovery.threads.per.data.dir=2), before the
> > > >> threads load the segments in each log, we only know how many logs
> > > >> (partitions) we have in the broker (i.e. RemainingLogsToRecover
> > metric).
> > > >> We cannot know how many segments/Bytes needed to recover until each
> > > thread
> > > >> starts to load the segments under one log (partition).
> > > >>
> > > >> So, the example in the KIP, it shows:
> > > >> Currently, there are still 5 logs (partitions) needed to recover under
> > > >> /tmp/log1 dir. And there are 2 threads doing the jobs, where one
> > thread
> > > has
> > > >> 1 segments needed to recover, and the other one has 3 segments
> > > needed
> > > >> to recover.
> > > >>
> > > >>   - kafka.log
> > > >>  - LogManager
> > > >> - RemainingLogsToRecover
> > > >>- /tmp/log1 => 5← there are 5 logs under
> > > >>/tmp/log1 needed to be recovered
> > > >>- /tmp/log2 => 0
> > > >> - RemainingSegmentsToRecover
> > > >>- /tmp/log1 ← 2 threads are doing log
> > > >>recovery for /tmp/log1
> > > >>- 0 => 1 ← there are 1 segments needed to
> > be
> > > >>   recovered for thread 0
> > > >>   - 1 => 3
> > > >>   - /tmp/log2
> > > >>   - 0 => 0
> > > >>   - 1 => 0
> > > >>
> > > >>
> > > >> So, after a while, the metrics might look like this:
> > > >> It 

Re: [VOTE] KIP-831: Add metric for log recovery progress

2022-05-19 Thread Raman Verma
Hello Luke,

The proposal looks good to me. Thanks. +1


On Tue, May 17, 2022 at 9:26 PM James Cheng  wrote:
>
> +1 (non-binding)
>
> -James
>
> Sent from my iPhone
>
> > On May 16, 2022, at 12:12 AM, Luke Chen  wrote:
> >
> > Hi all,
> >
> > I'd like to start a vote on KIP to expose metrics for log recovery
> > progress. These metrics would let the admins have a way to monitor the log
> > recovery progress.
> >
> > Details can be found here:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-831%3A+Add+metric+for+log+recovery+progress
> >
> > Any feedback is appreciated.
> >
> > Thank you.
> > Luke



-- 
Best Regards,
Raman Verma


Re: [DISCUSS] KIP-714: Client metrics and observability

2022-05-19 Thread Jun Rao
Hi, Magus,

Thanks for the reply.

50. Sounds good.

51. I miss-understood the proposal in the KIP then. The proposal is to
define a set of common metric names that every client should implement. The
problem is that every client already has its own set of metrics with its
own names. I am not sure that we could easily agree upon a common set of
metrics that work with all clients. There are likely to be some metrics
that are client specific. Translating between the common name and client
specific name is probably going to add more confusion. As mentioned in the
KIP, similar metrics from different clients could have subtle
semantic differences. Could we just let each client use its own set of
metric names?

Thanks,

Jun

On Thu, May 19, 2022 at 10:39 AM Magnus Edenhill  wrote:

> Den ons 18 maj 2022 kl 19:57 skrev Jun Rao :
>
> > Hi, Magnus,
> >
>
> Hi Jun
>
>
> >
> > Thanks for the updated KIP. Just a couple of more comments.
> >
> > 50. To troubleshoot a particular client issue, I imagine that the client
> > needs to identify its client_instance_id. How does the client find this
> > out? Do we plan to include client_instance_id in the client log, expose
> it
> > as a metric or something else?
> >
>
> The KIP suggests that client implementations emit an informative log
> message
> with the assigned client-instance-id once it is retrieved (once per client
> instance lifetime).
> There's also a clientInstanceId() method that an application can use to
> retrieve
> the client instance id and emit through whatever side channels makes sense.
>
>
>
> > 51. The KIP lists a bunch of metrics that need to be collected at the
> > client side. However, it seems quite a few useful java client metrics
> like
> > the following are missing.
> > buffer-total-bytes
> > buffer-available-bytes
> >
>
> These are covered by client.producer.record.queue.bytes and
> client.producer.record.queue.max.bytes.
>
>
> > bufferpool-wait-time
> >
>
> Missing, but somewhat implementation specific.
> If it was up to me we would add this later if there's a need.
>
>
>
> > batch-size-avg
> > batch-size-max
> >
>
> These are missing and would be suitably represented as a histogram. I'll
> add them.
>
>
>
> > io-wait-ratio
> > io-ratio
> >
>
> There's client.io.wait.time which should cover io-wait-ratio.
> We could add a client.io.time as well, now or in a later KIP.
>
> Thanks,
> Magnus
>
>
>
>
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Apr 4, 2022 at 10:01 AM Jun Rao  wrote:
> >
> > > Hi, Xavier,
> > >
> > > Thanks for the reply.
> > >
> > > 28. It does seem that we have started using KafkaMetrics on the broker
> > > side. Then, my only concern is on the usage of Histogram in
> KafkaMetrics.
> > > Histogram in KafkaMetrics statically divides the value space into a
> fixed
> > > number of buckets and only returns values on the bucket boundary. So,
> the
> > > returned histogram value may never show up in a recorded value. Yammer
> > > Histogram, on the other hand, uses reservoir sampling. The reported
> value
> > > is always one of the recorded values. So, I am not sure that Histogram
> in
> > > KafkaMetrics is as good as Yammer Histogram.
> > ClientMetricsPluginExportTime
> > > uses Histogram.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Mar 31, 2022 at 5:21 PM Xavier Léauté
> > 
> > > wrote:
> > >
> > >> >
> > >> > 28. On the broker, we typically use Yammer metrics. Only for metrics
> > >> that
> > >> > depend on Kafka metric features (e.g., quota), we use the Kafka
> > metric.
> > >> > Yammer metrics have 4 types: gauge, meter, histogram and timer.
> meter
> > >> > calculates a rate, but also exposes an accumulated value.
> > >> >
> > >>
> > >> I don't see a good reason we should limit ourselves to Yammer metrics
> on
> > >> the broker. KafkaMetrics was written
> > >> to replace Yammer metrics and is used for all new components (clients,
> > >> streams, connect, etc.)
> > >> My understanding is that the original goal was to retire Yammer
> metrics
> > in
> > >> the broker in favor of KafkaMetrics.
> > >> We just haven't done so out of backwards compatibility concerns.
> > >> There are other broker metrics such as group coordinator, transaction
> > >> state
> > >> manager, and various socket server metrics
> > >> already using KafkaMetrics that don't need specific Kafka metric
> > features,
> > >> so I don't see why we should refrain from using
> > >> Kafka metrics on the broker unless there are real compatibility
> concerns
> > >> or
> > >> where implementation specifics could lead to confusion when comparing
> > >> metrics using different implementations.
> > >>
> > >> In my opinion we should encourage people to use KafkaMetrics going
> > forward
> > >> on the broker as well, for two reasons:
> > >> a) yammer metrics is long deprecated and no longer maintained
> > >> b) yammer metrics are much less expressive
> > >> c) we don't have a proper API to expose yammer metrics outside of JMX
> > >> 

Re: [DISCUSS] KIP-835: Monitor KRaft Controller Quorum Health

2022-05-19 Thread Jun Rao
Hi, Jose,

Thanks for the reply.

20. I see the differences now. The metrics in KafkaController use Yammer
metric and follow the camel case naming. The metrics in Raft use the client
side Metrics package and follow the dash notation. So the naming in the KIP
sounds good to me.

21. Sounds good.

Jun



On Wed, May 18, 2022 at 2:11 PM José Armando García Sancio
 wrote:

> Hi Jun,
>
> Jun wrote:
> > 20. For the metric type and name, we use the camel names in some cases
> and
> > dashed lower names in some other cases. Should we make them consistent?
>
> For the metrics group `type=KafkaController`, I am using camel names
> like `MetadataLastAppliedRecordOffset` because it matches the naming
> strategy for the metrics already in that group.
>
> For the metrics group `type=broker-metadata-metrics`, I model the
> naming scheme after the metrics in KIP-595 or the `raft-metrics`
> group. I made the assumption that we wanted to start naming metrics
> and groups using that scheme but maybe that is not correct.
>
> What do you think?
>
> > 21. Could you document the meaning of load-processing-time?
>
> I updated KIP-835 to include this information. Here is a summary:
>
> 1.
> kafka.server:type=broker-metadata-metrics,name=load-processing-time-us-avg
> Reports the average amount of time it took for the broker to process
> all pending records when there are pending records in the cluster
> metadata partition. The time unit for this metric is microseconds.
> 2.
> kafka.server:type=broker-metadata-metrics,name=load-processing-time-us-max
> Reports the maximum amount of time it took for the broker to process
> all pending records when there are pending records in the cluster
> metadata partition. The time unit for this metric is microseconds.
> 3.
> kafka.server:type=broker-metadata-metrics,name=record-batch-size-byte-avg
> Reports the average byte size of the record batches in the cluster
> metadata partition.
> 4.
> kafka.server:type=broker-metadata-metrics,name=record-batch-size-byte-max
> Reports the maximum byte size of the record batches in the cluster
> metadata partition.
>
> -José
>


[jira] [Created] (KAFKA-13918) Schedule or cancel nooprecord write on metadata version change

2022-05-19 Thread Jose Armando Garcia Sancio (Jira)
Jose Armando Garcia Sancio created KAFKA-13918:
--

 Summary: Schedule or cancel nooprecord write on metadata version 
change
 Key: KAFKA-13918
 URL: https://issues.apache.org/jira/browse/KAFKA-13918
 Project: Kafka
  Issue Type: Sub-task
  Components: controller
Reporter: Jose Armando Garcia Sancio
Assignee: Jose Armando Garcia Sancio






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: Are timestamps available for records stored in Kafka Streams state stores?

2022-05-19 Thread James Cheng
Thanks Guozhang! Based on your comment, I searched through the repo and found 
the associated pull requests and JIRAs.

It looks like most of the support was added in 
https://issues.apache.org/jira/browse/KAFKA-6521 


Can you add that to the KIP page for KIP-258? It would make it easier for other 
people to find when/where the timestamp support was added.

Thanks!
-James

> On May 19, 2022, at 1:24 PM, Guozhang Wang  wrote:
> 
> Hi James,
> 
> For kv / time-window stores, they have been geared with timestamps and you
> can access via the TimestampedKeyValueStore/TimstampedWindowStore.
> 
> What's not implemented yet are timestamped session stores.
> 
> Guozhang
> 
> On Thu, May 19, 2022 at 12:49 PM James Cheng  wrote:
> 
>> Hi,
>> 
>> I'm trying to see if timestamps are available for records that are stored
>> in Kafka Streams state stores.
>> 
>> I saw "KIP-258: Allow to Store Record Timestamps in RocksDB"
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-258%3A+Allow+to+Store+Record+Timestamps+in+RocksDB
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-258:+Allow+to+Store+Record+Timestamps+in+RocksDB
>>> 
>> 
>> But I am not sure if it has been fully implemented. The vote thread went
>> through, but it looks like the implementation is still in progress.
>> https://issues.apache.org/jira/browse/KAFKA-8382 <
>> https://issues.apache.org/jira/browse/KAFKA-8382>
>> 
>> The KIP page says "2.3.0 (partially implemented, inactive)"
>> 
>> Can someone share what the current thoughts are, around this KIP?
>> 
>> Thanks!
>> 
>> -James
> 
> 
> 
> -- 
> -- Guozhang



Re: [DISCUSS] KIP-839: Provide builders for KafkaProducer/KafkaConsumer and KafkaStreams

2022-05-19 Thread François Rosière
@Kirk,

1. No defaults are expected. Serializers/deserializers would need to either
by provided using the config or the builder.
2. As some properties would need to be closed, maybe the build method
should only be called one time. To see if we really need to add a check for
that.

Kr,

François.

Le jeu. 19 mai 2022 à 00:02, Kirk True  a écrit :

> Hi François,
>
> Thanks for the KIP!
>
> A couple of questions:
>
>  1. Do the builders have defaults for the serializers/deserializers?
>  2. Can the build method be called more than once on a given builder?
>
> Thanks,
> Kirk
>
> On Wed, May 18, 2022, at 10:11 AM, François Rosière wrote:
> > Hi all,
> >
> > KIP to create builders for
> >
> >- KafkaProducer
> >- KafkaConsumer
> >- KafkaStreams
> >
> >
> > This KIP can be seen as the continuity of the KIP-832.
> >
> > KIP details:  
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211884640
> > Jira issue: https://issues.apache.org/jira/browse/KAFKA-13913
> >
> > Kr,
> >
> > F.
> >
>


Re: Are timestamps available for records stored in Kafka Streams state stores?

2022-05-19 Thread Guozhang Wang
Hi James,

For kv / time-window stores, they have been geared with timestamps and you
can access via the TimestampedKeyValueStore/TimstampedWindowStore.

What's not implemented yet are timestamped session stores.

Guozhang

On Thu, May 19, 2022 at 12:49 PM James Cheng  wrote:

> Hi,
>
> I'm trying to see if timestamps are available for records that are stored
> in Kafka Streams state stores.
>
> I saw "KIP-258: Allow to Store Record Timestamps in RocksDB"
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-258%3A+Allow+to+Store+Record+Timestamps+in+RocksDB
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-258:+Allow+to+Store+Record+Timestamps+in+RocksDB
> >
>
> But I am not sure if it has been fully implemented. The vote thread went
> through, but it looks like the implementation is still in progress.
> https://issues.apache.org/jira/browse/KAFKA-8382 <
> https://issues.apache.org/jira/browse/KAFKA-8382>
>
> The KIP page says "2.3.0 (partially implemented, inactive)"
>
> Can someone share what the current thoughts are, around this KIP?
>
> Thanks!
>
> -James



-- 
-- Guozhang


Are timestamps available for records stored in Kafka Streams state stores?

2022-05-19 Thread James Cheng
Hi,

I'm trying to see if timestamps are available for records that are stored in 
Kafka Streams state stores. 

I saw "KIP-258: Allow to Store Record Timestamps in RocksDB"
https://cwiki.apache.org/confluence/display/KAFKA/KIP-258%3A+Allow+to+Store+Record+Timestamps+in+RocksDB
 


But I am not sure if it has been fully implemented. The vote thread went 
through, but it looks like the implementation is still in progress.
https://issues.apache.org/jira/browse/KAFKA-8382 


The KIP page says "2.3.0 (partially implemented, inactive)"

Can someone share what the current thoughts are, around this KIP?

Thanks!

-James

Re: [ANNOUNCE] Apache Kafka 3.2.0

2022-05-19 Thread James Cheng
Bruno,

Congrats on the release!

There is a small typo on the page.
> KIP-791 
> 
>  adds method recordMetada() to the StateStoreContext,

Should be
> KIP-791 
> 
>  adds method recordMetadata() to the StateStoreContext,

I know that the page has already been published, but should we fix that typo?

Thanks!
-James


> On May 17, 2022, at 9:01 AM, Bruno Cadonna  wrote:
> 
> The Apache Kafka community is pleased to announce the release for Apache 
> Kafka 3.2.0
> 
> * log4j 1.x is replaced with reload4j (KAFKA-9366)
> * StandardAuthorizer for KRaft (KIP-801)
> * Send a hint to the partition leader to recover the partition (KIP-704)
> * Top-level error code field in DescribeLogDirsResponse (KIP-784)
> * kafka-console-producer writes headers and null values (KIP-798 and KIP-810)
> * JoinGroupRequest and LeaveGroupRequest have a reason attached (KIP-800)
> * Static membership protocol lets the leader skip assignment (KIP-814)
> * Rack-aware standby task assignment in Kafka Streams (KIP-708)
> * Interactive Query v2 (KIP-796, KIP-805, and KIP-806)
> * Connect APIs list all connector plugins and retrieve their configuration 
> (KIP-769)
> * TimestampConverter SMT supports different unix time precisions (KIP-808)
> * Connect source tasks handle producer exceptions (KIP-779)
> 
> All of the changes in this release can be found in the release notes:
> https://www.apache.org/dist/kafka/3.2.0/RELEASE_NOTES.html
> 
> 
> You can download the source and binary release (Scala 2.12 and 2.13) from:
> https://kafka.apache.org/downloads#3.2.0
> 
> ---
> 
> 
> Apache Kafka is a distributed streaming platform with four core APIs:
> 
> 
> ** The Producer API allows an application to publish a stream of records to
> one or more Kafka topics.
> 
> ** The Consumer API allows an application to subscribe to one or more
> topics and process the stream of records produced to them.
> 
> ** The Streams API allows an application to act as a stream processor,
> consuming an input stream from one or more topics and producing an
> output stream to one or more output topics, effectively transforming the
> input streams to output streams.
> 
> ** The Connector API allows building and running reusable producers or
> consumers that connect Kafka topics to existing applications or data
> systems. For example, a connector to a relational database might
> capture every change to a table.
> 
> 
> With these APIs, Kafka can be used for two broad classes of application:
> 
> ** Building real-time streaming data pipelines that reliably get data
> between systems or applications.
> 
> ** Building real-time streaming applications that transform or react
> to the streams of data.
> 
> 
> Apache Kafka is in use at large and small companies worldwide, including
> Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
> Target, The New York Times, Uber, Yelp, and Zalando, among others.
> 
> A big thank you for the following 113 contributors to this release!
> 
> A. Sophie Blee-Goldman, Adam Kotwasinski, Aleksandr Sorokoumov, Alexandre 
> Garnier, Alok Nikhil, aSemy, Bounkong Khamphousone, bozhao12, Bruno Cadonna, 
> Chang, Chia-Ping Tsai, Chris Egerton, Colin P. Mccabe, Colin Patrick McCabe, 
> Cong Ding, David Arthur, David Jacot, David Mao, defhacks, dengziming, Ed B, 
> Edwin, florin-akermann, GauthamM-official, GuoPhilipse, Guozhang Wang, Hao 
> Li, Haoze Wu, Idan Kamara, Ismael Juma, Jason Gustafson, Jason Koch, Jeff 
> Kim, jiangyuan, Joel Hamill, John Roesler, Jonathan Albrecht, Jorge Esteban 
> Quilcate Otoya, Josep Prat, Joseph (Ting-Chou) Lin, José Armando García 
> Sancio, Jules Ivanic, Julien Chanaud, Justin Lee, Justine Olshan, Kamal 
> Chandraprakash, Kate Stanley, keashem, Kirk True, Knowles Atchison, Jr, 
> Konstantine Karantasis, Kowshik Prakasam, kurtostfeld, Kvicii, Lee Dongjin, 
> Levani Kokhreidze, lhunyady, Liam Clarke-Hutchinson, liym, loboya~, Lucas 
> Bradstreet, Ludovic DEHON, Luizfrf3, Luke Chen, Marc Löhe, Matthew Wong, 
> Matthias J. Sax, Michal T, Mickael Maison, Mike Lothian, mkandaswamy, Márton 
> Sigmond, Nick Telford, Niket, Okada Haruki, Paolo Patierno, Patrick Stuedi, 
> Philip Nee, Prateek Agarwal, prince-mahajan, Rajini Sivaram, Randall Hauch, 
> Richard, RivenSun, Rob Leland, Ron Dagostino, Sayantanu Dey, Stanislav 
> Vodetskyi, sunshujie1990, Tamara Skokova, Tim Patterson, Tolga H. Dur, Tom 
> Bentley, Tomonari Yamashita, vamossagar12, Vicky Papavasileiou, Victoria Xia, 
> Vijay Krishna, Vincent Jiang, Walker Carlson, wangyap, Wenhao Ji, Wenjun 
> Ruan, Xiaobing Fang, Xiaoyue Xue, xuexiaoyue, Yang Yu, yasar03, Yu, Zhang 
> Hongyi, zzccctv, 工业废水, 彭小漪
> 
> We welcome your help and feedback. For more 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #945

2022-05-19 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag

2022-05-19 Thread José Armando García Sancio
Hey Niket,

I took a look at the latest KIP. It looks like QuorumInfo.ReplicaState
is missing the RPC fields added by this PR. Is the plan to return them
to the Admin Client? E.g. it is missing LastFetchTimestamp and
LastCaughtUpTimestamp.

For those fields what will the admin client return when the RPC
version doesn't support those features?

 --
-José


Re: [DISCUSS] KIP-714: Client metrics and observability

2022-05-19 Thread Magnus Edenhill
Den ons 18 maj 2022 kl 19:57 skrev Jun Rao :

> Hi, Magnus,
>

Hi Jun


>
> Thanks for the updated KIP. Just a couple of more comments.
>
> 50. To troubleshoot a particular client issue, I imagine that the client
> needs to identify its client_instance_id. How does the client find this
> out? Do we plan to include client_instance_id in the client log, expose it
> as a metric or something else?
>

The KIP suggests that client implementations emit an informative log message
with the assigned client-instance-id once it is retrieved (once per client
instance lifetime).
There's also a clientInstanceId() method that an application can use to
retrieve
the client instance id and emit through whatever side channels makes sense.



> 51. The KIP lists a bunch of metrics that need to be collected at the
> client side. However, it seems quite a few useful java client metrics like
> the following are missing.
> buffer-total-bytes
> buffer-available-bytes
>

These are covered by client.producer.record.queue.bytes and
client.producer.record.queue.max.bytes.


> bufferpool-wait-time
>

Missing, but somewhat implementation specific.
If it was up to me we would add this later if there's a need.



> batch-size-avg
> batch-size-max
>

These are missing and would be suitably represented as a histogram. I'll
add them.



> io-wait-ratio
> io-ratio
>

There's client.io.wait.time which should cover io-wait-ratio.
We could add a client.io.time as well, now or in a later KIP.

Thanks,
Magnus




>
> Thanks,
>
> Jun
>
> On Mon, Apr 4, 2022 at 10:01 AM Jun Rao  wrote:
>
> > Hi, Xavier,
> >
> > Thanks for the reply.
> >
> > 28. It does seem that we have started using KafkaMetrics on the broker
> > side. Then, my only concern is on the usage of Histogram in KafkaMetrics.
> > Histogram in KafkaMetrics statically divides the value space into a fixed
> > number of buckets and only returns values on the bucket boundary. So, the
> > returned histogram value may never show up in a recorded value. Yammer
> > Histogram, on the other hand, uses reservoir sampling. The reported value
> > is always one of the recorded values. So, I am not sure that Histogram in
> > KafkaMetrics is as good as Yammer Histogram.
> ClientMetricsPluginExportTime
> > uses Histogram.
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Mar 31, 2022 at 5:21 PM Xavier Léauté
> 
> > wrote:
> >
> >> >
> >> > 28. On the broker, we typically use Yammer metrics. Only for metrics
> >> that
> >> > depend on Kafka metric features (e.g., quota), we use the Kafka
> metric.
> >> > Yammer metrics have 4 types: gauge, meter, histogram and timer. meter
> >> > calculates a rate, but also exposes an accumulated value.
> >> >
> >>
> >> I don't see a good reason we should limit ourselves to Yammer metrics on
> >> the broker. KafkaMetrics was written
> >> to replace Yammer metrics and is used for all new components (clients,
> >> streams, connect, etc.)
> >> My understanding is that the original goal was to retire Yammer metrics
> in
> >> the broker in favor of KafkaMetrics.
> >> We just haven't done so out of backwards compatibility concerns.
> >> There are other broker metrics such as group coordinator, transaction
> >> state
> >> manager, and various socket server metrics
> >> already using KafkaMetrics that don't need specific Kafka metric
> features,
> >> so I don't see why we should refrain from using
> >> Kafka metrics on the broker unless there are real compatibility concerns
> >> or
> >> where implementation specifics could lead to confusion when comparing
> >> metrics using different implementations.
> >>
> >> In my opinion we should encourage people to use KafkaMetrics going
> forward
> >> on the broker as well, for two reasons:
> >> a) yammer metrics is long deprecated and no longer maintained
> >> b) yammer metrics are much less expressive
> >> c) we don't have a proper API to expose yammer metrics outside of JMX
> >> (MetricsReporter only exposes KafkaMetrics)
> >>
> >
>


Re: [VOTE] KIP-835: Monitor KRaft Controller Quorum Health

2022-05-19 Thread Guozhang Wang
That makes sense. Thanks!

+1 (binding).

On Thu, May 19, 2022 at 8:46 AM José Armando García Sancio
 wrote:

> Guozhang Wang wrote:
> >
> > Thanks José! For 1/2 above, just checking if we would record the
> > corresponding sensors only during broker bootstrap time, or whenever
> there
> > are new metadata records being committed by the controller quorum (since
> > there are always a short period of time, between when the records are
> > committed, to when the records get fetched by that broker)?
>
> It measures the time spent by the broker keeping up with the log or
> processing the log. In practicality, when the broker starts that would
> be the time spent loading the snapshot and processing the committed
> records in the log after the snapshot. After startup that would be the
> time spent reading the log to keep up with the local high-watermark. I
> changed the name of the metric to "pending-record-processing-time-us".
> I think the word "load" in the previous metric name was awkward and
> misleading.
>
> --
> -José
>


-- 
-- Guozhang


Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-19 Thread Cong Ding
Hey Mickael,

Great KIP!

I have one question:

You mentioned "DescribeLogDirs is usually a low volume API. This change
should not
significantly affect the latency of this API." and "That would allow to
easily validate whether disk operations (like a resize), or topic deletion
(log deletion only happen after a short delay) have completed." I wonder if
there is an existing metric/API that can allow administrators to determine
whether we need to resize? If administrators use this API to determine
whether we need a resize, would this API become a high-volume API? I
understand we don't want this API to be a high-volume one because the API
is already costly by returning `"name": "Topics"`.

Cong

On Thu, Apr 7, 2022 at 2:17 AM Mickael Maison 
wrote:

> Hi,
>
> I wrote a small KIP to expose the total and usable space of logdirs
> via the DescribeLogDirs API:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-827%3A+Expose+logdirs+total+and+usable+space+via+Kafka+API
>
> Please take a look and let me know if you have any feedback.
>
> Thanks,
> Mickael
>


[jira] [Resolved] (KAFKA-13863) Prevent null config value when create topic in KRaft mode

2022-05-19 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-13863.
-
Fix Version/s: 3.3.0
   Resolution: Fixed

> Prevent null config value when create topic in KRaft mode
> -
>
> Key: KAFKA-13863
> URL: https://issues.apache.org/jira/browse/KAFKA-13863
> Project: Kafka
>  Issue Type: Bug
>Reporter: dengziming
>Priority: Major
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] KIP-831: Add metric for log recovery progress

2022-05-19 Thread Tom Bentley
Hi Luke,

Thanks for the KIP. I think the idea makes sense and would provide useful
observability of log recovery. I have a few comments.

1. There's not a JIRA for this KIP (or the JIRA link needs updating).
2. Similarly the link to this discussion thread needs updating.
3. I wonder whether we need to keep these metrics (with value 0) once the
broker enters the running state. Do you see it as valuable? A benefit of
removing the metrics would be a reduction on storage required for metric
stores which are recording these metrics.
4. I think the KIP's public interfaces section could be a bit clearer.
Previous KIPs which added metrics usually used a table, with the MBean
name, metric type and description. SeeKIP-551 for example (or KIP-748,
KIP-608). Similarly you could use a table in the proposed changes section
rather than describing the tree you'd see in an MBean console.

Kind regards,

Tom

On Wed, 11 May 2022 at 09:08, Luke Chen  wrote:

> > And if people start using RemainingLogs and RemainingSegments and then
> REALLY FEEL like they need RemainingBytes, then we can always add it in the
> future.
>
> +1
>
> Thanks James!
> Luke
>
> On Wed, May 11, 2022 at 3:57 PM James Cheng  wrote:
>
> > Hi Luke,
> >
> > Thanks for the detailed explanation. I agree that the current proposal of
> > RemainingLogs and RemainingSegments will greatly improve the situation,
> and
> > that we can go ahead with the KIP as is.
> >
> > If RemainingBytes were straight-forward to implement, then I’d like to
> > have it. But we can live without it for now. And if people start using
> > RemainingLogs and RemainingSegments and then REALLY FEEL like they need
> > RemainingBytes, then we can always add it in the future.
> >
> > Thanks Luke, for the detailed explanation, and for responding to my
> > feedback!
> >
> > -James
> >
> > Sent from my iPhone
> >
> > > On May 10, 2022, at 6:48 AM, Luke Chen  wrote:
> > >
> > > Hi James and all,
> > >
> > > I checked again and I can see when creating UnifiedLog, we expected the
> > > logs/indexes/snapshots are in good state.
> > > So, I don't think we should break the current design to expose the
> > > `RemainingBytesToRecovery`
> > > metric.
> > >
> > > If there is no other comments, I'll start a vote within this week.
> > >
> > > Thank you.
> > > Luke
> > >
> > >> On Fri, May 6, 2022 at 6:00 PM Luke Chen  wrote:
> > >>
> > >> Hi James,
> > >>
> > >> Thanks for your input.
> > >>
> > >> For the `RemainingBytesToRecovery` metric proposal, I think there's
> one
> > >> thing I didn't make it clear.
> > >> Currently, when log manager start up, we'll try to load all logs
> > >> (segments), and during the log loading, we'll try to recover logs if
> > >> necessary.
> > >> And the logs loading is using "thread pool" as you thought.
> > >>
> > >> So, here's the problem:
> > >> All segments in each log folder (partition) will be loaded in each log
> > >> recovery thread, and until it's loaded, we can know how many segments
> > (or
> > >> how many Bytes) needed to recover.
> > >> That means, if we have 10 partition logs in one broker, and we have 2
> > log
> > >> recovery threads (num.recovery.threads.per.data.dir=2), before the
> > >> threads load the segments in each log, we only know how many logs
> > >> (partitions) we have in the broker (i.e. RemainingLogsToRecover
> metric).
> > >> We cannot know how many segments/Bytes needed to recover until each
> > thread
> > >> starts to load the segments under one log (partition).
> > >>
> > >> So, the example in the KIP, it shows:
> > >> Currently, there are still 5 logs (partitions) needed to recover under
> > >> /tmp/log1 dir. And there are 2 threads doing the jobs, where one
> thread
> > has
> > >> 1 segments needed to recover, and the other one has 3 segments
> > needed
> > >> to recover.
> > >>
> > >>   - kafka.log
> > >>  - LogManager
> > >> - RemainingLogsToRecover
> > >>- /tmp/log1 => 5← there are 5 logs under
> > >>/tmp/log1 needed to be recovered
> > >>- /tmp/log2 => 0
> > >> - RemainingSegmentsToRecover
> > >>- /tmp/log1 ← 2 threads are doing log
> > >>recovery for /tmp/log1
> > >>- 0 => 1 ← there are 1 segments needed to
> be
> > >>   recovered for thread 0
> > >>   - 1 => 3
> > >>   - /tmp/log2
> > >>   - 0 => 0
> > >>   - 1 => 0
> > >>
> > >>
> > >> So, after a while, the metrics might look like this:
> > >> It said, now, there are only 4 logs needed to recover in /tmp/log1,
> and
> > >> the thread 0 has 9000 segments left, and thread 1 has 5 segments left
> > >> (which should imply the thread already completed 2 logs recovery in
> the
> > >> period)
> > >>
> > >>   - kafka.log
> > >>  - LogManager
> > >> - RemainingLogsToRecover
> > >>- /tmp/log1 => 3← there are 3 logs under
> > >>/tmp/log1 

Re: [VOTE] KIP-835: Monitor KRaft Controller Quorum Health

2022-05-19 Thread José Armando García Sancio
Guozhang Wang wrote:
>
> Thanks José! For 1/2 above, just checking if we would record the
> corresponding sensors only during broker bootstrap time, or whenever there
> are new metadata records being committed by the controller quorum (since
> there are always a short period of time, between when the records are
> committed, to when the records get fetched by that broker)?

It measures the time spent by the broker keeping up with the log or
processing the log. In practicality, when the broker starts that would
be the time spent loading the snapshot and processing the committed
records in the log after the snapshot. After startup that would be the
time spent reading the log to keep up with the local high-watermark. I
changed the name of the metric to "pending-record-processing-time-us".
I think the word "load" in the previous metric name was awkward and
misleading.

-- 
-José


[jira] [Resolved] (KAFKA-13807) Ensure that we can set log.flush.interval.ms with IncrementalAlterConfigs

2022-05-19 Thread Colin McCabe (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin McCabe resolved KAFKA-13807.
--
Resolution: Fixed

> Ensure that we can set log.flush.interval.ms with IncrementalAlterConfigs
> -
>
> Key: KAFKA-13807
> URL: https://issues.apache.org/jira/browse/KAFKA-13807
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin McCabe
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (KAFKA-13917) Avoid calling lookupCoordinator() in tight loop

2022-05-19 Thread Viktor Somogyi-Vass (Jira)
Viktor Somogyi-Vass created KAFKA-13917:
---

 Summary: Avoid calling lookupCoordinator() in tight loop
 Key: KAFKA-13917
 URL: https://issues.apache.org/jira/browse/KAFKA-13917
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Affects Versions: 3.1.1, 3.1.0, 3.1.2
Reporter: Viktor Somogyi-Vass
Assignee: Viktor Somogyi-Vass


Currently the heartbeat thread's lookupCoordinator() is called in a tight loop 
if brokers crash and the consumer is left running. Besides that it floods the 
logs on debug level, it increases CPU usage as well.

The fix is easy, just need to put a backoff call after coordinator lookup.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-19 Thread Mickael Maison
Hi Ismael,

1. I'm fine dropping "Space" from the field name, I think the names
are clear enough, I've updated the KIP.
2. These values are properties of the volume each log directory is
into. If you have multiple log directories in the same volume, they
will both return the usable and total size of that volume. I'm not
sure if it's something many people do but I've clarified the sizes are
from the underlying volumes in the KIP.

Thanks,
Mickael




On Tue, May 17, 2022 at 5:46 PM Ismael Juma  wrote:
>
> Hi Mickael,
>
> Thanks for the KIP. Two questions:
>
> 1. Is `space` redundant?  is `totalBytes` and `usableBytes` a more concise
> description of the same thing?
> 2. Is usable space a property of the log directory? What if you have
> multiple log directories in the same underlying OS partition?
>
> Ismael
>
> On Thu, Apr 7, 2022 at 2:17 AM Mickael Maison 
> wrote:
>
> > Hi,
> >
> > I wrote a small KIP to expose the total and usable space of logdirs
> > via the DescribeLogDirs API:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-827%3A+Expose+logdirs+total+and+usable+space+via+Kafka+API
> >
> > Please take a look and let me know if you have any feedback.
> >
> > Thanks,
> > Mickael
> >


Re: Kafka 2.7.0 vulnerabilities

2022-05-19 Thread Luke Chen
Hi Udit,

If v3.0.0 is vulnerable to this CVE, then I believe v2.7.0 is also
vulnerable, since the component used in v2.7.0 must be in the older version.
Please upgrade to v3.1.1 or later.

Thank you.
Luke


On Thu, May 19, 2022 at 6:41 PM Seth, Udit 
wrote:

> Greetings Concerned,
> Currently, our product is using Kafka 2.7.0 and as per the vulnerabilities
> reported by our security team we wish to confirm if CVE-2020-36518 impacts
> Kafka 2.7.0 or not?
> Because as per https://issues.apache.org/jira/browse/KAFKA-13775 , the
> affected versions are 3.1.0, 3.0.0, 3.0.1.
>
> Please confirm.
>
> Thanks and Regards
> Udit Seth
>


[jira] [Created] (KAFKA-13916) Fenced replicas should not be allowed to join the ISR in KRaft (KIP-841)

2022-05-19 Thread David Jacot (Jira)
David Jacot created KAFKA-13916:
---

 Summary: Fenced replicas should not be allowed to join the ISR in 
KRaft (KIP-841)
 Key: KAFKA-13916
 URL: https://issues.apache.org/jira/browse/KAFKA-13916
 Project: Kafka
  Issue Type: Improvement
Reporter: David Jacot
Assignee: David Jacot






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [VOTE] KIP-830: Allow disabling JMX Reporter

2022-05-19 Thread Mickael Maison
Hi Ismael,

That's a good idea, I've updated the KIP.

Thanks,
Mickael

On Tue, May 17, 2022 at 4:17 PM Federico Valeri  wrote:
>
> +1 (non binding)
>
> Thanks.
>
> On Tue, May 17, 2022 at 3:47 PM Mickael Maison  
> wrote:
> >
> > Hi,
> >
> > I'd like to start a vote on KIP-830 which proposes a method for a
> > method for disabling JMXReporter and making JMXReporter behave like
> > other reporters in the next major release when it will need to be
> > explicitly listed in metric.reporters to be enabled.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-830%3A+Allow+disabling+JMX+Reporter
> >
> > Let me know if you have feedback,
> >
> > Thanks,
> > Mickael


Kafka 2.7.0 vulnerabilities

2022-05-19 Thread Seth, Udit
Greetings Concerned,
Currently, our product is using Kafka 2.7.0 and as per the vulnerabilities 
reported by our security team we wish to confirm if CVE-2020-36518 impacts 
Kafka 2.7.0 or not?
Because as per https://issues.apache.org/jira/browse/KAFKA-13775 , the affected 
versions are 3.1.0, 3.0.0, 3.0.1.

Please confirm.

Thanks and Regards
Udit Seth


Re: [VOTE] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-19 Thread Federico Valeri
Thanks Mickael.

+1 (non binding)

On Wed, May 18, 2022 at 11:08 AM Divij Vaidya  wrote:
>
> +1 non binding.
>
> Divij Vaidya
>
>
>
> On Tue, May 17, 2022 at 6:16 PM Igor Soarez  wrote:
>
> > Thanks for this KIP Mickael.
> >
> > +1 non binding
> >
> > --
> > Igor
> >
> > On Tue, May 17, 2022, at 2:48 PM, Luke Chen wrote:
> > > Hi Mickael,
> > >
> > > +1 (binding) from me.
> > > Thanks for the KIP!
> > >
> > > Luke
> > >
> > > On Tue, May 17, 2022 at 9:30 PM Mickael Maison  > >
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> I'd like to start a vote on KIP-827. It proposes exposing the total
> > >> and usable space of logdirs
> > >> via the DescribeLogDirs API:
> > >>
> > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-827%3A+Expose+logdirs+total+and+usable+space+via+Kafka+API
> > >>
> > >> Thanks,
> > >> Mickael
> > >>
> >


[GitHub] [kafka-site] mimaison commented on a diff in pull request #410: KAFKA-13882 Docker to preview docs locally

2022-05-19 Thread GitBox


mimaison commented on code in PR #410:
URL: https://github.com/apache/kafka-site/pull/410#discussion_r876875276


##
.htaccess:
##
@@ -9,3 +9,5 @@ RewriteRule ^/?(\d+)/javadoc - [S=2]
 RewriteRule ^/?(\d+)/images/ - [S=1]
 RewriteCond $2 !=protocol
 RewriteRule ^/?(\d+)/([a-z]+)(\.html)? /$1/documentation#$2 [R=302,L,NE]
+RewriteCond %{REQUEST_FILENAME}.html -f
+RewriteRule ^(.*)$ %{REQUEST_FILENAME}.html

Review Comment:
   I find it strange we have to edit this file and I wonder how it works on the 
website. I'll try to find out



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Final reminder: ApacheCon North America call for presentations closing soon

2022-05-19 Thread Rich Bowen
[Note: You're receiving this because you are subscribed to one or more
Apache Software Foundation project mailing lists.]

This is your final reminder that the Call for Presetations for
ApacheCon North America 2022 will close at 00:01 GMT on Monday, May
23rd, 2022. Please don't wait! Get your talk proposals in now!

Details here: https://apachecon.com/acna2022/cfp.html

--Rich, for the ApacheCon Planners




[jira] [Created] (KAFKA-13915) Kafka streams should validate that the repartition topics are not created with cleanup.policy compact

2022-05-19 Thread Peter James Pringle (Jira)
Peter James Pringle created KAFKA-13915:
---

 Summary: Kafka streams should validate that the repartition topics 
are not created with cleanup.policy compact
 Key: KAFKA-13915
 URL: https://issues.apache.org/jira/browse/KAFKA-13915
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 2.8.1
Reporter: Peter James Pringle


Add sanity validation on streams start up that *repartition* topics are not 
setup with *cleanup.policy* of {*}compact{*}.

In enterprise envs automated creation of kafka streams intermediate topics is 
not always possible due to policy restrictions and as a result it is done 
manually which  is prone to user misconfiguration.

In several cases we have found the repartition topics have been incorrectly 
setup following the changelog topic setup with compact enabled. The result 
being that a compacted repartition topic will result in data loss if more that 
one value is mapped to the new key.

 

Example:

 

{{Original data: (coffee, drink), (tea, drink), (beer, drink)}}

 

Repartition by type i.e. drink:

 

Expected:

{{(drink, coffee), (drink, tea), (drink, beer)}}

 

With compaction the following is possible:

 

Actual

{{(drink, beer); }}

coffee and tea are lost.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[DISCUSS] SASL reauthentication, session expiry

2022-05-19 Thread András Csaki
Hi Kafka Devs,

I'd like to discuss expected behavior and a potential bug with the SASL
reauthentication process.

I've opened KAFKA-13848 a while back and have a small project to reproduce
the problem here: https://github.com/acsaki/kafka-sasl-reauth

Briefly explained, OAuth clients remain able to produce/consume after they
have failed to reauthenticate, demonstrated with a short token expiry and a
killed Oauth server. :)

The problem seems to be in
SaslServerAuthenticator.ReauthInfo#calcCompletionTimesAndReturnSessionLifetimeMs
where sessionExpirationTimeNanos is only set when the session's calculated
lifetime is non-negative (token has not expired yet).

Because of this ReauthInfo#sessionExpirationTimeNanos remains null, in turn
making KafkaChannel#serverAuthenticationSessionExpired to always return
false, so SocketServer won't close the channel, leaving my producers and
consumers connected and happily producing and consuming.
You can see there's not much OAUTHBEARER specific in this behavior.

Looking at the if conditions in
calcCompletionTimesAndReturnSessionLifetimeMs it all seems rather
deliberate.
I've opened a very much work in progress and simplistic PR here:
https://github.com/apache/kafka/pull/12179
It only makes sure ReauthInfo#sessionExpirationTimeNanos gets set when
either credentials can expire or there's a max reauth time set. It actually
makes my producers with expired tokens die but it seems to break a lot of
assumptions in tests. (some of the tests I've started to fix but there are
many still broken)

I'd like to first discuss if this is indeed a problem worth investigating
more. Or maybe leaving clients with expired tokens connected is what we
want here so they may be able to reauthenticate eventually.


Best,
Andras


Re: [DISCUSS] KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag

2022-05-19 Thread David Jacot
Hi Niket,

Thanks for the KIP. I have a few minor comments:

1. We should keep DescribeQuorumResult's constructor package-private
for now in my opinion. We have been debating about this in KIP-692 and
KIP-777 but as we haven't reached a consensus yet, we should be on the
conservative side here.

2. Regarding gettings in QuorumInfo and ReplicaState, we usually don't
prefix getters with `get`.

3. Regarding the new fields in DescribeQuorumResponse, should they
have default values (e.g. -1) and be ignorable? The default values are
useful when talking to a controller which does not support those fields.

Best,
David

On Thu, May 19, 2022 at 12:59 AM Niket Goel  wrote:
>
> I did miss updating the KIP on Jose's comment. Have done that now, thanks
> for the reminder.
>
> For the `kafka-metadata-quorum.sh` tool, it seems that the tool's
> dependence on the DescribeQuorum API is implicit given the original KIP
> [1]. I will add a section in this KIP demonstrating that the tool's output
> should contain the newly added fields as well.
> The tool itself is tracked under this JIRA [2]
>
> Thanks
> Niket
>
> [1]
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-595%3A+A+Raft+Protocol+for+the+Metadata+Quorum#KIP595%3AARaftProtocolfortheMetadataQuorum-ToolingSupport=
> [2] https://issues.apache.org/jira/browse/KAFKA-13914
>
> On Tue, May 17, 2022 at 7:31 PM deng ziming 
> wrote:
>
> > Hello Niket,
> >
> >
> > 1. I find the DescribeQuorumResult still contains an
> > DescribeQuorumResponseData, which is not allowed as Jose commented, have
> > you forgot to change it?
> >
> > 2. You only add an Handle in AdminClient, can you also add an
> > `kafka-metadata-quorum.sh` tool to help this?
> >
> >
> > > On May 17, 2022, at 9:50 AM, Niket Goel 
> > wrote:
> > >
> > > Thanks for the call out David. We will populate these fields for the
> > > Observers as well. I will clarify this in the KIP.
> > >
> > > On Mon, May 16, 2022 at 1:50 PM David Arthur 
> > wrote:
> > >
> > >> Niket, thanks for the KIP!
> > >>
> > >> Sorry for the late feedback on this, but I just had a quick question.
> > The
> > >> KIP indicates the two new fields will be set for voters only, however
> > this
> > >> ReplicaState struct is also used by the Observers in
> > >> DescribeQuorumResponse. Will we simply fill in -1 for these values, or
> > do
> > >> we intend to report the last fetch and caught-up time of the observers
> > as
> > >> well?
> > >>
> > >> Thanks!
> > >> David
> > >>
> > >>
> > >> On Mon, May 16, 2022 at 1:46 PM Niket Goel 
> > >> wrote:
> > >>
> > >>> Hi all,
> > >>>
> > >>> Thank you for the feedback on this. I have started a voting thread for
> > >>> this KIP here:
> > >>> https://lists.apache.org/thread/bkb7gsbxpljh5qh014ztffq7bldjrb2x
> > >>>
> > >>> Thanks
> > >>> Niket Goel
> > >>>
> > >>>
> > >>> From: Niket Goel 
> > >>> Date: Thursday, May 12, 2022 at 5:25 PM
> > >>> To: dev@kafka.apache.org 
> > >>> Subject: Re: [DISCUSS] KIP-836: Addition of Information in
> > >>> DescribeQuorumResponse about Voter Lag
> > >>> Appreciate the careful review Jose.!
> > >>>
> > >>> Ack on 1 and 2. Will fix.
> > >>>
> > >>> For number 3 (and I am using [1] as a reference for this discussion), I
> > >>> think the correct language to use would be:
> > >>>
> > >>> "Whenever a new fetch request
> > >>> comes in the replica's last caught up time is updated to the time of
> > >>> this fetch request if it requests an offset greater than or equal to
> > the
> > >>> leader's
> > >>> current end offset"
> > >>> Does that sound right now?
> > >>>
> > >>> Although I think I will go ahead and rewrite the explanation in a way
> > >> that
> > >>> is more understandable. Thanks for pointing this out.
> > >>>
> > >>> Thanks
> > >>>
> > >>> [1]
> > >>>
> > >>
> > https://github.com/apache/kafka/blob/fa59be4e770627cd34cef85986b58ad7f606928d/core/src/main/scala/kafka/cluster/Replica.scala#L97
> > >>>
> > >>>
> > >>>
> > >>> On Thu, May 12, 2022 at 3:20 PM José Armando García Sancio
> > >>>  wrote:
> > >>> Thanks for the Kafka improvement Niket.
> > >>>
> > >>> 1. For the fields `LastFetchTime` and `LastCaughtUpTime`, Kafka tends
> > >>> to use the suffix "Timestamp" when the value is an absolute wall clock
> > >>> value.
> > >>>
> > >>> 2. The method `result()` for the type `DescribeQuorumResult` returns
> > >>> the type `DescribeQuorumResponseData`. The types generated from the
> > >>> RPC JSON schema are internal to Kafka and not exposed to clients. For
> > >>> the admin client we should use a different type that is explicitly
> > >>> public. See `org.apache.kafka.client.admin.DescribeTopicsResult` for
> > >>> an example.
> > >>>
> > >>> 3. The proposed section has his sentence "Whenever a new fetch request
> > >>> comes in the replica's last caught up time is updated to the time of
> > >>> the fetch request if it requests an offset greater than the leader's
> > >>> current end offset." Did you mean "previous fetch time" instead of
> > >>> "last caught up time"? 

Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #944

2022-05-19 Thread Apache Jenkins Server
See