Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #829

2022-03-31 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-13791) Fix FetchResponse#`fetchData` and `forgottenTopics`: Assignment of lazy-initialized members should be the last step with double-checked locking

2022-03-31 Thread YunKui Lu (Jira)
YunKui Lu created KAFKA-13791:
-

 Summary: Fix FetchResponse#`fetchData` and `forgottenTopics`: 
Assignment of lazy-initialized members should be the last step with 
double-checked locking
 Key: KAFKA-13791
 URL: https://issues.apache.org/jira/browse/KAFKA-13791
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 3.0.1
Reporter: YunKui Lu


Double-checked locking can be used for lazy initialization of volatile fields, 
but only if field assignment is the last step in the synchronized block. 
Otherwise, you run the risk of threads accessing a half-initialized object.

The problem is consistent with 
[KAFKA-13777|https://issues.apache.org/jira/projects/KAFKA/issues/KAFKA-13777]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] KIP-813 Shared State Stores

2022-03-31 Thread John Roesler
Hi Daan,

Thanks for the KIP!

I just got caught up on the discussion. I just have a some small questions, and 
then I will be ready to vote. 

1. Am I right I’m thinking that there’s no way to enforce the stores are 
actually read-only, right? It seems like the StoreBuilder interface is too 
generic for that. If that’s true, I think it’s fine, but we should be sure the 
JavaDoc clearly states that other processors must not write into these stores 
(except for the one that feeds it).

 2. Are you planning for these stores to get standbys as well? I would think 
so, otherwise the desired purpose of standbys (eliminating restoration latency 
during failover) would not be served. 

Thanks,
John

On Mon, Mar 7, 2022, at 13:13, Matthias J. Sax wrote:
> Thanks for updating the KIP. LGTM.
>
> I think we can start a vote.
>
>
>>  I think this might provide issues if your processor is doing a projection 
>> of the data.
>
> This is correct. It's a know issue: 
> https://issues.apache.org/jira/browse/KAFKA-7663
>
> Global-stores/KTables are designed to put the data into the store 
> _unmodified_.
>
>
> -Matthias
>
> On 2/28/22 5:05 AM, Daan Gertis wrote:
>> Updated the KIP to be more aligned with global state store function names.
>> 
>> If I remember correctly during restore the processor will not be used right? 
>> I think this might provide issues if your processor is doing a projection of 
>> the data. Either way, I would not add that into this KIP since it is a 
>> specific use-case pattern.
>> 
>> Unless there is anything more to add or change, I would propose moving to a 
>> vote?
>> 
>> Cheers!
>> D.
>> 
>> From: Matthias J. Sax 
>> Date: Friday, 18 February 2022 at 03:29
>> To: dev@kafka.apache.org 
>> Subject: Re: [DISCUSS] KIP-813 Shared State Stores
>> Thanks for updating the KIP!
>> 
>> I am wondering if we would need two overloads of `addReadOnlyStateStore`
>> one w/ and one w/o `TimestampExtractor` argument to effectively make it
>> an "optional" parameter?
>> 
>> Also wondering if we need to pass in a `String sourceName` and `String
>> processorName` parameters (similar to `addGlobalStore()`?) instead if
>> re-using the store name as currently proposed? -- In general I don't
>> have a strong opinion either way, but it seems to introduce some API
>> inconsistency if we don't follow the `addGlobalStore()` pattern?
>> 
>> 
>>> Another thing we were confronted with was the restoring of state when the 
>>> actual local storage is gone. For example, we host on K8s with ephemeral 
>>> pods, so there is no persisted storage between pod restarts. However, the 
>>> consumer group will be already been at the latest offset, preventing from 
>>> previous data to be restored within the new pod’s statestore.
>> 
>> We have already code in-place in the runtime to do the right thing for
>> this case (ie, via DSL source-table changelog optimization). We can
>> re-use this part. It's nothing we need to discuss on the KIP, but we can
>> discuss on the PR later.
>> 
>> 
>> -Matthias
>> 
>> 
>> On 2/17/22 10:09 AM, Guozhang Wang wrote:
>>> Hi Daan,
>>>
>>> I think for the read-only state stores you'd need ot slightly augment the
>>> checkpointing logic so that it would still write the checkpointed offsets
>>> while restoring from the changelogs.
>>>
>>>
>>> Guozhang
>>>
>>> On Thu, Feb 17, 2022 at 7:02 AM Daan Gertis 
>>> wrote:
>>>
> Could you add more details about the signature of
> `addReadOnlyStateStore()` -- What parameters does it take? Are there any
> overloads taking different parameters? The KIP only contains some verbal
> description on the "Implementation Plan" section, that is hard to find
> and hard to read.
>
> The KIP mentions a `ProcessorProvider` -- do you mean
 `ProcessorSupplier`?
>
> About timestamp synchronization: why do you propose to disable timestamp
> synchronization (similar to global state stores)? It seems to be an
> unnecessary limitation? -- Given that we could re-use the new method for
> source `KTables` (ie, `StreamsBuilder#table()` implemenation), having
> timestamp synchronization enabled seems to be important?

 Yup, will do these updates. I’ll overload the addReadOnlyStateStore to
 have allow for timestamp synchronization.

 Another thing we were confronted with was the restoring of state when the
 actual local storage is gone. For example, we host on K8s with ephemeral
 pods, so there is no persisted storage between pod restarts. However, the
 consumer group will be already been at the latest offset, preventing from
 previous data to be restored within the new pod’s statestore.

 If I remember correctly, there was some checkpoint logic available when
 restoring, but we are bypassing that since logging is disabled on the
 statestore, no?

 As always, thanks for your insights.

 Cheers,
 D.


 From: Matthias J. Sax 
 Date: Wednesday, 16 

[jira] [Created] (KAFKA-13790) ReplicaManager should be robust to all partition updates from kraft metadata log

2022-03-31 Thread Jason Gustafson (Jira)
Jason Gustafson created KAFKA-13790:
---

 Summary: ReplicaManager should be robust to all partition updates 
from kraft metadata log
 Key: KAFKA-13790
 URL: https://issues.apache.org/jira/browse/KAFKA-13790
 Project: Kafka
  Issue Type: Bug
Reporter: Jason Gustafson
Assignee: Jason Gustafson


There are two ways that partition state can be updated in the zk world: one is 
through `LeaderAndIsr` requests and one is through `AlterPartition` responses. 
All changes made to partition state result in new LeaderAndIsr requests, but 
replicas will ignore them if the leader epoch is less than or equal to the 
current known leader epoch. Basically it works like this:
 * Changes made by the leader are done through AlterPartition requests. These 
changes bump the partition epoch (or zk version), but leave the leader epoch 
unchanged. LeaderAndIsr requests are sent by the controller, but replicas 
ignore them. Partition state is instead only updated when the AlterIsr response 
is received.
 * Changes made by the controller are made directly by the controller and 
always result in a leader epoch bump. These changes are sent to replicas 
through LeaderAndIsr requests and are applied by replicas.

The code in `kafka.server.ReplicaManager` and `kafka.cluster.Partition` are 
built on top of these assumptions. The logic in `makeLeader`, for example, 
assumes that the leader epoch has indeed been bumped. Specifically, follower 
state gets reset and a new entry is written to the leader epoch cache.

In KRaft, we also have two paths to update partition state. One is 
AlterPartition, just like in the zk world. The second is updates received from 
the metadata log. These follow the same path as LeaderAndIsr requests for the 
most part, but a big difference is that all changes are sent down to 
`kafka.cluster.Partition`, even those which do not have a bumped leader epoch. 
This breaks the assumptions mentioned above in `makeLeader`, which could result 
in leader epoch cache inconsistency. Another side effect of this on the 
follower side is that replica fetchers for updated partitions get unnecessarily 
restarted. There may be others as well.

We need to either replicate the same logic on the zookeeper side or make the 
logic robust to all updates including those without a leader epoch bump.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] KIP-714: Client metrics and observability

2022-03-31 Thread Xavier Léauté
>
> Are there cases where the metrics plugin developers would want to forward
> the compressed payload without decompressing?


The only interoperable use-case I can think of would be to forward the
payloads directly to an OpenTelemetry collector backend.
Today OTLP only mandates gzip/none compression support for gRPC and HTTP
protocols, so this might only work for a limited set
of compression formats (or no compression) out of the box.

see
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/otlp.md#protocol-details

Maybe we could consider exposing the raw uncompressed bytes regardless of
client side compression, if someone wanted
to avoid the cost of de-serializing the payload, since there would always
be an option to forward that as-is, and let the opentelemetry collector add
tags relevant to the broker originating those client metrics.


Re: [DISCUSS] KIP-714: Client metrics and observability

2022-03-31 Thread Xavier Léauté
>
> 28. On the broker, we typically use Yammer metrics. Only for metrics that
> depend on Kafka metric features (e.g., quota), we use the Kafka metric.
> Yammer metrics have 4 types: gauge, meter, histogram and timer. meter
> calculates a rate, but also exposes an accumulated value.
>

I don't see a good reason we should limit ourselves to Yammer metrics on
the broker. KafkaMetrics was written
to replace Yammer metrics and is used for all new components (clients,
streams, connect, etc.)
My understanding is that the original goal was to retire Yammer metrics in
the broker in favor of KafkaMetrics.
We just haven't done so out of backwards compatibility concerns.
There are other broker metrics such as group coordinator, transaction state
manager, and various socket server metrics
already using KafkaMetrics that don't need specific Kafka metric features,
so I don't see why we should refrain from using
Kafka metrics on the broker unless there are real compatibility concerns or
where implementation specifics could lead to confusion when comparing
metrics using different implementations.

In my opinion we should encourage people to use KafkaMetrics going forward
on the broker as well, for two reasons:
a) yammer metrics is long deprecated and no longer maintained
b) yammer metrics are much less expressive
c) we don't have a proper API to expose yammer metrics outside of JMX
(MetricsReporter only exposes KafkaMetrics)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #828

2022-03-31 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-13789) Fragile tag ordering in metric mbean names

2022-03-31 Thread Jason Gustafson (Jira)
Jason Gustafson created KAFKA-13789:
---

 Summary: Fragile tag ordering in metric mbean names
 Key: KAFKA-13789
 URL: https://issues.apache.org/jira/browse/KAFKA-13789
 Project: Kafka
  Issue Type: Improvement
Reporter: Jason Gustafson


We noticed that mbean name creation logic is a bit fragile in 
`KafkaMetricsGroup`, which many server components rely on. We rely on the 
ordering of tags in the underlying map collection. Any change to the map 
implementation could result in a different tag ordering, which would result in 
a different mbean name.

In [https://github.com/apache/kafka/pull/11970,] we reimplemented the metric 
naming function to rely on LinkedHashMap so that the ordering is explicit. We 
should try to upgrade current metrics to rely on the new method, which probably 
will involve ensuring that we have good test coverage for registered metrics. 
At a minimum, we should ensure new metrics use the explicit ordering. Perhaps 
we can consider deprecating the old methods or creating a new 
`SafeKafkaMetricsGroup` implementation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] KIP-821: Connect Transforms support for nested structures

2022-03-31 Thread Jorge Esteban Quilcate Otoya
Thanks, Chris! Much appreciated all the feedback here.

1. You nailed it setting the design goal here: "it shouldn't be impossible
to use this new feature for any field name, no matter how convoluted. It's
fine if edge cases introduce difficulty (such as less-readable
configurations), but it's not fine if they can't be addressed at all."
Back to the previous proposals (using only dots as separators) we have 2
alternatives:
1. escaping with backslashes
2. escaping with dots itself

I'll lean for alternative 2, as you proposed before. Feels to me that
backslashes have more potential to lead to confusion in JSON configs, and
CSV seems like a good precedent to use the same character to escape itself.
KIP is updated to reflect this.

2. Thanks! I'll add an example, and stick with the current approach
defining the style per individual transform configuration.

3. Yes, thanks! KIP updated.

4. Of course. KIP updated.

On Mon, 28 Mar 2022 at 21:59, Chris Egerton  wrote:

> Hi Jorge,
>
> Thanks for addressing my comments; the KIP looks up-to-date and pretty
> readable now, and the rejected alternatives section does a great job of
> outlining the discussion so far and providing context for anyone else who
> might want to join in.
>
> 1. Thoughts on choice of delimiter:
> - I like the optimization for simple cases, but I think the new proposal is
> a little too restrictive. What if there's a field whose name contains all
> of the permitted options (currently just ".", ",", and "/")?
> - If we expand the set of permitted delimiters to allow for any
> single-character string, configuration complexity will increase and
> readability may decrease
> - Also worth pointing out that there is some convention for doubling a
> delimiter character as an escape mechanism with formats like CSV [1]
> - Overall I think we may be approaching the saturation point for productive
> discussion on delimiter syntax so I don't want to spend too much more of
> your time on it. I think the one point I'd like to leave for now is that it
> shouldn't be impossible to use this new feature for any field name, no
> matter how convoluted. It's fine if edge cases introduce difficulty (such
> as less-readable configurations), but it's not fine if they can't be
> addressed at all.
>
> 2.
> The configuration style where you define "transforms.field.style" in the
> connector config, and then this applies to all SMTs for the connector, is
> very interesting. However, it doesn't follow convention for existing SMTs.
> Right now, if you want to configure an SMT, you define its name in the
> connector config (for example, "transforms": "smt1"), and then define all
> of the properties for that SMT in the connector config using a namespacing
> mechanism specific to that SMT (for example, "transforms.smt1.prop1":
> "val1"). That SMT then sees only the properties defined in that namespace,
> with the prefix stripped (for example, "prop1": "val1") in its configure
> [2] [3] method.
> If we want to continue to follow this convention, then instead of
> specifying "transforms.field.style" in a connector config, we would expect
> users to configure "transforms..field.style", for each SMT that they
> want to configure a field style for. This would require more work on the
> part of the user, but would be simpler to reason about and easier to
> implement.
> If we want to explore an alternative where users can specify global
> properties that apply to all transforms in a connector config, then the
> semantics for this need to be defined in the KIP. This would have to
> include whether this will apply only for the special case of the
> "field.style" and possibly "field.separator" properties or if it would be
> available more generally for other properties, whether it will apply only
> for the SMTs outlined in the KIP or if the "field.style" and possibly
> "field.separator" properties would also be passed into custom SMTs so that
> they could choose to act on them if applicable, how edge cases like having
> an SMT named "field" in your connector config would be handled, etc.
> Either way, it might help to have an example in the KIP outlining how one
> of the to-be-augmented SMTs can be configured with this new feature and a
> before/after of how a record value would be transformed with that
> configuration.
>
> 3. The docstring for the "transforms.field.style" property mentions that
> the permitted values are "plain" and "nested", but then describes behavior
> with a value of "root". Should that be "plain" instead?
>
> 4. The docstring for the "transforms.field.separator" property exclusively
> mentions structs, but the feature is intended to work with maps as well.
> Can we update it to reflect this?
>
> References:
>
> [1] - https://stackoverflow.com/a/17808731
> [2] -
>
> https://github.com/apache/kafka/blob/7243facb8d69a7252e6b9556b5eaee13e41bab7f/connect/api/src/main/java/org/apache/kafka/connect/transforms/Transformation.java#L30
> [3] -
>
> 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #827

2022-03-31 Thread Apache Jenkins Server
See 




Re: newbie: PR 11955 review

2022-03-31 Thread Rajani Karuturi
Tests failed again. :(

Is everyone facing this problem?

I tried to look at the cause for the test failures but couldn’t find any.

Is it possible to review and merge this?
It’s a pretty small fix.

Thanks,


On Thu, 31 Mar 2022 at 8:24 PM, Adam Bellemare 
wrote:

> Thanks Bruno :)
>
> On Thu, Mar 31, 2022 at 10:09 AM Bruno Cadonna  wrote:
>
> > Hi Rajani and Adam,
> >
> > I do not think "retest this" or any other comment works for
> > non-committers anymore.
> >
> > Rajani, the failures seem to be flaky tests since they are unrelated to
> > your change. Could you please look into JIRA for the flaky tests? You
> > should find tickets that start with "Flaky test" or similar. If you find
> > tickets, comment on the tickets that the test failed again. I think you
> > will find other comments that you can use as an example.
> >
> > I restarted the builds.
> >
> > Best,
> > Bruno
> >
> > On 31.03.22 15:51, Adam Bellemare wrote:
> > > It could be an intermittent error. I think there is a way to issue a
> > retest
> > > via a comment in github ("retest this") but I am not sure if that's the
> > > precise command, nor if you will have the ability to kick it off.
> > >
> > >
> > >
> > > On Thu, Mar 31, 2022 at 9:12 AM Rajani Karuturi 
> > wrote:
> > >
> > >> Hi All,
> > >> I am new here. To get started, I took one of the "newbie" issues and
> > made
> > >> changes for the same.
> > >>
> > >> Issue: https://issues.apache.org/jira/browse/KAFKA-12380
> > >> PR: https://github.com/apache/kafka/pull/11955
> > >>
> > >> PR build shows some test failures but the same pass on my local.
> > >> The change is very isolated and should not cause any failures ideally.
> > >> Can someone please help me on the next steps?
> > >>
> > >> Thanks,
> > >> ~ Rajani
> > >>
> > >
> >
>
-- 
~ Rajani


[jira] [Resolved] (KAFKA-12875) Change Log layer segment map mutations to avoid absence of active segment

2022-03-31 Thread Jun Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-12875.
-
Fix Version/s: 3.3.0
 Assignee: Yu Yang
   Resolution: Fixed

Merged to trunk.

> Change Log layer segment map mutations to avoid absence of active segment
> -
>
> Key: KAFKA-12875
> URL: https://issues.apache.org/jira/browse/KAFKA-12875
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kowshik Prakasam
>Assignee: Yu Yang
>Priority: Major
> Fix For: 3.3.0
>
>
> [https://github.com/apache/kafka/pull/10650] showed a case where active 
> segment was absent when Log layer segments were iterated. We should 
> investigate Log layer code to see if we can change Log layer segment map 
> mutations to avoid absence of active segment at any given point. For example, 
> if we are clearing all segments and creating a new one, maybe we can reverse 
> the order to create a new segment first and then clear the old ones later.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.2 #19

2022-03-31 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #826

2022-03-31 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-826: Define platforms supported

2022-03-31 Thread Mickael Maison
Hi Ismael,

Thanks for the feedback!

1. I agree we need to run system tests. As far as I can tell we don't
currently run system tests for any platforms in the Apache CI, not
even for x86_64. System tests require a lot of resources so I'm not
sure we can run them daily on all platforms in the Apache CI. However,
I think we should be able to run them in the Apache CI (probably only
triggered by committers) whenever required, for example when doing a
release, or when debugging broken tests. We've not enabled integration
tests on aarch64 and pp64le yet so it's hard to gauge if a single
machine is enough to handle all builds. But if we also want to run
system tests regularly, ensuring we have at least 2 machines for each
platforms seems like a requirement.

2. Machines are managed by the Apache Infra team. I'd expect the
process to be similar for all platforms. If Infra can't manage some
machines effectively and help us debug issues on them, then they
should not be part of the CI.

Thanks,
Mickael

On Thu, Mar 31, 2022 at 5:50 PM Mickael Maison  wrote:
>
> Hi Divij,
>
> Thanks for taking a look at the KIP!
>
> 1. I'm not sure we should include the kernel as part of the support
> statement. Kafka does not interact with the kernel directly and
> instead relies on the JVM to make the right system calls. So I think
> it's best to keep that out.
>
> 2. That's a good point. The system tests that Ismael mentioned include
> benchmarks, so if we run them we would get some performance metrics.
>
> Thanks,
> Mickael
>
>
> On Wed, Mar 30, 2022 at 3:10 PM Ismael Juma  wrote:
> >
> > Thanks for the KIP. I would say there are 2 requirements to fully support
> > platforms that the KIP does not mention:
> >
> > 1. Running system tests on that platform nightly.
> > 2. Committers need a mechanism to be able to debug and fix issues affecting
> > the platforms.
> >
> > arm64 seems like the kind of platform where we can achieve both, but the
> > others are more challenging.
> >
> > Ismael
> >
> > On Wed, Mar 30, 2022 at 4:01 AM Mickael Maison 
> > wrote:
> >
> > > Hi,
> > >
> > > I recently started a thread
> > > (https://lists.apache.org/thread/4ffpspc59cnzqbgjf3l50ykcvyp7vwtv)
> > > about whether we should consider aarch64 (ARM64) and ppc64le (PowerPC)
> > > as supported platforms because we've recently started using these
> > > platforms in the Kafka CI.
> > >
> > > Unfortunately nobody replied so I guess a better way to start a
> > > discussion is to make a proposal. I opened KIP-826 to propose clearly
> > > defining the platforms we support in our docs and also introducing a
> > > process for adding additional platforms in the future:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-826%3A+Define+platforms+supported
> > >
> > > Let me know if you have any feedback.
> > >
> > > Thanks,
> > > Mickael
> > >


Re: [DISCUSS] KIP-826: Define platforms supported

2022-03-31 Thread Mickael Maison
Hi Divij,

Thanks for taking a look at the KIP!

1. I'm not sure we should include the kernel as part of the support
statement. Kafka does not interact with the kernel directly and
instead relies on the JVM to make the right system calls. So I think
it's best to keep that out.

2. That's a good point. The system tests that Ismael mentioned include
benchmarks, so if we run them we would get some performance metrics.

Thanks,
Mickael


On Wed, Mar 30, 2022 at 3:10 PM Ismael Juma  wrote:
>
> Thanks for the KIP. I would say there are 2 requirements to fully support
> platforms that the KIP does not mention:
>
> 1. Running system tests on that platform nightly.
> 2. Committers need a mechanism to be able to debug and fix issues affecting
> the platforms.
>
> arm64 seems like the kind of platform where we can achieve both, but the
> others are more challenging.
>
> Ismael
>
> On Wed, Mar 30, 2022 at 4:01 AM Mickael Maison 
> wrote:
>
> > Hi,
> >
> > I recently started a thread
> > (https://lists.apache.org/thread/4ffpspc59cnzqbgjf3l50ykcvyp7vwtv)
> > about whether we should consider aarch64 (ARM64) and ppc64le (PowerPC)
> > as supported platforms because we've recently started using these
> > platforms in the Kafka CI.
> >
> > Unfortunately nobody replied so I guess a better way to start a
> > discussion is to make a proposal. I opened KIP-826 to propose clearly
> > defining the platforms we support in our docs and also introducing a
> > process for adding additional platforms in the future:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-826%3A+Define+platforms+supported
> >
> > Let me know if you have any feedback.
> >
> > Thanks,
> > Mickael
> >


Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.2 #18

2022-03-31 Thread Apache Jenkins Server
See 




Re: newbie: PR 11955 review

2022-03-31 Thread Adam Bellemare
Thanks Bruno :)

On Thu, Mar 31, 2022 at 10:09 AM Bruno Cadonna  wrote:

> Hi Rajani and Adam,
>
> I do not think "retest this" or any other comment works for
> non-committers anymore.
>
> Rajani, the failures seem to be flaky tests since they are unrelated to
> your change. Could you please look into JIRA for the flaky tests? You
> should find tickets that start with "Flaky test" or similar. If you find
> tickets, comment on the tickets that the test failed again. I think you
> will find other comments that you can use as an example.
>
> I restarted the builds.
>
> Best,
> Bruno
>
> On 31.03.22 15:51, Adam Bellemare wrote:
> > It could be an intermittent error. I think there is a way to issue a
> retest
> > via a comment in github ("retest this") but I am not sure if that's the
> > precise command, nor if you will have the ability to kick it off.
> >
> >
> >
> > On Thu, Mar 31, 2022 at 9:12 AM Rajani Karuturi 
> wrote:
> >
> >> Hi All,
> >> I am new here. To get started, I took one of the "newbie" issues and
> made
> >> changes for the same.
> >>
> >> Issue: https://issues.apache.org/jira/browse/KAFKA-12380
> >> PR: https://github.com/apache/kafka/pull/11955
> >>
> >> PR build shows some test failures but the same pass on my local.
> >> The change is very isolated and should not cause any failures ideally.
> >> Can someone please help me on the next steps?
> >>
> >> Thanks,
> >> ~ Rajani
> >>
> >
>


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #825

2022-03-31 Thread Apache Jenkins Server
See 




Re: newbie: PR 11955 review

2022-03-31 Thread Bruno Cadonna

Hi Rajani and Adam,

I do not think "retest this" or any other comment works for 
non-committers anymore.


Rajani, the failures seem to be flaky tests since they are unrelated to 
your change. Could you please look into JIRA for the flaky tests? You 
should find tickets that start with "Flaky test" or similar. If you find 
tickets, comment on the tickets that the test failed again. I think you 
will find other comments that you can use as an example.


I restarted the builds.

Best,
Bruno

On 31.03.22 15:51, Adam Bellemare wrote:

It could be an intermittent error. I think there is a way to issue a retest
via a comment in github ("retest this") but I am not sure if that's the
precise command, nor if you will have the ability to kick it off.



On Thu, Mar 31, 2022 at 9:12 AM Rajani Karuturi  wrote:


Hi All,
I am new here. To get started, I took one of the "newbie" issues and made
changes for the same.

Issue: https://issues.apache.org/jira/browse/KAFKA-12380
PR: https://github.com/apache/kafka/pull/11955

PR build shows some test failures but the same pass on my local.
The change is very isolated and should not cause any failures ideally.
Can someone please help me on the next steps?

Thanks,
~ Rajani





Re: newbie: PR 11955 review

2022-03-31 Thread Adam Bellemare
It could be an intermittent error. I think there is a way to issue a retest
via a comment in github ("retest this") but I am not sure if that's the
precise command, nor if you will have the ability to kick it off.



On Thu, Mar 31, 2022 at 9:12 AM Rajani Karuturi  wrote:

> Hi All,
> I am new here. To get started, I took one of the "newbie" issues and made
> changes for the same.
>
> Issue: https://issues.apache.org/jira/browse/KAFKA-12380
> PR: https://github.com/apache/kafka/pull/11955
>
> PR build shows some test failures but the same pass on my local.
> The change is very isolated and should not cause any failures ideally.
> Can someone please help me on the next steps?
>
> Thanks,
> ~ Rajani
>


newbie: PR 11955 review

2022-03-31 Thread Rajani Karuturi
Hi All,
I am new here. To get started, I took one of the "newbie" issues and made
changes for the same.

Issue: https://issues.apache.org/jira/browse/KAFKA-12380
PR: https://github.com/apache/kafka/pull/11955

PR build shows some test failures but the same pass on my local.
The change is very isolated and should not cause any failures ideally.
Can someone please help me on the next steps?

Thanks,
~ Rajani


Re: [DISCUSS] KIP-714: Client metrics and observability

2022-03-31 Thread Magnus Edenhill
Hey Ismael,


> > The PushTelemetryRequest handler decompresses the payload before passing
> it
> > to the metrics plugin.
> > This was done to avoid having to expose a public decompression interface
> to
> > metrics plugin developers.
> >
>
> Are there cases where the metrics plugin developers would want to forward
> the compressed payload without decompressing?
>

Maybe, but most plugins probably want to either add some extra information
(e.g., from the auth context), or convert to another format, so the original
compressed blob is most likely not that interesting.
In any case the plugin will want to inspect the uncompressed metrics data to
verify it is not garbage before forwarding it upstream.

We could always add an option later to allow passing the metrics payload
verbatim if the need arises.

Thanks,
Magnus


[jira] [Created] (KAFKA-13788) Creation of invalid dynamic config prevents further creation of valid configs

2022-03-31 Thread Prateek Agarwal (Jira)
Prateek Agarwal created KAFKA-13788:
---

 Summary: Creation of invalid dynamic config prevents further 
creation of valid configs
 Key: KAFKA-13788
 URL: https://issues.apache.org/jira/browse/KAFKA-13788
 Project: Kafka
  Issue Type: Bug
  Components: config
Affects Versions: 2.8.0
Reporter: Prateek Agarwal


Kafka currently allows creating an unknown dynamic config without any errors. 
But it errors when next valid dynamic config gets created.

This can be seen locally in a cluster by creating a wrong config 
{{log.cleaner.threadzz}} which was preventing creation of the valid config 
later {{log.cleaner.threads}}.

{code}
# Invalid config 'log.cleaner.threadzz' gets created without issues
$ ./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --add-config 
log.cleaner.threadzz=2 --entity-type brokers --entity-default 2>1
Completed updating default config for brokers in the cluster.
{code}

Now when a valid config is added, {{kafka-configs.sh}} errors out:
{code}
$ ./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --add-config 
log.cleaner.threads=2 --entity-type brokers --entity-default
All sensitive broker config entries must be specified for --alter, missing 
entries: Set(log.cleaner.threadzz)
{code}

To fix this, one needs to first delete the incorrect config:
{code:java}
$ ./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter 
--delete-config log.cleaner.threadzz --entity-type brokers --entity-default
 {code}

But ideally, the invalid config should error out so that creation of the valid 
config doesn't get prevented.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (KAFKA-13783) Remove reason prefixing in JoinGroupRequest and LeaveGroupRequest

2022-03-31 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-13783.
-
Resolution: Fixed

> Remove reason prefixing in JoinGroupRequest and LeaveGroupRequest
> -
>
> Key: KAFKA-13783
> URL: https://issues.apache.org/jira/browse/KAFKA-13783
> Project: Kafka
>  Issue Type: Bug
>Reporter: David Jacot
>Assignee: David Jacot
>Priority: Blocker
> Fix For: 3.2.0
>
>
> KIP-800 introduced a mechanism to pass a reason in the join group request and 
> in the leaver group request. A default reason is used unless one is provided 
> by the user. In this case, the custom reason is prefixed by the default one.
> When we tried to used this in Kafka Streams, we noted a significant 
> degradation of the performances, see 
> https://github.com/apache/kafka/pull/11873. It is not clear wether the 
> prefixing is the root cause of the issue or not. To be on the safe side, I 
> think that we should remove the prefixing. It does not bring much anyway as 
> we are still able to distinguish a custom reason from the default one on the 
> broker side.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13787) Failed to delete state store directory for it is not empty

2022-03-31 Thread Nico Pommerening (Jira)
Nico Pommerening created KAFKA-13787:


 Summary: Failed to delete state store directory for it is not empty
 Key: KAFKA-13787
 URL: https://issues.apache.org/jira/browse/KAFKA-13787
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 3.0.0
Reporter: Nico Pommerening
 Attachments: bug-log.txt

On Kafka Streams shutdown the Cleanup of state directories seems not to work, 
since the lock and metadata file seem not to be deleted.

Relevant WARN logging:

2022-03-31 10:34:41,689 WARN  [SpringApplicationShutdownHook] 
org.apache.kafka.streams.processor.internals.StateDirectory: stream-thread 
[SpringApplicationShutdownHook] Failed to delete state store directory of 
/kafka-streams-statestore/555b9965-95e3-4c92-b467-1d283428da5d/test-test-run-kpi
 for it is not empty

 

Left over files in directory:
 * .lock
 * kafka-streams-process-metadata

 

I'm not sure what the consequences of a unclean state cleanup are, but I would 
like to get rid of the Warning.

I attached a bigger log extract and I've already patched the StateDirectory 
implementation which I'll try to contribute.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13786) Optimized documentation for control.plane.listener.name parameter

2022-03-31 Thread RivenSun (Jira)
RivenSun created KAFKA-13786:


 Summary: Optimized documentation for control.plane.listener.name 
parameter
 Key: KAFKA-13786
 URL: https://issues.apache.org/jira/browse/KAFKA-13786
 Project: Kafka
  Issue Type: Improvement
  Components: config
Reporter: RivenSun


documentation:
[https://kafka.apache.org/documentation/#brokerconfigs_control.plane.listener.name]

There are two points in the above document that are not clear.

1. There is a piece of verification code in KafkaConfig
{code:java}
// validate control.plane.listener.name config
if (controlPlaneListenerName.isDefined) {
  require(advertisedListenerNames.contains(controlPlaneListenerName.get),
s"${KafkaConfig.ControlPlaneListenerNameProp} must be a listener name 
defined in ${KafkaConfig.AdvertisedListenersProp}. " +
s"The valid options based on currently configured listeners are 
${advertisedListenerNames.map(_.value).mkString(",")}")
  // controlPlaneListenerName should be different from interBrokerListenerName
  
require(!controlPlaneListenerName.get.value().equals(interBrokerListenerName.value()),
s"${KafkaConfig.ControlPlaneListenerNameProp}, when defined, should have a 
different value from the inter broker listener name. " +
s"Currently they both have the value ${controlPlaneListenerName.get}")
} {code}
`control.plane.listener.name` if set value, it cannot have the same value as 
`inter.broker.listener.name`.

 

2. In the ControllerChannelManager.addNewBroker() method, there is a line of 
code
{code:java}
val controllerToBrokerListenerName = 
config.controlPlaneListenerName.getOrElse(config.interBrokerListenerName)
 {code}
So the last sentence of the parameter document is wrong.
{code:java}
If not explicitly configured, the default value will be null and there will be 
no dedicated endpoints for controller connections. {code}
 

Solution:

Change the wrong document mentioned above to the following:



If not explicitly configured, the default value is null and the value of 
`inter.broker.listener.name` will be used for controller connections.
If set explicitly, the value cannot be the same as the value of 
`inter.broker.listener.name`



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #824

2022-03-31 Thread Apache Jenkins Server
See