Re: Re: KIP 678: New Kafka Connect SMT for plainText => Struct(or Map) with Regex

2022-06-13 Thread Chris Egerton
Hi whsoul,

Would you mind updating the KIP document (
https://cwiki.apache.org/confluence/display/KAFKA/KIP+678%3A+New+Kafka+Connect+SMT+for+plainText+%3D%3E+Struct%28or+Map%29+with+Regex)
with all of these changes? We tend to discuss and vote on what's included
in the Wiki as our source of truth, as opposed to pull requests.

RE 4:
> it seems the chained transform with extractField SMT + plainText value
ParseStructByRegex make the same result with struct value
ParseSturctByRegex, but it will drop collector meta data during
extractField ( I think.. )
This is exactly right--using ExtractField + ParseStructByRegex will replace
the record key/value with the parsed struct, and drop everything else. Do
you think it'd increase the implementation complexity significantly to add
support for users to specify a field to operate on, which would preserve
the rest of the record key/value as-is? I'm not so sure that everyone is
going to want to drop all other metadata from their messages, but if
there's something that makes this particularly difficult to implement (as
compared to the other SMTs that are already included out of the box with
Kafka Connect), then we could probably leave it out for now.

RE 6:
It's a tricky distinction, but what I mean is that, although we might add a
class named GroupRegexValidator and use that class in our SMT library,
unless it's part of the public interface we're trying to change, it's an
implementation detail and we don't have to call it out in the KIP. The
advantage to leaving it out is that it makes the KIP more concise and
therefore easier to review, you can choose to rename, remove, decompose,
etc. the class in your PR without having to worry about sticking to the
plan in the KIP that everyone reviewed and voted on. It's a minor detail
though, I'm noting it here more because it may be useful when writing
future KIPs than because it's necessary to adhere to strictly in this one.

Cheers,

Chris

On Fri, Jun 10, 2022 at 8:13 AM gyejun choi  wrote:

> Hi Chris,
>
> I applied some code fix according your second reviews.
>
> https://github.com/apache/kafka/pull/12219/commits/f673ea2eae0d907502e44c0ecd53b616386627bf
>
>
> 1. [applied] changed name as ParseStructByRegex
>
> 2. [applied] throw DataException, when a line that the SMT sees doesn't
> match the regex...
> originally, it will be skipped if no data match with regex,
> but change code to throw DataException according to your review
>
> 3. [already applied]
> I already delete code about desc ":{TYPE}" with commit below
>
> https://github.com/apache/kafka/pull/12219/commits/534d995b3e6371c37443eb72eee03884cb23c85d
>
> 4. [need discuss]
> In my use case about log data collection,
> I configured the pipeline below
> nginx => filebeat => kafka => kafka connect es connector => es
>
> filebeat ( or most other log collector ) usually send value as struct ( not
> plaintext ) with collector meta data,
> and the key name as "message" ( in case filebeat )
>
> I think there are more use cases log message wrapped struct value than
> plain text value,
> and it seems the chained transform with extractField SMT + plainText value
> ParseStructByRegex make the same result with struct value
> ParseSturctByRegex,
> but it will drop collector meta data during extractField ( I think.. )
> and also in almost case, users will use ParseStructByRegex SMT with
> extractField SMT
>
> 5. [applied] throw DataException, when there are difference between group
> size and mapping names size
>
>
> 6. [ question ]
> you mean, use RegexValidator class already exist?
> without group "(.*)" pattern check, it will not provide early detection
> about regex config mistake,
> If you think it is enough as runtime DataException detection?
>
> always thanks.
>
> whsoul
>
> 2022년 6월 8일 (수) 오전 9:46, Chris Egerton 님이 작성:
>
> > Hi whsoul,
> >
> > Thanks for the updates. I have a few more thoughts but this is looking
> > pretty good:
> >
> > 1. The name "ToStructByRegexTransform" is a little unwieldy. What do you
> > think about something shorter like ParseStruct, ParseRegex, or
> > ParseStructByRegex?
> >
> > 2. What happens if a line that the SMT sees doesn't match the regex
> > supplied by the user? Will it throw an exception, silently skip the
> message
> > and return it as-is, allow the user to configure it to do either,
> something
> > else entirely? (I'd personally lean towards just throwing an exception
> > since users can configure regexes to be lenient via the optional
> > quantifier, i.e. "?")
> >
> > 3. The description for the "regex" property still includes the "( with
> > :{TYPE} )" snippet; should that be removed?
> >
> > 4. Is it worth adding support to this SMT to operate on an individual
> field
> > of a message? I.e., you specify a "field" of "log_line" and the SMT
> expects
> > to see a struct or a map with a field/key of "log_line" and parses that
> > instead of the entire message. If so, it might be worth specifying that
> > this property would follow any 

Re: [DISCUSS] KIP-821: Connect Transforms support for nested structures

2022-06-13 Thread Chris Egerton
Hi Jorge,

I've done some more thinking and I hate to say it, but I think the syntax
does need to be expanded. Right now it's clear what "a.b" refers to and
what "a..b" refers to, but what about "a...b"? Is that referring to
subfield ".b" of field "a", or subfield "b" of field "a."? This gets even
more complicated when thinking about fields whose names are exclusively
made up of dots.

I'm also a little hesitant to mix the cases of multi-value paths and deep
scans. What if you only want to access one subfield deep for an SMT,
instead of recursing through all the children of a given field? It's akin
to the distinction between * and ** with file globbing patterns, and there
could be a substantial performance difference if you have heavily-nested
fields.

Ultimately, I think that if the proposed "field.syntax.version" property
sits well with people, it might be better to reduce the scope of the KIP
back to the original proposal and just focus on adding support for
explicitly-specified nested values, with no multi-value paths whatsoever,
knowing that we have an easy way to introduce new syntax and features in
the future. (We could probably leave the "a...b" case for that next version
too.)

I was a huge fan of this KIP before we started trying to address more
complex use cases, and although I don't want to write those off, I think we
may have bitten off more than we can chew in time for the 3.3.0 release and
would hate to see this KIP get delayed as a result.

I'd be really curious to hear from Joshua and Tom on this front, though. Is
it acceptable to move more incrementally here and settle on the syntax
version property as our means of introducing new features, or is it
preferable to implement things monolithically and try to get everything (or
at least, as much as possible) right the first time?

Thanks again for your continued effort on this KIP!

Cheers,

Chris

On Wed, Jun 8, 2022 at 5:41 PM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Thanks, Chris!
>
> Please, find my comments below:
>
> On Tue, 7 Jun 2022 at 04:39, Chris Egerton 
> wrote:
>
> > Hi Jorge,
> >
> > Thanks! Sorry for the delay; here are my thoughts:
> >
> > 1. Under the "Accessing multiple values by deep-scan" header it's stated
> > that "If deep-scan is used, it must have only one field after the
> asterisk
> > level.". However, in example 3 for the Cast SMT and other examples for
> > other SMTs, the spec contains a field of "*.child.k2", which appears to
> > have two fields after the asterisk level. I may be misunderstanding the
> > proposal, but it seems like the two contradict each other.
> >
>
> Thanks for catching this. I have clarified it by removing this restriction.
> Also, have extended the deep-scan scenarios.
>
>
> >
> > 2. I'm a little unclear on why we need the special handling for arrays
> > where, for an array field "a", the field name "a" can be treated as
> either
> > the array itself, or every element in the array. Is there a reason we
> can't
> > use the field name "a.*" to handle the latter case, and "a" to handle the
> > former?
> >
>
> Agree, this is confusing. I like the `a.*` approach to access array items.
> I have added this to the proposal.
>
>
> >
> > 3. How would a user specify that they'd like to access a field with the
> > literal name "*"?
> >
>
> Good one. I'm proposing an approach similar to how it's proposed to escape
> dots, with a double-asterisk. Curious on your thoughts around this.
>
>
> >
> > 4. For the Cast SMT, do you think it might bite some people if fields
> that
> > can't be cast correctly are silently ignored? I'm imagining the case
> where
> > none of the fields in a multi-path expression can be cast correctly and
> it
> > ends up eating half of someone's day to track down why their SMT isn't
> > doing anything.
> >
>
> If I understand correctly, this challenge could be relevant across SMTs.
> At the moment, most/all? SMTs just silently ignore.
> Was thinking about adding a flag `field.on.path.not.found` to either ignore
> or fail when no paths are found. What do your think?
>
>
> >
> > 5. For the ExtractField and ValueToKey SMTs, what happens if a deep-scan
> > field name is used, but only one field is found? Is the resulting field
> > still an array, or is it just the single field that was found? (FWIW I'm
> > leaning towards keeping it an array just to keep schemas consistent in a
> > pipeline in case the number of fields found fluctuates across records.)
> >
> > Agree. Will clarify that an array is always produced even for 1 or 0
> fields are found.
>
>
> > 6. (Nit) For the HeaderFrom SMT, it's stated that "As this SMT affects
> only
> > existing fields, additional configurations will not be required.". Given
> > the new "field.syntax.version" property, this part should probably be
> > removed.
> >
> Agree.
>
>
> >
> > 7. Is recursive descent intentionally excluded? That was an important
> part
> > of Joshua's KIP and his feedback on this KIP; I think it's worth 

Re: [VOTE] KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag

2022-06-13 Thread Niket Goel
Thanks for taking a look Gouzhang.
That is an accurate summary of the changes.

- Niket

> On Jun 13, 2022, at 2:19 PM, Guozhang Wang  wrote:
> 
> Hello Niket,
> 
> I took a look at the diffs of the wiki page history and it seems the
> updates are:
> 
> 1) Removing the topic name from QuorumInfo.
> 2) Making ReplicaState public APIs more aligned with existing ones.
> 3) Renaming from DescribeQuorumResult to DescribeMetadataQuorumResult.
> 
> Is that all? If yes, I'm +1 on the changes.
> 
> 
> Guozhang
> 
> On Mon, Jun 13, 2022 at 2:09 PM Niket Goel  >
> wrote:
> 
>> Hi all,
>> 
>> An update on KIP-836; while iterating on the PR some finer details in the
>> naming of the API handler and related classes in the Admin Client has
>> changed.
>> I have updated the KIP with the changes. Please provide any feedback you
>> may have on the changes made.
>> 
>> PR - 
>> https://www.google.com/url?q=https://github.com/apache/kafka/pull/12206=gmail-imap=165576000900=AOvVaw0Av1rIpaRYGZsVAJkgSFwx
>>  
>> 
>> KIP -
>> https://www.google.com/url?q=https://cwiki.apache.org/confluence/display/KAFKA/KIP-836%253A%2BAddition%2Bof%2BInformation%2Bin%2BDescribeQuorumResponse%2Babout%2BVoter%2BLag=gmail-imap=165576000900=AOvVaw3ZhA-Qvc0rZBYddc4u9667
>>  
>> 
>> 
>> Thanks
>> - Niket
>> 
>> 
>>> On May 24, 2022, at 2:30 PM, José Armando García Sancio
>>  wrote:
>>> 
>>> +1 (binding).
>>> 
>>> On Mon, May 23, 2022 at 11:52 AM Niket Goel 
>> wrote:
 
 Thanks everyone for the feedback on this. KIP-836 was approved with
 3 binding votes from Luke, David Arthur and David Jacot.
 - Niket
 
> On May 20, 2022, at 11:44 PM, David Jacot  wrote:
> 
> +1 (binding)
> 
> Thanks for the KIP!
> 
> David
> 
> Le ven. 20 mai 2022 à 21:08, David Arthur  a écrit :
> 
>> Hey Niket, +1 (binding) from me.
>> 
>> -David
>> 
>> On Thu, May 19, 2022 at 2:07 PM José Armando García Sancio
>>  wrote:
>> 
>>> Hey Niket,
>>> 
>>> I took a look at the latest KIP. It looks like
>> QuorumInfo.ReplicaState
>>> is missing the RPC fields added by this PR. Is the plan to return
>> them
>>> to the Admin Client? E.g. it is missing LastFetchTimestamp and
>>> LastCaughtUpTimestamp.
>>> 
>>> For those fields what will the admin client return when the RPC
>>> version doesn't support those features?
>>> 
>>> --
>>> -José
>>> 
>> 
>> 
>> --
>> David Arthur
>> 
 
>>> 
>>> 
>>> --
>>> -José
>> 
>> 
> 
> -- 
> -- Guozhang



[jira] [Created] (KAFKA-13987) BlockingConnectorTest causing flakiness in other tests

2022-06-13 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-13987:
-

 Summary: BlockingConnectorTest causing flakiness in other tests
 Key: KAFKA-13987
 URL: https://issues.apache.org/jira/browse/KAFKA-13987
 Project: Kafka
  Issue Type: Test
  Components: KafkaConnect
Reporter: Chris Egerton
Assignee: Chris Egerton


We artificially reduce the REST request timeout from 90 seconds to 10 seconds 
while running the BlockingConnectorTest suite, which allows some test cases 
that issue REST requests that are expected to time out to proceed faster 
without eating up 90 seconds spinning their wheels.

However, this reduction is applied by mutating static state in the JVM, and 
since Connect tests from different suites can be run concurrently, it has the 
potential to impact other tests and make them flaky.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Jenkins build is back to stable : Kafka » Kafka Branch Builder » trunk #1001

2022-06-13 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag

2022-06-13 Thread Guozhang Wang
Hello Niket,

I took a look at the diffs of the wiki page history and it seems the
updates are:

1) Removing the topic name from QuorumInfo.
2) Making ReplicaState public APIs more aligned with existing ones.
3) Renaming from DescribeQuorumResult to DescribeMetadataQuorumResult.

Is that all? If yes, I'm +1 on the changes.


Guozhang

On Mon, Jun 13, 2022 at 2:09 PM Niket Goel 
wrote:

> Hi all,
>
> An update on KIP-836; while iterating on the PR some finer details in the
> naming of the API handler and related classes in the Admin Client has
> changed.
> I have updated the KIP with the changes. Please provide any feedback you
> may have on the changes made.
>
> PR - https://github.com/apache/kafka/pull/12206
> KIP -
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-836%3A+Addition+of+Information+in+DescribeQuorumResponse+about+Voter+Lag
>
> Thanks
> - Niket
>
>
> > On May 24, 2022, at 2:30 PM, José Armando García Sancio
>  wrote:
> >
> > +1 (binding).
> >
> > On Mon, May 23, 2022 at 11:52 AM Niket Goel 
> wrote:
> >>
> >> Thanks everyone for the feedback on this. KIP-836 was approved with
> >> 3 binding votes from Luke, David Arthur and David Jacot.
> >> - Niket
> >>
> >>> On May 20, 2022, at 11:44 PM, David Jacot  wrote:
> >>>
> >>> +1 (binding)
> >>>
> >>> Thanks for the KIP!
> >>>
> >>> David
> >>>
> >>> Le ven. 20 mai 2022 à 21:08, David Arthur  a écrit :
> >>>
>  Hey Niket, +1 (binding) from me.
> 
>  -David
> 
>  On Thu, May 19, 2022 at 2:07 PM José Armando García Sancio
>   wrote:
> 
> > Hey Niket,
> >
> > I took a look at the latest KIP. It looks like
> QuorumInfo.ReplicaState
> > is missing the RPC fields added by this PR. Is the plan to return
> them
> > to the Admin Client? E.g. it is missing LastFetchTimestamp and
> > LastCaughtUpTimestamp.
> >
> > For those fields what will the admin client return when the RPC
> > version doesn't support those features?
> >
> > --
> > -José
> >
> 
> 
>  --
>  David Arthur
> 
> >>
> >
> >
> > --
> > -José
>
>

-- 
-- Guozhang


[jira] [Created] (KAFKA-13986) DescribeQuorum does not return the observers (brokers) for the Metadata log

2022-06-13 Thread Niket Goel (Jira)
Niket Goel created KAFKA-13986:
--

 Summary: DescribeQuorum does not return the observers (brokers) 
for the Metadata log
 Key: KAFKA-13986
 URL: https://issues.apache.org/jira/browse/KAFKA-13986
 Project: Kafka
  Issue Type: Improvement
  Components: kraft
Affects Versions: 3.0.1, 3.0.0
Reporter: Niket Goel


h2. Background

While working on the [PR|https://github.com/apache/kafka/pull/12206] for 
KIP-836, we realized that the `DescribeQuorum` API does not return the brokers 
as observers for the metadata log.

As noted by [~dengziming] :
_We set nodeId=-1 if it's a broker so observers.size==0_

The related code is:

[https://github.com/apache/kafka/blob/4c9eeef5b2dff9a4f0977fbc5ac7eaaf930d0d0e/core/src/main/scala/kafka/raft/RaftManager.scala#L185-L189]
{code:java}
val nodeId = if (config.processRoles.contains(ControllerRole))
{ OptionalInt.of(config.nodeId) }
else
{ OptionalInt.empty() }
{code}
h2. ToDo

We should fix this and have the DescribeMetadata API return the brokers as 
observers for the metadata log.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [VOTE] KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag

2022-06-13 Thread Niket Goel
Hi all,

An update on KIP-836; while iterating on the PR some finer details in the 
naming of the API handler and related classes in the Admin Client has changed.
I have updated the KIP with the changes. Please provide any feedback you may 
have on the changes made.

PR - https://github.com/apache/kafka/pull/12206
KIP - 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-836%3A+Addition+of+Information+in+DescribeQuorumResponse+about+Voter+Lag

Thanks
- Niket


> On May 24, 2022, at 2:30 PM, José Armando García Sancio 
>  wrote:
> 
> +1 (binding).
> 
> On Mon, May 23, 2022 at 11:52 AM Niket Goel  
> wrote:
>> 
>> Thanks everyone for the feedback on this. KIP-836 was approved with
>> 3 binding votes from Luke, David Arthur and David Jacot.
>> - Niket
>> 
>>> On May 20, 2022, at 11:44 PM, David Jacot  wrote:
>>> 
>>> +1 (binding)
>>> 
>>> Thanks for the KIP!
>>> 
>>> David
>>> 
>>> Le ven. 20 mai 2022 à 21:08, David Arthur  a écrit :
>>> 
 Hey Niket, +1 (binding) from me.
 
 -David
 
 On Thu, May 19, 2022 at 2:07 PM José Armando García Sancio
  wrote:
 
> Hey Niket,
> 
> I took a look at the latest KIP. It looks like QuorumInfo.ReplicaState
> is missing the RPC fields added by this PR. Is the plan to return them
> to the Admin Client? E.g. it is missing LastFetchTimestamp and
> LastCaughtUpTimestamp.
> 
> For those fields what will the admin client return when the RPC
> version doesn't support those features?
> 
> --
> -José
> 
 
 
 --
 David Arthur
 
>> 
> 
> 
> -- 
> -José



Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1000

2022-06-13 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-831: Add metric for log recovery progress

2022-06-13 Thread Jun Rao
Hi, Luke,

Thanks for the KIP. +1 from me.

Jun

On Mon, Jun 13, 2022 at 8:04 AM Mickael Maison 
wrote:

> +1 (binding)
>
> Thanks for the KIP!
>
> Mickael
>
> On Sun, Jun 12, 2022 at 5:26 PM Yu Kvicii  wrote:
> >
> > +1 non binding. Thanks
> >
> >
> > > On May 16, 2022, at 15:11, Luke Chen  wrote:
> > >
> > > Hi all,
> > >
> > > I'd like to start a vote on KIP to expose metrics for log recovery
> > > progress. These metrics would let the admins have a way to monitor the
> log
> > > recovery progress.
> > >
> > > Details can be found here:
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-831%3A+Add+metric+for+log+recovery+progress
> > >
> > > Any feedback is appreciated.
> > >
> > > Thank you.
> > > Luke
> >
>


Re: [VOTE] KIP-831: Add metric for log recovery progress

2022-06-13 Thread Mickael Maison
+1 (binding)

Thanks for the KIP!

Mickael

On Sun, Jun 12, 2022 at 5:26 PM Yu Kvicii  wrote:
>
> +1 non binding. Thanks
>
>
> > On May 16, 2022, at 15:11, Luke Chen  wrote:
> >
> > Hi all,
> >
> > I'd like to start a vote on KIP to expose metrics for log recovery
> > progress. These metrics would let the admins have a way to monitor the log
> > recovery progress.
> >
> > Details can be found here:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-831%3A+Add+metric+for+log+recovery+progress
> >
> > Any feedback is appreciated.
> >
> > Thank you.
> > Luke
>


[jira] [Reopened] (KAFKA-13414) Replace Powermock/EasyMock by Mockito in connect.storage package

2022-06-13 Thread Mickael Maison (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison reopened KAFKA-13414:


My PR only updated some of the classes under connect.storage. The following 
files still need to be migrated to Mockito:
- KafkaConfigBackingStoreTest
- KafkaOffsetBackingStoreTest

> Replace Powermock/EasyMock by Mockito in connect.storage package
> 
>
> Key: KAFKA-13414
> URL: https://issues.apache.org/jira/browse/KAFKA-13414
> Project: Kafka
>  Issue Type: Sub-task
>  Components: KafkaConnect
>Reporter: Mickael Maison
>Assignee: Mickael Maison
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (KAFKA-13436) Omitted BrokerTopicMetrics metrics in the documentation

2022-06-13 Thread Mickael Maison (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-13436.

Fix Version/s: 3.3.0
   Resolution: Fixed

> Omitted BrokerTopicMetrics metrics in the documentation
> ---
>
> Key: KAFKA-13436
> URL: https://issues.apache.org/jira/browse/KAFKA-13436
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Reporter: Dongjin Lee
>Assignee: Dongjin Lee
>Priority: Minor
> Fix For: 3.3.0
>
>
> As of present, there are 18 'kafka.server:type=BrokerTopicMetrics' but, only 
> 13 of them are described in the documentation.
> The omitted metrics are:
>  * kafka.server:type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec
>  * kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec
>  * kafka.server:type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec
>  * kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec
>  * kafka.server:type=BrokerTopicMetrics,name=BytesRejectedPerSec



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (KAFKA-13985) MirrorSourceTask commitRecord throws NPE if SMT is filtering out source record

2022-06-13 Thread Jacopo (Jira)
Jacopo created KAFKA-13985:
--

 Summary: MirrorSourceTask commitRecord throws NPE if SMT is 
filtering out source record
 Key: KAFKA-13985
 URL: https://issues.apache.org/jira/browse/KAFKA-13985
 Project: Kafka
  Issue Type: Bug
  Components: mirrormaker
Affects Versions: 3.2.0, 3.1.0
Reporter: Jacopo


Applying a SMT that filter message it can laverage to enter in this if:

 

>From WorkerSourceTask.java

 
{code:java}
final SourceRecord record = transformationChain.apply(preTransformRecord);
final ProducerRecord producerRecord = 
convertTransformedRecord(record);
if (producerRecord == null || retryWithToleranceOperator.failed()) {
    counter.skipRecord();
    commitTaskRecord(preTransformRecord, null);
    continue;
} {code}
 

That bring to:

>From MirrorSourceTask.java

 
{code:java}
    @Override
    public void commitRecord(SourceRecord record, RecordMetadata metadata) {
        try {
            if (stopping) {
                return;
            }
            if (!metadata.hasOffset()) {
                log.error("RecordMetadata has no offset -- can't sync offsets 
for {}.", record.topic());
                return;
            }

...{code}
 

Causing a NPE because metadata is null, this the exception.
{code:java}
java.lang.NullPointerException
    at 
org.apache.kafka.connect.mirror.MirrorSourceTask.commitRecord(MirrorSourceTask.java:177)
    at 
org.apache.kafka.connect.runtime.WorkerSourceTask.commitTaskRecord(WorkerSourceTask.java:463)
    at 
org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:358)
    at 
org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:257)
    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:188)
    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:243)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown 
Source)
    at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
Source)
    at java.base/java.lang.Thread.run(Unknown Source) {code}
In my understanding this is well handled and it does not have negative impacts 
because it's handled by MirrorSourceTask.commitRecord, without leaving the 
exception be forwarded outside of it. 

 

But probably is preferred handle it checking if metadata != null.

So skipping commit but safely and silently

 

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (KAFKA-13984) Fix TopicBasedRemoteLogMetadataManager#initializeResources should return immediately when the result of isPartitionsCountSameAsConfigured method is false

2022-06-13 Thread zhaobo (Jira)
zhaobo created KAFKA-13984:
--

 Summary: Fix 
TopicBasedRemoteLogMetadataManager#initializeResources should return 
immediately when the result of isPartitionsCountSameAsConfigured method is false
 Key: KAFKA-13984
 URL: https://issues.apache.org/jira/browse/KAFKA-13984
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 3.2.0
Reporter: zhaobo


When  executing 
{code:java}
TopicBasedRemoteLogMetadataManager.initializeResources{code}
, if the result of isPartitionsCountSameAsConfigured method is false.

It means that the actual number of partitions in the internal topic 
__remote_log_metadata is inconsistent with the number of partitions configured 
in our configuration file
At this time, it is not very reasonable to continue the subsequent 
initialization process.
We should raise an error to exit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (KAFKA-13983) Support special character in Resource name in ACLs operation by sanitizing

2022-06-13 Thread Aman Singh (Jira)
Aman Singh created KAFKA-13983:
--

 Summary: Support special character in Resource name in ACLs 
operation by sanitizing 
 Key: KAFKA-13983
 URL: https://issues.apache.org/jira/browse/KAFKA-13983
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Aman Singh


Currently, resource names in ACLS can contain any special characters, but 
resource names with some special characters are not a valid zookeeper path 
entry.

For example resource name {{test/true}} is not a valid zookeeper path entry.

Zookeeper will create a child node, name as {color:#de350b}{{true}}{color} 
inside the {color:#de350b}{{test}}{color} node.

This will create two problems:-
 # If there is *one*  ACL with a resource name {{test}} it can't be deleted 
because if there is only one, Kafka tries to delete the node as well by 
thinking it will be empty which is not true it has the child node {{true.}}
 # When broker restarts {{{}ACL cache{}}}(which is used for ACL operations like 
describe, authorization etc) update from zookeeper and Kafka only looks for  
ACLs that are direct child nodes of resource type in the ACL tree. 

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


RE: [External] Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #995

2022-06-13 Thread Nozza, Nicola
Hi Bruno,

Thank you a lot for your feedback.

Sorry for the wrong thread mail. I have also written a single mail (in attach).

Regards,

Nicola Nozza

-Original Message-
From: Bruno Cadonna  
Sent: lunedì 13 giugno 2022 11:25
To: dev@kafka.apache.org; Nozza, Nicola 
Subject: Re: [External] Build failed in Jenkins: Kafka » Kafka Branch Builder » 
trunk #995

Hi Nicola,

The ETA for Kafka 4.0 in which log4j will be updated to version 2 is going to 
be released in the second half of 2023.

In the meanwhile we replaced log4j with reload4j in release 3.2.0. 
Reload4j closes the known vulnerabilities of log4j1.

Out of curiosity, why do you reply to a Jenkins build e-mail and ask something 
totally unrelated to the Jenkins build?

Best,
Bruno



On 10.06.22 17:44, Nozza, Nicola wrote:
> Hi,
> 
> Someone knows if there is an ETA for the release in the following link:
> 
> [KAFKA-9366] Upgrade log4j to log4j2 - ASF JIRA (apache.org)
> 
> Thanks,
> 
> Nicola Nozza
> 
> -Original Message-
> From: Apache Jenkins Server 
> Sent: venerdì 10 giugno 2022 17:43
> To: dev@kafka.apache.org
> Subject: [External] Build failed in Jenkins: Kafka » Kafka Branch 
> Builder » trunk #995
> 
> This message is from an EXTERNAL SENDER - be CAUTIOUS, particularly with 
> links and attachments.
> 
> See 
>  he.org_job_Kafka_job_kafka_job_trunk_995_display_redirect=DwIFaQ=e
> IGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU=_xaOzQifkxaSrndD85AUqeAQW
> tUM1WmWQ8VvQ8mTgzg=1zzopzlgF3euqEpniX4mbchXoSO3tfxy-tBu89oRD0gelMedU
> XN-HjCo9wU1wP8C=nU48ilW9_jnw16qbJ8yAStOBveD5N0UcT4KEX71CfmU= >
> 
> Changes:
> 
> 
> --
> [...truncated 634647 lines...]
> [2022-06-10T15:29:21.047Z] > Task :streams:javadoc [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:890:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:919:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:939:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:854:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:890:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:919:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:939:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:84:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:136:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:147:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Repartitioned.java:101:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Repartitioned.java:167:
>  warning - Tag @link: reference not found: DefaultPartitioner 
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java:443:
>  warning - @statestore.cache.max.bytes is an unknown tag.
> [2022-06-10T15:29:21.047Z] 
> /home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java:443:
>  warning - @statestore.cache.max.bytes is an unknown tag.
> 

[jira] [Created] (KAFKA-13982) Replace EasyMock and PowerMock with Mockito for WorkerConfigTransformerTest

2022-06-13 Thread Christo Lolov (Jira)
Christo Lolov created KAFKA-13982:
-

 Summary: Replace EasyMock and PowerMock with Mockito for 
WorkerConfigTransformerTest
 Key: KAFKA-13982
 URL: https://issues.apache.org/jira/browse/KAFKA-13982
 Project: Kafka
  Issue Type: Sub-task
Reporter: Christo Lolov






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [External] Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #995

2022-06-13 Thread Bruno Cadonna

Hi Nicola,

The ETA for Kafka 4.0 in which log4j will be updated to version 2 is 
going to be released in the second half of 2023.


In the meanwhile we replaced log4j with reload4j in release 3.2.0. 
Reload4j closes the known vulnerabilities of log4j1.


Out of curiosity, why do you reply to a Jenkins build e-mail and ask 
something totally unrelated to the Jenkins build?


Best,
Bruno



On 10.06.22 17:44, Nozza, Nicola wrote:

Hi,

Someone knows if there is an ETA for the release in the following link:

[KAFKA-9366] Upgrade log4j to log4j2 - ASF JIRA (apache.org)

Thanks,

Nicola Nozza

-Original Message-
From: Apache Jenkins Server 
Sent: venerdì 10 giugno 2022 17:43
To: dev@kafka.apache.org
Subject: [External] Build failed in Jenkins: Kafka » Kafka Branch Builder » 
trunk #995

This message is from an EXTERNAL SENDER - be CAUTIOUS, particularly with links 
and attachments.

See 


Changes:


--
[...truncated 634647 lines...]
[2022-06-10T15:29:21.047Z] > Task :streams:javadoc [2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:890:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:919:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:939:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:854:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:890:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:919:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:939:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:84:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:136:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:147:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Repartitioned.java:101:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Repartitioned.java:167:
 warning - Tag @link: reference not found: DefaultPartitioner 
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java:443:
 warning - @statestore.cache.max.bytes is an unknown tag.
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java:443:
 warning - @statestore.cache.max.bytes is an unknown tag.
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java:443:
 warning - @statestore.cache.max.bytes is an unknown tag.
[2022-06-10T15:29:21.047Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/TopologyConfig.java:61:
 warning - Tag @link: missing '#': "org.apache.kafka.streams.StreamsBuilder()"
[2022-06-10T15:29:21.047Z] 

[jira] [Resolved] (KAFKA-13414) Replace Powermock/EasyMock by Mockito in connect.storage package

2022-06-13 Thread Christo Lolov (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christo Lolov resolved KAFKA-13414.
---
Resolution: Fixed

> Replace Powermock/EasyMock by Mockito in connect.storage package
> 
>
> Key: KAFKA-13414
> URL: https://issues.apache.org/jira/browse/KAFKA-13414
> Project: Kafka
>  Issue Type: Sub-task
>  Components: KafkaConnect
>Reporter: Mickael Maison
>Assignee: Mickael Maison
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)