Re: [DISCUSS] KIP-753: ACL authentication, Host field support IP network segment

2021-11-26 Thread Luke Chen
Hi Lobo,
Thanks for the KIP!

I like the idea to allow "IP subnet" to be passed into `--allow-host`
option to set for a principle. It will be useful in production environment.

Here's some comments:
1. I think "IP subnet" is more specific than "network segment", is that
right?
2. Since you allow the IP subnet in "--allow-host" option, should we also
allow the IP subnet in "--deny-host" option?
3. You should mention that we only accept the "CIDR notation" of the IP
subnet, to avoid other kinds of subnet expression. REF:
https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation
4. IP subnet also supports IPv6, should we also allow subnet of IPv6?

Thank you.
Luke

On Tue, Jun 8, 2021 at 9:19 AM lobo xu  wrote:

> The KIP address is wrong in the last email. This is the correct Kip Wiki
> address
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-753%3A++ACL+authentication%2C+Host+field+support+IP+network+segment
>
>
> On 2021/06/07 16:24:50, lobo xu  wrote:
> > Hi all
> >
> > I'd like to discuss the following kip, any suggestions are welcome.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-753%3A++ACL+authentication%2C+Host+field+support+IP+network+segment
> 。
> >
> > Many thanks,
> >
> > Lobo
> >
>


[DISCUSS] KIP-804: OfflinePartitionsCount Tagged by Topic

2021-11-26 Thread Mason Legere
Hi All,

I would like to start a discussion for KIP-804
,
which
proposes tagging the offline partition counter metric (managed by the
controller) by the topic name of the corresponding offline partition(s).

Open to any thoughts and suggestions,
Mason


[jira] [Resolved] (KAFKA-12791) ConcurrentModificationException in KafkaProducer constructor

2021-11-26 Thread Lucas Bradstreet (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lucas Bradstreet resolved KAFKA-12791.
--
Resolution: Fixed

> ConcurrentModificationException in KafkaProducer constructor
> 
>
> Key: KAFKA-12791
> URL: https://issues.apache.org/jira/browse/KAFKA-12791
> Project: Kafka
>  Issue Type: Bug
>Reporter: Lucas Bradstreet
>Priority: Minor
> Fix For: 3.0.0
>
>
> Recently we have noticed multiple instances where KafkaProducers have failed 
> to constructe due to the following exception:
> {noformat}
> org.apache.kafka.common.KafkaException: Failed to construct kafka producer at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:440)
>  at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:291)
>  at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:318)
>  java.base/java.lang.Thread.run(Thread.java:832) Caused by: 
> java.util.ConcurrentModificationException at 
> java.base/java.util.HashMap$HashIterator.nextNode(HashMap.java:1584) at 
> java.base/java.util.HashMap$KeyIterator.next(HashMap.java:1607) at 
> java.base/java.util.AbstractSet.removeAll(AbstractSet.java:171) at 
> org.apache.kafka.common.config.AbstractConfig.unused(AbstractConfig.java:221) 
> at 
> org.apache.kafka.common.config.AbstractConfig.logUnused(AbstractConfig.java:379)
>  at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:433)
>  ... 9 more exception.class:org.apache.kafka.common.KafkaException 
> exception.message:Failed to construct kafka producer
> {noformat}
> It appears that this is due to the fact that `used` below is a synchronized 
> set:
>  
> {code:java}
> public Set unused() {
>  Set keys = new HashSet<>(originals.keySet());
>  keys.removeAll(used);
>  return keys;
> }{code}
> It appears that `used` is being modified while removeAll is being called. 
> This may be due to the way that keys are added to it when used:
> {code:java}
> protected Object get(String key) {
>  if (!values.containsKey(key))
>  throw new ConfigException(String.format("Unknown configuration '%s'", key));
>  used.add(key);
>  return values.get(key);
> }{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] KIP-800: Add reason to LeaveGroupRequest

2021-11-26 Thread Bill Bejeck
Thanks for the KIP, David this seems like it will be very helpful.

+1(binding)

-Bill

On Thu, Nov 25, 2021 at 10:02 AM David Jacot 
wrote:

> Hi Tom,
>
> I do agree with you. For context, this is the current reason/message logged
> by the consumer when enforceRebalance is called so I just kept it as it is.
> I guess that we can revise it during the implementation.
>
> Thanks,
> David
>
> On Thu, Nov 25, 2021 at 3:52 PM Tom Bentley  wrote:
> >
> > Thanks for the KIP David.
> >
> > It's a very trivial point, but I did wonder whether "caused" or
> "requested"
> > might be a better word than "enforced" in the default messages.
> >
> > In any case, +1 (binding).
> >
> > Kind regards,
> >
> > Tom
> >
> >
> >
> > On Thu, Nov 25, 2021 at 2:36 PM David Jacot  >
> > wrote:
> >
> > > Thanks, Mickael. I have removed it.
> > >
> > > On Thu, Nov 25, 2021 at 3:22 PM Mickael Maison <
> mickael.mai...@gmail.com>
> > > wrote:
> > > >
> > > > +1 (binding)
> > > > Thanks for the KIP
> > > >
> > > > Just a small suggestion: in other Options classes we don't seem to
> > > > prefix setters with "set", so I think we could use void reason(final
> > > > String reason).
> > > >
> > > >
> > > > On Thu, Nov 25, 2021 at 1:48 PM Luke Chen  wrote:
> > > > >
> > > > > Hi David,
> > > > > Thanks for the KIP!
> > > > > It is good to have the joinGroup/leaveGroup reason sent to brokers
> for
> > > > > better troubleshooting.
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Thank you.
> > > > > Luke
> > > > >
> > > > > On Thu, Nov 25, 2021 at 8:14 AM Gwen Shapira
>  > > >
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > Thanks for driving David. Super useful.
> > > > > >
> > > > > > On Wed, Nov 24, 2021 at 8:53 AM David Jacot
> > > 
> > > > > > wrote:
> > > > > >
> > > > > > > Hi folks,
> > > > > > >
> > > > > > > I'd like to start a vote on KIP-800: Add reason to
> > > LeaveGroupRequest.
> > > > > > >
> > > > > > > KIP: https://cwiki.apache.org/confluence/x/eYyqCw
> > > > > > >
> > > > > > > Please let me know what you think.
> > > > > > >
> > > > > > > Best,
> > > > > > > David
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Gwen Shapira
> > > > > > Engineering Manager | Confluent
> > > > > > 650.450.2760 | @gwenshap
> > > > > > Follow us: Twitter | blog
> > > > > >
> > >
> > >
>


Re: KIP-769: Connect API to retrieve connector configuration definitions

2021-11-26 Thread Mickael Maison
Hi Chris,

1. If we want to expose worker plugins, I think we should do it via a
separate endpoint. But to be honest, I'm not even sure I see strong
use cases for exposing them as they are either enabled or not and
can't be changed at runtime. So I'd prefer to stick to "connector
level" plugins in this KIP. Let me now if you have use cases, I'm open
to reconsider this choice.
I'll add that in the rejected alternatives section for now

2. I remembered seeing issues in the past with multiple plugin.path
entries but I tried today and I was able to mix and match plugins from
different paths. So my bad for getting confused.
Then I agree, it makes more sense to group them by plugin type.

3. Yes this should be covered in KIP-802:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-802%3A+Validation+Support+for+Kafka+Connect+SMT+Options

4. No particular reason. We can support both formats like today. I've
updated the KIP

Thanks,
Mickael



On Tue, Nov 23, 2021 at 6:40 PM Chris Egerton
 wrote:
>
> Hi Mickael,
>
> I think the increase in scope here is great and the added value certainly
> justifies the proposed changes. I have some thoughts but overall I like the
> direction this is going in now.
>
> 1. The new /plugins endpoint is described as containing "all plugins that
> are Connectors, Transformations, Converters, HeaderConverters and
> Predicates". So essentially, it looks like we want to expose all plugins
> that are configured on a per-connector basis, but exclude plugins that are
> configured on a per-worker basis (such as config providers and REST
> extensions). Do you think it may be valuable to expose information on
> worker-level plugins as well?
>
> 2. The description for the new /plugins endpoint also states that "Plugins
> will be grouped by plugin.path. This will make it clear to users what's
> available to use as it's not possible to use a Connector from one path with
> Transformations from another.". Is this true? I thought that Connect's
> classloading made it possible to package
> converters/transformations/predicates completely independently from each
> other, and to reference them from also-independently-packaged connectors.
> If it turns out that this is the case, could we consider restructuring the
> response to be grouped by plugin type instead of by classloader? There's
> also the ungrouped format proposed in KIP-494 (
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120740150)
> which we might consider as well.
>
> 3. I think this can be left for a follow-up KIP if necessary, but I'm
> curious about your thoughts on adding new validate methods to all
> connector-level plugins that can be used similarly to how the existing
> Connector::validate method (
> https://github.com/apache/kafka/blob/1e0916580f16b99b911b0ed36e9740dcaeef520e/connect/api/src/main/java/org/apache/kafka/connect/connector/Connector.java#L131-L146)
> is used. This would allow for plugins to perform validation that's more
> sophisticated than what the ConfigDef is capable of, such as validating
> combinations of properties like a hostname and credentials for reaching it.
> I know that at least Confluent's Avro, protobuf, and JSON schema converters
> would benefit from this kind of feature. It's a little tangential to this
> KIP (which at the moment is about discovering plugins and their
> configuration surfaces, as opposed to validating them), but I figured I'd
> ask since we're going to be expanding the Converter interface and it may be
> useful to tackle this while we're in the neighborhood.
>
> 4. The description for the new /plugins///configdef endpoint
> states that "Name must be the fully qualified class name of the plugin".
> Any reason not to also support aliases (e.g., "FileStreamSinkConnector" or
> "FileStreamSink" instead of
> "org.apache.kafka.connect.file.FileStreamSinkConnector")?
>
> Cheers,
>
> Chris
>
> On Tue, Nov 23, 2021 at 12:07 PM Mickael Maison 
> wrote:
>
> > Thanks all for the feedback!
> >
> > Chris,
> > I agree that fixing the current endpoint helps a lot. Thanks for
> > raising these JIRAs and submitting a PR!
> > However thinking about the issue further, I decided to expand the
> > scope of the KIP to cover all user-visible plugins.
> > In practice, users want to know about all available plugins not only
> > connectors. This includes transformations, converters,
> > header_converters and predicates. As we also want to retrieve
> > configdef for these too, I think it makes sense to introduce a new
> > endpoint to do so. Alongside we obviously need a new endpoint for
> > listing all plugins.
> >
> > Gunnar,
> > I took a look at exposing valid values via the API. I think the issue
> > is that Validators don't expose a way to retrieve valid values.
> > Changing validators will have an impact on all components so I'd
> > prefer to address this requirement in a separate KIP. I agree this
> > would be an interesting improvement and I'd happy to write a KIP for
> > it too.
> >
>

[jira] [Created] (KAFKA-13486) Kafka Connect: Failed to start task due to NPE

2021-11-26 Thread Geliba Uilte (Jira)
Geliba Uilte created KAFKA-13486:


 Summary: Kafka Connect: Failed to start task due to NPE
 Key: KAFKA-13486
 URL: https://issues.apache.org/jira/browse/KAFKA-13486
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 2.7.1
 Environment: Kubernetes, custom docker image
Reporter: Geliba Uilte


I have a Kafka Connect cluster with three workers running on Kubernetes. The 
workers communicate with each other using pod's IP (internal IP 192.X.X.X). 
Sometimes, pods are redistributed to different node. I am not sure if it has 
anything to do with the issue, but I think it makes pod's IP to be changed and 
Kafka Connect needs to rebalance.

Occasionally, tasks fail due to NPE.

>From the connectors/:connector/status REST API, I can see this trace:

 
{code:java}
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:517)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:1258)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1700(DistributedHerder.java:127)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$10.call(DistributedHerder.java:1273)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$10.call(DistributedHerder.java:1269)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834){code}
 

It looks like the issue is similar to KAFKA-10323 and

It seems NPE is thrown from 
[here|https://github.com/apache/kafka/blob/2.7.1/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L517].

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13485) Restart connectors after RetriableException raised from Task::start()

2021-11-26 Thread Gunnar Morling (Jira)
Gunnar Morling created KAFKA-13485:
--

 Summary: Restart connectors after RetriableException raised from 
Task::start()
 Key: KAFKA-13485
 URL: https://issues.apache.org/jira/browse/KAFKA-13485
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Reporter: Gunnar Morling


If a {{RetriableException}} is raised from {{Task::start()}}, this doesn't 
trigger an attempt to start that connector again. I.e. the restart 
functionality currently is only implemented for exceptions raised from 
{{poll()}}/{{put()}}. Triggering restarts also upon failures during {{start()}} 
would be desirable, so to circumvent temporary failure conditions like a 
network hickup which currrently require a manual restart of the affected tasks, 
if a connector for instance establishes a database connection during 
{{start()}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: Handling retriable exceptions during Connect source task start

2021-11-26 Thread Gunnar Morling
Hi all,

We encountered a similar situation in Debezium again, where an exception
during Task::start() would be desirable to be retried.

Would anything speak against implementing retriable support for
Task::start() in Kafka Connect? Would it require a KIP?

Thanks,

--Gunnar


Am Mo., 9. Aug. 2021 um 10:47 Uhr schrieb Gunnar Morling <
gunnar.morl...@googlemail.com>:

> Hi,
>
> To ask slightly differently: would there be interest in a pull request for
> implementing retries, in case RetriableException is thrown from the
> Task::start() method?
>
> Thanks,
>
> --Gunnar
>
>
> Am Do., 5. Aug. 2021 um 22:27 Uhr schrieb Sergei Morozov :
>
>> Hi,
>>
>> I'm trying to address an issue in Debezium (DBZ-3823
>> ) where a source connector
>> task
>> cannot recover from a retriable exception.
>>
>> The root cause is that the task interacts with the source database during
>> SourceTask#start but Kafka Connect doesn't handle retriable exceptions
>> thrown at this stage as retriable. KIP-298
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect
>> >
>> that
>> originally introduced handling of retriable exception doesn't describe
>> handling task start exceptions, so it's unclear to me whether those aren't
>> allowed by design or it was just out of the scope of the KIP.
>>
>> My current working solution
>>  relies
>> on the internal Debezium implementation of the task restart which
>> introduces certain risks (the details are in the PR description).
>>
>> The question is: are retriable exceptions during start disallowed by
>> design, and the task must not throw retriable exceptions during start, or
>> it's just currently not supported by the Connect framework and I just need
>> to implement proper error handling in the connector?
>>
>> Thanks!
>>
>> --
>> Sergei Morozov
>>
>