[jira] [Created] (KAFKA-12840) Removing `compact` cleaning on a topic should abort on-going compactions

2021-05-24 Thread David Jacot (Jira)
David Jacot created KAFKA-12840:
---

 Summary: Removing `compact` cleaning on a topic should abort 
on-going compactions
 Key: KAFKA-12840
 URL: https://issues.apache.org/jira/browse/KAFKA-12840
 Project: Kafka
  Issue Type: Improvement
Reporter: David Jacot
Assignee: David Jacot


When `compact` is removed from the `cleanup.policy` of a topic, the compactions 
of that topic should be aborted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12841) NPE from the provided metadata in client callback in case of ApiException

2021-05-24 Thread Avi Youkhananov (Jira)
Avi Youkhananov created KAFKA-12841:
---

 Summary: NPE from the provided metadata in client callback in case 
of ApiException
 Key: KAFKA-12841
 URL: https://issues.apache.org/jira/browse/KAFKA-12841
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 2.6.0
 Environment: Prod
Reporter: Avi Youkhananov
 Attachments: NPE.production

1.
org.apache.kafka.clients.producer.Callback interface has method 
onCompletion(...)
Which says as part of the documentation :

*The metadata for the record that was sent (i.e. the partition and offset). *An 
empty metadata with -1 value for all fields* except for topicPartition will be 
returned if an error occurred.


We got an NPE from doSend(...) method in 
org.apache.kafka.clients.producer.KafkaProducer 
Which can occur in case ApiException was thrown ...
In case of ApiException it uses the regular callback instead of 
InterceptorCallback which also may cover the NPE.

2. More over RecordMetadata has method partition() which return int but can 
also throw NPE because TopicPartition might be null.

Stack trace attached.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-741: Change default serde to be null

2021-05-24 Thread Leah Thomas
If there are no further notes, I'll go ahead and start the voting thread
today.

Thanks,
Leah

On Thu, May 20, 2021 at 2:58 PM Leah Thomas  wrote:

> Thanks for finding that, Guozhang. Consistency seems like the best option
> to me as well, for the time being.  I updated the KIP with that detail
>
> On Thu, May 20, 2021 at 11:53 AM Sophie Blee-Goldman
>  wrote:
>
>> Thanks Guozhang, I forgot about ConfigException. I actually just reviewed
>> two KIPs related to config-based Serdes,
>> both of which use the ConfigException in this way: the ListSerde KIP, and
>> the KIP to clean up the windowed inner class
>> serde configuration.
>>
>> For the sake of simplicity and keeping this KIP well-scoped, I would
>> prefer
>> to stick with the ConfigException for now,
>> since this is consistent with how these very similar cases are handled at
>> the moment. I would still stand by the idea
>> of introducing a dedicated SerdeConfigurationException (or similar) but I
>> think we can treat that as orthogonal, and
>> maybe do a followup KIP at some point to convert all of these relevant
>> cases over to a new Serde-specific exception
>>
>> On Thu, May 20, 2021 at 3:33 AM Bruno Cadonna  wrote:
>>
>> > Hi,
>> >
>> > I think using ConfigException makes sense. But I am also fine with
>> > SerdeConfigurationException. I think both are meaningful in this
>> > situation where the former has the advantage that we do not need to
>> > introduce a new exception.
>> >
>> > Best,
>> > Bruno
>> >
>> >
>> >
>> >
>> > On 20.05.21 07:54, Guozhang Wang wrote:
>> > > Thanks Sophie. I think not piggy-backing on TopologyException makes
>> > sense.
>> > >
>> > > It just occurs to me that today we already have similar situations
>> even
>> > > with this config default to Bytes, that is the other
>> > > `DEFAULT_WINDOWED_KEY/VALUE_SERDE_INNER_CLASS` config, whose default
>> is
>> > > actually null. Quickly checking the code here, I think we throw
>> > > StreamsException when they are found not defined during runtime, we
>> > > actually throw the `*ConfigException*`. So for consistency we could
>> just
>> > > use that exception as well.
>> > >
>> > > Guozhang
>> > >
>> > >
>> > > On Wed, May 19, 2021 at 3:24 PM Sophie Blee-Goldman
>> > >  wrote:
>> > >
>> > >> To be honest I'm not really a fan of reusing the TopologyException
>> > since it
>> > >> feels like
>> > >> a bit of a stretch from a user point of view to classify Serde
>> > >> misconfiguration as a
>> > >> topology issue.
>> > >>
>> > >> I personally think a StreamsException would be acceptable, but I
>> would
>> > also
>> > >> propose
>> > >> to introduce a new type of exception, something like
>> > >> SerdeConfigurationException or
>> > >> so. We certainly don't want to end up like the Producer API with its
>> 500
>> > >> different
>> > >> exceptions. Luckily Streams is nowhere near that yet, in my opinion,
>> and
>> > >> problems
>> > >> with Serde configuration are so common and well-defined that a
>> dedicated
>> > >> exception
>> > >> feels very appropriate.
>> > >>
>> > >> If there are any other instances in the codebase where we throw a
>> > >> StreamsException
>> > >> for a Serde-related issue, this could also be migrated to the new
>> > exception
>> > >> type (not
>> > >> necessarily all at once, but gradually after this KIP)
>> > >>
>> > >> Thoughts?
>> > >>
>> > >> On Wed, May 19, 2021 at 10:31 AM Guozhang Wang 
>> > wrote:
>> > >>
>> > >>> Leah, thanks for the KIP.
>> > >>>
>> > >>> It looks good to me overall, just following up on @
>> br...@confluent.io
>> > >>>  's question about exception: what about using
>> the
>> > >>> `TopologyException` class? I know that currently it is only thrown
>> > during
>> > >>> the topology parsing phase, not at the streams construction, but I
>> feel
>> > >> we
>> > >>> can extend its scope to cover both topology building and streams
>> object
>> > >>> (i.e. taking the topology and the config) construction time as well
>> > since
>> > >>> part of the construction is again to re-write / augment the
>> topology.
>> > >>>
>> > >>> Guozhang
>> > >>>
>> > >>>
>> > >>> On Wed, May 19, 2021 at 8:44 AM Leah Thomas
>> > > > >>>
>> > >>> wrote:
>> > >>>
>> >  Hi Sophie,
>> > 
>> >  Thanks for catching that. These are existing methods inside of
>> >  `StreamsConfig` that will return null (the new default) instead of
>> > byte
>> >  array serde (the old default). Both `StreamsConfig` and
>> >  `defaultKeySerde`/`defaultValueSerde` are public, so I assume these
>> > >> still
>> >  count as part of the public API. I updated the KIP to include this
>> >  information.
>> > 
>> >  Bruno - I was planning on including a specific message with the
>> > streams
>> >  exception to indicate that either a serde needs to be passed in or
>> a
>> >  default needs to be set. I'm open to doing something more specific,
>> >  perhaps something like a serde exception? WDYT? I was hoping that
>> with
>> > >>> the
>> > >>>

[jira] [Created] (KAFKA-12842) Failing test: org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic

2021-05-24 Thread John Roesler (Jira)
John Roesler created KAFKA-12842:


 Summary: Failing test: 
org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic
 Key: KAFKA-12842
 URL: https://issues.apache.org/jira/browse/KAFKA-12842
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: John Roesler
 Fix For: 3.0.0


This test failed during a PR build, which means that it failed twice in a row, 
due to the test-retry logic in PR builds.

 

[https://github.com/apache/kafka/pull/10744/checks?check_run_id=2643417209]

 
{noformat}
java.lang.NullPointerException
at 
java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
at org.reflections.Store.getAllIncluding(Store.java:82)
at org.reflections.Store.getAll(Store.java:93)
at org.reflections.Reflections.getSubTypesOf(Reflections.java:404)
at 
org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.getPluginDesc(DelegatingClassLoader.java:352)
at 
org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanPluginPath(DelegatingClassLoader.java:337)
at 
org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanUrlsAndAddPlugins(DelegatingClassLoader.java:268)
at 
org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initPluginLoader(DelegatingClassLoader.java:216)
at 
org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:209)
at 
org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:61)
at 
org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:93)
at 
org.apache.kafka.connect.util.clusters.WorkerHandle.start(WorkerHandle.java:50)
at 
org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.addWorker(EmbeddedConnectCluster.java:174)
at 
org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.startConnect(EmbeddedConnectCluster.java:260)
at 
org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster.start(EmbeddedConnectCluster.java:141)
at 
org.apache.kafka.connect.integration.ConnectWorkerIntegrationTest.testSourceTaskNotBlockedOnShutdownWithNonExistentTopic(ConnectWorkerIntegrationTest.java:303)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.Gene

Re: [VOTE] KIP-720 Deprecate MirrorMaker v1

2021-05-24 Thread Colin McCabe
Thanks, Ryanne. Can you add information about this way forward to the KIP? Also 
it would be good to clarify that this work needs to get done before removing 
MM1.

best,
Colin


On Thu, May 20, 2021, at 16:00, Ryanne Dolan wrote:
> Hey y'all, reviving this thread because it seems we have a way forward
> w.r.t. IdentityReplicationPolicy aka LegacyReplicationPolicy, which I
> believe is the only missing feature in MM2 that we need to deprecate MM1.
> 
> If there are no objections over the next couple of days I'll consider this
> adopted. Thanks!
> 
> Ryanne
> 
> On Fri, Apr 2, 2021 at 10:48 AM Colin McCabe  wrote:
> 
> > Hi Ryanne,
> >
> > Thanks for the response.  It would be good to have a PR for KIP-382, I
> > agree.
> >
> > Perhaps one possible compromise for KIP-712 would be to make the changes
> > in MM2 first, and then backport them to MM1.  I think it's important that
> > when we have a deprecated way of doing something and a non-deprecated way,
> > the non-deprecated way is the recommended way.  If we are onboarding more
> > users to the deprecated code path (for example, because there's major
> > features missing in the new code path), we're doing something wrong.
> >
> > best,
> > Colin
> >
> >
> > On Thu, Apr 1, 2021, at 15:48, Ryanne Dolan wrote:
> > > Colin, the only feature gap I'm aware of is that users must provide their
> > > own ReplicationPolicy in order to replicate topics without renaming them.
> > > This is straightforward, and such ReplicationPolicy implementations are
> > > easy to find. We could provide one OOTB, and indeed KIP-382 proposes we
> > do
> > > so, but the PR is missing. I'm happy to pick that up, no problem.
> > >
> > > wrt KIP-712, the changes are immediately applicable to MM2, at least as
> > it
> > > is currently written. I have no dog in the fight wrt whether the changes
> > > also land in MM1, but, assuming both KIPs land concurrently in 3.0, I
> > don't
> > > see why the two KIPs would be in conflict. Obvs, this KIP marks MM1 as
> > > deprecated going forward, but I don't think that precludes a concurrent
> > > improvement.
> > >
> > > If KIP-712 were being proposed after 3.0, I'd agree with you.
> > >
> > > I think the reality is that MM1 has been sort of unofficially deprecated
> > > for a long time, so people are understandably disinterested in landing
> > new
> > > features there. But let's have that debate in the KIP-712 thread. I
> > believe
> > > we'd be having the same discussion there with or without KIP-720 passing.
> > >
> > > Ryanne
> > >
> > > On Thu, Apr 1, 2021, 2:07 PM Colin McCabe  wrote:
> > >
> > > > Thanks for bringing this up, Ismael.  I agree that we need to figure
> > this
> > > > out before we accept this KIP.
> > > >
> > > > If MM1 is deprecated, then that means we are telling users they need to
> > > > migrate away from it as soon as they can.  I think that rules out
> > adding
> > > > big new features to MM1, unless those features relate towards
> > migrating to
> > > > MM2.  So we need to figure out if that's really what we want to do, or
> > if
> > > > we want to keep MM1 around for a while.  This is certainly relevant to
> > the
> > > > discussion in the KIP-712 thread -- right now, these KIPs contradict
> > each
> > > > other.
> > > >
> > > > It's also important that MM2 reaches feature parity with MM1 before
> > > > deprecating MM1.  Or if we can't reach feature parity, we should
> > explain
> > > > why the unsupported features are not needed going forward.  Do we have
> > a
> > > > list of all the gaps?
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Thu, Apr 1, 2021, at 09:44, Ismael Juma wrote:
> > > > > OK. :) Maybe something like:
> > > > >
> > > > > "We believe MirrorMaker 2 is an improvement over the original
> > MirrorMaker
> > > > > when it comes to reliability and functionality for the majority of
> > use
> > > > > cases. We intend to focus on MirrorMaker 2 for future development and
> > > > hence
> > > > > we propose deprecating MirrorMaker 2 for future removal."
> > > > >
> > > > > Is this accurate? How does it sound?
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Thu, Apr 1, 2021 at 9:10 AM Ryanne Dolan 
> > > > wrote:
> > > > >
> > > > > > Ah, do you mind wording it for me, Ismael? Or do you mean I should
> > just
> > > > > > remove the "MM1 is still useful" part?
> > > > > >
> > > > > > Ryanne
> > > > > >
> > > > > > On Thu, Apr 1, 2021, 11:01 AM Ismael Juma 
> > wrote:
> > > > > >
> > > > > > > Can we please add proper motivation? I'm -1 with the current
> > > > motivation
> > > > > > > even though I'm in favor of the change.
> > > > > > >
> > > > > > > On Thu, Apr 1, 2021, 8:46 AM Ryanne Dolan  > >
> > > > wrote:
> > > > > > >
> > > > > > > > Hey y'all, looks like we've got the requisite votes for this to
> > > > pass,
> > > > > > and
> > > > > > > > the various concerns wrt KIP-712 are now being discussed on
> > that
> > > > > > thread.
> > > > > > > So
> > > > > > > > I'm going to go ahead and c

Re: [DISCUSS] KIP-618: Atomic commit of source connector records and offsets

2021-05-24 Thread Chris Egerton
Hi all,

Wanted to note here that I've updated the KIP document to include the
changes discussed recently. They're mostly located in the "Public
Interfaces" section. I suspect discussion hasn't concluded yet and there
will probably be a few more changes to come, but wanted to take the
opportunity to provide a snapshot of what the current design looks like.

Cheers,

Chris

On Fri, May 21, 2021 at 4:32 PM Chris Egerton  wrote:

> Hi Tom,
>
> Wow, I was way off base! I was thinking that the intent of the fencible
> producer was to employ it by default with 3.0, as opposed to only after the
> worker-level
> "exactly.once.source.enabled" property was flipped on. You are correct
> that with the case you were actually describing, there would be no
> heightened ACL requirements, and that it would leave room in the future for
> exactly-once to be disabled on a per-connector basis (as long as all the
> workers in the cluster already had "exactly.once.source.enabled" set to
> "true") with no worries about breaking changes.
>
> I agree that this is something for another KIP; even if we could squeeze
> it in in time for this release, it might be a bit much for new users to
> take in all at once. But I can add it to the doc as "future work" since
> it's a promising idea that could prove valuable to someone who might need
> per-connector granularity in the future.
>
> Thanks for clearing things up; in retrospect your comments make a lot more
> sense now, and I hope I've sufficiently addressed them by now.
>
> PSA for you and everyone else--I plan on updating the doc next week with
> the new APIs for connector-defined transaction boundaries,
> user-configurable transaction boundaries (i.e., poll vs. interval vs.
> connectors), and preflight checks for exactly-once validation (required vs.
> requested).
>
> Cheers,
>
> Chris
>
> On Fri, May 21, 2021 at 7:14 AM Tom Bentley  wrote:
>
>> Hi Chris,
>>
>> Thanks for continuing to entertain some of these ideas.
>>
>> On Fri, May 14, 2021 at 5:06 PM Chris Egerton > >
>> wrote:
>>
>> > [...]
>> >
>> That's true, but we do go from three static ACLs (write/describe on a
>> fixed
>> > transactional ID, and idempotent write on a fixed cluster) to a dynamic
>> > collection of ACLs.
>> >
>>
>> I'm not quite sure I follow, maybe I've lost track. To be clear, I was
>> suggesting the use of a 'fencing producer' only in clusters with
>> exactly.once.source.enabled=true where I imagined the key difference
>> between the exactly once and fencing cases was how the producer was
>> configured/used (transactional vs this new fencing semantic). I think the
>> ACL requirements for connector producer principals would therefore be the
>> same as currently described in the KIP. The same is true for the worker
>> principals (which is the only breaking change you give in the KIP). So I
>> don't think the fencing idea changes the backwards compatibility story
>> that's already in the KIP, just allows a safe per-connector
>> exactly.once=disabled option to be supported (with required as requested
>> as
>> we already discussed).
>>
>> But I'm wondering whether I've overlooked something.
>>
>> Ultimately I think it may behoove us to err on the side of reducing the
>> > breaking changes here for now and saving them for 4.0 (or some later
>> major
>> > release), but would be interested in thoughts from you and others.
>> >
>>
>> Difficult to answer (given I think I might be missing something).
>> If there are breaking changes then I don't disagree. It's difficult to
>> reason about big changes like this without some practical experience.
>> If there are not, then I think we could also implement the whole
>> exactly.once=disabled thing in a later KIP without additional breaking
>> changes (i.e. some time in 3.x), right?
>>
>>
>> > > Gouzhang also has a (possible) use case for a fencing-only producer (
>> > https://issues.apache.org/jira/browse/KAFKA-12693), and as he points
>> out
>> > there, you should be able to get these semantics today by calling
>> > initTransactions() and then just using the producer as normal (no
>> > beginTransaction()/abortTransaction()/endTransaction()).
>> >
>> > I tested this locally and was not met with success; transactional
>> producers
>> > do a check right now to ensure that any calls to "KafkaProducer::send"
>> > occur within a transaction (see
>> >
>> >
>> https://github.com/apache/kafka/blob/29c55fdbbc331bbede17908ccc878953a1b15d87/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L957-L959
>> > and
>> >
>> >
>> https://github.com/apache/kafka/blob/29c55fdbbc331bbede17908ccc878953a1b15d87/clients/src/main/java/org/apache/kafka/clients/producer/internals/TransactionManager.java#L450-L451
>> > ).
>> > Not a blocker, just noting that we'd have to do some legwork to make
>> this
>> > workable with the producer API.
>> >
>>
>> Ah, sorry, I should have actually tried it rather than just taking a quick
>> look at the code.
>>
>> Rather than remove those s

Re: [DISCUSS] KIP-739: Block Less on KafkaProducer#send

2021-05-24 Thread Colin McCabe
Hi all,

I agree that we should give users the option of having a fully async API, but I 
don't think external thread pools or queues are the right direction to go here. 
They add performance overheads and don't address the root causes of the problem.

There are basically two scenarios where we block, currently. One is when we are 
doing a metadata fetch. I think this is clearly a bug, or at least an 
implementation limitation. From the user's point of view, the fact that we are 
doing a metadata fetch is an implementation detail that really shouldn't be 
exposed like this. We have talked about fixing this in the past. I think we 
just should spend the time to do it.

The second scenario is where the client has produced too much data in too 
little time. This could happen if there is a network glitch, or the server is 
slower than expected. In this case, the behavior is intentional and not a bug. 
To understand this, think about what would happen if we didn't block. We would 
start buffering more and more data in memory, until finally the application 
died with an out of memory error. That would be frustrating for users and 
wouldn't add to the usability of Kafka.

We could potentially have an option to handle the out-of-memory scenario 
differently by returning an error code immediately rather than blocking. 
Applications would have to be rewritten to handle this properly, but it is a 
possibility. I suspect that most of them wouldn't use this, but we could offer 
it as a possibility for async purists (which might include certain frameworks). 
The big problem the users would have to solve is what to do with the record 
that they were unable to produce due to the buffer full issue.

best,
Colin


On Thu, May 20, 2021, at 10:35, Nakamura wrote:
> >
> > My suggestion was just do this in multiple steps/phases, firstly let's fix
> > the issue of send being misleadingly asynchronous (i.e. internally its
> > blocking) and then later one we can make the various
> > threadpools configurable with a sane default.
> 
> I like that approach. I updated the "Which thread should be responsible for
> waiting" part of KIP-739 to add your suggestion as my recommended approach,
> thank you!  If no one else has major concerns about that approach, I'll
> move the alternatives to "rejected alternatives".
> 
> On Thu, May 20, 2021 at 7:26 AM Matthew de Detrich
>  wrote:
> 
> > @
> >
> > Nakamura
> > On Wed, May 19, 2021 at 7:35 PM Nakamura  wrote:
> >
> > > @Ryanne:
> > > In my mind's eye I slightly prefer the throwing the "cannot enqueue"
> > > exception to satisfying the future immediately with the "cannot enqueue"
> > > exception?  But I agree, it would be worth doing more research.
> > >
> > > @Matthew:
> > >
> > > > 3. Using multiple thread pools is definitely recommended for different
> > > > types of tasks, for serialization which is CPU bound you definitely
> > would
> > > > want to use a bounded thread pool that is fixed by the number of CPU's
> > > (or
> > > > something along those lines).
> > > > https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c is
> > a
> > > > very good guide on this topic
> > > I think this guide is good in general, but I would be hesitant to follow
> > > its guidance re: offloading serialization without benchmarking it.  My
> > > understanding is that context-switches have gotten much cheaper, and that
> > > gains from cache locality are small, but they're not nothing.  Especially
> > > if the workload has a very small serialization cost, I wouldn't be
> > shocked
> > > if it made it slower.  I feel pretty strongly that we should do more
> > > research here before unconditionally encouraging serialization in a
> > > threadpool.  If people think it's important to do it here (eg if we think
> > > it would mean another big API change) then we should start thinking about
> > > what benchmarking we can do to gain higher confidence in this kind of
> > > change.  However, I don't think it would change semantics as
> > substantially
> > > as we're proposing here, so I would vote for pushing this to a subsequent
> > > KIP.
> > >
> > Of course, its all down to benchmarking, benchmarking and benchmarking.
> > Ideally speaking you want to use all of the resources available to you, so
> > if you have a bottleneck in serialization and you have many cores free then
> > using multiple cores may be more appropriate than a single core. Typically
> > I would expect that using a single thread to do serialization is likely to
> > be the most situation, I was just responding to an earlier point that was
> > made in regards to using ThreadPools for serialization (note that you can
> > also just use a ThreadPool that is pinned to a single thread)
> >
> >
> >
> > >
> > > > 4. Regarding providing the ability for users to supply their own custom
> > > > ThreadPool this is more of an ergonomics question for the API.
> > Especially
> > > > when it gets to monitoring/tracing, giving the ability for users to
> 

Re: [DISCUSS] KIP-655: Windowed "Distinct" Operation for KStream

2021-05-24 Thread John Roesler
Hey there, Ivan!

In typical fashion, I'm going to make a somewhat outlandish
proposal. I'm hoping that we can side-step some of the
complications that have arisen. Please bear with me.

It seems like `distinct()` is not fundamentally unlike other windowed
"aggregation" operations. Your concern about unnecessary
repartitioning seems to apply just as well to `count()` as to `distinct()`.
This has come up before, but I don't remember when: what if we
introduce a parameter to `selectKey()` that specifies that the caller
asserts that the new key does _not_ change the data partitioning?
The docs on that parameter would of course spell out all the "rights
and responsibilities" of setting it.

In that case, we could indeed get back to
`selectKey(A).windowBy(B).distinct(...)`, where we get to compose the
key mapper and the windowing function without having to carve out
a separate domain just for `distinct()`. All the rest of the KStream
operations would also benefit.

What do you think?

Thanks,
John

On Sun, May 23, 2021, at 08:09, Ivan Ponomarev wrote:
> Hello everyone,
> 
> let me revive the discussion for KIP-655. Now I have some time again and 
> I'm eager to finalize this.
> 
> Based on what was already discussed, I think that we can split the 
> discussion into three topics for our convenience.
> 
> The three topics are:
> 
> - idExtractor  (how should we extract the deduplication key for the record)
> 
> - timeWindows (what time windows should we use)
> 
> - miscellaneous (naming etc.)
> 
>  idExtractor 
> 
> Original proposal: use (k, v) -> f(k, v) mapper, defaulting to (k, v) -> 
> k.  The drawback here is that we must warn the user to choose such a 
> function that sets different IDs for records from different partitions, 
> otherwise same IDs might be not co-partitioned (and not deduplicated as 
> a result). Additional concern: what should we do when this function 
> returns null?
> 
> Matthias proposed key-only deduplication: that is, no idExtractor at 
> all, and if we want to use `distinct` for a particular identifier, we 
> must `selectKey()` before. The drawback of this approach is that we will 
> always have repartitioning after the key selection, while in practice 
> repartitioning will not always be necessary (for example, when the data 
> stream is such that different values infer different keys).
> 
> So here we have a 'safety vs. performance' trade-off. But 'safe' variant 
> is also not very convenient for developers, since we're forcing them to 
> change the structure of their records.
> 
> A 'golden mean' here might be using composite ID with its first 
> component equals to k and its second component equals to some f(v) (f 
> defaults to v -> null, and null value returned by f(v) means 
> 'deduplicate by the key only'). The nuance here is that we will have 
> serializers only for types of k and f(v), and we must correctly 
> serialize a tuple (k, f(v)), but of course this is doable.
> 
> What do you think?
> 
>  timeWindows 
> 
> Originally I proposed TimeWindows only just because they solved my 
> particular case :-) but agree with Matthias' and Sophie's objections.
> 
> I like the Sophie's point: we need both epoch-aligned and data-aligned 
> windows. IMO this is absolutely correct: "data-aligned is useful for 
> example when you know that a large number of updates to a single key 
> will occur in short bursts, and epoch-aligned when you specifically want 
> to get just a single update per discrete time interval."
> 
> I just cannot agree right away with Sophie's 
> .groupByKey().windowedBy(...).distinct() proposal, as it implies  the 
> key-only deduplication -- see the previous topic.
> 
> Epoch-aligned windows are very simple: they should forward only one 
> record per enumerated time window. TimeWindows are exactly what we want 
> here. I mentioned in the KIP both tumbling and hopping windows just 
> because both are possible for TimeWindows, but indeed I don't see any 
> real use case for hopping windows, only tumbling windows make sence IMO.
> 
> For data-aligned windows SlidingWindow interface seems to be a nearly 
> valid choice. Nearly. It should forward a record once when it's first 
> seen, and then not again for any identical records that fall into the 
> next N timeUnits.  However, we cannot reuse SlidingWindow as is, because 
> just as Matthias noted, SlidingWindows go backward in time, while we 
> need a windows that go forward in time, and are not opened while records 
> fall into an already existing window. We definitely should make our own 
> implementation, maybe we should call it ExpirationWindow? WDYT?
> 
> 
>  miscellaneous 
> 
> Persistent/in-memory stores. Matthias proposed to pass Materialized 
> parameter next to DistinctParameters (and this is necessary, because we 
> will need to provide a serializer for extracted id). This is absolutely 
> valid point, I agree and I will fix it in the KIP.
> 
> Naming. Sophie noted that the Streams DS

[VOTE] KIP-741: Change default serde to be null

2021-05-24 Thread Leah Thomas
Hi,

I'd like to kick-off voting for KIP-741: Change default serde to be null.

The
discussion is linked on the KIP for context.

Cheers,
Leah


Re: [VOTE] KIP-741: Change default serde to be null

2021-05-24 Thread Guozhang Wang
+1, thanks!

On Mon, May 24, 2021 at 1:51 PM Leah Thomas 
wrote:

> Hi,
>
> I'd like to kick-off voting for KIP-741: Change default serde to be null.
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-741%3A+Change+default+serde+to+be+null
> >
> The
> discussion is linked on the KIP for context.
>
> Cheers,
> Leah
>


-- 
-- Guozhang


Re: [VOTE] KIP-741: Change default serde to be null

2021-05-24 Thread Walker Carlson
+1 (non-binding) from me, Leah

Walker

On Mon, May 24, 2021 at 1:51 PM Leah Thomas 
wrote:

> Hi,
>
> I'd like to kick-off voting for KIP-741: Change default serde to be null.
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-741%3A+Change+default+serde+to+be+null
> >
> The
> discussion is linked on the KIP for context.
>
> Cheers,
> Leah
>


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #159

2021-05-24 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 341524 lines...]
[2021-05-24T19:36:26.090Z] > Task :connect:api:compileJava UP-TO-DATE
[2021-05-24T19:36:26.090Z] > Task :connect:api:classes UP-TO-DATE
[2021-05-24T19:36:26.090Z] > Task :streams:compileJava UP-TO-DATE
[2021-05-24T19:36:26.090Z] > Task :streams:classes UP-TO-DATE
[2021-05-24T19:36:26.090Z] > Task :raft:compileJava UP-TO-DATE
[2021-05-24T19:36:26.090Z] > Task :storage:compileJava UP-TO-DATE
[2021-05-24T19:36:26.090Z] > Task :connect:json:compileJava UP-TO-DATE
[2021-05-24T19:36:26.090Z] > Task :connect:json:classes UP-TO-DATE
[2021-05-24T19:36:26.090Z] > Task :connect:json:javadoc SKIPPED
[2021-05-24T19:36:26.090Z] > Task 
:clients:generateMetadataFileForMavenJavaPublication
[2021-05-24T19:36:26.090Z] > Task :connect:json:javadocJar
[2021-05-24T19:36:27.042Z] > Task :streams:test-utils:compileJava UP-TO-DATE
[2021-05-24T19:36:27.042Z] > Task :metadata:compileJava UP-TO-DATE
[2021-05-24T19:36:27.042Z] > Task :core:compileJava NO-SOURCE
[2021-05-24T19:36:27.042Z] > Task :streams:copyDependantLibs
[2021-05-24T19:36:27.042Z] > Task :streams:jar UP-TO-DATE
[2021-05-24T19:36:27.042Z] > Task 
:streams:generateMetadataFileForMavenJavaPublication
[2021-05-24T19:36:31.694Z] > Task :connect:api:javadoc
[2021-05-24T19:36:31.694Z] > Task :connect:api:javadocJar
[2021-05-24T19:36:33.880Z] > Task :streams:javadoc
[2021-05-24T19:36:33.880Z] > Task :streams:javadocJar
[2021-05-24T19:36:35.931Z] > Task :clients:javadoc
[2021-05-24T19:36:35.931Z] > Task :clients:javadocJar
[2021-05-24T19:36:35.931Z] > Task :clients:processTestMessages UP-TO-DATE
[2021-05-24T19:36:35.931Z] > Task :clients:compileTestJava UP-TO-DATE
[2021-05-24T19:36:35.931Z] > Task :clients:testClasses UP-TO-DATE
[2021-05-24T19:36:36.883Z] > Task :connect:api:copyDependantLibs UP-TO-DATE
[2021-05-24T19:36:36.883Z] > Task :connect:api:jar UP-TO-DATE
[2021-05-24T19:36:36.883Z] > Task 
:connect:api:generateMetadataFileForMavenJavaPublication
[2021-05-24T19:36:36.883Z] > Task :connect:json:compileTestJava UP-TO-DATE
[2021-05-24T19:36:36.883Z] > Task :connect:api:compileTestJava UP-TO-DATE
[2021-05-24T19:36:36.883Z] > Task :connect:api:testClasses UP-TO-DATE
[2021-05-24T19:36:36.883Z] > Task :connect:json:copyDependantLibs UP-TO-DATE
[2021-05-24T19:36:36.883Z] > Task :connect:json:jar UP-TO-DATE
[2021-05-24T19:36:36.883Z] > Task :connect:api:testJar
[2021-05-24T19:36:36.883Z] > Task 
:connect:json:generateMetadataFileForMavenJavaPublication
[2021-05-24T19:36:36.883Z] > Task :connect:json:testClasses UP-TO-DATE
[2021-05-24T19:36:36.883Z] > Task :connect:json:testJar
[2021-05-24T19:36:36.883Z] > Task :connect:api:testSrcJar
[2021-05-24T19:36:36.883Z] > Task :connect:json:testSrcJar
[2021-05-24T19:36:36.883Z] > Task 
:connect:json:publishMavenJavaPublicationToMavenLocal
[2021-05-24T19:36:36.883Z] > Task 
:connect:api:publishMavenJavaPublicationToMavenLocal
[2021-05-24T19:36:36.883Z] > Task :connect:json:publishToMavenLocal
[2021-05-24T19:36:36.883Z] > Task :connect:api:publishToMavenLocal
[2021-05-24T19:36:36.883Z] > Task :clients:testJar
[2021-05-24T19:36:38.108Z] > Task :clients:testSrcJar
[2021-05-24T19:36:38.108Z] > Task 
:clients:publishMavenJavaPublicationToMavenLocal
[2021-05-24T19:36:38.108Z] > Task :clients:publishToMavenLocal
[2021-05-24T19:36:57.846Z] > Task :core:compileScala
[2021-05-24T19:38:11.130Z] > Task :core:classes
[2021-05-24T19:38:11.130Z] > Task :core:compileTestJava NO-SOURCE
[2021-05-24T19:38:34.135Z] > Task :core:compileTestScala
[2021-05-24T19:39:28.839Z] > Task :core:testClasses
[2021-05-24T19:39:57.890Z] > Task :streams:compileTestJava
[2021-05-24T19:41:01.466Z] > Task :streams:testClasses
[2021-05-24T19:41:01.466Z] > Task :streams:testJar
[2021-05-24T19:41:01.466Z] > Task :streams:testSrcJar
[2021-05-24T19:41:01.466Z] > Task 
:streams:publishMavenJavaPublicationToMavenLocal
[2021-05-24T19:41:01.466Z] > Task :streams:publishToMavenLocal
[2021-05-24T19:41:01.466Z] 
[2021-05-24T19:41:01.466Z] Deprecated Gradle features were used in this build, 
making it incompatible with Gradle 8.0.
[2021-05-24T19:41:01.466Z] Use '--warning-mode all' to show the individual 
deprecation warnings.
[2021-05-24T19:41:01.466Z] See 
https://docs.gradle.org/7.0.2/userguide/command_line_interface.html#sec:command_line_warnings
[2021-05-24T19:41:01.466Z] 
[2021-05-24T19:41:01.466Z] Execution optimizations have been disabled for 2 
invalid unit(s) of work during this build to ensure correctness.
[2021-05-24T19:41:01.466Z] Please consult deprecation warnings for more details.
[2021-05-24T19:41:01.466Z] 
[2021-05-24T19:41:01.466Z] BUILD SUCCESSFUL in 4m 51s
[2021-05-24T19:41:01.466Z] 71 actionable tasks: 37 executed, 34 up-to-date
[Pipeline] sh
[2021-05-24T19:41:04.800Z] + grep ^version= gradle.properties
[2021-05-24T19:41:04.800Z] + cut -d= -f 2
[Pipeline] dir
[2021-05-24T19:41:05.700Z] Running in 

[jira] [Created] (KAFKA-12843) KIP-740 follow up: clean up TaskMetadata

2021-05-24 Thread A. Sophie Blee-Goldman (Jira)
A. Sophie Blee-Goldman created KAFKA-12843:
--

 Summary: KIP-740 follow up: clean up TaskMetadata
 Key: KAFKA-12843
 URL: https://issues.apache.org/jira/browse/KAFKA-12843
 Project: Kafka
  Issue Type: Task
  Components: streams
Reporter: A. Sophie Blee-Goldman
 Fix For: 4.0.0


See 
[KIP-740|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181306557]
 – for the TaskMetadata class, we need to:
 # Deprecate the TaskMetadata#getTaskId method
 # "Remove" the deprecated TaskMetadata#taskId method, then re-add a taskId() 
API that returns a TaskId instead of a String
 # Remove the deprecated constructor



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12844) KIP-740 follow up: clean up TaskId

2021-05-24 Thread A. Sophie Blee-Goldman (Jira)
A. Sophie Blee-Goldman created KAFKA-12844:
--

 Summary: KIP-740 follow up: clean up TaskId
 Key: KAFKA-12844
 URL: https://issues.apache.org/jira/browse/KAFKA-12844
 Project: Kafka
  Issue Type: Task
  Components: streams
Reporter: A. Sophie Blee-Goldman
 Fix For: 4.0.0


See 
[KIP-740|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181306557]
 – for the TaskId class, we need to remove the following deprecated APIs:
 # The public partition and topicGroupId fields should be "removed", ie made 
private (can also now rename topicGroupId to subtopology to match the getter)
 # The two #readFrom and two #writeTo methods can be removed (they have already 
been converted to internal utility methods we now use instead, so just remove 
them)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-741: Change default serde to be null

2021-05-24 Thread Sophie Blee-Goldman
+1 binding

thanks for the KIP
-Sophie

On Mon, May 24, 2021 at 2:02 PM Walker Carlson
 wrote:

> +1 (non-binding) from me, Leah
>
> Walker
>
> On Mon, May 24, 2021 at 1:51 PM Leah Thomas 
> wrote:
>
> > Hi,
> >
> > I'd like to kick-off voting for KIP-741: Change default serde to be null.
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-741%3A+Change+default+serde+to+be+null
> > >
> > The
> > discussion is linked on the KIP for context.
> >
> > Cheers,
> > Leah
> >
>


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #160

2021-05-24 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-741: Change default serde to be null

2021-05-24 Thread John Roesler
+1 (binding) from me. Thanks for the KIP!
-John

On Mon, May 24, 2021, at 18:10, Sophie Blee-Goldman wrote:
> +1 binding
> 
> thanks for the KIP
> -Sophie
> 
> On Mon, May 24, 2021 at 2:02 PM Walker Carlson
>  wrote:
> 
> > +1 (non-binding) from me, Leah
> >
> > Walker
> >
> > On Mon, May 24, 2021 at 1:51 PM Leah Thomas 
> > wrote:
> >
> > > Hi,
> > >
> > > I'd like to kick-off voting for KIP-741: Change default serde to be null.
> > > <
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-741%3A+Change+default+serde+to+be+null
> > > >
> > > The
> > > discussion is linked on the KIP for context.
> > >
> > > Cheers,
> > > Leah
> > >
> >
> 


Re: [VOTE] KIP-743: Remove config value 0.10.0-2.4 of Streams built-in metrics version config

2021-05-24 Thread John Roesler
+1 (binding)

Thanks, Bruno!
-John

On Fri, May 21, 2021, at 10:58, Sophie Blee-Goldman wrote:
> +1 (binding)
> 
> Thanks Bruno!
> 
> On Fri, May 21, 2021 at 8:06 AM Guozhang Wang  wrote:
> 
> > +1, thanks!
> >
> > On Fri, May 21, 2021 at 12:45 AM Bruno Cadonna  wrote:
> >
> > > Hi,
> > >
> > > I'd like to start a vote on KIP-743 that proposes to remove config value
> > > 0.10.0-2.4 from Streams config built.in.metrics.version.
> > >
> > > https://cwiki.apache.org/confluence/x/uIfOCg
> > >
> > > Best,
> > > Bruno
> > >
> >
> >
> > --
> > -- Guozhang
> >
> 


[jira] [Created] (KAFKA-12845) Rollback change which requires join key to be non null on KStream->GlobalKTable

2021-05-24 Thread Pedro Gontijo (Jira)
Pedro Gontijo created KAFKA-12845:
-

 Summary: Rollback change which requires join key to be non null on 
KStream->GlobalKTable
 Key: KAFKA-12845
 URL: https://issues.apache.org/jira/browse/KAFKA-12845
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 2.7.0
Reporter: Pedro Gontijo


As part of [KAFKA-10277|https://issues.apache.org/jira/browse/KAFKA-10277] the 
behavior for KStream->GlobalKtable joins was changed to require non null join 
keys.

But it seems reasonable that not every record will have an existing 
relationship (and hence a key) with the join globalktable. Think about a 
User>Car for instance, or PageView>Product. An empty/zero key could be returned 
by the KeyMapper but that will make a totally unnecessary search into the store.

I do not think that makes sense for any GlobalKtable join (inner or left) but 
for left join it sounds even more strange.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)