Build failed in Jenkins: kafka-trunk-jdk11 #1478

2020-05-19 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10010: Should make state store registration idempotent (#8681)


--
[...truncated 1.08 MB...]

org.apache.kafka.common.network.SslTransportLayerTest > 
testClientAuthenticationRequiredNotProvided[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testClientAuthenticationRequiredNotProvided[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testGracefulRemoteCloseDuringHandshakeWrite[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testGracefulRemoteCloseDuringHandshakeWrite[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testClientAuthenticationRequestedNotProvided[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testClientAuthenticationRequestedNotProvided[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testIOExceptionsDuringHandshakeWrite[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testIOExceptionsDuringHandshakeWrite[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testInvalidKeystorePassword[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testInvalidKeystorePassword[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testClientAuthenticationDisabledNotProvided[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testClientAuthenticationDisabledNotProvided[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testCustomClientSslEngineFactory[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testCustomClientSslEngineFactory[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testValidEndpointIdentificationSanDns[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testValidEndpointIdentificationSanDns[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testEndpointIdentificationNoReverseLookup[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testEndpointIdentificationNoReverseLookup[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testUngracefulRemoteCloseDuringHandshakeWrite[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testUngracefulRemoteCloseDuringHandshakeWrite[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testGracefulRemoteCloseDuringHandshakeRead[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testGracefulRemoteCloseDuringHandshakeRead[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testInvalidSecureRandomImplementation[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testInvalidSecureRandomImplementation[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testInvalidEndpointIdentification[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testInvalidEndpointIdentification[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testValidEndpointIdentificationSanIp[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testValidEndpointIdentificationSanIp[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testEndpointIdentificationDisabled[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testEndpointIdentificationDisabled[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testInterBrokerSslConfigValidation[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testInterBrokerSslConfigValidation[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testServerTruststoreDynamicUpdate[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testServerTruststoreDynamicUpdate[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testInvalidSslEngineFactory[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testInvalidSslEngineFactory[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTransportLayerTest > 
testNullTruststorePassword[tlsProtocol=TLSv1.3] STARTED

org.apache.kafka.common.network.SslTransportLayerTest > 
testNullTruststorePassword[tlsProtocol=TLSv1.3] PASSED

org.apache.kafka.common.network.SslTr

Build failed in Jenkins: kafka-trunk-jdk11 #1477

2020-05-19 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10011: Remove task id from lockedTaskDirectories during


--
[...truncated 4.99 MB...]

kafka.controller.ReplicaStateMachineTest > 
testReplicaDeletionIneligibleToOnlineReplicaTransition PASSED

kafka.controller.ReplicaStateMachineTest > 
testInvalidReplicaDeletionStartedToNewReplicaTransition STARTED

kafka.controller.ReplicaStateMachineTest > 
testInvalidReplicaDeletionStartedToNewReplicaTransition PASSED

kafka.controller.ReplicaStateMachineTest > 
testInvalidNonexistentReplicaToReplicaDeletionSuccessfulTransition STARTED

kafka.controller.ReplicaStateMachineTest > 
testInvalidNonexistentReplicaToReplicaDeletionSuccessfulTransition PASSED

kafka.controller.ReplicaStateMachineTest > 
testNewReplicaToOnlineReplicaTransition STARTED

kafka.controller.ReplicaStateMachineTest > 
testNewReplicaToOnlineReplicaTransition PASSED

kafka.controller.ControllerIntegrationTest > 
testControllerDetectsBouncedBrokers STARTED

kafka.controller.ControllerIntegrationTest > 
testControllerDetectsBouncedBrokers PASSED

kafka.controller.ControllerIntegrationTest > testControlledShutdown STARTED

kafka.controller.ControllerIntegrationTest > testControlledShutdown PASSED

kafka.controller.ControllerIntegrationTest > 
testPartitionReassignmentWithOfflineReplicaHaltingProgress STARTED

kafka.controller.ControllerIntegrationTest > 
testPartitionReassignmentWithOfflineReplicaHaltingProgress PASSED

kafka.controller.ControllerIntegrationTest > 
testControllerEpochPersistsWhenAllBrokersDown STARTED

kafka.controller.ControllerIntegrationTest > 
testControllerEpochPersistsWhenAllBrokersDown PASSED

kafka.controller.ControllerIntegrationTest > 
testTopicCreationWithOfflineReplica STARTED

kafka.controller.ControllerIntegrationTest > 
testTopicCreationWithOfflineReplica PASSED

kafka.controller.ControllerIntegrationTest > 
testPartitionReassignmentResumesAfterReplicaComesOnline STARTED

kafka.controller.ControllerIntegrationTest > 
testPartitionReassignmentResumesAfterReplicaComesOnline PASSED

kafka.controller.ControllerIntegrationTest > 
testLeaderAndIsrWhenEntireIsrOfflineAndUncleanLeaderElectionDisabled STARTED

kafka.controller.ControllerIntegrationTest > 
testLeaderAndIsrWhenEntireIsrOfflineAndUncleanLeaderElectionDisabled PASSED

kafka.controller.ControllerIntegrationTest > 
testTopicPartitionExpansionWithOfflineReplica STARTED

kafka.controller.ControllerIntegrationTest > 
testTopicPartitionExpansionWithOfflineReplica PASSED

kafka.controller.ControllerIntegrationTest > 
testPreferredReplicaLeaderElectionWithOfflinePreferredReplica STARTED

kafka.controller.ControllerIntegrationTest > 
testPreferredReplicaLeaderElectionWithOfflinePreferredReplica PASSED

kafka.controller.ControllerIntegrationTest > 
testMetadataPropagationOnControlPlane STARTED

kafka.controller.ControllerIntegrationTest > 
testMetadataPropagationOnControlPlane PASSED

kafka.controller.ControllerIntegrationTest > 
testAutoPreferredReplicaLeaderElection STARTED

kafka.controller.ControllerIntegrationTest > 
testAutoPreferredReplicaLeaderElection PASSED

kafka.controller.ControllerIntegrationTest > testTopicCreation STARTED

kafka.controller.ControllerIntegrationTest > testTopicCreation PASSED

kafka.controller.ControllerIntegrationTest > testControllerMoveOnTopicDeletion 
STARTED

kafka.controller.ControllerIntegrationTest > testControllerMoveOnTopicDeletion 
PASSED

kafka.controller.ControllerIntegrationTest > testPartitionReassignment STARTED

kafka.controller.ControllerIntegrationTest > testPartitionReassignment PASSED

kafka.controller.ControllerIntegrationTest > testTopicPartitionExpansion STARTED

kafka.controller.ControllerIntegrationTest > testTopicPartitionExpansion PASSED

kafka.controller.ControllerIntegrationTest > 
testControllerMoveIncrementsControllerEpoch STARTED

kafka.controller.ControllerIntegrationTest > 
testControllerMoveIncrementsControllerEpoch PASSED

kafka.controller.ControllerIntegrationTest > 
testLeaderAndIsrWhenEntireIsrOfflineAndUncleanLeaderElectionEnabled STARTED

kafka.controller.ControllerIntegrationTest > 
testLeaderAndIsrWhenEntireIsrOfflineAndUncleanLeaderElectionEnabled PASSED

kafka.controller.ControllerIntegrationTest > 
testControllerMoveOnPartitionReassignment STARTED

kafka.controller.ControllerIntegrationTest > 
testControllerMoveOnPartitionReassignment PASSED

kafka.controller.ControllerIntegrationTest > testControllerMoveOnTopicCreation 
STARTED

kafka.controller.ControllerIntegrationTest > testControllerMoveOnTopicCreation 
PASSED

kafka.controller.ControllerIntegrationTest > 
testControllerRejectControlledShutdownRequestWithStaleBrokerEpoch STARTED

kafka.controller.ControllerIntegrationTest > 
testControllerRejectControlledShutdownRequestWithStaleBrokerEpoch PASSED

kafka.controller.ControllerIntegrationTest > 
testBackToBackPreferredReplicaLeaderElections STA

Re: [VOTE] KIP-613: Add end-to-end latency metrics to Streams

2020-05-19 Thread Sophie Blee-Goldman
We can conclude the vote on this: the KIP passes with three
 +1 (binding) votes from Guzhang, Bill, and John.

Thanks all for the great discussion and feedback!

On Tue, May 19, 2020 at 3:34 PM Bill Bejeck  wrote:

> Thanks for the KIP, I'm a +1 (binding) on the proposal.
>
> -Bill
>
> On Tue, May 19, 2020 at 3:02 PM Guozhang Wang  wrote:
>
> > +1 on the updated proposal. Thanks Sophie.
> >
> > On Tue, May 19, 2020 at 11:51 AM John Roesler 
> wrote:
> >
> > > Thanks for the KIP, Sophie.
> > >
> > > I'm +1 (binding) on the current proposal.
> > >
> > > Thanks,
> > > -John
> > >
> > > On Fri, May 15, 2020, at 19:45, Guozhang Wang wrote:
> > > > Hi Sophie,
> > > >
> > > > Sorry I was a bit late on following up the DISCUSS thread. I still
> have
> > > > some comment about the processor-node level metrics and left a reply
> on
> > > > that thread.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > > On Fri, May 15, 2020 at 2:34 PM Sophie Blee-Goldman <
> > sop...@confluent.io
> > > >
> > > > wrote:
> > > >
> > > > > Hey all,
> > > > >
> > > > > I think we've reached a general consensus on all the
> > non-implementation
> > > > > details of KIP-613 so I'd like to call for a vote.
> > > > >
> > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-613%3A+Add+end-to-end+latency+metrics+to+Streams
> > > > >
> > > > > Thanks!
> > > > > Sophie
> > > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
>


Re: [VOTE] KIP-601: Configurable socket connection timeout in NetworkClient

2020-05-19 Thread Cheng Tan
Dear Colin,


Thanks for the reply. Your reasoning make sense. I’ve modified the KIP-601 with 
two changes:

1. Now KIP-601 is proposing a exponential connection setup timeout, which is 
controlled by socket.connections.setup.timeout.ms (init value) and 
socket.connections.setup.timeout.max.ms (max value)

2. The logic optimization in leastLoadedNode(), which I want to discuss on that 
again. In the scenario that no connected or connection node exists, instead of 
providing the node with least failed attempts, the NetworkClient can provide 
the least recently used node which respects the reconnect backoff. The existing 
property ClusterConnectionStates.NodeConnectionState.lastConnectAttemptMs can 
help us pick the LRU node conveniently. Does this make sense to you?

Please let me know what you think. Thanks.


Best, - Cheng Tan



> On May 19, 2020, at 1:44 PM, Colin McCabe  wrote:
> 
> It seems like this analysis is assuming that the only reason to wait longer 
> is so that we can send another SYN packet.  This may not be the case-- 
> perhaps waiting longer would allow us to receive an ACK from the remote end 
> that has been delayed for some reason while going through the network.
> 
> We also probably don't want our expiration time period to line up exactly 
> with Linux's retries.  If it did, we would cut off the connection attempt 
> just as we were re-sending another SYN.
> 
> Also, there are other OSes besides Linux, and other configurations besides 
> the default one.
> 
> So, on the whole, I don't think we need to make the default a power of 2.
> 
> best,
> Colin
> 



Build failed in Jenkins: kafka-trunk-jdk8 #4545

2020-05-19 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10011: Remove task id from lockedTaskDirectories during

[github] KAFKA-10010: Should make state store registration idempotent (#8681)


--
[...truncated 6.15 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTimeDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTimeDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis STARTED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis PASSED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime STARTED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime PASSED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep PASSED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldRet

Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-05-19 Thread John Roesler
Thanks for the response, Sophie,

I wholeheartedly agree we should take as much into account as possible
up front, rather than regretting our decisions later. I actually do share
your vague sense of worry, which was what led me to say initially that I
thought my counterproposal might be "too fancy". Sometimes, it's better
to be explicit instead of "elegant", if we think more people will be confused
than not.

I really don't think that there's any danger of "relying on a bug" here, 
although
people certainly could be relying on current behavior. One thing to be clear
about (which I just left a more detailed comment in KAFKA-8159 about) is that
when we say something like key1 > key2, this ordering is defined by the
serde's output and nothing else. 

Currently, thanks to your fix in https://github.com/apache/kafka/pull/6521,
the store contract is that for range scans, if from > to, then the store must
return an empty iterator. There's no possibility that someone could be relying
on iterating over that range in increasing order, because that's not what 
happens. However, they could indeed be relying on getting an empty iterator.

My counterproposal was to actually change this contract to say that the store
must return an iterator over the keys in that range, but in the reverse order.
So, in addition to considering whether this idea is "too fancy" (aka confusing),
we should also consider the likelihood of breaking an existing program with
this behavior/contract change.

To echo your clarification, I'm also not advocating strongly in favor of my
proposal. I just wanted to present it for consideration alongside Jorge's
original one.

Thanks for raising these very good points,
-John

On Tue, May 19, 2020, at 20:49, Sophie Blee-Goldman wrote:
> > Rather than working around it, I think we should just fix it
> 
> Now *that's* a "fancy" idea :P
> 
> That was my primary concern, although I do have a vague sense of worry
> that we might be allowing users to get into trouble without realizing it.
> For example if their custom serdes suffer a similar bug as the above,
> and/or
> they rely on getting results in increasing order (of the keys) even when
> to < from. Maybe they're relying on the fact that the range query returns
> nothing in that case.
> 
> Not sure if that qualifies as relying on a bug or not, but in that past
> we've
> taken the stance that we should not break compatibility even if the user
> was relying on bugs or unintentional behavior.
> 
> Just to clarify I'm not advocating strongly against this proposal, just
> laying
> out some considerations we should take into account. At the end of the day
> we should do what's right rather than maintain compatibility with existing
> bugs, but sometimes there's a reasonable middle ground.
> 
> On Tue, May 19, 2020 at 6:15 PM John Roesler  wrote:
> 
> > Thanks Sophie,
> >
> > Woah, that’s a nasty bug. Rather than working around it, I think we should
> > just fix it. I’ll leave some comments on the Jira.
> >
> > It doesn’t seem like it should be this KIP’s concern that some serdes
> > might be incorrectly written.
> >
> > Were there other practical concerns that you had in mind?
> >
> > Thanks,
> > John
> >
> > On Tue, May 19, 2020, at 19:10, Sophie Blee-Goldman wrote:
> > > I like this "fancy idea" to just flip the to/from bytes but I think there
> > > are some practical limitations to implementing this. In particular
> > > I'm thinking about this issue
> > >  with the built-in
> > signed
> > > number serdes.
> > >
> > > This trick would actually fix the problem for negative-negative queries
> > > (ie where to & from are negative) but would cause undetectable
> > > incorrect results for negative-positive queries. For example, say you
> > > call #range with from = -1 and to = 1, using the Short serdes. The
> > > serialized bytes for that are
> > >
> > > from = 
> > > to = 0001
> > >
> > > so we would end up flipping those and iterating over all keys from
> > > 0001 to . Iterating in lexicographical
> > > order means we would iterate over every key in the space *except* for
> > > 0, but 0 is actually the *only* other key we meant to be included in the
> > > range query.
> > >
> > > Currently we just log a warning and return an empty iterator when
> > > to < from, which is obviously also incorrect but feels slightly more
> > > palatable. If we start automatically converting to reverse queries we
> > > can't even log a warning in this case unless we wanted to log a warning
> > > every time, which would be weird to do for a valid usage of a new
> > > feature.
> > >
> > > All that said, I still like the idea overall. Off the top of my head I
> > guess
> > > we could add a store config to enable/disable automatic reverse
> > iteration,
> > > which is off by default?
> > >
> > > Thanks for the KIP! This will be a nice addition
> > >
> > > Sophie
> > >
> > >
> > > On Tue

Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-05-19 Thread Sophie Blee-Goldman
> Rather than working around it, I think we should just fix it

Now *that's* a "fancy" idea :P

That was my primary concern, although I do have a vague sense of worry
that we might be allowing users to get into trouble without realizing it.
For example if their custom serdes suffer a similar bug as the above,
and/or
they rely on getting results in increasing order (of the keys) even when
to < from. Maybe they're relying on the fact that the range query returns
nothing in that case.

Not sure if that qualifies as relying on a bug or not, but in that past
we've
taken the stance that we should not break compatibility even if the user
was relying on bugs or unintentional behavior.

Just to clarify I'm not advocating strongly against this proposal, just
laying
out some considerations we should take into account. At the end of the day
we should do what's right rather than maintain compatibility with existing
bugs, but sometimes there's a reasonable middle ground.

On Tue, May 19, 2020 at 6:15 PM John Roesler  wrote:

> Thanks Sophie,
>
> Woah, that’s a nasty bug. Rather than working around it, I think we should
> just fix it. I’ll leave some comments on the Jira.
>
> It doesn’t seem like it should be this KIP’s concern that some serdes
> might be incorrectly written.
>
> Were there other practical concerns that you had in mind?
>
> Thanks,
> John
>
> On Tue, May 19, 2020, at 19:10, Sophie Blee-Goldman wrote:
> > I like this "fancy idea" to just flip the to/from bytes but I think there
> > are some practical limitations to implementing this. In particular
> > I'm thinking about this issue
> >  with the built-in
> signed
> > number serdes.
> >
> > This trick would actually fix the problem for negative-negative queries
> > (ie where to & from are negative) but would cause undetectable
> > incorrect results for negative-positive queries. For example, say you
> > call #range with from = -1 and to = 1, using the Short serdes. The
> > serialized bytes for that are
> >
> > from = 
> > to = 0001
> >
> > so we would end up flipping those and iterating over all keys from
> > 0001 to . Iterating in lexicographical
> > order means we would iterate over every key in the space *except* for
> > 0, but 0 is actually the *only* other key we meant to be included in the
> > range query.
> >
> > Currently we just log a warning and return an empty iterator when
> > to < from, which is obviously also incorrect but feels slightly more
> > palatable. If we start automatically converting to reverse queries we
> > can't even log a warning in this case unless we wanted to log a warning
> > every time, which would be weird to do for a valid usage of a new
> > feature.
> >
> > All that said, I still like the idea overall. Off the top of my head I
> guess
> > we could add a store config to enable/disable automatic reverse
> iteration,
> > which is off by default?
> >
> > Thanks for the KIP! This will be a nice addition
> >
> > Sophie
> >
> >
> > On Tue, May 19, 2020 at 3:21 PM John Roesler 
> wrote:
> >
> > > Hi there Jorge,
> > >
> > > Thanks for the KIP!
> > >
> > > I think this feature sounds very reasonable.
> > >
> > > I'm not 100% sure if this is "too fancy", but what do you think
> > > about avoiding the enum by instead allowing people to flip
> > > the "from" and "to" endpoints? I.e., reading from "A" to "Z" would
> > > be a forward scan, and from "Z" to "A" would be a backward one?
> > >
> > > Thanks,
> > > -John
> > >
> > > On Tue, May 19, 2020, at 16:20, Jorge Quilcate wrote:
> > > > Hi everyone,
> > > >
> > > > I would like to start the discussion for KIP-617:
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
> > > >
> > > > Looking forward to your feedback.
> > > >
> > > > Thanks!
> > > > Jorge.
> > > >
> > > >
> > > > Attachments:
> > > > * 0x5F2C6E22064982DF.asc
> > >
> >
>


Build failed in Jenkins: kafka-trunk-jdk14 #106

2020-05-19 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10010: Should make state store registration idempotent (#8681)


--
[...truncated 3.09 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

Jenkins build is back to normal : kafka-2.5-jdk8 #126

2020-05-19 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-05-19 Thread John Roesler
Thanks Sophie,

Woah, that’s a nasty bug. Rather than working around it, I think we should just 
fix it. I’ll leave some comments on the Jira.

It doesn’t seem like it should be this KIP’s concern that some serdes might be 
incorrectly written.

Were there other practical concerns that you had in mind?

Thanks,
John

On Tue, May 19, 2020, at 19:10, Sophie Blee-Goldman wrote:
> I like this "fancy idea" to just flip the to/from bytes but I think there
> are some practical limitations to implementing this. In particular
> I'm thinking about this issue
>  with the built-in signed
> number serdes.
> 
> This trick would actually fix the problem for negative-negative queries
> (ie where to & from are negative) but would cause undetectable
> incorrect results for negative-positive queries. For example, say you
> call #range with from = -1 and to = 1, using the Short serdes. The
> serialized bytes for that are
> 
> from = 
> to = 0001
> 
> so we would end up flipping those and iterating over all keys from
> 0001 to . Iterating in lexicographical
> order means we would iterate over every key in the space *except* for
> 0, but 0 is actually the *only* other key we meant to be included in the
> range query.
> 
> Currently we just log a warning and return an empty iterator when
> to < from, which is obviously also incorrect but feels slightly more
> palatable. If we start automatically converting to reverse queries we
> can't even log a warning in this case unless we wanted to log a warning
> every time, which would be weird to do for a valid usage of a new
> feature.
> 
> All that said, I still like the idea overall. Off the top of my head I guess
> we could add a store config to enable/disable automatic reverse iteration,
> which is off by default?
> 
> Thanks for the KIP! This will be a nice addition
> 
> Sophie
> 
> 
> On Tue, May 19, 2020 at 3:21 PM John Roesler  wrote:
> 
> > Hi there Jorge,
> >
> > Thanks for the KIP!
> >
> > I think this feature sounds very reasonable.
> >
> > I'm not 100% sure if this is "too fancy", but what do you think
> > about avoiding the enum by instead allowing people to flip
> > the "from" and "to" endpoints? I.e., reading from "A" to "Z" would
> > be a forward scan, and from "Z" to "A" would be a backward one?
> >
> > Thanks,
> > -John
> >
> > On Tue, May 19, 2020, at 16:20, Jorge Quilcate wrote:
> > > Hi everyone,
> > >
> > > I would like to start the discussion for KIP-617:
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
> > >
> > > Looking forward to your feedback.
> > >
> > > Thanks!
> > > Jorge.
> > >
> > >
> > > Attachments:
> > > * 0x5F2C6E22064982DF.asc
> >
>


Re: [VOTE]: KIP-604: Remove ZooKeeper Flags from the Administrative Tools

2020-05-19 Thread Colin McCabe
On Tue, May 19, 2020, at 09:31, Jason Gustafson wrote:
> Hi Colin,
> 
> Looks good. I just had one question. It sounds like your intent is to
> change kafka-configs.sh so that the --zookeeper flag is only supported for
> bootstrapping. I assume in the case of SCRAM that we will only make this
> change after the broker API is available?
> 
> Thanks,
> Jason

Hi Jason,

Yes, that's correct.  We will have the SCRAM API ready by the Kafka 3.0 release.

best,
Colin


> 
> On Tue, May 19, 2020 at 5:22 AM Mickael Maison 
> wrote:
> 
> > +1 (binding)
> > Thanks Colin
> >
> > On Tue, May 19, 2020 at 10:57 AM Manikumar 
> > wrote:
> > >
> > > +1 (binding)
> > >
> > > Thanks for the KIP
> > >
> > > On Tue, May 19, 2020 at 12:29 PM David Jacot 
> > wrote:
> > >
> > > > +1 (non-binding).
> > > >
> > > > Thanks for the KIP.
> > > >
> > > > On Fri, May 15, 2020 at 12:41 AM Guozhang Wang 
> > wrote:
> > > >
> > > > > +1.
> > > > >
> > > > > Thanks Colin!
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Tue, May 12, 2020 at 3:45 PM Colin McCabe 
> > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I'd like to start a vote on KIP-604: Remove ZooKeeper Flags from
> > the
> > > > > > Administrative Tools.
> > > > > >
> > > > > > As a reminder, this KIP is for the next major release of Kafka,
> > the 3.0
> > > > > > release.   So it won't go into the upcoming 2.6 release.  It's a
> > pretty
> > > > > > small change that just removes the --zookeeper flags from some
> > tools
> > > > and
> > > > > > removes a deprecated tool.  We haven't decided exactly when we'll
> > do
> > > > 3.0
> > > > > > but I believe we will certainly want this change in that release.
> > > > > >
> > > > > > The KIP does contain one small change relevant to Kafka 2.6: adding
> > > > > > support for --if-exists and --if-not-exists in combination with the
> > > > > > --bootstrap-server flag.
> > > > > >
> > > > > > best,
> > > > > > Colin
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> >
>


Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-05-19 Thread Sophie Blee-Goldman
I like this "fancy idea" to just flip the to/from bytes but I think there
are some practical limitations to implementing this. In particular
I'm thinking about this issue
 with the built-in signed
number serdes.

This trick would actually fix the problem for negative-negative queries
(ie where to & from are negative) but would cause undetectable
incorrect results for negative-positive queries. For example, say you
call #range with from = -1 and to = 1, using the Short serdes. The
serialized bytes for that are

from = 
to = 0001

so we would end up flipping those and iterating over all keys from
0001 to . Iterating in lexicographical
order means we would iterate over every key in the space *except* for
0, but 0 is actually the *only* other key we meant to be included in the
range query.

Currently we just log a warning and return an empty iterator when
to < from, which is obviously also incorrect but feels slightly more
palatable. If we start automatically converting to reverse queries we
can't even log a warning in this case unless we wanted to log a warning
every time, which would be weird to do for a valid usage of a new
feature.

All that said, I still like the idea overall. Off the top of my head I guess
we could add a store config to enable/disable automatic reverse iteration,
which is off by default?

Thanks for the KIP! This will be a nice addition

Sophie


On Tue, May 19, 2020 at 3:21 PM John Roesler  wrote:

> Hi there Jorge,
>
> Thanks for the KIP!
>
> I think this feature sounds very reasonable.
>
> I'm not 100% sure if this is "too fancy", but what do you think
> about avoiding the enum by instead allowing people to flip
> the "from" and "to" endpoints? I.e., reading from "A" to "Z" would
> be a forward scan, and from "Z" to "A" would be a backward one?
>
> Thanks,
> -John
>
> On Tue, May 19, 2020, at 16:20, Jorge Quilcate wrote:
> > Hi everyone,
> >
> > I would like to start the discussion for KIP-617:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
> >
> > Looking forward to your feedback.
> >
> > Thanks!
> > Jorge.
> >
> >
> > Attachments:
> > * 0x5F2C6E22064982DF.asc
>


Build failed in Jenkins: kafka-trunk-jdk14 #105

2020-05-19 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10011: Remove task id from lockedTaskDirectories during


--
[...truncated 3.09 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTes

Build failed in Jenkins: kafka-trunk-jdk8 #4544

2020-05-19 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-6145: Add unit tests to verify fix of bug KAFKA-9173 (#8689)

[github] KAFKA-9992: Eliminate JavaConverters in EmbeddedKafkaCluster (#8673)


--
[...truncated 6.15 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTimeDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTimeDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessRecordForTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldForwardRecordsFromSubtopologyToSubtopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForSmallerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCreateStateDirectoryForStatefulTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfWallClockTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis STARTED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis PASSED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime STARTED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime PASSED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep PASSED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldReturnIsPersistent STARTED

org.apache.kafka.streams.internals.WindowS

Re: [VOTE] KIP-613: Add end-to-end latency metrics to Streams

2020-05-19 Thread Bill Bejeck
Thanks for the KIP, I'm a +1 (binding) on the proposal.

-Bill

On Tue, May 19, 2020 at 3:02 PM Guozhang Wang  wrote:

> +1 on the updated proposal. Thanks Sophie.
>
> On Tue, May 19, 2020 at 11:51 AM John Roesler  wrote:
>
> > Thanks for the KIP, Sophie.
> >
> > I'm +1 (binding) on the current proposal.
> >
> > Thanks,
> > -John
> >
> > On Fri, May 15, 2020, at 19:45, Guozhang Wang wrote:
> > > Hi Sophie,
> > >
> > > Sorry I was a bit late on following up the DISCUSS thread. I still have
> > > some comment about the processor-node level metrics and left a reply on
> > > that thread.
> > >
> > >
> > > Guozhang
> > >
> > > On Fri, May 15, 2020 at 2:34 PM Sophie Blee-Goldman <
> sop...@confluent.io
> > >
> > > wrote:
> > >
> > > > Hey all,
> > > >
> > > > I think we've reached a general consensus on all the
> non-implementation
> > > > details of KIP-613 so I'd like to call for a vote.
> > > >
> > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-613%3A+Add+end-to-end+latency+metrics+to+Streams
> > > >
> > > > Thanks!
> > > > Sophie
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-05-19 Thread John Roesler
Hi there Jorge,

Thanks for the KIP!

I think this feature sounds very reasonable.

I'm not 100% sure if this is "too fancy", but what do you think
about avoiding the enum by instead allowing people to flip
the "from" and "to" endpoints? I.e., reading from "A" to "Z" would
be a forward scan, and from "Z" to "A" would be a backward one?

Thanks,
-John

On Tue, May 19, 2020, at 16:20, Jorge Quilcate wrote:
> Hi everyone,
> 
> I would like to start the discussion for KIP-617:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
> 
> Looking forward to your feedback.
> 
> Thanks!
> Jorge.
> 
> 
> Attachments:
> * 0x5F2C6E22064982DF.asc


Build failed in Jenkins: kafka-trunk-jdk14 #104

2020-05-19 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-6145: Add unit tests to verify fix of bug KAFKA-9173 (#8689)

[github] KAFKA-9992: Eliminate JavaConverters in EmbeddedKafkaCluster (#8673)


--
[...truncated 3.09 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTim

[jira] [Resolved] (KAFKA-9982) [kafka-connect] Source connector does not guarantee at least once delivery

2020-05-19 Thread Chris Egerton (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Egerton resolved KAFKA-9982.
--
Resolution: Not A Bug

> [kafka-connect] Source connector does not guarantee at least once delivery
> --
>
> Key: KAFKA-9982
> URL: https://issues.apache.org/jira/browse/KAFKA-9982
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.5.0
>Reporter: Qinghui Xu
>Priority: Major
>
> In kafka-connect runtime, the WorkerSourceTask is responsible for sending 
> records to the destination topics and managing the source offset commit. 
> Committed offsets are then used later for recovery of tasks during rebalance 
> or restart.
> But there are two concerns when looking into the WorkerSourceTask 
> implementation:
>  * When producer fail to send records, there's no retry but just skipping 
> offset commit and then execute next loop (poll for new records)
>  * The offset commit and effectively sending records over network are in fact 
> asynchronous, which means the offset commit could happen before records are 
> received by brokers, and a rebalance/restart in this gap could lead to 
> message loss.
> The conclusion is thus that the source connector does not support at least 
> once semantics by default (without the plugin implementation making extra 
> effort itself). I consider this as a bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Aakash Shah
Hello all,

I'd actually like to retract the earlier additions to the KIP and point out
that since I've started the voting process and gotten some responses, I
will not be making any major changes to the KIP as it would require a
re-voting process.

Thanks,
Aakash

On Tue, May 19, 2020 at 2:31 PM Aakash Shah  wrote:

> Hi Chris and others,
>
> Yes, you are correct; I looked through KIP-298 to understand it better. I
> agree with your idea to handle "errors.tolerance=none."
>
> I see, you are basically saying you are in favor of standardizing handling
> what to set the reporter to if it is not configured. I am on board with
> this proposal, especially if this is in line with previous behaviors as you
> mentioned.
>
> I will add both of these suggestions to the KIP.
>
> Lastly, unless anyone has any issues with Chris's suggestions, I believe
> the last part we have to come to a consensus is using a Future as the
> return type. I am for giving extra guarantees to the user if they wish;
> however, I am not very familiar with the potential issues with the consumer
> heartbeat as Arjun pointed out. Does anyone have any thoughts on this?
>
> Thanks,
> Aakash
>
> On Tue, May 19, 2020 at 2:10 PM Chris Egerton  wrote:
>
>> Hi Aakash,
>>
>> > If "errors.tolerance=none", should it not be the case that the error
>> reporter does not even report any error; rather, the task just fails after
>> throwing the error? I do understand the point you are saying about
>> duplicates, though.
>>
>> I believe the "errors.tolerance" property dictates whether a task should
>> fail after a record that causes problems during conversion or
>> transformation is encountered and reported (for example, by writing to a
>> DLQ). If it is set to "none", then the task will fail immediately; if it
>> is
>> set to "all", then the task will continue running normally. So if we want
>> to preserve that behavior, we might want to immediately throw an exception
>> when an errant record is reported by a "SinkTask" instance and the user
>> has
>> configured "errors.tolerance = none", which unless caught will cause the
>> task to cease writing records to the sink. In addition to throwing that
>> exception, we should also still fail the task; the exception is just a way
>> to (hopefully) interrupt the task's processing of records in order to
>> prevent duplicates if/when the task is restarted later on.
>>
>> > Lastly, why do you say we should always provide an errant record
>> reporter?
>> Doesn't that change the contract of what functionality it is providing?
>>
>> I'm just thinking that instead of returning "null" when no errant record
>> reporter is configured, we could return one that always fails the task and
>> throws an exception. This seems in line with the default behavior of the
>> framework when no error handling configuration properties are specified
>> and
>> a record causes problems during conversion or transformation. We could
>> leave the choice in the hands of developers but this might make things
>> confusing for users who get different behavior from different connectors
>> under the same circumstances.
>>
>> Hope this helps!
>>
>> Cheers,
>>
>> Chris
>>
>> On Tue, May 19, 2020 at 1:50 PM Aakash Shah  wrote:
>>
>> > Hi Arjun,
>> >
>> > I am not very familiar with how the potential heartbeat failure would
>> cause
>> > more failures when consuming subsequent records. Can you elaborate on
>> this?
>> >
>> > Thanks,
>> > Aakash
>> >
>> > On Tue, May 19, 2020 at 10:03 AM Arjun Satish 
>> > wrote:
>> >
>> > > One more concern with the connector blocking on the Future's get() is
>> > that
>> > > it may cause the task's consumer to fail to heartbeat (since there is
>> no
>> > > independent thread to do this). That would then cause failures when we
>> > > eventually try to consume more records after returning from put(). The
>> > > developer would need to be cognizant of these bits before waiting on
>> the
>> > > future, which adds a reasonable amount of complexity.
>> > >
>> > > Even with preCommit() returning incomplete offsets, I suppose the
>> concern
>> > > would be that the put() method keeps giving the task more records,
>> and to
>> > > truly pause the "firehose", the task needs to pause all partitions?
>> > >
>> > >
>> > > On Tue, May 19, 2020 at 9:26 AM Arjun Satish 
>> > > wrote:
>> > >
>> > > > Can we get a couple of examples that shows utility of waiting on the
>> > > > Future<>? Also, in preCommit() we would report back on the
>> incomplete
>> > > > offsets. So that feedback mechanism will already exists for
>> developers
>> > > who
>> > > > want to manually manage this.
>> > > >
>> > > > On Tue, May 19, 2020 at 8:03 AM Randall Hauch 
>> > wrote:
>> > > >
>> > > >> Thanks, Aakash, for updating the KIP.
>> > > >>
>> > > >> On Tue, May 19, 2020 at 2:18 AM Arjun Satish <
>> arjun.sat...@gmail.com>
>> > > >> wrote:
>> > > >>
>> > > >> > Hi Randall,
>> > > >> >
>> > > >> > Thanks for the explanation! Excellent point about guaranteeing
>>

Re: [VOTE] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Konstantine Karantasis
+1 (binding)

I like how the KIP looks now too. Quite active discussions within the past
few days, which I found very useful.

There's some room to allow in the future the connector developers to decide
whether they want greater control over error reporting or they want the
framework to keep providing the reasonable guarantees that this KIP now
describes. The API is expressive enough to accommodate such improvements if
they are warranted, but its current form seems quite adequate to support
efficient end-to-end error reporting for sink connectors.

Thanks for introducing this KIP Aakash!

One last minor comment around naming:
Currently both the names ErrantRecordReporter and failedRecordReporter are
used. Using the same name everywhere seems preferable, so feel free to
choose the one that you prefer.

Regards,
Konstantine

On Tue, May 19, 2020 at 2:30 PM Ewen Cheslack-Postava 
wrote:

> +1 (binding)
>
> This will be a nice improvement. From the discussion thread it's clear this
> is tricky to get right, nice work!
>
> On Tue, May 19, 2020 at 8:16 AM Andrew Schofield <
> andrew_schofi...@live.com>
> wrote:
>
> > +1 (non-binding)
> >
> > This is now looking very nice.
> >
> > Andrew Schofield
> >
> > On 19/05/2020, 16:11, "Randall Hauch"  wrote:
> >
> > Thank you, Aakash, for putting together this KIP and shepherding the
> > discussion. Also, many thanks to all those that participated in the
> > very
> > active discussion. I'm actually very happy with the current proposal,
> > am
> > confident that it is a valuable improvement to the Connect framework,
> > and
> > know that it will be instrumental in making sink tasks easily able to
> > report problematic records and keep running.
> >
> > +1 (binding)
> >
> > Best regards,
> >
> > Randall
> >
> > On Sun, May 17, 2020 at 6:59 PM Aakash Shah 
> > wrote:
> >
> > > Hello all,
> > >
> > > I'd like to open a vote for KIP-610:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-610%3A+Error+Reporting+in+Sink+Connectors
> > >
> > > Thanks,
> > > Aakash
> > >
> >
> >
>


Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Aakash Shah
Hi Chris and others,

Yes, you are correct; I looked through KIP-298 to understand it better. I
agree with your idea to handle "errors.tolerance=none."

I see, you are basically saying you are in favor of standardizing handling
what to set the reporter to if it is not configured. I am on board with
this proposal, especially if this is in line with previous behaviors as you
mentioned.

I will add both of these suggestions to the KIP.

Lastly, unless anyone has any issues with Chris's suggestions, I believe
the last part we have to come to a consensus is using a Future as the
return type. I am for giving extra guarantees to the user if they wish;
however, I am not very familiar with the potential issues with the consumer
heartbeat as Arjun pointed out. Does anyone have any thoughts on this?

Thanks,
Aakash

On Tue, May 19, 2020 at 2:10 PM Chris Egerton  wrote:

> Hi Aakash,
>
> > If "errors.tolerance=none", should it not be the case that the error
> reporter does not even report any error; rather, the task just fails after
> throwing the error? I do understand the point you are saying about
> duplicates, though.
>
> I believe the "errors.tolerance" property dictates whether a task should
> fail after a record that causes problems during conversion or
> transformation is encountered and reported (for example, by writing to a
> DLQ). If it is set to "none", then the task will fail immediately; if it is
> set to "all", then the task will continue running normally. So if we want
> to preserve that behavior, we might want to immediately throw an exception
> when an errant record is reported by a "SinkTask" instance and the user has
> configured "errors.tolerance = none", which unless caught will cause the
> task to cease writing records to the sink. In addition to throwing that
> exception, we should also still fail the task; the exception is just a way
> to (hopefully) interrupt the task's processing of records in order to
> prevent duplicates if/when the task is restarted later on.
>
> > Lastly, why do you say we should always provide an errant record
> reporter?
> Doesn't that change the contract of what functionality it is providing?
>
> I'm just thinking that instead of returning "null" when no errant record
> reporter is configured, we could return one that always fails the task and
> throws an exception. This seems in line with the default behavior of the
> framework when no error handling configuration properties are specified and
> a record causes problems during conversion or transformation. We could
> leave the choice in the hands of developers but this might make things
> confusing for users who get different behavior from different connectors
> under the same circumstances.
>
> Hope this helps!
>
> Cheers,
>
> Chris
>
> On Tue, May 19, 2020 at 1:50 PM Aakash Shah  wrote:
>
> > Hi Arjun,
> >
> > I am not very familiar with how the potential heartbeat failure would
> cause
> > more failures when consuming subsequent records. Can you elaborate on
> this?
> >
> > Thanks,
> > Aakash
> >
> > On Tue, May 19, 2020 at 10:03 AM Arjun Satish 
> > wrote:
> >
> > > One more concern with the connector blocking on the Future's get() is
> > that
> > > it may cause the task's consumer to fail to heartbeat (since there is
> no
> > > independent thread to do this). That would then cause failures when we
> > > eventually try to consume more records after returning from put(). The
> > > developer would need to be cognizant of these bits before waiting on
> the
> > > future, which adds a reasonable amount of complexity.
> > >
> > > Even with preCommit() returning incomplete offsets, I suppose the
> concern
> > > would be that the put() method keeps giving the task more records, and
> to
> > > truly pause the "firehose", the task needs to pause all partitions?
> > >
> > >
> > > On Tue, May 19, 2020 at 9:26 AM Arjun Satish 
> > > wrote:
> > >
> > > > Can we get a couple of examples that shows utility of waiting on the
> > > > Future<>? Also, in preCommit() we would report back on the incomplete
> > > > offsets. So that feedback mechanism will already exists for
> developers
> > > who
> > > > want to manually manage this.
> > > >
> > > > On Tue, May 19, 2020 at 8:03 AM Randall Hauch 
> > wrote:
> > > >
> > > >> Thanks, Aakash, for updating the KIP.
> > > >>
> > > >> On Tue, May 19, 2020 at 2:18 AM Arjun Satish <
> arjun.sat...@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > Hi Randall,
> > > >> >
> > > >> > Thanks for the explanation! Excellent point about guaranteeing
> > offsets
> > > >> in
> > > >> > the async case.
> > > >> >
> > > >> > If we can guarantee that the offsets will be advanced only after
> the
> > > bad
> > > >> > records are reported, then is there any value is the Future<>
> return
> > > >> type?
> > > >> > I feel we can declare the function with a void return type:
> > > >> >
> > > >> > void report(SinkRecord failedRecord, Throwable error)
> > > >> >
> > > >> > that works asynchronously, and advances offse

Re: [VOTE] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Ewen Cheslack-Postava
+1 (binding)

This will be a nice improvement. From the discussion thread it's clear this
is tricky to get right, nice work!

On Tue, May 19, 2020 at 8:16 AM Andrew Schofield 
wrote:

> +1 (non-binding)
>
> This is now looking very nice.
>
> Andrew Schofield
>
> On 19/05/2020, 16:11, "Randall Hauch"  wrote:
>
> Thank you, Aakash, for putting together this KIP and shepherding the
> discussion. Also, many thanks to all those that participated in the
> very
> active discussion. I'm actually very happy with the current proposal,
> am
> confident that it is a valuable improvement to the Connect framework,
> and
> know that it will be instrumental in making sink tasks easily able to
> report problematic records and keep running.
>
> +1 (binding)
>
> Best regards,
>
> Randall
>
> On Sun, May 17, 2020 at 6:59 PM Aakash Shah 
> wrote:
>
> > Hello all,
> >
> > I'd like to open a vote for KIP-610:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-610%3A+Error+Reporting+in+Sink+Connectors
> >
> > Thanks,
> > Aakash
> >
>
>


Build failed in Jenkins: kafka-trunk-jdk11 #1476

2020-05-19 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-6145: Add unit tests to verify fix of bug KAFKA-9173 (#8689)

[github] KAFKA-9992: Eliminate JavaConverters in EmbeddedKafkaCluster (#8673)


--
[...truncated 3.09 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.st

[DISCUSS] KIP-617: Allow Kafka Streams State Stores to be iterated backwards

2020-05-19 Thread Jorge Quilcate
Hi everyone,

I would like to start the discussion for KIP-617:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards

Looking forward to your feedback.

Thanks!
Jorge.



0x5F2C6E22064982DF.asc
Description: application/pgp-keys


Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Chris Egerton
Hi Aakash,

> If "errors.tolerance=none", should it not be the case that the error
reporter does not even report any error; rather, the task just fails after
throwing the error? I do understand the point you are saying about
duplicates, though.

I believe the "errors.tolerance" property dictates whether a task should
fail after a record that causes problems during conversion or
transformation is encountered and reported (for example, by writing to a
DLQ). If it is set to "none", then the task will fail immediately; if it is
set to "all", then the task will continue running normally. So if we want
to preserve that behavior, we might want to immediately throw an exception
when an errant record is reported by a "SinkTask" instance and the user has
configured "errors.tolerance = none", which unless caught will cause the
task to cease writing records to the sink. In addition to throwing that
exception, we should also still fail the task; the exception is just a way
to (hopefully) interrupt the task's processing of records in order to
prevent duplicates if/when the task is restarted later on.

> Lastly, why do you say we should always provide an errant record reporter?
Doesn't that change the contract of what functionality it is providing?

I'm just thinking that instead of returning "null" when no errant record
reporter is configured, we could return one that always fails the task and
throws an exception. This seems in line with the default behavior of the
framework when no error handling configuration properties are specified and
a record causes problems during conversion or transformation. We could
leave the choice in the hands of developers but this might make things
confusing for users who get different behavior from different connectors
under the same circumstances.

Hope this helps!

Cheers,

Chris

On Tue, May 19, 2020 at 1:50 PM Aakash Shah  wrote:

> Hi Arjun,
>
> I am not very familiar with how the potential heartbeat failure would cause
> more failures when consuming subsequent records. Can you elaborate on this?
>
> Thanks,
> Aakash
>
> On Tue, May 19, 2020 at 10:03 AM Arjun Satish 
> wrote:
>
> > One more concern with the connector blocking on the Future's get() is
> that
> > it may cause the task's consumer to fail to heartbeat (since there is no
> > independent thread to do this). That would then cause failures when we
> > eventually try to consume more records after returning from put(). The
> > developer would need to be cognizant of these bits before waiting on the
> > future, which adds a reasonable amount of complexity.
> >
> > Even with preCommit() returning incomplete offsets, I suppose the concern
> > would be that the put() method keeps giving the task more records, and to
> > truly pause the "firehose", the task needs to pause all partitions?
> >
> >
> > On Tue, May 19, 2020 at 9:26 AM Arjun Satish 
> > wrote:
> >
> > > Can we get a couple of examples that shows utility of waiting on the
> > > Future<>? Also, in preCommit() we would report back on the incomplete
> > > offsets. So that feedback mechanism will already exists for developers
> > who
> > > want to manually manage this.
> > >
> > > On Tue, May 19, 2020 at 8:03 AM Randall Hauch 
> wrote:
> > >
> > >> Thanks, Aakash, for updating the KIP.
> > >>
> > >> On Tue, May 19, 2020 at 2:18 AM Arjun Satish 
> > >> wrote:
> > >>
> > >> > Hi Randall,
> > >> >
> > >> > Thanks for the explanation! Excellent point about guaranteeing
> offsets
> > >> in
> > >> > the async case.
> > >> >
> > >> > If we can guarantee that the offsets will be advanced only after the
> > bad
> > >> > records are reported, then is there any value is the Future<> return
> > >> type?
> > >> > I feel we can declare the function with a void return type:
> > >> >
> > >> > void report(SinkRecord failedRecord, Throwable error)
> > >> >
> > >> > that works asynchronously, and advances offsets only after the DLQ
> > >> producer
> > >> > (and other reporters) complete successfully (as you explained).
> > >> >
> > >> > This actually alleviates my concern of what this Future<> actually
> > >> means.
> > >> > Since a failure to report should kill the tasks, there is no reason
> > for
> > >> the
> > >> > connector to ever wait on the get().
> > >>
> > >>
> > >> We should not say "there is no reason", because we don't know all of
> the
> > >> requirements that might exist for sending records to external systems.
> > The
> > >> additional guarantee regarding error records being fully recorded
> before
> > >> `preCommit(...)` is called is a minimal guarantee that Connect
> provides
> > >> the
> > >> sink task, and returning a Future allow a sink task to have *stronger*
> > >> guarantees than what Connect provides by default.
> > >>
> > >> Once again:
> > >> 1. we need an async API to allow the sink task to report problem
> records
> > >> and then immediately continue doing more work.
> > >> 2. Connect should guarantee to the sink task that all reported records
> > >> will
> > >> ac

Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Aakash Shah
Hi Arjun,

I am not very familiar with how the potential heartbeat failure would cause
more failures when consuming subsequent records. Can you elaborate on this?

Thanks,
Aakash

On Tue, May 19, 2020 at 10:03 AM Arjun Satish 
wrote:

> One more concern with the connector blocking on the Future's get() is that
> it may cause the task's consumer to fail to heartbeat (since there is no
> independent thread to do this). That would then cause failures when we
> eventually try to consume more records after returning from put(). The
> developer would need to be cognizant of these bits before waiting on the
> future, which adds a reasonable amount of complexity.
>
> Even with preCommit() returning incomplete offsets, I suppose the concern
> would be that the put() method keeps giving the task more records, and to
> truly pause the "firehose", the task needs to pause all partitions?
>
>
> On Tue, May 19, 2020 at 9:26 AM Arjun Satish 
> wrote:
>
> > Can we get a couple of examples that shows utility of waiting on the
> > Future<>? Also, in preCommit() we would report back on the incomplete
> > offsets. So that feedback mechanism will already exists for developers
> who
> > want to manually manage this.
> >
> > On Tue, May 19, 2020 at 8:03 AM Randall Hauch  wrote:
> >
> >> Thanks, Aakash, for updating the KIP.
> >>
> >> On Tue, May 19, 2020 at 2:18 AM Arjun Satish 
> >> wrote:
> >>
> >> > Hi Randall,
> >> >
> >> > Thanks for the explanation! Excellent point about guaranteeing offsets
> >> in
> >> > the async case.
> >> >
> >> > If we can guarantee that the offsets will be advanced only after the
> bad
> >> > records are reported, then is there any value is the Future<> return
> >> type?
> >> > I feel we can declare the function with a void return type:
> >> >
> >> > void report(SinkRecord failedRecord, Throwable error)
> >> >
> >> > that works asynchronously, and advances offsets only after the DLQ
> >> producer
> >> > (and other reporters) complete successfully (as you explained).
> >> >
> >> > This actually alleviates my concern of what this Future<> actually
> >> means.
> >> > Since a failure to report should kill the tasks, there is no reason
> for
> >> the
> >> > connector to ever wait on the get().
> >>
> >>
> >> We should not say "there is no reason", because we don't know all of the
> >> requirements that might exist for sending records to external systems.
> The
> >> additional guarantee regarding error records being fully recorded before
> >> `preCommit(...)` is called is a minimal guarantee that Connect provides
> >> the
> >> sink task, and returning a Future allow a sink task to have *stronger*
> >> guarantees than what Connect provides by default.
> >>
> >> Once again:
> >> 1. we need an async API to allow the sink task to report problem records
> >> and then immediately continue doing more work.
> >> 2. Connect should guarantee to the sink task that all reported records
> >> will
> >> actually be recorded before `preCommit(...)` is called
> >> 3. a sink task *might* need stronger guarantees, and may need to block
> on
> >> the reported records some time before `preCommit(...)`, and we should
> >> allow
> >> them to do this.
> >> 4. Future and callbacks are common techniques, but there are significant
> >> runtime risks of using callbacks, whereas Future is a common/standard
> >> pattern that is straightforward to use.
> >>
> >> This *exactly* matches the current KIP, which is why I plan to vote for
> >> this valuable and well-thought out KIP.
> >>
> >>
> >> > And if we are guaranteeing that the
> >> > offsets are only advanced when the errors are reported, then this
> >> becomes a
> >> > double win:
> >> >
> >> > 1. connector developers can literally fire and forget failed records.
> >> > 2. offsets are correctly advanced on errors being reported. Failure to
> >> > report error will kill the task, and the last committed offset will be
> >> the
> >> > correct one.
> >>
> >>
> >> > The main contract would simply be to call report() before preCommit()
> or
> >> > before put() returns in the task, so the framework knows that that
> there
> >> > are error records reported, and those need to finish before the
> offsets
> >> can
> >> > be advanced.
> >> >
> >> > I think I'd be pretty excited about this API. and if we all agree,
> then
> >> > let's go ahead with this?
> >>
> >>
> >> > Best,
> >> >
> >> >
> >> >
> >>
> >
>


Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-19 Thread Colin McCabe
On Tue, May 19, 2020, at 03:27, Rajini Sivaram wrote:
> Hi Colin,
> 
> I do agree about the `leastLoadedNode` case. My question was about the
> other cases where we are connecting to a specific node: fetch requests to
> leaders, produce requests to leaders, requests to group coordinators,
> requests to controller etc. It will be good to either quantify that these
> connections are less common and hence less critical in terms of performance
> in typical deployments or describe the impact on these connections from the
> proposed change in default behaviour. It is perfectly fine if connections
> to specific nodes don't benefit from the new timeout, I was looking for
> analysis which says they aren't made any worse either, especially in the
> context of other connection rate limiting/quota work we are proposing like
> KIP-612.
> 

Hi Rajini,

This is a fair point.  In the VOTE thread, I proposed using an exponential 
connection retry backoff to mitigate this problem.  So the first few retries 
would happen quickly, but later retries would take increasingly longer, keeping 
the number of reconnect attempts down.

(This is assuming we're trying to connect to a single fixed node, like the 
controller node)

best,
Colin


>
> Regards,
> 
> Rajini
> 
> 
> On Mon, May 18, 2020 at 8:48 PM Colin McCabe  wrote:
> 
> > Hi Rajini,
> >
> > I think the idea behind the 10 second default is that if you have three
> > Kafka nodes A, B, C (or whatever), and you can't talk to A within 10
> > seconds, you'll try again with B or C, and still have plenty of time left
> > over.  Whereas currently, if your connection hangs while trying to connect
> > to A, you're out of luck-- you'll just hang until the whole request timeout
> > is gone.  So while you could have tried a different node and succeeded, you
> > never got a chance to.
> >
> > So in the common case where you have other nodes that you can connect to,
> > we won't end up trying to reconnect to the same node over and over.  I'll
> > add some more comments in the vote thread.
> >
> > best,
> > Colin
> >
> >
> > On Fri, May 15, 2020, at 14:13, Rajini Sivaram wrote:
> > > Hi Cheng,
> > >
> > > I am fine with the rest of the KIP apart from the 10s default. If no one
> > > else has any concerns about this new default, let's go with it. Please go
> > > ahead and start vote.
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> > > On Fri, May 15, 2020 at 8:21 PM Cheng Tan  wrote:
> > >
> > > > Dear Rajini,
> > > >
> > > >
> > > > Thanks for the reply.
> > > >
> > > > > e have a lot of these and I want to
> > > > > understand the benefits of the proposed timeout in this case alone.
> > We
> > > > > currently have a request timeout of 30s. Would you consider adding a
> > 10s
> > > > > connection timeout?
> > > >
> > > > A shorter timeout (10s) at the transportation level will help clients
> > > > detect dead nodes faster. “request.timeout.ms” is too general and
> > applies
> > > > to all the requests whose complexity at the application level varies.
> > It’s
> > > > risky to set “request.timeout.ms” to a lower value for detecting dead
> > > > nodes quicker because of the involvement of the application layer.
> > > >
> > > > After “socket.connection.setup.timeout.ms” hits, NetworkClient will
> > fail
> > > > the request in the exact approach as it handles “request.timeout.ms”.
> > > > That is to say, the response will constructed upon a
> > RetriableException.
> > > > Producer, Consumer, and KafkaAdminClient can then perform their retry
> > logic
> > > > as a request timeout happens.
> > > >
> > > > > We have KIP-612 that is proposing to throttle connection set up on
> > the
> > > > one
> > > > > hand and this KIP that is dramatically reducing default connection
> > > > timeout
> > > > > on the other. Not sure if that is a good idea.
> > > >
> > > > The default of the broker connection creation rate limit is
> > Int.MaxValue.
> > > > The KIP also proposes per-IP throttle configuration. Thus, I don’t
> > expect
> > > > the combination of the broker connection throttle and a shorter client
> > > > transportation timeout will have a side effect.
> > > >
> > > > Does the reasons above make sense to you?
> > > >
> > > > Best, - Cheng
> > > >
> > > >
> > > >
> > > >
> > > > > On May 15, 2020, at 4:49 AM, Rajini Sivaram  > >
> > > > wrote:
> > > > >
> > > > > Hi Cheng,
> > > > >
> > > > > Let me rephrase my question. Let's say we didn't have the case of
> > > > > leastLoadedNode. We are only talking about connections to a specific
> > node
> > > > > (i.e. leader or controller). We have a lot of these and I want to
> > > > > understand the benefits of the proposed timeout in this case alone.
> > We
> > > > > currently have a request timeout of 30s. Would you consider adding a
> > 10s
> > > > > connection timeout? And if you did, what would you expect the 10s
> > timeout
> > > > > to do?
> > > > >
> > > > > a) We could fail a request if connection didn't complete within 10s.
> > If
> > > > we

Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Aakash Shah
Hi Chris,

Thanks for the suggestions.

If "errors.tolerance=none", should it not be the case that the error
reporter does not even report any error; rather, the task just fails after
throwing the error? I do understand the point you are saying about
duplicates, though.

You raise a good point about "offset.flush.interval.ms" and I think we
should respect that. I will add this constraint to the KIP. Please let me
know if this extra constraint adds any other issues I am not aware of.

Lastly, why do you say we should always provide an errant record reporter?
Doesn't that change the contract of what functionality it is providing?

Thanks,
Aakash


On Tue, May 19, 2020 at 11:15 AM Chris Egerton  wrote:

> Hi Randall,
>
> First off, thank you for the incredibly detailed example. I don't mind
> walls of text. I found it very helpful. I especially liked the idea about
> modifying how the framework invokes "SinkTask::preCommit" to take most of
> the work out of developers' hands in the common case of a "fire-and-forget"
> but still provide flexibility to accommodate connectors with, for example,
> exactly-once delivery guarantees that involve committing offsets to the
> sink atomically with the actual records that they've received from Kafka.
>
> I have one point I'd like to raise about the stated advantage of an
> asynchronous API: that tasks can continue processing records and sending
> them to the sink destination without having to block on the completion of
> the error report.
>
> Wouldn't this actually be a disadvantage in the case that the user has
> configured the connector with "errors.tolerance = none"? In that case, the
> expectation is that the task should fail as soon as it hits a bad record;
> allowing it to possibly continue to produce records in that case (which
> would likely end up as duplicates in the sink if/when the task is
> restarted) doesn't seem optimal.
>
> I don't think that this makes an asynchronous API completely unusable; I
> just think that we'd want to synchronously throw some kind of exception
> when the error reporter is invoked and the connector is configured with
> "errors.tolerance = none", instead of causing one to be thrown wrapped in
> an ExecutionException if/when "Future::get" is called on the returned
> future.
>
> I'd also like to suggest a slight change to the logic for invoking
> "SinkTask::preCommit". The interval at which offsets are committed for sink
> tasks is configurable via the worker-level "offset.flush.interval.ms"
> property; I think it'd be nice to respect that property if we could. What
> would you think about calling "SinkTask::preCommit" at the normally
> scheduled times, but altering the offsets that are passed in to that call
> to not go beyond any offsets for errant records that have been reported but
> not fully processed yet?
>
> For example, imagine a task has been given records with offsets 0-10 on a
> single topic partition and reports records with offsets 2 and 7 to the
> framework. Then, the framework is able to process the record with offset 2
> but not the record with offset 7. When it comes time for an offset commit,
> the framework will call "SinkTask::preCommit" with an offset of 6 for that
> topic partition, since the record for offset 7 has not been completely
> taken care of yet.
>
> One more small suggestion: we may want to always provide an errant record
> reporter to connectors, even if one has not been configured. This reporter
> would simply fail the task and throw an exception as soon as it's invoked.
> This would provide a more uniform experience for users across different
> connectors and would establish expectations that, if a connector uses the
> features added by KIP-610 at all, it will fail by default on any invalid
> records (instead of doing something implementation-dependent).
>
> Cheers,
>
> Chris
>
> On Tue, May 19, 2020 at 10:03 AM Arjun Satish 
> wrote:
>
> > One more concern with the connector blocking on the Future's get() is
> that
> > it may cause the task's consumer to fail to heartbeat (since there is no
> > independent thread to do this). That would then cause failures when we
> > eventually try to consume more records after returning from put(). The
> > developer would need to be cognizant of these bits before waiting on the
> > future, which adds a reasonable amount of complexity.
> >
> > Even with preCommit() returning incomplete offsets, I suppose the concern
> > would be that the put() method keeps giving the task more records, and to
> > truly pause the "firehose", the task needs to pause all partitions?
> >
> >
> > On Tue, May 19, 2020 at 9:26 AM Arjun Satish 
> > wrote:
> >
> > > Can we get a couple of examples that shows utility of waiting on the
> > > Future<>? Also, in preCommit() we would report back on the incomplete
> > > offsets. So that feedback mechanism will already exists for developers
> > who
> > > want to manually manage this.
> > >
> > > On Tue, May 19, 2020 at 8:03 AM Randall Hauch 
> w

Re: [VOTE] KIP 585: Filter and conditional SMTs

2020-05-19 Thread Bill Bejeck
Thanks for the KIP Tom, this will be a useful addition.

+1(binding)

-Bill

On Tue, May 19, 2020 at 1:14 PM Tom Bentley  wrote:

> It would be nice to get this into Kafka 2.6. There are 2 binding and 3
> non-binding votes so far. If you've not looked at it already now would be a
> great time!
>
> Many thanks,
>
> Tom
>
> On Tue, May 19, 2020 at 1:27 PM Mickael Maison 
> wrote:
>
> > +1 (binding)
> > Thanks Tom for leading this KIP and steering the syntax discussion
> > towards a consensus
> >
> > On Tue, May 19, 2020 at 11:29 AM Edoardo Comar 
> wrote:
> > >
> > > +1 (non-binding)
> > > Thanks Tom
> > > --
> > >
> > > Edoardo Comar
> > >
> > > Event Streams for IBM Cloud
> > > IBM UK Ltd, Hursley Park, SO21 2JN
> > >
> > >
> > >
> > >
> > > From:   Gunnar Morling 
> > > To: dev@kafka.apache.org
> > > Date:   19/05/2020 10:35
> > > Subject:[EXTERNAL] Re: [VOTE] KIP 585: Filter and conditional
> > SMTs
> > >
> > >
> > >
> > > +1 (non-binding)
> > >
> > > Thanks for working on this, Tom! This KIP will be very useful for
> > > connectors like Debezium.
> > >
> > > --Gunnar
> > >
> > > Am Fr., 15. Mai 2020 um 20:02 Uhr schrieb Konstantine Karantasis
> > > :
> > > >
> > > > +1 (binding)
> > > >
> > > > Thanks Tom.
> > > >
> > > > Konstantine
> > > >
> > > > On Fri, May 15, 2020 at 5:03 AM Andrew Schofield
> > > 
> > > > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Thanks for the KIP. This will be very useful.
> > > > >
> > > > > Andrew Schofield
> > > > >
> > > > > On 13/05/2020, 10:14, "Tom Bentley"  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I'd like to start a vote on KIP-585: Filter and conditional
> SMTs
> > > > >
> > > > >
> > > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D585-253A-2BFilter-2Band-2BConditional-2BSMTs&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=AnNSwofDk0eZPfkUhSGAHsEyMB_tKe1luK9nox7bE1w&s=_AHSlXsBMSSSOnVL3bBa-Pzu9Zg1f8lgOSTI_VMTP8s&e=
> > >
> > > > >
> > > > > Those involved in the discussion seem to be positively disposed
> > to
> > > the
> > > > > idea, but in the absence of any committer participation it's
> been
> > > > > difficult
> > > > > to find a consensus on how these things should be configured.
> > > What's
> > > > > presented here seemed to be the option which people preferred
> > > overall.
> > > > >
> > > > > Kind regards,
> > > > >
> > > > > Tom
> > > > >
> > > > >
> > >
> > >
> > >
> > >
> > > Unless stated otherwise above:
> > > IBM United Kingdom Limited - Registered in England and Wales with
> number
> > > 741598.
> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
> > 3AU
> > >
> >
> >
>


Re: [VOTE] KIP-221: Enhance KStream with Connecting Topic Creation and Repartition Hint

2020-05-19 Thread Matthias J. Sax
While working on the follow up PR to deprecate `through()`, I realized
that we forgot to add `repartition()` to the Scala API.

I updated the KIP accordingly and my PR also include the update to the
Scala API (for adding `repartition()` and deprecating `through()`):
https://github.com/apache/kafka/pull/8679


Please let us know if there are any concerns.


-Matthias

On 5/15/20 10:27 AM, Matthias J. Sax wrote:
> Thanks Levani!
> 
> I took the liberty to update the KIP accordingly, because so far no
> concerns were raised about the change.
> 
> Also assigned the ticket to myself (let me know if you want to work on
> it instead). We should get the change into 2.6 release.
> 
> 
> -Matthias
> 
> 
> On 5/15/20 1:34 AM, Levani Kokhreidze wrote:
>> Hi Matthias,
>>
>> Thanks for your thoughts and sorry for taking so long to respond.
>> For me that makes total sense, so +1 from my side. 
>> I took the liberty of creating a ticket for it: 
>> https://issues.apache.org/jira/browse/KAFKA-10003 
>> 
>>
>> Regards,
>> Levani
>>
>>
>>> On May 12, 2020, at 8:30 PM, Guozhang Wang  wrote:
>>>
>>> Sounds fair to me; I think as a syntax sugar it is a good to have, but
>>> sometimes it was "enforced" to be used for repartitioning purposes.
>>>
>>> On Mon, May 11, 2020 at 7:08 PM Matthias J. Sax  wrote:
>>>
 As an afterthought to KIP-221, I am wondering if we should deprecate
 `KStream#through()`?

 The reasoning is that I assume that most people don't want to manage
 topics manually anyway and thus it might be good to guide users to use
 repartition(). Furthermore, through() is really just syntactic sugar for
 to() followed by builder.stream() (thus people don't really loose
 functionality). So far, through() was very nice to have, especially with
 PAPI integration in the DSL (users might need to do a manual
 repartitioning before transform()) however this pattern should be
 subsumed by repartition().

 Reducing the surface area of our API (instead of just enlarging it)
 might be good.

 Thoughts?


 -Matthias

 On 4/5/20 9:36 PM, John Roesler wrote:
> Thanks for the update, Levani!
> -John
>
> On Sat, Apr 4, 2020, at 04:36, Levani Kokhreidze wrote:
>> Hello,
>>
>> Small update regarding this KIP. As per John’s suggestion during the
>> code review
>> (https://github.com/apache/kafka/pull/7170#discussion_r392608571
>> )
>> we’ve decided to remove KeyValueMapper overloads for the new
>> `repartition` operation for the first release of this feature.
>> Wiki page has been updated accordingly
>> (
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-221%3A+Enhance+DSL+with+Connecting+Topic+Creation+and+Repartition+Hint
 <
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-221:+Enhance+DSL+with+Connecting+Topic+Creation+and+Repartition+Hint
> )
>>
>> Regards,
>> Levani
>>
>>> On Aug 1, 2019, at 9:55 AM, Levani Kokhreidze 
 wrote:
>>>
>>> Thank you all!
>>>
>>> The vote has been open for ~8 days. KIP has three binding votes (Bill,
 Guozhang, Matthias) and one non-binding (Sophie) so the KIP vote passes!
>>> I’ll mark KIP as accepted and start working on it as soon as possible!
>>>
>>> Regards,
>>> Levani
>>>
 On Aug 1, 2019, at 2:37 AM, Matthias J. Sax 
 wrote:

 +1 (binding)

 On 7/31/19 8:36 AM, Guozhang Wang wrote:
> Thanks for the update! +1 (binding).
>
> On Tue, Jul 30, 2019 at 11:42 PM Levani Kokhreidze <
 levani.co...@gmail.com>
> wrote:
>
>> Hello Guozhang,
>>
>> Thanks for the feedback. That’s an interesting point. To be honest,
 I
>> totally missed it. I wasn’t aware that there’s `groupBy`
 possibility on
>> KTable.
>> I don’t see any reasons why not to add same functionality to KTable
>> interface.
>>
>> I’ve updated the KIP:
>>
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-221%3A+Enhance+DSL+with+Connecting+Topic+Creation+and+Repartition+Hint
>> <
>>
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-221:+Enhance+DSL+with+Connecting+Topic+Creation+and+Repartition+Hint
>>>
>> Please let me know if you have any other questions and/or concerns.
>>
>> Regards,
>> Levani
>>
>>> On Jul 31, 2019, at 1:24 AM, Guozhang Wang 
 wrote:
>>>
>>> Hello Levani,
>>>
>>> Thanks for the KIP! Just got a quick question here about the
 scope: why
>> do
>>> we only want this for `KStream`, not `KTable#groupBy` for example?
>>>
>>>
>>> Guozhang
>>>
>

Re: [DISCUSS] KIP-598: Augment TopologyDescription with store and source / sink serde information

2020-05-19 Thread Guozhang Wang
We already has a Serdes actually, which is a factory class. What we really
need is to add new functions to `Serde`, `Serializer` and `Deserializer`
interfaces, but since we already dropped Java7 backward compatibility may
not be a big issue anyways, let me think about it a bit more.

On Tue, May 19, 2020 at 12:01 PM Matthias J. Sax  wrote:

> Thanks Guozhang.
>
> This makes sense. I am still wondering about wrapped serdes:
>
> > and if it is a wrapper serde, also print its inner
> >>> serde name
>
> How can our default implementation of `TopologyDescriber` know if it's a
> wrapped serde or not? Furthermore, how do wrapped serdes expose their
> inner serdes?
>
> I am also not sure what the purpose of TopologyDescriber is? Would it
> mabye be better to add new interface `Serdes` can implement instead?
>
>
> -Matthias
>
>
>
> On 5/18/20 9:24 PM, Guozhang Wang wrote:
> > Bruno, Matthias:
> >
> > Thanks for your inputs. After some thoughts I've decide to update my
> > proposal in the following way:
> >
> > 1. Store#serdes() would return a "Map"
> >
> > 2. Topology's description would be independent of whether it is generated
> > from `StreamsBuilder#build(props)` or `StreamsBuilder#build()`, and if
> the
> > serde is not known we would use "" as the default value.
> >
> > 3. Add `List TopologyDescription#sourceTopics() / sinkTopics() /
> > repartitionTopics() / changelogTopics()` and for pattern /
> topic-extractor
> > we would use fixed format of "" and
> > "".
> >
> >
> > I will try to implement this in my existing PR and after I've confirmed
> it
> > works, I will update the final wiki for voting.
> >
> >
> > Guozhang
> >
> >
> > On Mon, May 18, 2020 at 9:13 PM Guozhang Wang 
> wrote:
> >
> >> Hello Andy,
> >>
> >> Thanks a lot for your comments! I do not mind at all :)
> >>
> >> I think that's a valid point, what I have in mind is to expose an
> >> interface which can be optionally overridden in the overridden
> describe()
> >> call:
> >>
> >> Topology#describe(final TopologyDescriber)
> >>
> >> Interface TopologyDescriber {
> >>
> >> default describeSerde(final Serde);
> >>
> >> default describeSerializer(final Serializer);
> >>
> >> default describeDeserializer(final Serializer);
> >> }
> >>
> >> And we would expose a DefaultTopologyDescriber class that just print the
> >> serde class names -- and if it is a wrapper serde, also print its inner
> >> serde name.
> >>
> >> Guozhang
> >>
> >>
> >> On Mon, May 11, 2020 at 12:13 PM Andy Coates  wrote:
> >>
> >>> Hi Guozhang,
> >>>
> >>> Thanks for writing this up. I’m very interested to see this, so I hope
> >>> you don’t mind me commenting.
> >>>
> >>> I’ve only really one comment to make, and that’s on the text printed
> for
> >>> the serde classes:
> >>>
> >>> As I understand it, the name will either come from the passed in
> config,
> >>> or may default to “unknown”, or may be obtained from the instances
> passed
> >>> while building the topology. It’s this latter case that interests me.
> >>> Where you have an actual serde instance could we not output more
> >>> information?
> >>>
> >>> The examples use simple (de)serialization classes such as
> >>> `LongDeseriailizer` where the name alone imparts all the information
> the
> >>> user is likely to need. However, users may provide there own custom
> >>> serialisers and such serialisers may contain state that is important,
> e.g.
> >>> the serialiser may know the schema of the data being serialized.  May
> there
> >>> be benefit from taking the `toString()` representation of the
> serialiser?
> >>>
> >>> Of course, this would require adding suitable `toString` impls to our
> own
> >>> stock serialisers, but may ultimately prove more versatile in the
> future.
> >>> The downside is potential to corrupt the topology description, e.g. a
> >>> toString that includes new lines etc.
> >>>
> >>> Thanks,
> >>>
> >>> Andy
> >>>
> >>>
> >>>
>  On 4 May 2020, at 19:27, Bruno Cadonna  wrote:
> 
>  Hi Guozhang,
> 
>  Thank you for the KIP!
> 
>  Exposing also the inner types of the wrapper serdes would be
>  important. For debugging as Matthias has already mentioned and to see
>  more easily changes that are applied to a topology.
> 
>  I am also +1 on the `toJson()` method to easily access the topology
>  description programmatically and to make the description backward
>  compatible.
> 
>  Regarding `List serdeNames();`, I would be in favour of a more
>  expressive return type, like a Map that assigns labels to Serde names.
>  For example, for key and value serdes the label could be "key" and
>  "value". Or something similar.
> 
>  Best,
>  Bruno
> 
> 
> 
>  On Mon, May 4, 2020 at 2:25 AM Guozhang Wang 
> >>> wrote:
> >
> > Hello Matthias John, thanks for your comments!! Replied them inline.
> >
> > I think there are a couple open questions that I'd like to hear your
> > opinions on with th

Build failed in Jenkins: kafka-trunk-jdk14 #103

2020-05-19 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Update stream documentation (#8622)


--
[...truncated 3.09 MB...]

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordsFromKeyValuePairs STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateConsumerRecordsFromKeyValuePairs PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKeyAndDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldRequireCustomTopicNameIfNotDefaultFactoryTopicNameWithDefaultTimestamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldNotAllowToCreateTopicWithNullTopicNameWithNullKey PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
STARTED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateNullKeyConsumerRecordWithOtherTopicNameAndTimestampWithTimetamp 
PASSED

org.apache.kafka.streams.test.ConsumerRecordFactoryTest > 
shouldCreateCon

Re: [DISCUSS] KIP-589 Add API to Update Replica State in Controller

2020-05-19 Thread David Arthur
Thanks, Jason. Good feedback

1. I was mostly referring to the fact that the ReplicaManager uses a
background thread to send the ZK notification and it really has no
visibility as to whether the ZK operation succeeded or not. We'll most
likely want to continue using a background thread for batching purposes
with the new RPC. Retries make sense as well.

2. Yes, I'll change that

3. Thanks, I neglected to mention this. Indeed I was considering
ControlledShutdown when originally thinking about this KIP. A Future Work
section is a good idea, I'll add one.

On Tue, May 19, 2020 at 2:58 PM Jason Gustafson  wrote:

> Hi David,
>
> This looks good. I just have a few comments:
>
> 1. I'm not sure it's totally fair to describe the current notification
> mechanism as "best-effort." At least it guarantees that the controller will
> eventually see the event. In any case, I think we might want a stronger
> contract going forward. As long as the broker remains the leader for
> partitions in offline log directories, it seems like we should retry the
> AlterReplicaState requests.
> 2. Should we consider a new name for `UNKNOWN_REPLICA_EVENT_TYPE`? Maybe
> `UNKOWN_REPLICA_STATE`?
> 3. Mostly an observation, but there is some overlap with this API and
> ControlledShutdown. From the controller's perspective, the intent is mostly
> the same. I guess we could treat a null array in the request as an intent
> to shutdown all replicas if we wanted to try and converge these APIs. One
> of the differences is that ControlledShutdown is a synchronous API, but I
> think it would have actually been better as an asynchronous API since
> historically we have run into problems with timeouts. Anyway, this is
> outside the scope of this KIP, but might be worth mentioning as "Future
> work" somewhere.
>
> Thanks,
> Jason
>
>
> On Mon, May 18, 2020 at 10:09 AM David Arthur  wrote:
>
> > I've updated the KIP with the feedback from this discussion
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller
> > .
> > I'll send out the vote thread shortly.
> >
> > Thanks again,
> > David
> >
> > On Tue, May 5, 2020 at 10:34 AM Tom Bentley  wrote:
> >
> > > Hi Colin,
> > >
> > > Yeah, that makes sense, thanks. I was thinking, longer term, that there
> > are
> > > other benefits to having the log dir information available to the
> > > controller. For example it would allow the possibility for CREATE_TOPIC
> > > requests to include the intended log dir for each replica. But that's
> > > obviously completely out of scope for this KIP.
> > >
> > > Kind regards,
> > >
> > > Tom
> > >
> > > On Mon, May 4, 2020 at 10:11 PM Colin McCabe 
> wrote:
> > >
> > > > Hi Tom,
> > > >
> > > > As you said, the controller doesn't know about log directories,
> > although
> > > > individual brokers do.  So the brokers do currently have to enumerate
> > all
> > > > the partitions that need to be removed to the controllers explicitly.
> > So
> > > > this KIP isn't changing anything in that regard.
> > > >
> > > > The current flow is:
> > > > 1. ping ZK back-channel
> > > > 2. controller sends a full LeaderAndIsrRequest to the broker
> > > > 3. the broker sends a full response containing error codes for all
> > > > partitions.  Partitions on the failed storage have a nonzero error
> > code;
> > > > the others have 0.
> > > >
> > > > The new flow is:
> > > > 1. the broker sends an RPC with all the failed partitions
> > > >
> > > > So the new flow actually substantially reduces the amount of network
> > > > traffic, since previously we sent a full LeaderAndIsrRequest
> containing
> > > all
> > > > of the partitions.  Now we just send all the partitions in the failed
> > > > storage directory.  That could still be a lot, but certainly only be
> a
> > > > fraction of what a full LeaderAndIsrRequest would have.
> > > >
> > > > Sorry if I'm repeating stuff you already figured out, but I just
> wanted
> > > to
> > > > be more clear about this (I found it confusing too until David
> > explained
> > > it
> > > > to me originally...)
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Sat, May 2, 2020, at 10:30, Tom Bentley wrote:
> > > > > Hi David,
> > > > >
> > > > > > In the rejecting the alternative of having an RPC for log dir
> > > failures
> > > > > > you say
> > > > > >
> > > > > > I guess what I really mean here is that I wanted to avoid
> exposing
> > > the
> > > > > > notion of a log dir to the controller. I can update the
> description
> > > to
> > > > > > reflect this.
> > > > > >
> > > > >
> > > > > Ah, I think I see now. While each broker knows about its log dirs
> > this
> > > > > isn't something that's stored in zookeeper or known to the
> > controller.
> > > > >
> > > > >
> > > > > > > It's also not completely clear that the cost of having to
> > enumerate
> > > > all
> > > > > > the partitions on a log dir was weighed against the perceived
> > benefit
> > > > of a
> > > > > > more flexible RPC.
> 

Re: [DISCUSS] KIP-573: Enable TLSv1.3 by default

2020-05-19 Thread Nikolay Izhikov
PR - https://github.com/apache/kafka/pull/8695

> 18 мая 2020 г., в 23:30, Nikolay Izhikov  написал(а):
> 
> Hello, Colin
> 
> We need hack only because TLSv1.3 not supported in java8.
> 
>> Java 8 will receive TLS 1.3 support later this year 
>> (https://java.com/en/jre-jdk-cryptoroadmap.html)
> 
> We can 
> 
> 1. Enable TLSv1.3 for java11 for now. And after java8 get TLSv1.3 support 
> remove it.
> 2. Or we can wait and enable it after java8 update.
> 
> What do you think?
> 
>> 18 мая 2020 г., в 22:51, Ismael Juma  написал(а):
>> 
>> Yeah, agreed. One option is to actually only change this in Apache Kafka
>> 3.0 and avoid the hack altogether. We could make TLS 1.3 the default and
>> have 1.2 as one of the enabled protocols.
>> 
>> Ismael
>> 
>> On Mon, May 18, 2020 at 12:24 PM Colin McCabe  wrote:
>> 
>>> Hmm.  It would be good to figure out if we are going to remove this
>>> compatibility hack in the next major release of Kafka?  In other words, in
>>> Kafka 3.0, will we enable TLS 1.3 by default even if the cipher suite is
>>> specified?
>>> 
>>> best,
>>> Colin
>>> 
>>> 
>>> On Mon, May 18, 2020, at 09:26, Ismael Juma wrote:
 Sounds good.
 
 Ismael
 
 
 On Mon, May 18, 2020, 9:03 AM Nikolay Izhikov 
>>> wrote:
 
>> A safer approach may be to only add TLS 1.3 to the list if the cipher
> suite config has not been specified.
>> So, if TLS 1.3 is added to the list by Kafka, it would seem that it
> would not work if the user specified a list of cipher suites for
>>> previous
> TLS versions
> 
> Let’s just add test for this case?
> I can prepare the preliminary PR for this KIP and add this kind of
>>> test to
> it.
> 
> What do you think?
> 
> 
>> 18 мая 2020 г., в 18:59, Nikolay Izhikov 
> написал(а):
>> 
>>> 1. I meant that `ssl.protocol` is TLSv1.2 while
>>> `ssl.enabled.protocols`
> is `TLSv1.2, TLSv1.3`. How do these two configs interact
>> 
>> `ssl.protocol` is what will be used, by default, in this KIP is stays
> unchanged (TLSv1.2) Please, see [1]
>> `ssl.enabled.protocols` is list of protocols that  *can* be used.
>>> This
> value is just passed to the `SSLEngine` implementation.
>> Please, see DefaultSslEngineFactory#createSslEngine [2]
>> 
>>> 2. My question is not about obsolete protocols, it is about people
> using TLS 1.2 with specified cipher suites. How will that behave when
>>> TLS
> 1.3 is enabled by default?
>> 
>> They don’t change anything and all just work as expected on java11.
>> 
>>> 3. An additional question is how does this impact Java 8 users?
>> 
>> Yes.
>> If SSLEngine doesn’t support TLSv1.3 then java8 users should
>>> explicitly
> modify `ssl.enabled.protocols` and set it to `TLSv1.2`.
>> 
>> [1]
> 
>>> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/security/ssl/DefaultSslEngineFactory.java#L218
>> [2]
> 
>>> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/security/ssl/DefaultSslEngineFactory.java#L164
>> 
>>> 18 мая 2020 г., в 17:34, Ismael Juma 
>>> написал(а):
>>> 
>>> Nikolay,
>>> 
>>> Thanks for the comments. More below:
>>> 
>>> 1. I meant that `ssl.protocol` is TLSv1.2 while
>>> `ssl.enabled.protocols`
> is `TLSv1.2, TLSv1.3`. How do these two configs interact?
>>> 2. My question is not about obsolete protocols, it is about people
> using TLS 1.2 with specified cipher suites. How will that behave when
>>> TLS
> 1.3 is enabled by default?
>>> 3. An additional question is how does this impact Java 8 users?
>>> Java 8
> will receive TLS 1.3 support later this year (
> https://java.com/en/jre-jdk-cryptoroadmap.html), but it currently does
> not support it. One way to handle this would be to check if the
>>> underlying
> JVM supports TLS 1.3 before enabling it.
>>> 
>>> I hope this clarifies my questions.
>>> 
>>> Ismael
>>> 
>>> On Mon, May 18, 2020 at 6:44 AM Nikolay Izhikov <
>>> nizhi...@apache.org>
> wrote:
>>> Hello, Ismael.
>>> 
>>> Here is answers to your questions:
>>> 
 Quick question, the following is meant to include TLSv1.3 as well,
> right?
 Change the value of the SslConfigs.DEFAULT_SSL_ENABLED_PROTOCOLS to
> «TLSv1.2»
>>> 
>>> I propose to have the following value
> SslConfigs.DEFAULT_SSL_ENABLED_PROTOCOLS = «TLSv1.2,TLSv.1.3»
>>> 
 1. `ssl.protocol` would remain TLSv1.2 with this change. It would
>>> be
> good to explain why that's OK.
>>> 
>>> I think it covered by the following statements in KIP.
>>> If you know more trustworthy sources of this kind of information,
> please, let me know.
>>> 
>>> ```
>>> For now, only TLS1.2 and TLS1.3 are recommended for the usage, other
> versions of TLS co

Jenkins build is back to normal : kafka-trunk-jdk8 #4543

2020-05-19 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-613: Add end-to-end latency metrics to Streams

2020-05-19 Thread Guozhang Wang
+1 on the updated proposal. Thanks Sophie.

On Tue, May 19, 2020 at 11:51 AM John Roesler  wrote:

> Thanks for the KIP, Sophie.
>
> I'm +1 (binding) on the current proposal.
>
> Thanks,
> -John
>
> On Fri, May 15, 2020, at 19:45, Guozhang Wang wrote:
> > Hi Sophie,
> >
> > Sorry I was a bit late on following up the DISCUSS thread. I still have
> > some comment about the processor-node level metrics and left a reply on
> > that thread.
> >
> >
> > Guozhang
> >
> > On Fri, May 15, 2020 at 2:34 PM Sophie Blee-Goldman  >
> > wrote:
> >
> > > Hey all,
> > >
> > > I think we've reached a general consensus on all the non-implementation
> > > details of KIP-613 so I'd like to call for a vote.
> > >
> > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-613%3A+Add+end-to-end+latency+metrics+to+Streams
> > >
> > > Thanks!
> > > Sophie
> > >
> >
> >
> > --
> > -- Guozhang
> >
>


-- 
-- Guozhang


Re: [DISCUSS] KIP-598: Augment TopologyDescription with store and source / sink serde information

2020-05-19 Thread Matthias J. Sax
Thanks Guozhang.

This makes sense. I am still wondering about wrapped serdes:

> and if it is a wrapper serde, also print its inner
>>> serde name

How can our default implementation of `TopologyDescriber` know if it's a
wrapped serde or not? Furthermore, how do wrapped serdes expose their
inner serdes?

I am also not sure what the purpose of TopologyDescriber is? Would it
mabye be better to add new interface `Serdes` can implement instead?


-Matthias



On 5/18/20 9:24 PM, Guozhang Wang wrote:
> Bruno, Matthias:
> 
> Thanks for your inputs. After some thoughts I've decide to update my
> proposal in the following way:
> 
> 1. Store#serdes() would return a "Map"
> 
> 2. Topology's description would be independent of whether it is generated
> from `StreamsBuilder#build(props)` or `StreamsBuilder#build()`, and if the
> serde is not known we would use "" as the default value.
> 
> 3. Add `List TopologyDescription#sourceTopics() / sinkTopics() /
> repartitionTopics() / changelogTopics()` and for pattern / topic-extractor
> we would use fixed format of "" and
> "".
> 
> 
> I will try to implement this in my existing PR and after I've confirmed it
> works, I will update the final wiki for voting.
> 
> 
> Guozhang
> 
> 
> On Mon, May 18, 2020 at 9:13 PM Guozhang Wang  wrote:
> 
>> Hello Andy,
>>
>> Thanks a lot for your comments! I do not mind at all :)
>>
>> I think that's a valid point, what I have in mind is to expose an
>> interface which can be optionally overridden in the overridden describe()
>> call:
>>
>> Topology#describe(final TopologyDescriber)
>>
>> Interface TopologyDescriber {
>>
>> default describeSerde(final Serde);
>>
>> default describeSerializer(final Serializer);
>>
>> default describeDeserializer(final Serializer);
>> }
>>
>> And we would expose a DefaultTopologyDescriber class that just print the
>> serde class names -- and if it is a wrapper serde, also print its inner
>> serde name.
>>
>> Guozhang
>>
>>
>> On Mon, May 11, 2020 at 12:13 PM Andy Coates  wrote:
>>
>>> Hi Guozhang,
>>>
>>> Thanks for writing this up. I’m very interested to see this, so I hope
>>> you don’t mind me commenting.
>>>
>>> I’ve only really one comment to make, and that’s on the text printed for
>>> the serde classes:
>>>
>>> As I understand it, the name will either come from the passed in config,
>>> or may default to “unknown”, or may be obtained from the instances passed
>>> while building the topology. It’s this latter case that interests me.
>>> Where you have an actual serde instance could we not output more
>>> information?
>>>
>>> The examples use simple (de)serialization classes such as
>>> `LongDeseriailizer` where the name alone imparts all the information the
>>> user is likely to need. However, users may provide there own custom
>>> serialisers and such serialisers may contain state that is important, e.g.
>>> the serialiser may know the schema of the data being serialized.  May there
>>> be benefit from taking the `toString()` representation of the serialiser?
>>>
>>> Of course, this would require adding suitable `toString` impls to our own
>>> stock serialisers, but may ultimately prove more versatile in the future.
>>> The downside is potential to corrupt the topology description, e.g. a
>>> toString that includes new lines etc.
>>>
>>> Thanks,
>>>
>>> Andy
>>>
>>>
>>>
 On 4 May 2020, at 19:27, Bruno Cadonna  wrote:

 Hi Guozhang,

 Thank you for the KIP!

 Exposing also the inner types of the wrapper serdes would be
 important. For debugging as Matthias has already mentioned and to see
 more easily changes that are applied to a topology.

 I am also +1 on the `toJson()` method to easily access the topology
 description programmatically and to make the description backward
 compatible.

 Regarding `List serdeNames();`, I would be in favour of a more
 expressive return type, like a Map that assigns labels to Serde names.
 For example, for key and value serdes the label could be "key" and
 "value". Or something similar.

 Best,
 Bruno



 On Mon, May 4, 2020 at 2:25 AM Guozhang Wang 
>>> wrote:
>
> Hello Matthias John, thanks for your comments!! Replied them inline.
>
> I think there are a couple open questions that I'd like to hear your
> opinions on with the context:
>
> a. For stores's serdes, the reason I proposed to expose a set of serde
> names instead of a pair of key / value serdes is for future possible
>>> store
> types which may not be key-values. I admit it could just be
>>> over-killing
> here so if you have a strong preference on the latter, I could be
>>> convinced
> to change that part but I'd want to make the original motivation clear.
>
> b. I think I'm convinced that I'd just augment the `toString` result
> regardless of which func generated the Topology (and hence its
> TopologyDescription), note this would mean 

Re: [DISCUSS] KIP-589 Add API to Update Replica State in Controller

2020-05-19 Thread Jason Gustafson
Hi David,

This looks good. I just have a few comments:

1. I'm not sure it's totally fair to describe the current notification
mechanism as "best-effort." At least it guarantees that the controller will
eventually see the event. In any case, I think we might want a stronger
contract going forward. As long as the broker remains the leader for
partitions in offline log directories, it seems like we should retry the
AlterReplicaState requests.
2. Should we consider a new name for `UNKNOWN_REPLICA_EVENT_TYPE`? Maybe
`UNKOWN_REPLICA_STATE`?
3. Mostly an observation, but there is some overlap with this API and
ControlledShutdown. From the controller's perspective, the intent is mostly
the same. I guess we could treat a null array in the request as an intent
to shutdown all replicas if we wanted to try and converge these APIs. One
of the differences is that ControlledShutdown is a synchronous API, but I
think it would have actually been better as an asynchronous API since
historically we have run into problems with timeouts. Anyway, this is
outside the scope of this KIP, but might be worth mentioning as "Future
work" somewhere.

Thanks,
Jason


On Mon, May 18, 2020 at 10:09 AM David Arthur  wrote:

> I've updated the KIP with the feedback from this discussion
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller
> .
> I'll send out the vote thread shortly.
>
> Thanks again,
> David
>
> On Tue, May 5, 2020 at 10:34 AM Tom Bentley  wrote:
>
> > Hi Colin,
> >
> > Yeah, that makes sense, thanks. I was thinking, longer term, that there
> are
> > other benefits to having the log dir information available to the
> > controller. For example it would allow the possibility for CREATE_TOPIC
> > requests to include the intended log dir for each replica. But that's
> > obviously completely out of scope for this KIP.
> >
> > Kind regards,
> >
> > Tom
> >
> > On Mon, May 4, 2020 at 10:11 PM Colin McCabe  wrote:
> >
> > > Hi Tom,
> > >
> > > As you said, the controller doesn't know about log directories,
> although
> > > individual brokers do.  So the brokers do currently have to enumerate
> all
> > > the partitions that need to be removed to the controllers explicitly.
> So
> > > this KIP isn't changing anything in that regard.
> > >
> > > The current flow is:
> > > 1. ping ZK back-channel
> > > 2. controller sends a full LeaderAndIsrRequest to the broker
> > > 3. the broker sends a full response containing error codes for all
> > > partitions.  Partitions on the failed storage have a nonzero error
> code;
> > > the others have 0.
> > >
> > > The new flow is:
> > > 1. the broker sends an RPC with all the failed partitions
> > >
> > > So the new flow actually substantially reduces the amount of network
> > > traffic, since previously we sent a full LeaderAndIsrRequest containing
> > all
> > > of the partitions.  Now we just send all the partitions in the failed
> > > storage directory.  That could still be a lot, but certainly only be a
> > > fraction of what a full LeaderAndIsrRequest would have.
> > >
> > > Sorry if I'm repeating stuff you already figured out, but I just wanted
> > to
> > > be more clear about this (I found it confusing too until David
> explained
> > it
> > > to me originally...)
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Sat, May 2, 2020, at 10:30, Tom Bentley wrote:
> > > > Hi David,
> > > >
> > > > > In the rejecting the alternative of having an RPC for log dir
> > failures
> > > > > you say
> > > > >
> > > > > I guess what I really mean here is that I wanted to avoid exposing
> > the
> > > > > notion of a log dir to the controller. I can update the description
> > to
> > > > > reflect this.
> > > > >
> > > >
> > > > Ah, I think I see now. While each broker knows about its log dirs
> this
> > > > isn't something that's stored in zookeeper or known to the
> controller.
> > > >
> > > >
> > > > > > It's also not completely clear that the cost of having to
> enumerate
> > > all
> > > > > the partitions on a log dir was weighed against the perceived
> benefit
> > > of a
> > > > > more flexible RPC.
> > > > >
> > > > > The enumeration isn't strictly required. In the "RPC semantics"
> > > section, I
> > > > > mention that if no topics are present in the RPC request, then all
> > > topics
> > > > > on the broker are implied. And if a topic is given with no
> > partitions,
> > > all
> > > > > partitions for that topic (on the broker) are implied. Does this
> make
> > > > > sense?
> > > > >
> > > >
> > > > So the no-topics-present optimisation wouldn't be available to a
> broker
> > > > with >1 log dirs where only some of the log dirs failed. I don't
> > suppose
> > > > that's a problem though.
> > > >
> > > > Thanks again,
> > > >
> > > > Tom
> > > >
> > > >
> > > > On Fri, May 1, 2020 at 5:48 PM David Arthur 
> wrote:
> > > >
> > > > > Jose/Colin/Tom, thanks for the feedback!
> > > > >
> > > > > > Partition level errors
> > > > >
> > > > > This was an 

Re: [VOTE] KIP-613: Add end-to-end latency metrics to Streams

2020-05-19 Thread John Roesler
Thanks for the KIP, Sophie.

I'm +1 (binding) on the current proposal.

Thanks,
-John

On Fri, May 15, 2020, at 19:45, Guozhang Wang wrote:
> Hi Sophie,
> 
> Sorry I was a bit late on following up the DISCUSS thread. I still have
> some comment about the processor-node level metrics and left a reply on
> that thread.
> 
> 
> Guozhang
> 
> On Fri, May 15, 2020 at 2:34 PM Sophie Blee-Goldman 
> wrote:
> 
> > Hey all,
> >
> > I think we've reached a general consensus on all the non-implementation
> > details of KIP-613 so I'd like to call for a vote.
> >
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-613%3A+Add+end-to-end+latency+metrics+to+Streams
> >
> > Thanks!
> > Sophie
> >
> 
> 
> -- 
> -- Guozhang
>


Re: Granting permission for Create KIP

2020-05-19 Thread Matthias J. Sax
Done.

On 5/19/20 7:10 AM, 阮良 wrote:
> Please grant permission for Create KIP to wiki ID: ruanliang_hualun
> 



signature.asc
Description: OpenPGP digital signature


Build failed in Jenkins: kafka-trunk-jdk14 #102

2020-05-19 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Small fixes in the documentation (#8623)


--
[...truncated 6.18 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCleanUpPersistentStateStoresOnClose[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpda

Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Chris Egerton
Hi Randall,

First off, thank you for the incredibly detailed example. I don't mind
walls of text. I found it very helpful. I especially liked the idea about
modifying how the framework invokes "SinkTask::preCommit" to take most of
the work out of developers' hands in the common case of a "fire-and-forget"
but still provide flexibility to accommodate connectors with, for example,
exactly-once delivery guarantees that involve committing offsets to the
sink atomically with the actual records that they've received from Kafka.

I have one point I'd like to raise about the stated advantage of an
asynchronous API: that tasks can continue processing records and sending
them to the sink destination without having to block on the completion of
the error report.

Wouldn't this actually be a disadvantage in the case that the user has
configured the connector with "errors.tolerance = none"? In that case, the
expectation is that the task should fail as soon as it hits a bad record;
allowing it to possibly continue to produce records in that case (which
would likely end up as duplicates in the sink if/when the task is
restarted) doesn't seem optimal.

I don't think that this makes an asynchronous API completely unusable; I
just think that we'd want to synchronously throw some kind of exception
when the error reporter is invoked and the connector is configured with
"errors.tolerance = none", instead of causing one to be thrown wrapped in
an ExecutionException if/when "Future::get" is called on the returned
future.

I'd also like to suggest a slight change to the logic for invoking
"SinkTask::preCommit". The interval at which offsets are committed for sink
tasks is configurable via the worker-level "offset.flush.interval.ms"
property; I think it'd be nice to respect that property if we could. What
would you think about calling "SinkTask::preCommit" at the normally
scheduled times, but altering the offsets that are passed in to that call
to not go beyond any offsets for errant records that have been reported but
not fully processed yet?

For example, imagine a task has been given records with offsets 0-10 on a
single topic partition and reports records with offsets 2 and 7 to the
framework. Then, the framework is able to process the record with offset 2
but not the record with offset 7. When it comes time for an offset commit,
the framework will call "SinkTask::preCommit" with an offset of 6 for that
topic partition, since the record for offset 7 has not been completely
taken care of yet.

One more small suggestion: we may want to always provide an errant record
reporter to connectors, even if one has not been configured. This reporter
would simply fail the task and throw an exception as soon as it's invoked.
This would provide a more uniform experience for users across different
connectors and would establish expectations that, if a connector uses the
features added by KIP-610 at all, it will fail by default on any invalid
records (instead of doing something implementation-dependent).

Cheers,

Chris

On Tue, May 19, 2020 at 10:03 AM Arjun Satish 
wrote:

> One more concern with the connector blocking on the Future's get() is that
> it may cause the task's consumer to fail to heartbeat (since there is no
> independent thread to do this). That would then cause failures when we
> eventually try to consume more records after returning from put(). The
> developer would need to be cognizant of these bits before waiting on the
> future, which adds a reasonable amount of complexity.
>
> Even with preCommit() returning incomplete offsets, I suppose the concern
> would be that the put() method keeps giving the task more records, and to
> truly pause the "firehose", the task needs to pause all partitions?
>
>
> On Tue, May 19, 2020 at 9:26 AM Arjun Satish 
> wrote:
>
> > Can we get a couple of examples that shows utility of waiting on the
> > Future<>? Also, in preCommit() we would report back on the incomplete
> > offsets. So that feedback mechanism will already exists for developers
> who
> > want to manually manage this.
> >
> > On Tue, May 19, 2020 at 8:03 AM Randall Hauch  wrote:
> >
> >> Thanks, Aakash, for updating the KIP.
> >>
> >> On Tue, May 19, 2020 at 2:18 AM Arjun Satish 
> >> wrote:
> >>
> >> > Hi Randall,
> >> >
> >> > Thanks for the explanation! Excellent point about guaranteeing offsets
> >> in
> >> > the async case.
> >> >
> >> > If we can guarantee that the offsets will be advanced only after the
> bad
> >> > records are reported, then is there any value is the Future<> return
> >> type?
> >> > I feel we can declare the function with a void return type:
> >> >
> >> > void report(SinkRecord failedRecord, Throwable error)
> >> >
> >> > that works asynchronously, and advances offsets only after the DLQ
> >> producer
> >> > (and other reporters) complete successfully (as you explained).
> >> >
> >> > This actually alleviates my concern of what this Future<> actually
> >> means.
> >> > Since a failure to report shoul

Re: [DISCUSS] KIP-609: Use Pre-registration and Blocking Calls for Better Transaction Efficiency

2020-05-19 Thread Boyang Chen
Hey John,

thanks for the insights! Replied inline.

On Tue, May 19, 2020 at 7:48 AM John Roesler  wrote:

> Thanks for the KIP, Boyang!
>
> This looks good and reasonable to me overall.
>
> J1: One clarification: you mention that the blocking behavior depends on
> a new broker version, which sounds good to me, but I didn't see why
> we would need to throw any UnsupportedVersionExceptions. It sounds
> a little like you just want to implement a kind of long polling on the
> AddPartitionToTxn API, such that the broker would optimistically block
> for a while when there is a pending prior transaction.
>
> Can this just be a behavior change on the broker side, such that both
> old and new clients would be asked to retry when the broker is older,
> and both old and new clients would instead see the API call block for
> longer (but be successful more often) when the broker is newer?
>
> Related: is it still possible to get back the "please retry" error from the
> broker, or is it guaranteed to block until the call completes?
>
> This is a good observation. I agree the blocking behavior could benefit
all the producer
versions older than 0.11, which could be retried. Added to the KIP.


> J2: Please forgive my ignorance, but is there any ill effect if a producer
> adds a partition to a transaction and then commits without having added
> any data to the transaction?
>
> I can see this happening, e.g., if I know that my application generally
> sends to 5 TopicPartitions, I would use the new beginTransaction call
> and just always give it the same list of partitions, and _then_ do the
> processing, which may or may not send data to all five potential
> partitions.
>

Yes, that's possible, which is the reason why we discussed bumping the
EndTxn request
to only include the partitions actually being written to, so that the
transaction coordinator will only send markers
to the actually-written partitions. The worst case for mis-used
pre-registration API
is to write out more transaction markers than necessary. For once, I do see
the benefit of doing that,
which is a life-saver for a "lazy user" who doesn't want to infer the
output partitions it would write to, but always
registers the full set of output partitions. With this observation in mind,
bumping EndTxn makes sense.

>
> Thanks again!
> -John
>
> On Mon, May 18, 2020, at 10:25, Boyang Chen wrote:
> > Oh, I see your point! Will add that context to the KIP.
> >
> > Boyang
> >
> > On Sun, May 17, 2020 at 11:39 AM Guozhang Wang 
> wrote:
> >
> > > My point here is only for the first AddPartitionToTxn request of the
> > > transaction, since only that request would potentially be blocked on
> the
> > > previous txn to complete. By deferring it we reduce the blocking time.
> > >
> > > I think StreamsConfigs override the linger.ms to 100ms not 10ms, so
> in the
> > > best case we can defer the first AddPartitionToTxn of the transaction
> by
> > > 100ms.
> > >
> > > Guozhang
> > >
> > >
> > > On Sat, May 16, 2020 at 12:20 PM Boyang Chen <
> reluctanthero...@gmail.com>
> > > wrote:
> > >
> > > > Thanks Guozhang for the context. The producer batch is either
> bounded by
> > > > the size or the linger time. For the default 10ms linger and 100ms
> > > > transaction commit time, the producer will be capped by
> > > > AddPartitionToTxn 10 times in the worst case. I think the improvement
> > > here
> > > > aims for the worst case scenario for users who didn't realize how the
> > > > internal works, and uses the API calls in a very inefficient way as
> the
> > > > scenario where record processing and send() happen concurrently.
> > > >
> > > > Boyang
> > > >
> > > > On Sat, May 16, 2020 at 10:19 AM Guozhang Wang 
> > > wrote:
> > > >
> > > > > Hello Boyang,
> > > > >
> > > > > Thanks for the proposed KIP, overall it makes sense to me.
> > > > >
> > > > > One non-public API related point that I'd like to make though, is
> that
> > > in
> > > > > KafkaProducer.send call we can potentially defer sending
> > > > > AddPartitionsToTxn request until the sender is about to send the
> first
> > > > > batch -- this is what I observed from some soak testing
> investigation
> > > > such
> > > > > that the batching effects actually allows the first record to be
> sent
> > > > much
> > > > > later than the send() call and that can be leveraged to further
> reduce
> > > > the
> > > > > time that we would be blocked on the AddPartitionsToTxn request.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > >
> > > > > On Thu, May 14, 2020 at 10:26 PM Boyang Chen <
> > > reluctanthero...@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hey all,
> > > > > >
> > > > > > I would like to start the discussion for KIP-609:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-609%3A+Use+Pre-registration+and+Blocking+Calls+for+Better+Transaction+Efficiency
> > > > > >
> > > > > > This KIP aims to improve the current EOS semantic which makes the
>

Build failed in Jenkins: kafka-trunk-jdk11 #1475

2020-05-19 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Small fixes in the documentation (#8623)

[github] MINOR: Update stream documentation (#8622)


--
[...truncated 3.09 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverT

Re: [VOTE] KIP 585: Filter and conditional SMTs

2020-05-19 Thread Tom Bentley
It would be nice to get this into Kafka 2.6. There are 2 binding and 3
non-binding votes so far. If you've not looked at it already now would be a
great time!

Many thanks,

Tom

On Tue, May 19, 2020 at 1:27 PM Mickael Maison 
wrote:

> +1 (binding)
> Thanks Tom for leading this KIP and steering the syntax discussion
> towards a consensus
>
> On Tue, May 19, 2020 at 11:29 AM Edoardo Comar  wrote:
> >
> > +1 (non-binding)
> > Thanks Tom
> > --
> >
> > Edoardo Comar
> >
> > Event Streams for IBM Cloud
> > IBM UK Ltd, Hursley Park, SO21 2JN
> >
> >
> >
> >
> > From:   Gunnar Morling 
> > To: dev@kafka.apache.org
> > Date:   19/05/2020 10:35
> > Subject:[EXTERNAL] Re: [VOTE] KIP 585: Filter and conditional
> SMTs
> >
> >
> >
> > +1 (non-binding)
> >
> > Thanks for working on this, Tom! This KIP will be very useful for
> > connectors like Debezium.
> >
> > --Gunnar
> >
> > Am Fr., 15. Mai 2020 um 20:02 Uhr schrieb Konstantine Karantasis
> > :
> > >
> > > +1 (binding)
> > >
> > > Thanks Tom.
> > >
> > > Konstantine
> > >
> > > On Fri, May 15, 2020 at 5:03 AM Andrew Schofield
> > 
> > > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Thanks for the KIP. This will be very useful.
> > > >
> > > > Andrew Schofield
> > > >
> > > > On 13/05/2020, 10:14, "Tom Bentley"  wrote:
> > > >
> > > > Hi,
> > > >
> > > > I'd like to start a vote on KIP-585: Filter and conditional SMTs
> > > >
> > > >
> > > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D585-253A-2BFilter-2Band-2BConditional-2BSMTs&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=AnNSwofDk0eZPfkUhSGAHsEyMB_tKe1luK9nox7bE1w&s=_AHSlXsBMSSSOnVL3bBa-Pzu9Zg1f8lgOSTI_VMTP8s&e=
> >
> > > >
> > > > Those involved in the discussion seem to be positively disposed
> to
> > the
> > > > idea, but in the absence of any committer participation it's been
> > > > difficult
> > > > to find a consensus on how these things should be configured.
> > What's
> > > > presented here seemed to be the option which people preferred
> > overall.
> > > >
> > > > Kind regards,
> > > >
> > > > Tom
> > > >
> > > >
> >
> >
> >
> >
> > Unless stated otherwise above:
> > IBM United Kingdom Limited - Registered in England and Wales with number
> > 741598.
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
> 3AU
> >
>
>


[jira] [Created] (KAFKA-10025) Segfault in RocksDB Statistics

2020-05-19 Thread Sophie Blee-Goldman (Jira)
Sophie Blee-Goldman created KAFKA-10025:
---

 Summary: Segfault in RocksDB Statistics
 Key: KAFKA-10025
 URL: https://issues.apache.org/jira/browse/KAFKA-10025
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.5.0
Reporter: Sophie Blee-Goldman


Hit the following after running a 2.5 app for several weeks:

 
{code:java}
# A fatal error has been detected by the Java Runtime Environment: 
# 
#  SIGSEGV (0xb) at pc=0x7fa40f9e31ab, pid=23022, tid=0x7fa4341b1700 
# 
# JRE version: OpenJDK Runtime Environment (8.0_242-b08) (build 1.8.0_242-b08) 
# Java VM: OpenJDK 64-Bit Server VM (25.242-b08 mixed mode linux-amd64 
compressed oops) 
# Problematic frame: 
# C  [librocksdbjni8322403889346508613.so+0x2e21ab]  
Java_org_rocksdb_Statistics_getAndResetTickerCount+0x1b 
# 
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again 
# 
# An error report file with more information is saved as: 
# /mnt/run/hs_err_pid23022.log
Compiled method (nm) 1697462436 5718     n 0       
org.rocksdb.Statistics::getAndResetTickerCount (native) total in heap  
[0x7fa45d73c5d0,0x7fa45d73c928] = 856 relocation     
[0x7fa45d73c6f8,0x7fa45d73c740] = 72 main code      
[0x7fa45d73c740,0x7fa45d73c920] = 480 oops           
[0x7fa45d73c920,0x7fa45d73c928] = 8
{code}
 

Since this seems to be caused by the Statistics which is used to expose RocksDB 
metrics, I suspect this impacts 2.4.0 onward. But so far it's only been 
confirmed in 2.5

Opened an issue with RocksDB: [https://github.com/facebook/rocksdb/issues/6856]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Arjun Satish
One more concern with the connector blocking on the Future's get() is that
it may cause the task's consumer to fail to heartbeat (since there is no
independent thread to do this). That would then cause failures when we
eventually try to consume more records after returning from put(). The
developer would need to be cognizant of these bits before waiting on the
future, which adds a reasonable amount of complexity.

Even with preCommit() returning incomplete offsets, I suppose the concern
would be that the put() method keeps giving the task more records, and to
truly pause the "firehose", the task needs to pause all partitions?


On Tue, May 19, 2020 at 9:26 AM Arjun Satish  wrote:

> Can we get a couple of examples that shows utility of waiting on the
> Future<>? Also, in preCommit() we would report back on the incomplete
> offsets. So that feedback mechanism will already exists for developers who
> want to manually manage this.
>
> On Tue, May 19, 2020 at 8:03 AM Randall Hauch  wrote:
>
>> Thanks, Aakash, for updating the KIP.
>>
>> On Tue, May 19, 2020 at 2:18 AM Arjun Satish 
>> wrote:
>>
>> > Hi Randall,
>> >
>> > Thanks for the explanation! Excellent point about guaranteeing offsets
>> in
>> > the async case.
>> >
>> > If we can guarantee that the offsets will be advanced only after the bad
>> > records are reported, then is there any value is the Future<> return
>> type?
>> > I feel we can declare the function with a void return type:
>> >
>> > void report(SinkRecord failedRecord, Throwable error)
>> >
>> > that works asynchronously, and advances offsets only after the DLQ
>> producer
>> > (and other reporters) complete successfully (as you explained).
>> >
>> > This actually alleviates my concern of what this Future<> actually
>> means.
>> > Since a failure to report should kill the tasks, there is no reason for
>> the
>> > connector to ever wait on the get().
>>
>>
>> We should not say "there is no reason", because we don't know all of the
>> requirements that might exist for sending records to external systems. The
>> additional guarantee regarding error records being fully recorded before
>> `preCommit(...)` is called is a minimal guarantee that Connect provides
>> the
>> sink task, and returning a Future allow a sink task to have *stronger*
>> guarantees than what Connect provides by default.
>>
>> Once again:
>> 1. we need an async API to allow the sink task to report problem records
>> and then immediately continue doing more work.
>> 2. Connect should guarantee to the sink task that all reported records
>> will
>> actually be recorded before `preCommit(...)` is called
>> 3. a sink task *might* need stronger guarantees, and may need to block on
>> the reported records some time before `preCommit(...)`, and we should
>> allow
>> them to do this.
>> 4. Future and callbacks are common techniques, but there are significant
>> runtime risks of using callbacks, whereas Future is a common/standard
>> pattern that is straightforward to use.
>>
>> This *exactly* matches the current KIP, which is why I plan to vote for
>> this valuable and well-thought out KIP.
>>
>>
>> > And if we are guaranteeing that the
>> > offsets are only advanced when the errors are reported, then this
>> becomes a
>> > double win:
>> >
>> > 1. connector developers can literally fire and forget failed records.
>> > 2. offsets are correctly advanced on errors being reported. Failure to
>> > report error will kill the task, and the last committed offset will be
>> the
>> > correct one.
>>
>>
>> > The main contract would simply be to call report() before preCommit() or
>> > before put() returns in the task, so the framework knows that that there
>> > are error records reported, and those need to finish before the offsets
>> can
>> > be advanced.
>> >
>> > I think I'd be pretty excited about this API. and if we all agree, then
>> > let's go ahead with this?
>>
>>
>> > Best,
>> >
>> >
>> >
>>
>


[jira] [Created] (KAFKA-10024) Add dynamic configuration and enforce quota for per-IP connection rate limits

2020-05-19 Thread Anna Povzner (Jira)
Anna Povzner created KAFKA-10024:


 Summary: Add dynamic configuration and enforce quota for per-IP 
connection rate limits
 Key: KAFKA-10024
 URL: https://issues.apache.org/jira/browse/KAFKA-10024
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Anna Povzner
Assignee: Anna Povzner


This JIRA is for the second part of KIP-612 – Add per-IP connection creation 
rate limits.

As described here: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-612%3A+Ability+to+Limit+Connection+Creation+Rate+on+Brokers]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE]: KIP-604: Remove ZooKeeper Flags from the Administrative Tools

2020-05-19 Thread Jason Gustafson
Hi Colin,

Looks good. I just had one question. It sounds like your intent is to
change kafka-configs.sh so that the --zookeeper flag is only supported for
bootstrapping. I assume in the case of SCRAM that we will only make this
change after the broker API is available?

Thanks,
Jason

On Tue, May 19, 2020 at 5:22 AM Mickael Maison 
wrote:

> +1 (binding)
> Thanks Colin
>
> On Tue, May 19, 2020 at 10:57 AM Manikumar 
> wrote:
> >
> > +1 (binding)
> >
> > Thanks for the KIP
> >
> > On Tue, May 19, 2020 at 12:29 PM David Jacot 
> wrote:
> >
> > > +1 (non-binding).
> > >
> > > Thanks for the KIP.
> > >
> > > On Fri, May 15, 2020 at 12:41 AM Guozhang Wang 
> wrote:
> > >
> > > > +1.
> > > >
> > > > Thanks Colin!
> > > >
> > > > Guozhang
> > > >
> > > > On Tue, May 12, 2020 at 3:45 PM Colin McCabe 
> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I'd like to start a vote on KIP-604: Remove ZooKeeper Flags from
> the
> > > > > Administrative Tools.
> > > > >
> > > > > As a reminder, this KIP is for the next major release of Kafka,
> the 3.0
> > > > > release.   So it won't go into the upcoming 2.6 release.  It's a
> pretty
> > > > > small change that just removes the --zookeeper flags from some
> tools
> > > and
> > > > > removes a deprecated tool.  We haven't decided exactly when we'll
> do
> > > 3.0
> > > > > but I believe we will certainly want this change in that release.
> > > > >
> > > > > The KIP does contain one small change relevant to Kafka 2.6: adding
> > > > > support for --if-exists and --if-not-exists in combination with the
> > > > > --bootstrap-server flag.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
>


Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Arjun Satish
Can we get a couple of examples that shows utility of waiting on the
Future<>? Also, in preCommit() we would report back on the incomplete
offsets. So that feedback mechanism will already exists for developers who
want to manually manage this.

On Tue, May 19, 2020 at 8:03 AM Randall Hauch  wrote:

> Thanks, Aakash, for updating the KIP.
>
> On Tue, May 19, 2020 at 2:18 AM Arjun Satish 
> wrote:
>
> > Hi Randall,
> >
> > Thanks for the explanation! Excellent point about guaranteeing offsets in
> > the async case.
> >
> > If we can guarantee that the offsets will be advanced only after the bad
> > records are reported, then is there any value is the Future<> return
> type?
> > I feel we can declare the function with a void return type:
> >
> > void report(SinkRecord failedRecord, Throwable error)
> >
> > that works asynchronously, and advances offsets only after the DLQ
> producer
> > (and other reporters) complete successfully (as you explained).
> >
> > This actually alleviates my concern of what this Future<> actually means.
> > Since a failure to report should kill the tasks, there is no reason for
> the
> > connector to ever wait on the get().
>
>
> We should not say "there is no reason", because we don't know all of the
> requirements that might exist for sending records to external systems. The
> additional guarantee regarding error records being fully recorded before
> `preCommit(...)` is called is a minimal guarantee that Connect provides the
> sink task, and returning a Future allow a sink task to have *stronger*
> guarantees than what Connect provides by default.
>
> Once again:
> 1. we need an async API to allow the sink task to report problem records
> and then immediately continue doing more work.
> 2. Connect should guarantee to the sink task that all reported records will
> actually be recorded before `preCommit(...)` is called
> 3. a sink task *might* need stronger guarantees, and may need to block on
> the reported records some time before `preCommit(...)`, and we should allow
> them to do this.
> 4. Future and callbacks are common techniques, but there are significant
> runtime risks of using callbacks, whereas Future is a common/standard
> pattern that is straightforward to use.
>
> This *exactly* matches the current KIP, which is why I plan to vote for
> this valuable and well-thought out KIP.
>
>
> > And if we are guaranteeing that the
> > offsets are only advanced when the errors are reported, then this
> becomes a
> > double win:
> >
> > 1. connector developers can literally fire and forget failed records.
> > 2. offsets are correctly advanced on errors being reported. Failure to
> > report error will kill the task, and the last committed offset will be
> the
> > correct one.
>
>
> > The main contract would simply be to call report() before preCommit() or
> > before put() returns in the task, so the framework knows that that there
> > are error records reported, and those need to finish before the offsets
> can
> > be advanced.
> >
> > I think I'd be pretty excited about this API. and if we all agree, then
> > let's go ahead with this?
>
>
> > Best,
> >
> >
> >
>


[jira] [Created] (KAFKA-10023) Enforce broker-wide and per-listener connection creation rate (KIP-612, part 1)

2020-05-19 Thread Anna Povzner (Jira)
Anna Povzner created KAFKA-10023:


 Summary: Enforce broker-wide and per-listener connection creation 
rate (KIP-612, part 1)
 Key: KAFKA-10023
 URL: https://issues.apache.org/jira/browse/KAFKA-10023
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Anna Povzner
Assignee: Anna Povzner


This JIRA is for the first part of KIP-612 – Add an ability to configure and 
enforce broker-wide and per-listener connection creation rate. 

As described here: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-612%3A+Ability+to+Limit+Connection+Creation+Rate+on+Brokers]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka-site] bbejeck commented on pull request #265: MINOR: Update stream documentation for version 2.5

2020-05-19 Thread GitBox


bbejeck commented on pull request #265:
URL: https://github.com/apache/kafka-site/pull/265#issuecomment-630893259







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka-site] bbejeck merged pull request #265: MINOR: Update stream documentation for version 2.5

2020-05-19 Thread GitBox


bbejeck merged pull request #265:
URL: https://github.com/apache/kafka-site/pull/265


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: [VOTE] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Andrew Schofield
+1 (non-binding)

This is now looking very nice.

Andrew Schofield

On 19/05/2020, 16:11, "Randall Hauch"  wrote:

Thank you, Aakash, for putting together this KIP and shepherding the
discussion. Also, many thanks to all those that participated in the very
active discussion. I'm actually very happy with the current proposal, am
confident that it is a valuable improvement to the Connect framework, and
know that it will be instrumental in making sink tasks easily able to
report problematic records and keep running.

+1 (binding)

Best regards,

Randall

On Sun, May 17, 2020 at 6:59 PM Aakash Shah  wrote:

> Hello all,
>
> I'd like to open a vote for KIP-610:
>
> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-610%3A+Error+Reporting+in+Sink+Connectors
>
> Thanks,
> Aakash
>



Re: [VOTE] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Randall Hauch
Thank you, Aakash, for putting together this KIP and shepherding the
discussion. Also, many thanks to all those that participated in the very
active discussion. I'm actually very happy with the current proposal, am
confident that it is a valuable improvement to the Connect framework, and
know that it will be instrumental in making sink tasks easily able to
report problematic records and keep running.

+1 (binding)

Best regards,

Randall

On Sun, May 17, 2020 at 6:59 PM Aakash Shah  wrote:

> Hello all,
>
> I'd like to open a vote for KIP-610:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-610%3A+Error+Reporting+in+Sink+Connectors
>
> Thanks,
> Aakash
>


Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Randall Hauch
Thanks, Aakash, for updating the KIP.

On Tue, May 19, 2020 at 2:18 AM Arjun Satish  wrote:

> Hi Randall,
>
> Thanks for the explanation! Excellent point about guaranteeing offsets in
> the async case.
>
> If we can guarantee that the offsets will be advanced only after the bad
> records are reported, then is there any value is the Future<> return type?
> I feel we can declare the function with a void return type:
>
> void report(SinkRecord failedRecord, Throwable error)
>
> that works asynchronously, and advances offsets only after the DLQ producer
> (and other reporters) complete successfully (as you explained).
>
> This actually alleviates my concern of what this Future<> actually means.
> Since a failure to report should kill the tasks, there is no reason for the
> connector to ever wait on the get().


We should not say "there is no reason", because we don't know all of the
requirements that might exist for sending records to external systems. The
additional guarantee regarding error records being fully recorded before
`preCommit(...)` is called is a minimal guarantee that Connect provides the
sink task, and returning a Future allow a sink task to have *stronger*
guarantees than what Connect provides by default.

Once again:
1. we need an async API to allow the sink task to report problem records
and then immediately continue doing more work.
2. Connect should guarantee to the sink task that all reported records will
actually be recorded before `preCommit(...)` is called
3. a sink task *might* need stronger guarantees, and may need to block on
the reported records some time before `preCommit(...)`, and we should allow
them to do this.
4. Future and callbacks are common techniques, but there are significant
runtime risks of using callbacks, whereas Future is a common/standard
pattern that is straightforward to use.

This *exactly* matches the current KIP, which is why I plan to vote for
this valuable and well-thought out KIP.


> And if we are guaranteeing that the
> offsets are only advanced when the errors are reported, then this becomes a
> double win:
>
> 1. connector developers can literally fire and forget failed records.
> 2. offsets are correctly advanced on errors being reported. Failure to
> report error will kill the task, and the last committed offset will be the
> correct one.


> The main contract would simply be to call report() before preCommit() or
> before put() returns in the task, so the framework knows that that there
> are error records reported, and those need to finish before the offsets can
> be advanced.
>
> I think I'd be pretty excited about this API. and if we all agree, then
> let's go ahead with this?


> Best,
>
>
>


Re: [DISCUSS] KIP-609: Use Pre-registration and Blocking Calls for Better Transaction Efficiency

2020-05-19 Thread John Roesler
Thanks for the KIP, Boyang!

This looks good and reasonable to me overall.

J1: One clarification: you mention that the blocking behavior depends on
a new broker version, which sounds good to me, but I didn't see why
we would need to throw any UnsupportedVersionExceptions. It sounds
a little like you just want to implement a kind of long polling on the
AddPartitionToTxn API, such that the broker would optimistically block
for a while when there is a pending prior transaction.

Can this just be a behavior change on the broker side, such that both
old and new clients would be asked to retry when the broker is older,
and both old and new clients would instead see the API call block for
longer (but be successful more often) when the broker is newer?

Related: is it still possible to get back the "please retry" error from the
broker, or is it guaranteed to block until the call completes?

J2: Please forgive my ignorance, but is there any ill effect if a producer
adds a partition to a transaction and then commits without having added
any data to the transaction?

I can see this happening, e.g., if I know that my application generally
sends to 5 TopicPartitions, I would use the new beginTransaction call
and just always give it the same list of partitions, and _then_ do the
processing, which may or may not send data to all five potential partitions.

Thanks again!
-John

On Mon, May 18, 2020, at 10:25, Boyang Chen wrote:
> Oh, I see your point! Will add that context to the KIP.
> 
> Boyang
> 
> On Sun, May 17, 2020 at 11:39 AM Guozhang Wang  wrote:
> 
> > My point here is only for the first AddPartitionToTxn request of the
> > transaction, since only that request would potentially be blocked on the
> > previous txn to complete. By deferring it we reduce the blocking time.
> >
> > I think StreamsConfigs override the linger.ms to 100ms not 10ms, so in the
> > best case we can defer the first AddPartitionToTxn of the transaction by
> > 100ms.
> >
> > Guozhang
> >
> >
> > On Sat, May 16, 2020 at 12:20 PM Boyang Chen 
> > wrote:
> >
> > > Thanks Guozhang for the context. The producer batch is either bounded by
> > > the size or the linger time. For the default 10ms linger and 100ms
> > > transaction commit time, the producer will be capped by
> > > AddPartitionToTxn 10 times in the worst case. I think the improvement
> > here
> > > aims for the worst case scenario for users who didn't realize how the
> > > internal works, and uses the API calls in a very inefficient way as the
> > > scenario where record processing and send() happen concurrently.
> > >
> > > Boyang
> > >
> > > On Sat, May 16, 2020 at 10:19 AM Guozhang Wang 
> > wrote:
> > >
> > > > Hello Boyang,
> > > >
> > > > Thanks for the proposed KIP, overall it makes sense to me.
> > > >
> > > > One non-public API related point that I'd like to make though, is that
> > in
> > > > KafkaProducer.send call we can potentially defer sending
> > > > AddPartitionsToTxn request until the sender is about to send the first
> > > > batch -- this is what I observed from some soak testing investigation
> > > such
> > > > that the batching effects actually allows the first record to be sent
> > > much
> > > > later than the send() call and that can be leveraged to further reduce
> > > the
> > > > time that we would be blocked on the AddPartitionsToTxn request.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Thu, May 14, 2020 at 10:26 PM Boyang Chen <
> > reluctanthero...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hey all,
> > > > >
> > > > > I would like to start the discussion for KIP-609:
> > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-609%3A+Use+Pre-registration+and+Blocking+Calls+for+Better+Transaction+Efficiency
> > > > >
> > > > > This KIP aims to improve the current EOS semantic which makes the
> > > > > processing more efficient and consolidated.
> > > > >
> > > > > Thanks!
> > > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
>


Granting permission for Create KIP

2020-05-19 Thread 阮良
Please grant permission for Create KIP to wiki ID: ruanliang_hualun

[jira] [Created] (KAFKA-10022) console-producer suppert set client.id

2020-05-19 Thread Young Chou (Jira)
Young Chou created KAFKA-10022:
--

 Summary: console-producer suppert set client.id
 Key: KAFKA-10022
 URL: https://issues.apache.org/jira/browse/KAFKA-10022
 Project: Kafka
  Issue Type: Improvement
  Components: tools
 Environment: Trunk branch 
Reporter: Young Chou


"console-producer" supports the setting of "client.id", which is a reasonable 
requirement, and the way "console consumer" and "console producer" handle 
"client.id" can be unified. "client.id" defaults to "console-producer"

 

PR:  https://github.com/apache/kafka/pull/8694



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE]: KIP-604: Remove ZooKeeper Flags from the Administrative Tools

2020-05-19 Thread Mickael Maison
+1 (binding)
Thanks Colin

On Tue, May 19, 2020 at 10:57 AM Manikumar  wrote:
>
> +1 (binding)
>
> Thanks for the KIP
>
> On Tue, May 19, 2020 at 12:29 PM David Jacot  wrote:
>
> > +1 (non-binding).
> >
> > Thanks for the KIP.
> >
> > On Fri, May 15, 2020 at 12:41 AM Guozhang Wang  wrote:
> >
> > > +1.
> > >
> > > Thanks Colin!
> > >
> > > Guozhang
> > >
> > > On Tue, May 12, 2020 at 3:45 PM Colin McCabe  wrote:
> > >
> > > > Hi all,
> > > >
> > > > I'd like to start a vote on KIP-604: Remove ZooKeeper Flags from the
> > > > Administrative Tools.
> > > >
> > > > As a reminder, this KIP is for the next major release of Kafka, the 3.0
> > > > release.   So it won't go into the upcoming 2.6 release.  It's a pretty
> > > > small change that just removes the --zookeeper flags from some tools
> > and
> > > > removes a deprecated tool.  We haven't decided exactly when we'll do
> > 3.0
> > > > but I believe we will certainly want this change in that release.
> > > >
> > > > The KIP does contain one small change relevant to Kafka 2.6: adding
> > > > support for --if-exists and --if-not-exists in combination with the
> > > > --bootstrap-server flag.
> > > >
> > > > best,
> > > > Colin
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >


Re: [VOTE] KIP 585: Filter and conditional SMTs

2020-05-19 Thread Mickael Maison
+1 (binding)
Thanks Tom for leading this KIP and steering the syntax discussion
towards a consensus

On Tue, May 19, 2020 at 11:29 AM Edoardo Comar  wrote:
>
> +1 (non-binding)
> Thanks Tom
> --
>
> Edoardo Comar
>
> Event Streams for IBM Cloud
> IBM UK Ltd, Hursley Park, SO21 2JN
>
>
>
>
> From:   Gunnar Morling 
> To: dev@kafka.apache.org
> Date:   19/05/2020 10:35
> Subject:[EXTERNAL] Re: [VOTE] KIP 585: Filter and conditional SMTs
>
>
>
> +1 (non-binding)
>
> Thanks for working on this, Tom! This KIP will be very useful for
> connectors like Debezium.
>
> --Gunnar
>
> Am Fr., 15. Mai 2020 um 20:02 Uhr schrieb Konstantine Karantasis
> :
> >
> > +1 (binding)
> >
> > Thanks Tom.
> >
> > Konstantine
> >
> > On Fri, May 15, 2020 at 5:03 AM Andrew Schofield
> 
> > wrote:
> >
> > > +1 (non-binding)
> > >
> > > Thanks for the KIP. This will be very useful.
> > >
> > > Andrew Schofield
> > >
> > > On 13/05/2020, 10:14, "Tom Bentley"  wrote:
> > >
> > > Hi,
> > >
> > > I'd like to start a vote on KIP-585: Filter and conditional SMTs
> > >
> > >
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D585-253A-2BFilter-2Band-2BConditional-2BSMTs&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=AnNSwofDk0eZPfkUhSGAHsEyMB_tKe1luK9nox7bE1w&s=_AHSlXsBMSSSOnVL3bBa-Pzu9Zg1f8lgOSTI_VMTP8s&e=
>
> > >
> > > Those involved in the discussion seem to be positively disposed to
> the
> > > idea, but in the absence of any committer participation it's been
> > > difficult
> > > to find a consensus on how these things should be configured.
> What's
> > > presented here seemed to be the option which people preferred
> overall.
> > >
> > > Kind regards,
> > >
> > > Tom
> > >
> > >
>
>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>


[jira] [Resolved] (KAFKA-6690) Priorities for Source Topics

2020-05-19 Thread Nick Afshartous (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Afshartous resolved KAFKA-6690.

Resolution: Won't Fix

> Priorities for Source Topics
> 
>
> Key: KAFKA-6690
> URL: https://issues.apache.org/jira/browse/KAFKA-6690
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Bala Prassanna I
>Assignee: Nick Afshartous
>Priority: Major
>
> We often encounter use cases where we need to prioritise source topics. If a 
> consumer is listening more than one topic, say, HighPriorityTopic and 
> LowPriorityTopic, it should consume events from LowPriorityTopic only when 
> all the events from HighPriorityTopic are consumed. This is needed in Kafka 
> Streams processor topologies as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


RE: [VOTE] KIP 585: Filter and conditional SMTs

2020-05-19 Thread Edoardo Comar
+1 (non-binding) 
Thanks Tom
--

Edoardo Comar

Event Streams for IBM Cloud
IBM UK Ltd, Hursley Park, SO21 2JN




From:   Gunnar Morling 
To: dev@kafka.apache.org
Date:   19/05/2020 10:35
Subject:[EXTERNAL] Re: [VOTE] KIP 585: Filter and conditional SMTs



+1 (non-binding)

Thanks for working on this, Tom! This KIP will be very useful for
connectors like Debezium.

--Gunnar

Am Fr., 15. Mai 2020 um 20:02 Uhr schrieb Konstantine Karantasis
:
>
> +1 (binding)
>
> Thanks Tom.
>
> Konstantine
>
> On Fri, May 15, 2020 at 5:03 AM Andrew Schofield 

> wrote:
>
> > +1 (non-binding)
> >
> > Thanks for the KIP. This will be very useful.
> >
> > Andrew Schofield
> >
> > On 13/05/2020, 10:14, "Tom Bentley"  wrote:
> >
> > Hi,
> >
> > I'd like to start a vote on KIP-585: Filter and conditional SMTs
> >
> >
> > 
https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D585-253A-2BFilter-2Band-2BConditional-2BSMTs&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=AnNSwofDk0eZPfkUhSGAHsEyMB_tKe1luK9nox7bE1w&s=_AHSlXsBMSSSOnVL3bBa-Pzu9Zg1f8lgOSTI_VMTP8s&e=
 

> >
> > Those involved in the discussion seem to be positively disposed to 
the
> > idea, but in the absence of any committer participation it's been
> > difficult
> > to find a consensus on how these things should be configured. 
What's
> > presented here seemed to be the option which people preferred 
overall.
> >
> > Kind regards,
> >
> > Tom
> >
> >




Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU



Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-19 Thread Rajini Sivaram
Hi Colin,

I do agree about the `leastLoadedNode` case. My question was about the
other cases where we are connecting to a specific node: fetch requests to
leaders, produce requests to leaders, requests to group coordinators,
requests to controller etc. It will be good to either quantify that these
connections are less common and hence less critical in terms of performance
in typical deployments or describe the impact on these connections from the
proposed change in default behaviour. It is perfectly fine if connections
to specific nodes don't benefit from the new timeout, I was looking for
analysis which says they aren't made any worse either, especially in the
context of other connection rate limiting/quota work we are proposing like
KIP-612.

Regards,

Rajini


On Mon, May 18, 2020 at 8:48 PM Colin McCabe  wrote:

> Hi Rajini,
>
> I think the idea behind the 10 second default is that if you have three
> Kafka nodes A, B, C (or whatever), and you can't talk to A within 10
> seconds, you'll try again with B or C, and still have plenty of time left
> over.  Whereas currently, if your connection hangs while trying to connect
> to A, you're out of luck-- you'll just hang until the whole request timeout
> is gone.  So while you could have tried a different node and succeeded, you
> never got a chance to.
>
> So in the common case where you have other nodes that you can connect to,
> we won't end up trying to reconnect to the same node over and over.  I'll
> add some more comments in the vote thread.
>
> best,
> Colin
>
>
> On Fri, May 15, 2020, at 14:13, Rajini Sivaram wrote:
> > Hi Cheng,
> >
> > I am fine with the rest of the KIP apart from the 10s default. If no one
> > else has any concerns about this new default, let's go with it. Please go
> > ahead and start vote.
> >
> > Regards,
> >
> > Rajini
> >
> >
> > On Fri, May 15, 2020 at 8:21 PM Cheng Tan  wrote:
> >
> > > Dear Rajini,
> > >
> > >
> > > Thanks for the reply.
> > >
> > > > e have a lot of these and I want to
> > > > understand the benefits of the proposed timeout in this case alone.
> We
> > > > currently have a request timeout of 30s. Would you consider adding a
> 10s
> > > > connection timeout?
> > >
> > > A shorter timeout (10s) at the transportation level will help clients
> > > detect dead nodes faster. “request.timeout.ms” is too general and
> applies
> > > to all the requests whose complexity at the application level varies.
> It’s
> > > risky to set “request.timeout.ms” to a lower value for detecting dead
> > > nodes quicker because of the involvement of the application layer.
> > >
> > > After “socket.connection.setup.timeout.ms” hits, NetworkClient will
> fail
> > > the request in the exact approach as it handles “request.timeout.ms”.
> > > That is to say, the response will constructed upon a
> RetriableException.
> > > Producer, Consumer, and KafkaAdminClient can then perform their retry
> logic
> > > as a request timeout happens.
> > >
> > > > We have KIP-612 that is proposing to throttle connection set up on
> the
> > > one
> > > > hand and this KIP that is dramatically reducing default connection
> > > timeout
> > > > on the other. Not sure if that is a good idea.
> > >
> > > The default of the broker connection creation rate limit is
> Int.MaxValue.
> > > The KIP also proposes per-IP throttle configuration. Thus, I don’t
> expect
> > > the combination of the broker connection throttle and a shorter client
> > > transportation timeout will have a side effect.
> > >
> > > Does the reasons above make sense to you?
> > >
> > > Best, - Cheng
> > >
> > >
> > >
> > >
> > > > On May 15, 2020, at 4:49 AM, Rajini Sivaram  >
> > > wrote:
> > > >
> > > > Hi Cheng,
> > > >
> > > > Let me rephrase my question. Let's say we didn't have the case of
> > > > leastLoadedNode. We are only talking about connections to a specific
> node
> > > > (i.e. leader or controller). We have a lot of these and I want to
> > > > understand the benefits of the proposed timeout in this case alone.
> We
> > > > currently have a request timeout of 30s. Would you consider adding a
> 10s
> > > > connection timeout? And if you did, what would you expect the 10s
> timeout
> > > > to do?
> > > >
> > > > a) We could fail a request if connection didn't complete within 10s.
> If
> > > we
> > > > always expect connections to succeed within 10s, this would be
> considered
> > > > reasonable behaviour. But this would be changing the current default,
> > > which
> > > > allows you up to 30 seconds to connect and process a request.
> > > > b) We retry the connection. What would be the point? We were waiting
> in a
> > > > queue for connecting, but we decide to stop and join the back of the
> > > queue.
> > > >
> > > > We have KIP-612 that is proposing to throttle connection set up on
> the
> > > one
> > > > hand and this KIP that is dramatically reducing default connection
> > > timeout
> > > > on the other. Not sure if that is a good idea.
> > > >
> > > >
> > > > On Fri, May 15

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2020-05-19 Thread Satish Duggana
Hi Jun,
Please go through the wiki which has the latest updates. Google doc is
updated frequently to be in sync with wiki.

Thanks,
Satish.

On Tue, May 19, 2020 at 12:30 AM Jun Rao  wrote:

> Hi, Satish,
>
> Thanks for the update. Just to clarify. Which doc has the latest updates,
> the wiki or the google doc?
>
> Jun
>
> On Thu, May 14, 2020 at 10:38 AM Satish Duggana 
> wrote:
>
>> Hi Jun,
>> Thanks for your comments.  We updated the KIP with more details.
>>
>> >100. For each of the operations related to tiering, it would be useful
>> to provide a description on how it works with the new API. These include
>> things like consumer fetch, replica fetch, offsetForTimestamp, retention
>> (remote and local) by size, time and logStartOffset, topic deletion, etc.
>> This will tell us if the proposed APIs are sufficient.
>>
>> We addressed most of these APIs in the KIP. We can add more details if
>> needed.
>>
>> >101. For the default implementation based on internal topic, is it meant
>> as a proof of concept or for production usage? I assume that it's the
>> former. However, if it's the latter, then the KIP needs to describe the
>> design in more detail.
>>
>> It is production usage as was mentioned in an earlier mail. We plan to
>> update this section in the next few days.
>>
>> >102. When tiering a segment, the segment is first written to the object
>> store and then its metadata is written to RLMM using the api "void 
>> putRemoteLogSegmentData()".
>> One potential issue with this approach is that if the system fails after
>> the first operation, it leaves a garbage in the object store that's never
>> reclaimed. One way to improve this is to have two separate APIs, sth like
>> preparePutRemoteLogSegmentData() and commitPutRemoteLogSegmentData().
>>
>> That is a good point. We currently have a different way using markers in
>> the segment but your suggestion is much better.
>>
>> >103. It seems that the transactional support and the ability to read
>> from follower are missing.
>>
>> KIP is updated with transactional support, follower fetch semantics, and
>> reading from a follower.
>>
>> >104. It would be useful to provide a testing plan for this KIP.
>>
>> We added a few tests by introducing test util for tiered storage in the
>> PR. We will provide the testing plan in the next few days.
>>
>> Thanks,
>> Satish.
>>
>>
>> On Wed, Feb 26, 2020 at 9:43 PM Harsha Chintalapani 
>> wrote:
>>
>>>
>>>
>>>
>>>
>>> On Tue, Feb 25, 2020 at 12:46 PM, Jun Rao  wrote:
>>>
 Hi, Satish,

 Thanks for the updated doc. The new API seems to be an improvement
 overall. A few more comments below.

 100. For each of the operations related to tiering, it would be useful
 to provide a description on how it works with the new API. These include
 things like consumer fetch, replica fetch, offsetForTimestamp, retention
 (remote and local) by size, time and logStartOffset, topic deletion,
 etc. This will tell us if the proposed APIs are sufficient.

>>>
>>> Thanks for the feedback Jun. We will add more details around this.
>>>
>>>
 101. For the default implementation based on internal topic, is it
 meant as a proof of concept or for production usage? I assume that it's the
 former. However, if it's the latter, then the KIP needs to describe the
 design in more detail.

>>>
>>> Yes it meant to be for production use.  Ideally it would be good to
>>> merge this in as the default implementation for metadata service. We can
>>> add more details around design and testing.
>>>
>>> 102. When tiering a segment, the segment is first written to the object
 store and then its metadata is written to RLMM using the api "void
 putRemoteLogSegmentData()".
 One potential issue with this approach is that if the system fails
 after the first operation, it leaves a garbage in the object store that's
 never reclaimed. One way to improve this is to have two separate APIs, sth
 like preparePutRemoteLogSegmentData() and commitPutRemoteLogSegmentData().

 103. It seems that the transactional support and the ability to read
 from follower are missing.

 104. It would be useful to provide a testing plan for this KIP.

>>>
>>> We are working on adding more details around transactional support and
>>> coming up with test plan.
>>> Add system tests and integration tests.
>>>
>>> Thanks,

 Jun

 On Mon, Feb 24, 2020 at 8:10 AM Satish Duggana <
 satish.dugg...@gmail.com> wrote:

 Hi Jun,
 Please look at the earlier reply and let us know your comments.

 Thanks,
 Satish.

 On Wed, Feb 12, 2020 at 4:06 PM Satish Duggana <
 satish.dugg...@gmail.com> wrote:

 Hi Jun,
 Thanks for your comments on the separation of remote log metadata
 storage and remote log storage.
 We had a few discussions since early Jan on how to support eventually
 consistent stores like S3 by uncoupling remo

Re: [VOTE]: KIP-604: Remove ZooKeeper Flags from the Administrative Tools

2020-05-19 Thread Manikumar
+1 (binding)

Thanks for the KIP

On Tue, May 19, 2020 at 12:29 PM David Jacot  wrote:

> +1 (non-binding).
>
> Thanks for the KIP.
>
> On Fri, May 15, 2020 at 12:41 AM Guozhang Wang  wrote:
>
> > +1.
> >
> > Thanks Colin!
> >
> > Guozhang
> >
> > On Tue, May 12, 2020 at 3:45 PM Colin McCabe  wrote:
> >
> > > Hi all,
> > >
> > > I'd like to start a vote on KIP-604: Remove ZooKeeper Flags from the
> > > Administrative Tools.
> > >
> > > As a reminder, this KIP is for the next major release of Kafka, the 3.0
> > > release.   So it won't go into the upcoming 2.6 release.  It's a pretty
> > > small change that just removes the --zookeeper flags from some tools
> and
> > > removes a deprecated tool.  We haven't decided exactly when we'll do
> 3.0
> > > but I believe we will certainly want this change in that release.
> > >
> > > The KIP does contain one small change relevant to Kafka 2.6: adding
> > > support for --if-exists and --if-not-exists in combination with the
> > > --bootstrap-server flag.
> > >
> > > best,
> > > Colin
> > >
> >
> >
> > --
> > -- Guozhang
> >
>


Re: [VOTE] KIP-597: MirrorMaker2 internal topics Formatters

2020-05-19 Thread Manikumar
+1 (binding)

Thanks for the KIP.

On Tue, May 19, 2020 at 2:57 PM Mickael Maison 
wrote:

> Thanks Konstantine for the feedback (and vote)!
>
> 1) I've added example commands using the new formatters
>
> 2) I updated the Compatiblity section to be more explicit about the
> need for recompilation
>
> 3) Good point, updated
>
> On Tue, May 19, 2020 at 3:18 AM Konstantine Karantasis
>  wrote:
> >
> > Thanks Michael.
> > I think it's useful to enable specialized message formatters by adding
> this
> > interface to the public API.
> >
> > You have my vote: +1 (binding)
> >
> > Just a few optional comments below:
> >
> > 1. Would you mind adding the equivalent command line example in the
> places
> > where you have an example output?
> >
> > Something equivalent to
> > ./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic
> > __consumer_offsets --from-beginning --formatter
> >
> "kafka.coordinator.group.GroupMetadataManager\$GroupMetadataMessageFormatter"
> >
> > but with the equivalent formatter classes and expected topic names.
> >
> > 2. I have to note that breaking old formatters by requiring recompilation
> > could be avoided if we didn't change kafka.common.MessageFormatter to
> > extend the new org.apache.kafka.common.MessageFormatter. We could
> maintain
> > both, while the old one would remain deprecated and we could attempt to
> > instantiate one or the other type when reading the config and use either
> of
> > the two different types in the few places in ConsoleConsumer that a
> > formatter is used. But I admit that for this use case, it's not worth
> > maintaining both types. The interface wasn't public before anyways.
> >
> > Given that, my small request would be to rephrase in the compatibility
> > section to say something like:
> > 'Existing MessageFormatters implementations will require no changes
> beyond
> > recompilation.' or similar. Because, to be precise, existing formatters
> > _won't_ work if they are given as a parameter to a 2.6 console consumer,
> > without recompilation as you mention.
> >
> > 3. Finally, a minor comment on skipping the use of the `public` specifier
> > in the interface because it's redundant.
> >
> > Best regards,
> > Konstantine
> >
> > On Mon, May 18, 2020 at 3:26 PM Maulin Vasavada <
> maulin.vasav...@gmail.com>
> > wrote:
> >
> > > +1 (non-binding)
> > >
> > > On Mon, May 18, 2020 at 8:49 AM Mickael Maison <
> mickael.mai...@gmail.com>
> > > wrote:
> > >
> > > > Bumping this thread as KIP freeze is approaching.
> > > >
> > > > It's a pretty small change and I have a PR ready:
> > > > https://github.com/apache/kafka/pull/8604
> > > >
> > > > Thanks
> > > >
> > > > On Mon, May 4, 2020 at 5:26 PM Ryanne Dolan 
> > > wrote:
> > > > >
> > > > > +1, non-binding
> > > > >
> > > > > On Mon, May 4, 2020, 9:24 AM Christopher Egerton <
> chr...@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > On Mon, May 4, 2020 at 5:02 AM Edoardo Comar 
> > > > wrote:
> > > > > >
> > > > > > > +1 (non-binding)
> > > > > > > Thanks Mickael
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Edoardo Comar
> > > > > > >
> > > > > > > Event Streams for IBM Cloud
> > > > > > > IBM UK Ltd, Hursley Park, SO21 2JN
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > From:   Mickael Maison 
> > > > > > > To: dev 
> > > > > > > Date:   04/05/2020 11:45
> > > > > > > Subject:[EXTERNAL] [VOTE] KIP-597: MirrorMaker2
> internal
> > > > topics
> > > > > > > Formatters
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I'd like to start a vote on KIP-597:
> > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D597-253A-2BMirrorMaker2-2Binternal-2Btopics-2BFormatters&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=r-_T9EFUWNEUGi1GuX7klXNZIH2sJmxGTtySV3lAjoQ&s=iyBSxabuEi1h7ksmzoXgJT8jJoMR0xKYsJy_MpvtCRQ&e=
> > > > > > >
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Unless stated otherwise above:
> > > > > > > IBM United Kingdom Limited - Registered in England and Wales
> with
> > > > number
> > > > > > > 741598.
> > > > > > > Registered office: PO Box 41, North Harbour, Portsmouth,
> Hampshire
> > > > PO6
> > > > > > 3AU
> > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
>


Re: [VOTE] KIP-612: Ability to Limit Connection Creation Rate on Brokers

2020-05-19 Thread Rajini Sivaram
+1 (binding)

Thanks for the KIP, Anna!

Regards,

Rajini


On Tue, May 19, 2020 at 9:32 AM Alexandre Dupriez <
alexandre.dupr...@gmail.com> wrote:

> +1 (non-binding)
>
> Thank you for the KIP!
>
>
> Le mar. 19 mai 2020 à 07:57, David Jacot  a écrit :
> >
> > +1 (non-binding)
> >
> > Thanks for the KIP, Anna!
> >
> > On Tue, May 19, 2020 at 7:12 AM Satish Duggana  >
> > wrote:
> >
> > > +1 (non-binding)
> > > Thanks Anna for the nice feature to control the connection creation
> rate
> > > from the clients.
> > >
> > > On Tue, May 19, 2020 at 8:16 AM Gwen Shapira 
> wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > Thank you for driving this, Anna
> > > >
> > > > On Mon, May 18, 2020 at 4:55 PM Anna Povzner 
> wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I would like to start the vote on KIP-612: Ability to limit
> connection
> > > > > creation rate on brokers.
> > > > >
> > > > > For reference, here is the KIP wiki:
> > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-612%3A+Ability+to+Limit+Connection+Creation+Rate+on+Brokers
> > > > >
> > > > > And discussion thread:
> > > > >
> > > > >
> > > >
> > >
> https://lists.apache.org/thread.html/r61162661fa307d0bc5c8326818bf223a689c49e1c828c9928ee26969%40%3Cdev.kafka.apache.org%3E
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Anna
> > > > >
> > > >
> > > >
> > > > --
> > > > Gwen Shapira
> > > > Engineering Manager | Confluent
> > > > 650.450.2760 | @gwenshap
> > > > Follow us: Twitter | blog
> > > >
> > >
>


Re: [VOTE] KIP 585: Filter and conditional SMTs

2020-05-19 Thread Gunnar Morling
+1 (non-binding)

Thanks for working on this, Tom! This KIP will be very useful for
connectors like Debezium.

--Gunnar

Am Fr., 15. Mai 2020 um 20:02 Uhr schrieb Konstantine Karantasis
:
>
> +1 (binding)
>
> Thanks Tom.
>
> Konstantine
>
> On Fri, May 15, 2020 at 5:03 AM Andrew Schofield 
> wrote:
>
> > +1 (non-binding)
> >
> > Thanks for the KIP. This will be very useful.
> >
> > Andrew Schofield
> >
> > On 13/05/2020, 10:14, "Tom Bentley"  wrote:
> >
> > Hi,
> >
> > I'd like to start a vote on KIP-585: Filter and conditional SMTs
> >
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-585%3A+Filter+and+Conditional+SMTs
> >
> > Those involved in the discussion seem to be positively disposed to the
> > idea, but in the absence of any committer participation it's been
> > difficult
> > to find a consensus on how these things should be configured. What's
> > presented here seemed to be the option which people preferred overall.
> >
> > Kind regards,
> >
> > Tom
> >
> >


Re: [VOTE] KIP-597: MirrorMaker2 internal topics Formatters

2020-05-19 Thread Mickael Maison
Thanks Konstantine for the feedback (and vote)!

1) I've added example commands using the new formatters

2) I updated the Compatiblity section to be more explicit about the
need for recompilation

3) Good point, updated

On Tue, May 19, 2020 at 3:18 AM Konstantine Karantasis
 wrote:
>
> Thanks Michael.
> I think it's useful to enable specialized message formatters by adding this
> interface to the public API.
>
> You have my vote: +1 (binding)
>
> Just a few optional comments below:
>
> 1. Would you mind adding the equivalent command line example in the places
> where you have an example output?
>
> Something equivalent to
> ./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic
> __consumer_offsets --from-beginning --formatter
> "kafka.coordinator.group.GroupMetadataManager\$GroupMetadataMessageFormatter"
>
> but with the equivalent formatter classes and expected topic names.
>
> 2. I have to note that breaking old formatters by requiring recompilation
> could be avoided if we didn't change kafka.common.MessageFormatter to
> extend the new org.apache.kafka.common.MessageFormatter. We could maintain
> both, while the old one would remain deprecated and we could attempt to
> instantiate one or the other type when reading the config and use either of
> the two different types in the few places in ConsoleConsumer that a
> formatter is used. But I admit that for this use case, it's not worth
> maintaining both types. The interface wasn't public before anyways.
>
> Given that, my small request would be to rephrase in the compatibility
> section to say something like:
> 'Existing MessageFormatters implementations will require no changes beyond
> recompilation.' or similar. Because, to be precise, existing formatters
> _won't_ work if they are given as a parameter to a 2.6 console consumer,
> without recompilation as you mention.
>
> 3. Finally, a minor comment on skipping the use of the `public` specifier
> in the interface because it's redundant.
>
> Best regards,
> Konstantine
>
> On Mon, May 18, 2020 at 3:26 PM Maulin Vasavada 
> wrote:
>
> > +1 (non-binding)
> >
> > On Mon, May 18, 2020 at 8:49 AM Mickael Maison 
> > wrote:
> >
> > > Bumping this thread as KIP freeze is approaching.
> > >
> > > It's a pretty small change and I have a PR ready:
> > > https://github.com/apache/kafka/pull/8604
> > >
> > > Thanks
> > >
> > > On Mon, May 4, 2020 at 5:26 PM Ryanne Dolan 
> > wrote:
> > > >
> > > > +1, non-binding
> > > >
> > > > On Mon, May 4, 2020, 9:24 AM Christopher Egerton 
> > > > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > On Mon, May 4, 2020 at 5:02 AM Edoardo Comar 
> > > wrote:
> > > > >
> > > > > > +1 (non-binding)
> > > > > > Thanks Mickael
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Edoardo Comar
> > > > > >
> > > > > > Event Streams for IBM Cloud
> > > > > > IBM UK Ltd, Hursley Park, SO21 2JN
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > From:   Mickael Maison 
> > > > > > To: dev 
> > > > > > Date:   04/05/2020 11:45
> > > > > > Subject:[EXTERNAL] [VOTE] KIP-597: MirrorMaker2 internal
> > > topics
> > > > > > Formatters
> > > > > >
> > > > > >
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I'd like to start a vote on KIP-597:
> > > > > >
> > > > > >
> > > > >
> > >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D597-253A-2BMirrorMaker2-2Binternal-2Btopics-2BFormatters&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=r-_T9EFUWNEUGi1GuX7klXNZIH2sJmxGTtySV3lAjoQ&s=iyBSxabuEi1h7ksmzoXgJT8jJoMR0xKYsJy_MpvtCRQ&e=
> > > > > >
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Unless stated otherwise above:
> > > > > > IBM United Kingdom Limited - Registered in England and Wales with
> > > number
> > > > > > 741598.
> > > > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
> > > PO6
> > > > > 3AU
> > > > > >
> > > > > >
> > > > >
> > >
> >


Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Serhii Bilyk
Hi,
Please, tell me how to unsubscribe.
Thanks,
Serhii

On Mon, 18 May 2020, 23:44 Aakash Shah,  wrote:

> Hi Chris,
>
> I agree with your point.
>
> Randall, Konstantine, do you guys mind weighing in on any benefit of adding
> asynchronous functionality using a Future in the KIP right now? It seems to
> me that it only provides user control on when the thread will be blocked,
> and if we are going to process all the futures at once in a batch at the
> end, why not support batch processing in a future KIP, since it is not too
> difficult now that we are adding an interface. I'm not sure I see any gain
> beyond some user control that could increase throughput - but at the same
> time, as I mentioned before, I don't think throughput is a factor we need
> to consider much with error reporting. We don't really need or necessarily
> want a higher throughput on error reporting, as ideally, there should not
> be a high volume of errant records.
>
> Thanks,
> Aakash
>
> On Mon, May 18, 2020 at 1:22 PM Chris Egerton  wrote:
>
> > Hi Aakash,
> >
> > Yep, that's pretty much it. I'd also like to emphasize that we should be
> > identifying practical use cases for whatever API we provide. Giving
> > developers a future that can be made synchronous with little effort seems
> > flexible, but if that's all that developers are going to do with it
> anyway,
> > why make it a future at all? We should have some idea of how people would
> > use a future that doesn't just hinge on them blocking on it immediately,
> > and isn't more easily-addressed by a different API (such as one with
> batch
> > reporting).
> >
> > Cheers,
> >
> > Chris
> >
> > On Mon, May 18, 2020 at 1:17 PM Aakash Shah  wrote:
> >
> > > Hi all,
> > >
> > > Chris, I see your points about whether Futures provide much benefit at
> > all
> > > as they are not truly fully asynchronous.
> > >
> > > Correct me if I am wrong, but I think what you are trying to point out
> is
> > > that if we have the option to add additional functionality later (in a
> > > simpler way too since we are introducing a new interface), we should
> > > provide functionality that we know will provide value immediately and
> not
> > > cause any developer/user burden.
> > >
> > > In that case, I think the main area we have to come to a consensus on
> is
> > -
> > > how much control do we want to provide to the developer/user in this
> KIP
> > > considering that we can add the functionality relatively easily later?
> > >
> > > Randall, Konstantine, what do you think about adding it later vs now?
> > >
> > > Thanks,
> > > Aakash
> > >
> > > On Mon, May 18, 2020 at 12:45 PM Chris Egerton 
> > > wrote:
> > >
> > > > Hi Aakash,
> > > >
> > > > I asked this earlier about whether futures were the right way to go,
> if
> > > we
> > > > wanted to enable asynchronous behavior at all:
> > > >
> > > > > I'm still unclear on how futures are going to provide any benefit
> to
> > > > developers, though. Blocking on the return of such a future slightly
> > > later
> > > > on in the process of handling records is still blocking, and to be
> done
> > > > truly asynchronously without blocking processing of non-errant
> records,
> > > > would have to be done on a separate thread. It's technically possible
> > for
> > > > users to cache all of these futures and instead of invoking "get" on
> > > them,
> > > > simply check whether they're complete or not via "isDone", but this
> > seems
> > > > like an anti-pattern.
> > > >
> > > > > What is the benefit of wrapping this in a future?
> > > >
> > > > As far as I can tell, there hasn't been a practical example yet where
> > the
> > > > flexibility provided by a future would actually be beneficial in
> > writing
> > > a
> > > > connector. It'd be great if we could find one. One possible use case
> > > might
> > > > be processing records received in "SinkTask::put" without having to
> > block
> > > > for each errant record report before sending non-errant records to
> the
> > > > sink. However, this could also be addressed by allowing for batch
> > > reporting
> > > > of errant records instead of accepting a single record at a time; the
> > > task
> > > > would track errant records as it processes them in "put" and report
> > them
> > > > all en-masse after all non-errant records have been processed.
> > > >
> > > > With regards to the precedent of using futures for asynchronous
> APIs, I
> > > > think we should make sure that whatever API we decide on is actually
> > > useful
> > > > for the cases it serves. There's plenty of precedent for
> callback-based
> > > > asynchronous APIs in Kafka with both "Producer::send" and
> > > > "Consumer::commitAsync"; the question here shouldn't be about what's
> > done
> > > > in different APIs, but what would work for this one in particular.
> > > >
> > > > Finally, it's also been brought up that if we're going to introduce a
> > new
> > > > error reporter interface, we can always modify that interface later
> on
> > to
> > > > go

Re: [VOTE] KIP-612: Ability to Limit Connection Creation Rate on Brokers

2020-05-19 Thread Alexandre Dupriez
+1 (non-binding)

Thank you for the KIP!


Le mar. 19 mai 2020 à 07:57, David Jacot  a écrit :
>
> +1 (non-binding)
>
> Thanks for the KIP, Anna!
>
> On Tue, May 19, 2020 at 7:12 AM Satish Duggana 
> wrote:
>
> > +1 (non-binding)
> > Thanks Anna for the nice feature to control the connection creation rate
> > from the clients.
> >
> > On Tue, May 19, 2020 at 8:16 AM Gwen Shapira  wrote:
> >
> > > +1 (binding)
> > >
> > > Thank you for driving this, Anna
> > >
> > > On Mon, May 18, 2020 at 4:55 PM Anna Povzner  wrote:
> > >
> > > > Hi All,
> > > >
> > > > I would like to start the vote on KIP-612: Ability to limit connection
> > > > creation rate on brokers.
> > > >
> > > > For reference, here is the KIP wiki:
> > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-612%3A+Ability+to+Limit+Connection+Creation+Rate+on+Brokers
> > > >
> > > > And discussion thread:
> > > >
> > > >
> > >
> > https://lists.apache.org/thread.html/r61162661fa307d0bc5c8326818bf223a689c49e1c828c9928ee26969%40%3Cdev.kafka.apache.org%3E
> > > >
> > > > Thanks,
> > > >
> > > > Anna
> > > >
> > >
> > >
> > > --
> > > Gwen Shapira
> > > Engineering Manager | Confluent
> > > 650.450.2760 | @gwenshap
> > > Follow us: Twitter | blog
> > >
> >


Re: [DISCUSS] KIP-612: Ability to Limit Connection Creation Rate on Brokers

2020-05-19 Thread Alexandre Dupriez
Hi Anna,

Thank you for your answers.

900. Sorry, I forgot the connection creation rate metric already
allows users to detect excessive rates. I agree per-IP metrics would
expose to a high-cardinality risk.
902. Thank you for the pointers. I will explore the existing stress
test in Trogdor.

Thanks,
Alexandre

Le mar. 19 mai 2020 à 00:51, Anna Povzner  a écrit :
>
> I realized the KIP freeze is on May 20. I will start the voting thread now.
>
> On Mon, May 18, 2020 at 3:19 PM Anna Povzner  wrote:
>
> > Thanks everyone for the feedback. I will start a voting thread tomorrow
> > morning if there are no more comments.
> >
> > Regards,
> > Anna
> >
> > On Mon, May 18, 2020 at 2:06 PM Anna Povzner  wrote:
> >
> >> Hi Boyang,
> >>
> >> This KIP does not change the protocol with clients. The behavior is the
> >> same as with KIP-402 where the broker delays accepting new connections when
> >> the limit for the number of connections is reached. This KIP adds another
> >> reason for the delay (when the rate is reached). Similarly, when dropping a
> >> connection when per-IP limit is reached, except this KIP delays the
> >> response or may still accept the connection. Client may timeout on waiting
> >> for connection, and retry.
> >>
> >> Thanks,
> >> Anna
> >>
> >> On Mon, May 18, 2020 at 12:54 PM Boyang Chen 
> >> wrote:
> >>
> >>> Hey Anna,
> >>>
> >>> thanks for the KIP. Will this change be applied as one type of quota
> >>> violation, which for client side should be retriable? For EOS model
> >>> before
> >>> 2.6, the Streams client creates one producer for each input partition, so
> >>> it is actually possible to create thousands of producers when the service
> >>> is up. Just want to clarify what's the expected behavior to be seen on
> >>> the
> >>> client side?
> >>>
> >>> On Mon, May 18, 2020 at 12:04 PM Anna Povzner  wrote:
> >>>
> >>> > Hi Alexandre,
> >>> >
> >>> > Thanks for your comments. My answers are below:
> >>> >
> >>> > 900. The KIP does not propose any new metrics because we already have
> >>> > metrics that will let us monitor connection attempts and the amount of
> >>> time
> >>> > the broker delays accepting new connections:
> >>> > 1. We have a per-listener (and per-processor) metric for connection
> >>> > creation rate:
> >>> >
> >>> >
> >>> kafka.server:type=socket-server-metrics,listener={listener_name},networkProcessor={#},name=connection-creation-rate
> >>> > 2. We have per-listener metrics that track the amount of time Acceptor
> >>> is
> >>> > blocked from accepting connections:
> >>> >
> >>> >
> >>> kafka.network:type=Acceptor,name=AcceptorBlockedPercent,listener={listener_name}
> >>> > Note that adding per IP JMX metrics may end up adding a lot of
> >>> overhead,
> >>> > especially for clusters with a large number of clients and many
> >>> different
> >>> > IP addresses. If we ever want to add the metric, perhaps we could
> >>> propose a
> >>> > separate KIP, but that would require some more evaluation of potential
> >>> > overhead.
> >>> >
> >>> > 901. Yes, I updated the wiki with the approach for enforcing per IP
> >>> limits
> >>> > (not dropping right away), as I described in my response to Rajini.
> >>> >
> >>> > 902. Any additional stress testing is always super useful. I am going
> >>> to
> >>> > have PR with the first half of the KIP ready soon (broker-wider and
> >>> > per-listener limits). Perhaps it could be worthwhile to see if it makes
> >>> > sense to add stress testing to muckrake tests. Also, check out
> >>> connection
> >>> > stress workloads in Trogdor and whether they are sufficient or could be
> >>> > extended:
> >>> >
> >>> >
> >>> https://github.com/apache/kafka/tree/trunk/tools/src/main/java/org/apache/kafka/trogdor/workload
> >>> >
> >>> > Regards,
> >>> > Anna
> >>> >
> >>> > On Mon, May 18, 2020 at 8:57 AM Rajini Sivaram <
> >>> rajinisiva...@gmail.com>
> >>> > wrote:
> >>> >
> >>> > > Hi Anna,
> >>> > >
> >>> > > Thanks for the response, sounds good.
> >>> > >
> >>> > > Regards,
> >>> > >
> >>> > > Rajini
> >>> > >
> >>> > >
> >>> > > On Sun, May 17, 2020 at 1:38 AM Anna Povzner 
> >>> wrote:
> >>> > >
> >>> > > > Hi Rajini,
> >>> > > >
> >>> > > > Thanks for reviewing the KIP!
> >>> > > >
> >>> > > > I agree with your suggestion to make per-IP connection rate quota a
> >>> > > dynamic
> >>> > > > quota for entity name IP. This will allow configuring connection
> >>> rate
> >>> > > for a
> >>> > > > particular IP as well. I updated the wiki accordingly.
> >>> > > >
> >>> > > > Your second concern makes sense -- rejecting the connection right
> >>> away
> >>> > > will
> >>> > > > likely cause a new connection from the same client. I am concerned
> >>> > about
> >>> > > > delaying new connections for processing later, because if the
> >>> > connections
> >>> > > > keep coming with the high rate, there may be potentially a large
> >>> > backlog
> >>> > > > and connections may start timing out before the broker gets to
> >>> > processing
> >>> > > > them. For 

[jira] [Created] (KAFKA-10021) When reading to the end of the config log, check if fetch.max.wait.ms is greater than worker.sync.timeout.ms

2020-05-19 Thread Sanjana Kaundinya (Jira)
Sanjana Kaundinya created KAFKA-10021:
-

 Summary: When reading to the end of the config log, check if 
fetch.max.wait.ms is greater than worker.sync.timeout.ms
 Key: KAFKA-10021
 URL: https://issues.apache.org/jira/browse/KAFKA-10021
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Sanjana Kaundinya


Currently in the Connect code in DistributedHerder.java, we see the following 
piece of code

 

{{if (!canReadConfigs && !readConfigToEnd(workerSyncTimeoutMs))
return; // Safe to return and tick immediately because 
readConfigToEnd will do the backoff for us}}

where the workerSyncTimeoutMs passed in is the timeout given to read to the end 
of the config log. This is a bug as we should check if fetch.wait.max.ms is 
greater than worker.sync.timeout.ms and if it is, use worker.sync.timeout.ms as 
the fetch.wait.max.ms. A better fix would be to use the AdminClient to read to 
the end of the log, but at a minimum we should check the configs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-607: Add Metrics to Record the Memory Used by RocksDB to Kafka Streams

2020-05-19 Thread Bruno Cadonna
Thank you for voting!

This KIP passes with:
4 binding +1
1 non-binding +1
0 -1

Best,
Bruno

On Fri, May 15, 2020 at 11:34 PM Matthias J. Sax  wrote:
>
> +1 (binding)
>
> -Matthias
>
> On 5/15/20 11:48 AM, John Roesler wrote:
> > Thanks, Bruno!
> >
> > I’m +1 (binding)
> >
> > -John
> >
> > On Fri, May 15, 2020, at 11:32, Sophie Blee-Goldman wrote:
> >> Thanks Bruno! +1 (non-binding)
> >>
> >> Sophie
> >>
> >> On Fri, May 15, 2020 at 8:15 AM Bill Bejeck  wrote:
> >>
> >>> Thanks for the KIP!
> >>>
> >>> +1 (binding)
> >>>
> >>> -Bill
> >>>
> >>> On Fri, May 15, 2020 at 11:12 AM Guozhang Wang  wrote:
> >>>
>  +1.
> 
>  Thanks!
> 
>  On Fri, May 15, 2020 at 1:36 AM Bruno Cadonna 
> >>> wrote:
> 
> > Hi all,
> >
> > I'd like to call for votes on
> >
> > KIP-607: Add Metrics to Record the Memory Used by RocksDB to Kafka
>  Streams
> >
> > The KIP can be found here
> >
> >
> 
> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-607%3A+Add+Metrics+to+Kafka+Streams+to+Record+the+Memory+Used+by+RocksDB
> >
> > The discussion thread can be found here:
> >
> >
> 
> >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/202005.mbox/%3CCADR0NwzJBJa6WihnpmGj0R%2BYPVrojq4Kg_hOArNEytHAG-tZAQ%40mail.gmail.com%3E
> >
> > Best,
> > Bruno
> >
> 
> 
>  --
>  -- Guozhang
> 
> >>>
> >>
>


Re: [DISCUSS] KIP-610: Error Reporting in Sink Connectors

2020-05-19 Thread Arjun Satish
Hi Randall,

Thanks for the explanation! Excellent point about guaranteeing offsets in
the async case.

If we can guarantee that the offsets will be advanced only after the bad
records are reported, then is there any value is the Future<> return type?
I feel we can declare the function with a void return type:

void report(SinkRecord failedRecord, Throwable error)

that works asynchronously, and advances offsets only after the DLQ producer
(and other reporters) complete successfully (as you explained).

This actually alleviates my concern of what this Future<> actually means.
Since a failure to report should kill the tasks, there is no reason for the
connector to ever wait on the get(). And if we are guaranteeing that the
offsets are only advanced when the errors are reported, then this becomes a
double win:

1. connector developers can literally fire and forget failed records.
2. offsets are correctly advanced on errors being reported. Failure to
report error will kill the task, and the last committed offset will be the
correct one.

The main contract would simply be to call report() before preCommit() or
before put() returns in the task, so the framework knows that that there
are error records reported, and those need to finish before the offsets can
be advanced.

I think I'd be pretty excited about this API. and if we all agree, then
let's go ahead with this?

Best,


On Mon, May 18, 2020 at 8:46 PM Aakash Shah  wrote:

> Hi Randall,
>
> I really appreciate the highly detailed explanation. It clears up the
> advantages of an asynchronous design using Futures, specifically because
> get() does not necessarily need to be called due to the guarantee put in
> place by the framework that you mentioned. I think that if this guarantee
> exists, AND it allows users to simply call report() without having to block
> with get() with the added throughput, this is a good way to go.
>
> Chris, Arjun, I'd like to hear if you guys are on board with this
> suggestion. I'll update the KIP with the guarantee section.
>
> Hope we can come to a consensus around this area.
>
> Let me know what you think.
>
> Thanks,
> Aakash
>
> On Mon, May 18, 2020 at 4:24 PM Randall Hauch  wrote:
>
> > Hi, Chris, Aakash, and others:
> >
> > First of all, apologies for the extremely long email. Secondly, thanks
> for
> > the input on this KIP. The timing is unfortunately, but I do believe
> we're
> > agreed on most points.
> >
> > Chris asked earlier:
> >
> > > I'm still unclear on how futures are going to provide any benefit to
> > > developers, though.
> >
> >
> > Let me rephrase this, because I think it's obvious that futures are just
> a
> > way for the sink task to know, if needed, *when* the reporter has
> > successfully recorded the report of a bad record:
> >
> > What is the benefit of an asynchronous `report(...)`, and is it worth the
> > additional complexity?
> >
> >
> > I would agree that this complexity is not worthwhile *if* every sink task
> > implementation were expected to call `get()` on the future right after
> it's
> > called. But I hope to show that the asynchronous API as described by
> > KIP-610 (as of the time I sent this email) is valuable, simple to use,
> > flexible, and not onerous for many sink task implementations, because
> many
> > sink task implementations will not *have* to use the future -- as long as
> > we add one *additional* guarantee to the current KIP (which I'll define
> > later on).
> >
> > To show this, I'd like to walk through an example of what happens when a
> > sink task runs. I'm going to describe a scenario that involves multiple
> > `put(...)` followed by a single commit of offsets, since this is the
> > pattern the WorkerSinkTask uses, and because the "commit" part is
> essential
> > to the guarantee I believe we need to make. Specifically, consider one
> task
> > for a sink connector consuming a single topic partition, and the calls
> made
> > by WorkerSinkTask:
> >
> > 1. put(...) with records with offsets 1-1000
> > 2. put(...) with records with offsets 1001-2000
> > 3. put(...) with records with offsets 2001-3000
> > 4. put(...) with records with offsets 3001-4000
> > 5. preCommit(...) with offset 4000
> >
> > Now, let's say that the sink task calls `report(...)` on six records at
> > offsets 10, 11, 1010, 1011, 2010, and 2011, and let's look at what
> happens
> > when `report(...)` is synchronous and asynchronous. For simplicity, we're
> > going to assume that records are sent to the external system in batches
> of
> > 100.
> >
> > Let's consider the synchronous `report(...)` case first. The task
> basically
> > goes through the following sequence:
> > a) processes records 1-9
> > b) report record 10, blocking until record 10 is written to the DLQ topic
> > c) report record 11, blocking until record 11 is written to the DLQ topic
> > d) process records 12-102
> > e) send batch of 100 records (offsets 1-102, minus 10 and 11) to external
> > system
> > f) process records 103-202
> > g) 

Re: [DISCUSS] KIP-590: Redirect Zookeeper Mutation Protocols to The Controller

2020-05-19 Thread David Jacot
Hi Boyang,

I've got another question regarding the auto topic creation case. The KIP
says: "Currently the target broker shall just utilize its own ZK client to
create
internal topics, which is disallowed in the bridge release. For above
scenarios,
non-controller broker shall just forward a CreateTopicRequest to the
controller
instead and let controller take care of the rest, while waiting for the
response
in the meantime." There will be no request to forward in this case, right?
Instead,
a CreateTopicsRequest is created and sent to the controller node.

When the CreateTopicsRequest is sent as a side effect of the
MetadataRequest,
it would be good to know the principal and the clientId in the controller
(quota,
audit, etc.). Do you plan to use the Envelope API for this case as well or
to call
the regular API directly? Another was to phrase it would be: Shall the
internal
CreateTopicsRequest be sent with the original metadata (principal,
clientId, etc.)
of the MetadataRequest or as an admin request?

Best,
David

On Fri, May 8, 2020 at 2:04 AM Guozhang Wang  wrote:

> Just to adds a bit more FYI here related to the last question from David:
> in KIP-595 while implementing the new requests we are also adding a
> "KafkaNetworkChannel" which is used for brokers to send vote / fetch
> records, so besides the discussion on listeners I think implementation wise
> we can also consider consolidating a lot of those into the same call-trace
> as well -- of course this is not related to public APIs so maybe just needs
> to be coordinated among developments:
>
> 1. Broker -> Controller: ISR Change, Topic Creation, Admin Redirect
> (KIP-497).
> 2. Controller -> Broker: LeaderAndISR / MetadataUpdate; though these are
> likely going to be deprecated post KIP-500.
> 3. Txn Coordinator -> Broker: TxnMarker
>
>
> Guozhang
>
> On Wed, May 6, 2020 at 8:58 PM Boyang Chen 
> wrote:
>
> > Hey David,
> >
> > thanks for the feedbacks!
> >
> > On Wed, May 6, 2020 at 2:06 AM David Jacot  wrote:
> >
> > > Hi Boyang,
> > >
> > > While re-reading the KIP, I've got few small questions/comments:
> > >
> > > 1. When auto topic creation is enabled, brokers will send a
> > > CreateTopicRequest
> > > to the controller instead of writing to ZK directly. It means that
> > > creation of these
> > > topics are subject to be rejected with an error if a CreateTopicPolicy
> is
> > > used. Today,
> > > it bypasses the policy entirely. I suppose that clusters allowing auto
> > > topic creation
> > > don't have a policy in place so it is not a big deal. I suggest to call
> > > out explicitly the
> > > limitation in the KIP though.
> > >
> > > That's a good idea, will add to the KIP.
> >
> >
> > > 2. In the same vein as my first point. How do you plan to handle errors
> > > when internal
> > > topics are created by a broker? Do you plan to retry retryable errors
> > > indefinitely?
> > >
> > > I checked a bit on the admin client handling of the create topic RPC.
> It
> > seems that
> > the only retriable exceptions at the moment are NOT_CONTROLLER and
> > REQUEST_TIMEOUT.
> > So I guess we just need to retry on these exceptions?
> >
> >
> > > 3. Could you clarify which listener will be used for the internal
> > requests?
> > > Do you plan
> > > to use the control plane listener or perhaps the inter-broker listener?
> > >
> > > As we discussed in the KIP, currently the internal design for
> > broker->controller channel has not been
> > done yet, and I feel it makes sense to consolidate redirect RPC and
> > internal topic creation RPC to this future channel,
> > which are details to be filled in the near future, right now some
> > controller refactoring effort is still WIP.
> >
> >
> > > Thanks,
> > > David
> > >
> > > On Mon, May 4, 2020 at 9:37 AM Sönke Liebau
> > >  wrote:
> > >
> > > > Ah, I see, thanks for the clarification!
> > > >
> > > > Shouldn't be an issue I think. My understanding of KIPs was always
> that
> > > > they are mostly intended as a place to discuss and agree changes up
> > > front,
> > > > whereas tracking the actual releases that things go into should be
> > > handled
> > > > in Jira.
> > > > So maybe we just create new jiras for any subsequent work and either
> > link
> > > > those or make them subtasks (even though this jira is already a
> subtask
> > > > itself), that should allow us to properly track all releases that
> work
> > > goes
> > > > into.
> > > >
> > > > Thanks for your work on this!!
> > > >
> > > > Best,
> > > > Sönke
> > > >
> > > >
> > > > On Sat, 2 May 2020 at 00:31, Boyang Chen  >
> > > > wrote:
> > > >
> > > > > Sure thing Sonke, what I suggest is that usual KIPs get accepted to
> > go
> > > > into
> > > > > next release. It could span for a couple of releases because of
> > > > engineering
> > > > > time, but no change has to be shipped in specific future releases,
> > like
> > > > the
> > > > > backward incompatible change for KafkaPrincipal. But I guess it's
> not
> > > > > really a blocker, as long as we 

Re: [VOTE]: KIP-604: Remove ZooKeeper Flags from the Administrative Tools

2020-05-19 Thread David Jacot
+1 (non-binding).

Thanks for the KIP.

On Fri, May 15, 2020 at 12:41 AM Guozhang Wang  wrote:

> +1.
>
> Thanks Colin!
>
> Guozhang
>
> On Tue, May 12, 2020 at 3:45 PM Colin McCabe  wrote:
>
> > Hi all,
> >
> > I'd like to start a vote on KIP-604: Remove ZooKeeper Flags from the
> > Administrative Tools.
> >
> > As a reminder, this KIP is for the next major release of Kafka, the 3.0
> > release.   So it won't go into the upcoming 2.6 release.  It's a pretty
> > small change that just removes the --zookeeper flags from some tools and
> > removes a deprecated tool.  We haven't decided exactly when we'll do 3.0
> > but I believe we will certainly want this change in that release.
> >
> > The KIP does contain one small change relevant to Kafka 2.6: adding
> > support for --if-exists and --if-not-exists in combination with the
> > --bootstrap-server flag.
> >
> > best,
> > Colin
> >
>
>
> --
> -- Guozhang
>