Jenkins build is back to normal : kafka-trunk-jdk7 #3130

2018-01-29 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-206: Add support for UUID serialization and deserialization

2018-01-29 Thread Ewen Cheslack-Postava
+1 (binding)

On Fri, Jan 26, 2018 at 9:16 AM, Colin McCabe  wrote:

> +1 (non-binding)
>
>
>
> On Fri, Jan 26, 2018, at 08:29, Ted Yu wrote:
> > +1
> >
> > On Fri, Jan 26, 2018 at 7:00 AM, Brandon Kirchner <
> > brandon.kirch...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > I would like to (re)start the voting process for KIP-206:
> > >
> > > *https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 206%3A+Add+support+for+UUID+serialization+and+deserialization
> > >  > > 206%3A+Add+support+for+UUID+serialization+and+deserialization>*
> > >
> > > The KIP adds a UUID serializer and deserializer. Possible
> implementation
> > > can be seen here --
> > >
> > > https://github.com/apache/kafka/pull/4438
> > >
> > > Original discussion and voting thread can be seen here --
> > > http://search-hadoop.com/m/Kafka/uyzND1dlgePJY7l9?subj=+
> > > DISCUSS+KIP+206+Add+support+for+UUID+serialization+and+deserialization
> > >
> > >
> > > Thanks!
> > > Brandon K.
> > >
>


Build failed in Jenkins: kafka-trunk-jdk9 #343

2018-01-29 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-3625: Add public test utils for Kafka Streams (#4402)

[wangguoz] KAFKA-6451: Simplifying KStreamReduce and KStreamAggregate

[github] MINOR: Code refacotring in KTable-KTable Join (#4486)

--
[...truncated 1.45 MB...]

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.TopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.TopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithInvalidReplication 
STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithInvalidReplication 
PASSED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.TopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
PASSED

kafka.integration.MinIsrConfigTest > testDefaultKafkaConfig STARTED

kafka.integration.MinIsrConfigTest > testDefaultKafkaConfig PASSED

kafka.integration.FetcherTest > testFetcher STARTED

kafka.integration.FetcherTest > testFetcher PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.zk.KafkaZkClientTest > testBrokerRegistrationMethods STARTED

kafka.zk.KafkaZkClientTest > testBrokerRegistrationMethods PASSED

kafka.zk.KafkaZkClientTest > testSetGetAndDeletePartitionReassignment STARTED

kafka.zk.KafkaZkClientTest > testSetGetAndDeletePartitionReassignment PASSED

kafka.zk.KafkaZkClientTest > testGetDataAndVersion STARTED

kafka.zk.KafkaZkClientTest > testGetDataAndVersion PASSED

kafka.zk.KafkaZkClientTest > testGetChildren STARTED

kafka.zk.KafkaZkClientTest > testGetChildren PASSED

kafka.zk.KafkaZkClientTest > testSetAndGetConsumerOffset STARTED

kafka.zk.KafkaZkClientTest > testSetAndGetConsumerOffset PASSED

kafka.zk.KafkaZkClientTest > testClusterIdMethods STARTED

kafka.zk.KafkaZkClientTest > testClusterIdMethods PASSED

kafka.zk.KafkaZkClientTest > testEntityConfigManagementMethods STARTED

kafka.zk.KafkaZkClientTest > testEntityConfigManagementMethods PASSED

kafka.zk.KafkaZkClientTest > testCreateRecursive STARTED

kafka.zk.KafkaZkClientTest > testCreateRecursive PASSED

kafka.zk.KafkaZkClientTest > testGetConsumerOffsetNoData STARTED

kafka.zk.KafkaZkClientTest > testGetConsumerOffsetNoData PASSED

kafka.zk.KafkaZkClientTest > testDeleteTopicPathMethods STARTED

kafka.zk.KafkaZkClientTest > testDeleteTopicPathMethods PASSED

kafka.zk.KafkaZkClientTest > testAclManagementMethods STARTED

kafka.zk.KafkaZkClientTest > testAclManagementMethods PASSED

kafka.zk.KafkaZkClientTest > testPreferredReplicaElectionMethods STARTED

kafka.zk.KafkaZkClientTest > testPreferredReplicaElectionMethods PASSED

kafka.zk.KafkaZkClientTest > testPropagateLogDir STARTED

kafka.zk.KafkaZkClientTest > testPropagateLogDir PASSED

kafka.zk.KafkaZkClientTest > testGetDataAndStat STARTED

kafka.zk.KafkaZkClientTest > 

Build failed in Jenkins: kafka-trunk-jdk8 #2368

2018-01-29 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] MINOR: Fix some streams web doc nits (#4411)

--
[...truncated 1.46 MB...]
org.apache.kafka.streams.integration.RestoreIntegrationTest > 
shouldRestoreState STARTED

org.apache.kafka.streams.integration.RestoreIntegrationTest > 
shouldRestoreState PASSED

org.apache.kafka.streams.integration.RestoreIntegrationTest > 
shouldSuccessfullyStartWhenLoggingDisabled STARTED

org.apache.kafka.streams.integration.RestoreIntegrationTest > 
shouldSuccessfullyStartWhenLoggingDisabled PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftInner[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftInner[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftOuter[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftOuter[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftLeft[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftLeft[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterLeft[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterLeft[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInner[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInner[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuter[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuter[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeft[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeft[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInnerInner[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInnerInner[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInnerOuter[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInnerOuter[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInnerLeft[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInnerLeft[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterInner[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterInner[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterOuter[caching enabled = true] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterOuter[caching enabled = true] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftInner[caching enabled = false] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftInner[caching enabled = false] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftOuter[caching enabled = false] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftOuter[caching enabled = false] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftLeft[caching enabled = false] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeftLeft[caching enabled = false] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterLeft[caching enabled = false] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterLeft[caching enabled = false] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInner[caching enabled = false] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInner[caching enabled = false] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuter[caching enabled = false] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuter[caching enabled = false] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testLeft[caching enabled = false] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 

[jira] [Created] (KAFKA-6500) Can not build aggregatedJavadoc

2018-01-29 Thread Pavel Timofeev (JIRA)
Pavel Timofeev created KAFKA-6500:
-

 Summary: Can not build aggregatedJavadoc
 Key: KAFKA-6500
 URL: https://issues.apache.org/jira/browse/KAFKA-6500
 Project: Kafka
  Issue Type: Bug
  Components: documentation
Reporter: Pavel Timofeev


I'm tying to build kafka according to instruction provided on github - 
https://github.com/apache/kafka

I followed the instruction and managed to build a jar with
{code}
./gradlew jar
{code}

But I'm unable to build aggregated javadoc with
{code}
./gradlew aggregatedJavadoc
{code}

Neither with openjdk 1.7 nor 1.8.
It tells me
{noformat}
FAILURE: Build failed with an exception.

* What went wrong:
A problem was found with the configuration of task ':aggregatedJavadoc'.
> No value has been specified for property 'outputDirectory'.
{noformat}

Is the instruction missing some steps?
Is it a bug in build process?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-trunk-jdk7 #3129

2018-01-29 Thread Apache Jenkins Server
See 

--
Started by an SCM change
Started by an SCM change
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H23 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:825)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1092)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1123)
at hudson.scm.SCM.checkout(SCM.java:495)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1202)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1724)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:421)
Caused by: hudson.plugins.git.GitException: Command "git config 
remote.origin.url https://github.com/apache/kafka.git; returned status code 4:
stdout: 
stderr: error: failed to write new configuration file 


at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1970)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1938)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1934)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1572)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1584)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.setRemoteUrl(CliGitAPIImpl.java:1218)
at hudson.plugins.git.GitAPI.setRemoteUrl(GitAPI.java:160)
at sun.reflect.GeneratedMethodAccessor44.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:922)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:896)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:853)
at hudson.remoting.UserRequest.perform(UserRequest.java:207)
at hudson.remoting.UserRequest.perform(UserRequest.java:53)
at hudson.remoting.Request$2.run(Request.java:358)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Suppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to 
H23
at 
hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1693)
at hudson.remoting.UserResponse.retrieve(UserRequest.java:310)
at hudson.remoting.Channel.call(Channel.java:908)
at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:281)
at com.sun.proxy.$Proxy110.setRemoteUrl(Unknown Source)
at 
org.jenkinsci.plugins.gitclient.RemoteGitImpl.setRemoteUrl(RemoteGitImpl.java:295)
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:813)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1092)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1123)
at hudson.scm.SCM.checkout(SCM.java:495)
at 
hudson.model.AbstractProject.checkout(AbstractProject.java:1202)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at 
jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1724)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at 

[jira] [Created] (KAFKA-6499) Avoid creating dummy checkpoint files with no state stores

2018-01-29 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-6499:


 Summary: Avoid creating dummy checkpoint files with no state stores
 Key: KAFKA-6499
 URL: https://issues.apache.org/jira/browse/KAFKA-6499
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Guozhang Wang


Today, for a streams task that contains no state stores, its processor state 
manager would still write a dummy checkpoint file that contains some characters 
(version, size). This introduces unnecessary disk IOs and should be avoidable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : kafka-1.0-jdk7 #137

2018-01-29 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-trunk-jdk7 #3128

2018-01-29 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] MINOR: Fix some streams web doc nits (#4411)

--
[...truncated 405.18 KB...]
kafka.controller.ControllerIntegrationTest > 
testLeaderAndIsrWhenEntireIsrOfflineAndUncleanLeaderElectionEnabled STARTED

kafka.controller.ControllerIntegrationTest > 
testLeaderAndIsrWhenEntireIsrOfflineAndUncleanLeaderElectionEnabled PASSED

kafka.controller.ControllerIntegrationTest > 
testBackToBackPreferredReplicaLeaderElections STARTED

kafka.controller.ControllerIntegrationTest > 
testBackToBackPreferredReplicaLeaderElections PASSED

kafka.controller.ControllerIntegrationTest > testEmptyCluster STARTED

kafka.controller.ControllerIntegrationTest > testEmptyCluster PASSED

kafka.controller.ControllerIntegrationTest > testPreferredReplicaLeaderElection 
STARTED

kafka.controller.ControllerIntegrationTest > testPreferredReplicaLeaderElection 
PASSED

kafka.controller.ControllerEventManagerTest > testEventThatThrowsException 
STARTED

kafka.controller.ControllerEventManagerTest > testEventThatThrowsException 
PASSED

kafka.controller.ControllerEventManagerTest > testSuccessfulEvent STARTED

kafka.controller.ControllerEventManagerTest > testSuccessfulEvent PASSED

kafka.controller.PartitionStateMachineTest > 
testNonexistentPartitionToNewPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testNonexistentPartitionToNewPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransitionErrorCodeFromCreateStates STARTED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransitionErrorCodeFromCreateStates PASSED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToNonexistentPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToNonexistentPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOfflineTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOfflineTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOfflinePartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOfflinePartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidNonexistentPartitionToOnlinePartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidNonexistentPartitionToOnlinePartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidNonexistentPartitionToOfflinePartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidNonexistentPartitionToOfflinePartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransitionZkUtilsExceptionFromCreateStates 
STARTED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransitionZkUtilsExceptionFromCreateStates 
PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidNewPartitionToNonexistentPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidNewPartitionToNonexistentPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNewPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNewPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionErrorCodeFromStateLookup STARTED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionErrorCodeFromStateLookup PASSED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransitionForControlledShutdown STARTED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransitionForControlledShutdown PASSED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionZkUtilsExceptionFromStateLookup 
STARTED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionZkUtilsExceptionFromStateLookup 
PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNonexistentPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNonexistentPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidOfflinePartitionToNewPartitionTransition STARTED


[jira] [Resolved] (KAFKA-6451) Simplify KStreamReduce

2018-01-29 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-6451.
--
   Resolution: Fixed
Fix Version/s: 1.1.0

> Simplify KStreamReduce
> --
>
> Key: KAFKA-6451
> URL: https://issues.apache.org/jira/browse/KAFKA-6451
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>Assignee: Tanvi Jaywant
>Priority: Minor
>  Labels: beginner, newbie
> Fix For: 1.1.0
>
>
> If we do aggregations, we drop records with {{key=null}} or {{value=null}}. 
> However, in {{KStreamReduce}} we only early exit if {{key=null}} and process 
> {{value=null}} – even if we only update the state with it's old value and 
> also only send the old value downstream (ie, we still compute the correct 
> result), it's undesired and wasteful and we should early exit on 
> {{value=null}}, too.
> This problem might occur for {{KStreamAggregate}} or other processors, too, 
> and we need to double check those to make sure we implement consistent 
> behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-trunk-jdk9 #342

2018-01-29 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] MINOR: Fix some streams web doc nits (#4411)

--
[...truncated 1.03 MB...]
org.apache.kafka.common.record.LegacyRecordTest > testChecksum[365] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[365] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[365] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testFields[365] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testFields[365] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[366] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[366] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[366] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[366] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testFields[366] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testFields[366] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[367] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[367] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[367] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[367] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testFields[367] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testFields[367] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[368] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[368] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[368] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[368] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testFields[368] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testFields[368] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[369] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[369] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[369] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[369] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testFields[369] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testFields[369] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[370] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[370] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[370] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[370] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testFields[370] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testFields[370] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[371] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[371] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[371] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[371] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testFields[371] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testFields[371] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[372] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[372] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[372] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[372] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testFields[372] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testFields[372] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[373] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[373] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[373] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[373] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testFields[373] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testFields[373] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[374] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[374] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[374] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[374] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testFields[374] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testFields[374] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[375] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testChecksum[375] PASSED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[375] STARTED

org.apache.kafka.common.record.LegacyRecordTest > testEquality[375] 

Kafka Consumers not rebalancing.

2018-01-29 Thread satyajit vegesna
Hi All,

I was experimenting on the new consumer API and have a question regarding
the rebalance process.

I start a consumer group with single thread and make the Thread sleep while
processing the records retrieved from the first consumer.poll call, i was
making sure the Thread.sleep time goes beyond the session timeout and was
expecting to see a rebalance on the consumer group.

But when i try to get the state of the consumer group using the below
command,

/opt/confluent-4.0.0/bin/kafka-consumer-groups  --group
"consumer-grievances-group15" --bootstrap-server
xxx:9092,:9092,:9092  --describe

i get below result ,

Consumer group 'consumer-grievances-group15' has no active members.


TOPIC  PARTITION  CURRENT-OFFSET  LOG-END-OFFSET
LAGCONSUMER-ID   HOST
CLIENT-ID

TELMATEQA.grievances.grievances 0  125855  152037
26182  - -
-

The same happens with multiple thread in the consumer group scenario,
and *going
further one step i was able to make the thread get into running state, from
sleep state and could see that the consumer group started off from where it
left.*

My only question is , why isn't the rebalancing happening in this scenario.
My expectation was that the threads rebalance and start from the committed
offset.

Regards,
Satyajit.


[jira] [Resolved] (KAFKA-6280) Allow for additional archive types to be loaded from plugin.path in Connect

2018-01-29 Thread Konstantine Karantasis (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis resolved KAFKA-6280.
---
   Resolution: Fixed
Fix Version/s: (was: 1.1.0)
   1.0.0
   0.11.0.2

Has been fixed as part of: 

https://issues.apache.org/jira/browse/KAFKA-6087

> Allow for additional archive types to be loaded from plugin.path in Connect
> ---
>
> Key: KAFKA-6280
> URL: https://issues.apache.org/jira/browse/KAFKA-6280
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Minor
> Fix For: 0.11.0.2, 1.0.0
>
>
> Additionally to uber {{.jar}} archives, seems it would be nice if one could 
> load also `zip` archives of appropriately packaged Connect plugins. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Choose the number of partitions/topics

2018-01-29 Thread Chicolo, Robert (rchic...@student.cccs.edu)
so it goes beyond the throughput that kafka can support.  You have to decide as 
to what degree of parallelism your application can support. If one message 
processing depends on processing for another message, that limits the degree to 
which you can process in parallel. Depending on how much time the processing of 
the message takes and the desired response times the stream can be parallelized.


From: Maria Pilar 
Sent: Monday, January 29, 2018 8:58:17 AM
To: dev@kafka.apache.org; us...@kafka.apache.org
Subject: Choose the number of partitions/topics

Hi everyone

I have design an integration between 2 systems throug our API Stream Kafka,
and the requirements are unclear to choose properly the number of
partitions/topics.

That is the use case:

My producer will send 28 different type of events, so I have decided to
create 28 topics.

The max size value for one message will be 4,096 bytes and the total size
(MB/day) will be 2.469,888 mb/day.

The retention will be 2 days.

By default I´m thinking in one partition that as recomentation by confluent
it can produce 10 Mb/second.

However the requirement for the consumer is the minimun latency (sub 3
seconds), so I thinking to create more leader partitions/per topic to
paralle and achive the thoughput.

Do you know what is the best practice or formule to define it properly?

Thanks


Re: [VOTE] KIP-232: Detect outdated metadata using leaderEpoch and partitionEpoch

2018-01-29 Thread Dong Lin
Hey Colin,

On Mon, Jan 29, 2018 at 11:23 AM, Colin McCabe  wrote:

> > On Mon, Jan 29, 2018 at 10:35 AM, Dong Lin  wrote:
> >
> > > Hey Colin,
> > >
> > > I understand that the KIP will adds overhead by introducing
> per-partition
> > > partitionEpoch. I am open to alternative solutions that does not incur
> > > additional overhead. But I don't see a better way now.
> > >
> > > IMO the overhead in the FetchResponse may not be that much. We probably
> > > should discuss the percentage increase rather than the absolute number
> > > increase. Currently after KIP-227, per-partition header has 23 bytes.
> This
> > > KIP adds another 4 bytes. Assume the records size is 10KB, the
> percentage
> > > increase is 4 / (23 + 1) = 0.03%. It seems negligible, right?
>
> Hi Dong,
>
> Thanks for the response.  I agree that the FetchRequest / FetchResponse
> overhead should be OK, now that we have incremental fetch requests and
> responses.  However, there are a lot of cases where the percentage increase
> is much greater.  For example, if a client is doing full MetadataRequests /
> Responses, we have some math kind of like this per partition:
>
> > UpdateMetadataRequestPartitionState => topic partition controller_epoch
> leader  leader_epoch partition_epoch isr zk_version replicas
> offline_replicas
> > 14 bytes:  topic => string (assuming about 10 byte topic names)
> > 4 bytes:  partition => int32
> > 4  bytes: conroller_epoch => int32
> > 4  bytes: leader => int32
> > 4  bytes: leader_epoch => int32
> > +4 EXTRA bytes: partition_epoch => int32<-- NEW
> > 2+4+4+4 bytes: isr => [int32] (assuming 3 in the ISR)
> > 4 bytes: zk_version => int32
> > 2+4+4+4 bytes: replicas => [int32] (assuming 3 replicas)
> > 2  offline_replicas => [int32] (assuming no offline replicas)
>
> Assuming I added that up correctly, the per-partition overhead goes from
> 64 bytes per partition to 68, a 6.2% increase.
>
> We could do similar math for a lot of the other RPCs.  And you will have a
> similar memory and garbage collection impact on the brokers since you have
> to store all this extra state as well.
>

That is correct. IMO the Metadata is only updated periodically and is
probably not a big deal if we increase it by 6%. The FetchResponse and
ProduceRequest are probably the only requests that are bounded by the
bandwidth throughput.


>
> > >
> > > I agree that we can probably save more space by using partition ID so
> that
> > > we no longer needs the string topic name. The similar idea has also
> been
> > > put in the Rejected Alternative section in KIP-227. While this idea is
> > > promising, it seems orthogonal to the goal of this KIP. Given that
> there is
> > > already many work to do in this KIP, maybe we can do the partition ID
> in a
> > > separate KIP?
>
> I guess my thinking is that the goal here is to replace an identifier
> which can be re-used (the tuple of topic name, partition ID) with an
> identifier that cannot be re-used (the tuple of topic name, partition ID,
> partition epoch) in order to gain better semantics.  As long as we are
> replacing the identifier, why not replace it with an identifier that has
> important performance advantages?  The KIP freeze for the next release has
> already passed, so there is time to do this.
>

In general it can be easier for discussion and implementation if we can
split a larger task into smaller and independent tasks. For example,
KIP-112 and KIP-113 both deals with the JBOD support. KIP-31, KIP-32 and
KIP-33 are about timestamp support. The option on this can be subject
though.

IMO the change to switch from (topic, partition ID) to partitionEpch in all
request/response requires us to going through all request one by one. It
may not be hard but it can be time consuming and tedious. At high level the
goal and the change for that will be orthogonal to the changes required in
this KIP. That is the main reason I think we can split them into two KIPs.


> On Mon, Jan 29, 2018, at 10:54, Dong Lin wrote:
> > I think it is possible to move to entirely use partitionEpoch instead of
> > (topic, partition) to identify a partition. Client can obtain the
> > partitionEpoch -> (topic, partition) mapping from MetadataResponse. We
> > probably need to figure out a way to assign partitionEpoch to existing
> > partitions in the cluster. But this should be doable.
> >
> > This is a good idea. I think it will save us some space in the
> > request/response. The actual space saving in percentage probably depends
> on
> > the amount of data and the number of partitions of the same topic. I just
> > think we can do it in a separate KIP.
>
> Hmm.  How much extra work would be required?  It seems like we are already
> changing almost every RPC that involves topics and partitions, already
> adding new per-partition state to ZooKeeper, already changing how clients
> interact with partitions.  Is there some other big piece of work we'd have
> to do to move 

Re: [VOTE] KIP-232: Detect outdated metadata using leaderEpoch and partitionEpoch

2018-01-29 Thread Colin McCabe
> On Mon, Jan 29, 2018 at 10:35 AM, Dong Lin  wrote:
> 
> > Hey Colin,
> >
> > I understand that the KIP will adds overhead by introducing per-partition
> > partitionEpoch. I am open to alternative solutions that does not incur
> > additional overhead. But I don't see a better way now.
> >
> > IMO the overhead in the FetchResponse may not be that much. We probably
> > should discuss the percentage increase rather than the absolute number
> > increase. Currently after KIP-227, per-partition header has 23 bytes. This
> > KIP adds another 4 bytes. Assume the records size is 10KB, the percentage
> > increase is 4 / (23 + 1) = 0.03%. It seems negligible, right?

Hi Dong,

Thanks for the response.  I agree that the FetchRequest / FetchResponse 
overhead should be OK, now that we have incremental fetch requests and 
responses.  However, there are a lot of cases where the percentage increase is 
much greater.  For example, if a client is doing full MetadataRequests / 
Responses, we have some math kind of like this per partition:

> UpdateMetadataRequestPartitionState => topic partition controller_epoch 
> leader  leader_epoch partition_epoch isr zk_version replicas offline_replicas
> 14 bytes:  topic => string (assuming about 10 byte topic names)
> 4 bytes:  partition => int32
> 4  bytes: conroller_epoch => int32
> 4  bytes: leader => int32
> 4  bytes: leader_epoch => int32
> +4 EXTRA bytes: partition_epoch => int32<-- NEW
> 2+4+4+4 bytes: isr => [int32] (assuming 3 in the ISR)
> 4 bytes: zk_version => int32
> 2+4+4+4 bytes: replicas => [int32] (assuming 3 replicas)
> 2  offline_replicas => [int32] (assuming no offline replicas)

Assuming I added that up correctly, the per-partition overhead goes from 64 
bytes per partition to 68, a 6.2% increase.

We could do similar math for a lot of the other RPCs.  And you will have a 
similar memory and garbage collection impact on the brokers since you have to 
store all this extra state as well.

> >
> > I agree that we can probably save more space by using partition ID so that
> > we no longer needs the string topic name. The similar idea has also been
> > put in the Rejected Alternative section in KIP-227. While this idea is
> > promising, it seems orthogonal to the goal of this KIP. Given that there is
> > already many work to do in this KIP, maybe we can do the partition ID in a
> > separate KIP?

I guess my thinking is that the goal here is to replace an identifier which can 
be re-used (the tuple of topic name, partition ID) with an identifier that 
cannot be re-used (the tuple of topic name, partition ID, partition epoch) in 
order to gain better semantics.  As long as we are replacing the identifier, 
why not replace it with an identifier that has important performance 
advantages?  The KIP freeze for the next release has already passed, so there 
is time to do this.

On Mon, Jan 29, 2018, at 10:54, Dong Lin wrote:
> I think it is possible to move to entirely use partitionEpoch instead of
> (topic, partition) to identify a partition. Client can obtain the
> partitionEpoch -> (topic, partition) mapping from MetadataResponse. We
> probably need to figure out a way to assign partitionEpoch to existing
> partitions in the cluster. But this should be doable.
> 
> This is a good idea. I think it will save us some space in the
> request/response. The actual space saving in percentage probably depends on
> the amount of data and the number of partitions of the same topic. I just
> think we can do it in a separate KIP.

Hmm.  How much extra work would be required?  It seems like we are already 
changing almost every RPC that involves topics and partitions, already adding 
new per-partition state to ZooKeeper, already changing how clients interact 
with partitions.  Is there some other big piece of work we'd have to do to move 
to partition IDs that we wouldn't need for partition epochs?  I guess we'd have 
to find a way to support regular expression-based topic subscriptions.  If we 
split this into multiple KIPs, wouldn't we end up changing all that RPCs and ZK 
state a second time?  Also, I'm curious if anyone has done any proof of concept 
GC, memory, and network usage measurements on switching topic names for topic 
IDs.

best,
Colin

> 
> 
> 
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Mon, Jan 29, 2018 at 10:18 AM, Colin McCabe  wrote:
> >
> >> On Fri, Jan 26, 2018, at 12:17, Dong Lin wrote:
> >> > Hey Colin,
> >> >
> >> >
> >> > On Fri, Jan 26, 2018 at 10:16 AM, Colin McCabe 
> >> wrote:
> >> >
> >> > > On Thu, Jan 25, 2018, at 16:47, Dong Lin wrote:
> >> > > > Hey Colin,
> >> > > >
> >> > > > Thanks for the comment.
> >> > > >
> >> > > > On Thu, Jan 25, 2018 at 4:15 PM, Colin McCabe 
> >> > > wrote:
> >> > > >
> >> > > > > On Wed, Jan 24, 2018, at 21:07, Dong Lin wrote:
> >> > > > > > Hey Colin,
> >> > > > > >
> >> > > > > > Thanks for reviewing the KIP.
> >> > > > > >
> 

Re: [DISCUSS] KIP-86: Configurable SASL callback handlers

2018-01-29 Thread Rajini Sivaram
Hi all,

To simplify dynamic update of SASL configs in future (e.g add a new SASL
mechanism with a new callback handler or Login), I have separated out the
broker-side configs with a mechanism prefix in the property name (similar
to listener prefix) instead of including all the classes together as a map.
This will also make it easier to configure new Login classes when new
mechanisms are introduced alongside other mechanisms on a single listener.

Please let me know if you have any concerns.

Thank you,

Rajini


On Wed, Jan 17, 2018 at 6:54 AM, Rajini Sivaram 
wrote:

> Hi all,
>
> I have made some updates to this KIP to simplify addition of new SASL
> mechanisms:
>
>1. The Login interface has been made configurable as well (we have had
>this interface for quite some time and it has been stable).
>2.  The callback handler properties for client-side and server-side
>have been separated out since we would never use the same classes for both.
>
> Any feedback or suggestions are welcome.
>
> Thank you,
>
> Rajini
>
>
> On Mon, Apr 3, 2017 at 12:55 PM, Rajini Sivaram 
> wrote:
>
>> If there are no other concerns or suggestions on this KIP, I will start
>> vote later this week.
>>
>> Thank you...
>>
>> Regards,
>>
>> Rajini
>>
>> On Thu, Mar 30, 2017 at 9:42 PM, Rajini Sivaram 
>> wrote:
>>
>>> I have made a minor change to the callback handler interface to pass in
>>> the JAAS configuration entries in *configure,* to work with the
>>> multiple listener configuration introduced in KIP-103. I have also renamed
>>> the interface to AuthenticateCallbackHandler instead of AuthCallbackHandler
>>> to avoid confusion with authorization.
>>>
>>> I have rebased and updated the PR (https://github.com/apache/kaf
>>> ka/pull/2022) as well. Please let me know if there are any other
>>> comments or suggestions to move this forward.
>>>
>>> Thank you...
>>>
>>> Regards,
>>>
>>> Rajini
>>>
>>>
>>> On Thu, Dec 15, 2016 at 3:11 PM, Rajini Sivaram 
>>> wrote:
>>>
 Ismael,

 The reason for choosing CallbackHandler interface as the configurable
 interface is flexibility. As you say, we could instead define a simpler
 PlainCredentialProvider and ScramCredentialProvider. But that would tie
 users to Kafka's SaslServer implementation for PLAIN and SCRAM.
 SaslServer/SaslClient implementations are already pluggable using
 standard
 Java security provider mechanism. Callback handlers are the
 configuration
 mechanism for SaslServer/SaslClient. By making the handlers
 configurable,
 SASL becomes fully configurable for mechanisms supported by Kafka as
 well
 as custom mechanisms. From the 'Scenarios' section in the KIP, a simpler
 PLAIN/SCRAM interface satisfies the first two, but configurable callback
 handlers enables all five. I agree that most users don't require this
 level
 of flexibility, but we have had discussions about custom mechanisms in
 the
 past for integration with existing authentication servers. So I think
 it is
 a feature worth supporting.

 On Thu, Dec 15, 2016 at 2:21 PM, Ismael Juma  wrote:

 > Thanks Rajini, your answers make sense to me. One more general point:
 we
 > are following the JAAS callback architecture and exposing that to the
 user
 > where the user has to write code like:
 >
 > @Override
 > public void handle(Callback[] callbacks) throws IOException,
 > UnsupportedCallbackException {
 > String username = null;
 > for (Callback callback: callbacks) {
 > if (callback instanceof NameCallback)
 > username = ((NameCallback) callback).getDefaultName();
 > else if (callback instanceof PlainAuthenticateCallback) {
 > PlainAuthenticateCallback plainCallback =
 > (PlainAuthenticateCallback) callback;
 > boolean authenticated = authenticate(username,
 > plainCallback.password());
 > plainCallback.authenticated(authenticated);
 > } else
 > throw new UnsupportedCallbackException(callback);
 > }
 > }
 >
 > protected boolean authenticate(String username, char[] password)
 throws
 > IOException {
 > if (username == null)
 > return false;
 > else {
 > String expectedPassword =
 > JaasUtils.jaasConfig(LoginType.SERVER.contextName(), "user_" +
 username,
 > PlainLoginModule.class.getName());
 > return Arrays.equals(password,
 expectedPassword.toCharArray()
 > );
 > }
 > }
 >
 > Since we need to create a new callback type for Plain, Scram and so
 on, is
 > it really worth it to do it this way? For example, 

[jira] [Created] (KAFKA-6498) Add RocksDB statistics via Streams metrics

2018-01-29 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-6498:


 Summary: Add RocksDB statistics via Streams metrics
 Key: KAFKA-6498
 URL: https://issues.apache.org/jira/browse/KAFKA-6498
 Project: Kafka
  Issue Type: Improvement
  Components: metrics, streams
Reporter: Guozhang Wang


RocksDB's own stats can be programmatically exposed via 
{{Options.statistics()}} and the JNI `Statistics` has indeed implemented many 
useful settings already. However these stats are not exposed directly via 
Streams today and hence for any users who wants to get access to them they have 
to manually interact with the underlying RocksDB directly, not through Streams.

We should expose such stats via Streams metrics programmatically for users to 
investigate them without trying to access the rocksDB directly.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-232: Detect outdated metadata using leaderEpoch and partitionEpoch

2018-01-29 Thread Dong Lin
I think it is possible to move to entirely use partitionEpoch instead of
(topic, partition) to identify a partition. Client can obtain the
partitionEpoch -> (topic, partition) mapping from MetadataResponse. We
probably need to figure out a way to assign partitionEpoch to existing
partitions in the cluster. But this should be doable.

This is a good idea. I think it will save us some space in the
request/response. The actual space saving in percentage probably depends on
the amount of data and the number of partitions of the same topic. I just
think we can do it in a separate KIP.



On Mon, Jan 29, 2018 at 10:35 AM, Dong Lin  wrote:

> Hey Colin,
>
> I understand that the KIP will adds overhead by introducing per-partition
> partitionEpoch. I am open to alternative solutions that does not incur
> additional overhead. But I don't see a better way now.
>
> IMO the overhead in the FetchResponse may not be that much. We probably
> should discuss the percentage increase rather than the absolute number
> increase. Currently after KIP-227, per-partition header has 23 bytes. This
> KIP adds another 4 bytes. Assume the records size is 10KB, the percentage
> increase is 4 / (23 + 1) = 0.03%. It seems negligible, right?
>
> I agree that we can probably save more space by using partition ID so that
> we no longer needs the string topic name. The similar idea has also been
> put in the Rejected Alternative section in KIP-227. While this idea is
> promising, it seems orthogonal to the goal of this KIP. Given that there is
> already many work to do in this KIP, maybe we can do the partition ID in a
> separate KIP?
>
> Thanks,
> Dong
>
>
>
>
>
>
>
>
>
>
> On Mon, Jan 29, 2018 at 10:18 AM, Colin McCabe  wrote:
>
>> On Fri, Jan 26, 2018, at 12:17, Dong Lin wrote:
>> > Hey Colin,
>> >
>> >
>> > On Fri, Jan 26, 2018 at 10:16 AM, Colin McCabe 
>> wrote:
>> >
>> > > On Thu, Jan 25, 2018, at 16:47, Dong Lin wrote:
>> > > > Hey Colin,
>> > > >
>> > > > Thanks for the comment.
>> > > >
>> > > > On Thu, Jan 25, 2018 at 4:15 PM, Colin McCabe 
>> > > wrote:
>> > > >
>> > > > > On Wed, Jan 24, 2018, at 21:07, Dong Lin wrote:
>> > > > > > Hey Colin,
>> > > > > >
>> > > > > > Thanks for reviewing the KIP.
>> > > > > >
>> > > > > > If I understand you right, you maybe suggesting that we can use
>> a
>> > > global
>> > > > > > metadataEpoch that is incremented every time controller updates
>> > > metadata.
>> > > > > > The problem with this solution is that, if a topic is deleted
>> and
>> > > created
>> > > > > > again, user will not know whether that the offset which is
>> stored
>> > > before
>> > > > > > the topic deletion is no longer valid. This motivates the idea
>> to
>> > > include
>> > > > > > per-partition partitionEpoch. Does this sound reasonable?
>> > > > >
>> > > > > Hi Dong,
>> > > > >
>> > > > > Perhaps we can store the last valid offset of each deleted topic
>> in
>> > > > > ZooKeeper.  Then, when a topic with one of those names gets
>> > > re-created, we
>> > > > > can start the topic at the previous end offset rather than at 0.
>> This
>> > > > > preserves immutability.  It is no more burdensome than having to
>> > > preserve a
>> > > > > "last epoch" for the deleted partition somewhere, right?
>> > > > >
>> > > >
>> > > > My concern with this solution is that the number of zookeeper nodes
>> get
>> > > > more and more over time if some users keep deleting and creating
>> topics.
>> > > Do
>> > > > you think this can be a problem?
>> > >
>> > > Hi Dong,
>> > >
>> > > We could expire the "partition tombstones" after an hour or so.  In
>> > > practice this would solve the issue for clients that like to destroy
>> and
>> > > re-create topics all the time.  In any case, doesn't the current
>> proposal
>> > > add per-partition znodes as well that we have to track even after the
>> > > partition is deleted?  Or did I misunderstand that?
>> > >
>> >
>> > Actually the current KIP does not add per-partition znodes. Could you
>> > double check? I can fix the KIP wiki if there is anything misleading.
>>
>> Hi Dong,
>>
>> I double-checked the KIP, and I can see that you are in fact using a
>> global counter for initializing partition epochs.  So, you are correct, it
>> doesn't add per-partition znodes for partitions that no longer exist.
>>
>> >
>> > If we expire the "partition tomstones" after an hour, and the topic is
>> > re-created after more than an hour since the topic deletion, then we are
>> > back to the situation where user can not tell whether the topic has been
>> > re-created or not, right?
>>
>> Yes, with an expiration period, it would not ensure immutability-- you
>> could effectively reuse partition names and they would look the same.
>>
>> >
>> >
>> > >
>> > > It's not really clear to me what should happen when a topic is
>> destroyed
>> > > and re-created with new data.  Should consumers continue to be able to
>> > > consume?  

[jira] [Created] (KAFKA-6497) streams#store ambiguous InvalidStateStoreException

2018-01-29 Thread Pegerto Fernandez (JIRA)
Pegerto Fernandez created KAFKA-6497:


 Summary: streams#store ambiguous InvalidStateStoreException
 Key: KAFKA-6497
 URL: https://issues.apache.org/jira/browse/KAFKA-6497
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 1.0.0
Reporter: Pegerto Fernandez


When using the streams API.

When deploy new materialised views, the access to the store provide always 
InvalidStateStoreExeception, that can be cause by a rebalance, but it is also 
caused by not execute the topology that create the view. For example and empty 
topic.

 

In this case, when there topology is running, but the local store do not 
contain any data, the behaviour should be different, for example wait versus 
assume the expected key to be not found. 

 

I am relative new to streams so please correct me if I miss something.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6079) Idempotent production for source connectors

2018-01-29 Thread Randall Hauch (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch resolved KAFKA-6079.
--
Resolution: Won't Fix

Marking this as WONTFIX since it is *already possible since AK 0.11* to 
configure the Connect worker producers [to use idempotent 
delivery|https://kafka.apache.org/documentation/#upgrade_11_exactly_once_semantics]:

{quote}
Idempotent delivery ensures that messages are delivered exactly once to a 
particular topic partition during the lifetime of a single producer.
{quote}

That eliminates duplicate events due to retries, but a failure of a Connector 
worker might still mean records are written to Kafka but the worker fails 
before it can commit the latest offsets; when it restarts, it begins from the 
last committed offsets and this may certainly result in duplicate messages that 
were written before the failure.

See KAFKA-6080 for changing Connect to support exactly-once semantics for 
source connectors.

> Idempotent production for source connectors
> ---
>
> Key: KAFKA-6079
> URL: https://issues.apache.org/jira/browse/KAFKA-6079
> Project: Kafka
>  Issue Type: New Feature
>  Components: KafkaConnect
>Reporter: Antony Stubbs
>Priority: Major
>
> Idempotent production for source connection to reduce duplicates at least 
> from retires.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP 226 - Dynamic Broker Configuration

2018-01-29 Thread Rajini Sivaram
Hi all,

I made a change to the KIP to specify sasl.jaas.config property for brokers
separately for each mechanism, rather than together in a single login
context as we do today in the JAAS config file. This will make it easier to
add new mechanisms to a running broker (this is not in scope for this KIP,
but we might want to do in future).

Please let me know if you have any concerns.

Thank you,

Rajini


On Thu, Jan 25, 2018 at 3:51 AM, Viktor Somogyi 
wrote:

> Yea, if other commands seem to follow this pattern, I'll update KIP-248 as
> well :). Also introducing those arguments in the current ConfigCommand also
> makes sense from the migration point of view too as it will be introduced
> in 1.1 which makes it somewhat easier for KIP-248.
>
> On Wed, Jan 24, 2018 at 6:46 PM, Rajini Sivaram 
> wrote:
>
> > Hi Ismael,
> >
> > Yes, that makes sense. Looking at the command line options for different
> > tools, we seem to be using *--command-config  *in the
> commands
> > that currently talk to the new AdminClient (DelegationTokenCommand,
> > ConsumerGroupCommand, DeleteRecordsCommand). So perhaps it makes sense to
> > do the same for ConfigCommand as well. I will update KIP-226 with the two
> > options *--bootstrap-server* and *--command-config*.
> >
> > Viktor, what do you think?
> >
> > At the moment, I think many in the community are busy due to the code
> > freeze next week, but hopefully we should get more feedback on KIP-248
> soon
> > after.
> >
> > Thank you,
> >
> > Rajini
> >
> > On Wed, Jan 24, 2018 at 5:41 AM, Viktor Somogyi  >
> > wrote:
> >
> > > Hi all,
> > >
> > > I'd also like to as the community here who were participating the
> > > discussion of KIP-226 to take a look at KIP-248 (that is making
> > > kafka-configs.sh fully function with AdminClient and a Java based
> > > ConfigCommand). It would be much appreciated to get feedback on that as
> > it
> > > plays an important role for KIP-226 and other long-waited features.
> > >
> > > Thanks,
> > > Viktor
> > >
> > > On Wed, Jan 24, 2018 at 6:56 AM, Ismael Juma 
> wrote:
> > >
> > > > Hi Rajini,
> > > >
> > > > I think the proposal makes sense. One suggestion: can we just allow
> the
> > > > config to be passed? That is, leave out the properties config for
> now.
> > > >
> > > > On Tue, Jan 23, 2018 at 3:01 PM, Rajini Sivaram <
> > rajinisiva...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Since we are running out of time to get the whole ConfigCommand
> > > converted
> > > > > to using the new AdminClient for 1.1.0 (KIP-248), we need a way to
> > > enable
> > > > > ConfigCommand to handle broker config updates (implemented by
> > KIP-226).
> > > > As
> > > > > a simple first step, it would make sense to use the existing
> > > > ConfigCommand
> > > > > tool to perform broker config updates enabled by this KIP. Since
> > config
> > > > > validation and password encryption are performed by the broker,
> this
> > > will
> > > > > be easier to do with the new AdminClient. To do this, we need to
> add
> > > > > command line options for new admin client to kafka-configs.sh.
> > Dynamic
> > > > > broker config updates alone will be done under KIP-226 using the
> new
> > > > admin
> > > > > client to make this feature usable.. The new command line options
> > > > > (consistent with KIP-248) that will be added to ConfigCommand will
> > be:
> > > > >
> > > > >- --bootstrap-server *host:port*
> > > > >- --adminclient.config *config-file*
> > > > >- --adminclient.properties *k1=v1,k2=v2*
> > > > >
> > > > > If anyone has any concerns about these options being added to
> > > > > kafka-configs.sh, please let me know. Otherwise, I will update
> > KIP-226
> > > > and
> > > > > add the options to one of the KIP-226 PRs.
> > > > >
> > > > > Thank you,
> > > > >
> > > > > Rajini
> > > > >
> > > > > On Wed, Jan 10, 2018 at 5:14 AM, Ismael Juma 
> > > wrote:
> > > > >
> > > > > > Thanks Rajini. Sounds good.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Wed, Jan 10, 2018 at 11:41 AM, Rajini Sivaram <
> > > > > rajinisiva...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Ismael,
> > > > > > >
> > > > > > > I have updated the KIP to use AES-256 if available and AES-128
> > > > > otherwise
> > > > > > > for password encryption. Looking at GCM, it looks like GCM is
> > > > typically
> > > > > > > used with a variable initialization vector, while we are using
> a
> > > > > random,
> > > > > > > but constant IV per-password. Also, AES/GCM is not supported by
> > > > Java7.
> > > > > > > Since the authentication and performance benefits of GCM are
> not
> > > > > required
> > > > > > > for this scenario, I am thinking I will leave the default as
> CBC,
> > > but
> > > > > > make
> > > > > > > sure we test GCM as well so that users have the choice.
> > > > > > >
> > > > > > > On Wed, Jan 10, 2018 at 1:01 AM, Colin 

Re: [VOTE] KIP-232: Detect outdated metadata using leaderEpoch and partitionEpoch

2018-01-29 Thread Dong Lin
Hey Colin,

I understand that the KIP will adds overhead by introducing per-partition
partitionEpoch. I am open to alternative solutions that does not incur
additional overhead. But I don't see a better way now.

IMO the overhead in the FetchResponse may not be that much. We probably
should discuss the percentage increase rather than the absolute number
increase. Currently after KIP-227, per-partition header has 23 bytes. This
KIP adds another 4 bytes. Assume the records size is 10KB, the percentage
increase is 4 / (23 + 1) = 0.03%. It seems negligible, right?

I agree that we can probably save more space by using partition ID so that
we no longer needs the string topic name. The similar idea has also been
put in the Rejected Alternative section in KIP-227. While this idea is
promising, it seems orthogonal to the goal of this KIP. Given that there is
already many work to do in this KIP, maybe we can do the partition ID in a
separate KIP?

Thanks,
Dong










On Mon, Jan 29, 2018 at 10:18 AM, Colin McCabe  wrote:

> On Fri, Jan 26, 2018, at 12:17, Dong Lin wrote:
> > Hey Colin,
> >
> >
> > On Fri, Jan 26, 2018 at 10:16 AM, Colin McCabe 
> wrote:
> >
> > > On Thu, Jan 25, 2018, at 16:47, Dong Lin wrote:
> > > > Hey Colin,
> > > >
> > > > Thanks for the comment.
> > > >
> > > > On Thu, Jan 25, 2018 at 4:15 PM, Colin McCabe 
> > > wrote:
> > > >
> > > > > On Wed, Jan 24, 2018, at 21:07, Dong Lin wrote:
> > > > > > Hey Colin,
> > > > > >
> > > > > > Thanks for reviewing the KIP.
> > > > > >
> > > > > > If I understand you right, you maybe suggesting that we can use a
> > > global
> > > > > > metadataEpoch that is incremented every time controller updates
> > > metadata.
> > > > > > The problem with this solution is that, if a topic is deleted and
> > > created
> > > > > > again, user will not know whether that the offset which is stored
> > > before
> > > > > > the topic deletion is no longer valid. This motivates the idea to
> > > include
> > > > > > per-partition partitionEpoch. Does this sound reasonable?
> > > > >
> > > > > Hi Dong,
> > > > >
> > > > > Perhaps we can store the last valid offset of each deleted topic in
> > > > > ZooKeeper.  Then, when a topic with one of those names gets
> > > re-created, we
> > > > > can start the topic at the previous end offset rather than at 0.
> This
> > > > > preserves immutability.  It is no more burdensome than having to
> > > preserve a
> > > > > "last epoch" for the deleted partition somewhere, right?
> > > > >
> > > >
> > > > My concern with this solution is that the number of zookeeper nodes
> get
> > > > more and more over time if some users keep deleting and creating
> topics.
> > > Do
> > > > you think this can be a problem?
> > >
> > > Hi Dong,
> > >
> > > We could expire the "partition tombstones" after an hour or so.  In
> > > practice this would solve the issue for clients that like to destroy
> and
> > > re-create topics all the time.  In any case, doesn't the current
> proposal
> > > add per-partition znodes as well that we have to track even after the
> > > partition is deleted?  Or did I misunderstand that?
> > >
> >
> > Actually the current KIP does not add per-partition znodes. Could you
> > double check? I can fix the KIP wiki if there is anything misleading.
>
> Hi Dong,
>
> I double-checked the KIP, and I can see that you are in fact using a
> global counter for initializing partition epochs.  So, you are correct, it
> doesn't add per-partition znodes for partitions that no longer exist.
>
> >
> > If we expire the "partition tomstones" after an hour, and the topic is
> > re-created after more than an hour since the topic deletion, then we are
> > back to the situation where user can not tell whether the topic has been
> > re-created or not, right?
>
> Yes, with an expiration period, it would not ensure immutability-- you
> could effectively reuse partition names and they would look the same.
>
> >
> >
> > >
> > > It's not really clear to me what should happen when a topic is
> destroyed
> > > and re-created with new data.  Should consumers continue to be able to
> > > consume?  We don't know where they stopped consuming from the previous
> > > incarnation of the topic, so messages may have been lost.  Certainly
> > > consuming data from offset X of the new incarnation of the topic may
> give
> > > something totally different from what you would have gotten from
> offset X
> > > of the previous incarnation of the topic.
> > >
> >
> > With the current KIP, if a consumer consumes a topic based on the last
> > remembered (offset, partitionEpoch, leaderEpoch), and if the topic is
> > re-created, consume will throw InvalidPartitionEpochException because the
> > previous partitionEpoch will be different from the current
> partitionEpoch.
> > This is described in the Proposed Changes -> Consumption after topic
> > deletion in the KIP. I can improve the KIP if there is anything not
> 

Re: [VOTE] KIP-232: Detect outdated metadata using leaderEpoch and partitionEpoch

2018-01-29 Thread Colin McCabe
On Fri, Jan 26, 2018, at 12:17, Dong Lin wrote:
> Hey Colin,
> 
> 
> On Fri, Jan 26, 2018 at 10:16 AM, Colin McCabe  wrote:
> 
> > On Thu, Jan 25, 2018, at 16:47, Dong Lin wrote:
> > > Hey Colin,
> > >
> > > Thanks for the comment.
> > >
> > > On Thu, Jan 25, 2018 at 4:15 PM, Colin McCabe 
> > wrote:
> > >
> > > > On Wed, Jan 24, 2018, at 21:07, Dong Lin wrote:
> > > > > Hey Colin,
> > > > >
> > > > > Thanks for reviewing the KIP.
> > > > >
> > > > > If I understand you right, you maybe suggesting that we can use a
> > global
> > > > > metadataEpoch that is incremented every time controller updates
> > metadata.
> > > > > The problem with this solution is that, if a topic is deleted and
> > created
> > > > > again, user will not know whether that the offset which is stored
> > before
> > > > > the topic deletion is no longer valid. This motivates the idea to
> > include
> > > > > per-partition partitionEpoch. Does this sound reasonable?
> > > >
> > > > Hi Dong,
> > > >
> > > > Perhaps we can store the last valid offset of each deleted topic in
> > > > ZooKeeper.  Then, when a topic with one of those names gets
> > re-created, we
> > > > can start the topic at the previous end offset rather than at 0.  This
> > > > preserves immutability.  It is no more burdensome than having to
> > preserve a
> > > > "last epoch" for the deleted partition somewhere, right?
> > > >
> > >
> > > My concern with this solution is that the number of zookeeper nodes get
> > > more and more over time if some users keep deleting and creating topics.
> > Do
> > > you think this can be a problem?
> >
> > Hi Dong,
> >
> > We could expire the "partition tombstones" after an hour or so.  In
> > practice this would solve the issue for clients that like to destroy and
> > re-create topics all the time.  In any case, doesn't the current proposal
> > add per-partition znodes as well that we have to track even after the
> > partition is deleted?  Or did I misunderstand that?
> >
> 
> Actually the current KIP does not add per-partition znodes. Could you
> double check? I can fix the KIP wiki if there is anything misleading.

Hi Dong,

I double-checked the KIP, and I can see that you are in fact using a global 
counter for initializing partition epochs.  So, you are correct, it doesn't add 
per-partition znodes for partitions that no longer exist.

> 
> If we expire the "partition tomstones" after an hour, and the topic is
> re-created after more than an hour since the topic deletion, then we are
> back to the situation where user can not tell whether the topic has been
> re-created or not, right?

Yes, with an expiration period, it would not ensure immutability-- you could 
effectively reuse partition names and they would look the same.

> 
> 
> >
> > It's not really clear to me what should happen when a topic is destroyed
> > and re-created with new data.  Should consumers continue to be able to
> > consume?  We don't know where they stopped consuming from the previous
> > incarnation of the topic, so messages may have been lost.  Certainly
> > consuming data from offset X of the new incarnation of the topic may give
> > something totally different from what you would have gotten from offset X
> > of the previous incarnation of the topic.
> >
> 
> With the current KIP, if a consumer consumes a topic based on the last
> remembered (offset, partitionEpoch, leaderEpoch), and if the topic is
> re-created, consume will throw InvalidPartitionEpochException because the
> previous partitionEpoch will be different from the current partitionEpoch.
> This is described in the Proposed Changes -> Consumption after topic
> deletion in the KIP. I can improve the KIP if there is anything not clear.

Thanks for the clarification.  It sounds like what you really want is 
immutability-- i.e., to never "really" reuse partition identifiers.  And you do 
this by making the partition name no longer the "real" identifier.

My big concern about this KIP is that it seems like an anti-scalability 
feature.  Now we are adding 4 extra bytes for every partition in the 
FetchResponse and Request, for example.  That could be 40 kb per request, if 
the user has 10,000 partitions.  And of course, the KIP also makes massive 
changes to UpdateMetadataRequest, MetadataResponse, OffsetCommitRequest, 
OffsetFetchResponse, LeaderAndIsrRequest, ListOffsetResponse, etc. which will 
also increase their size on the wire and in memory.

One thing that we talked a lot about in the past is replacing partition names 
with IDs.  IDs have a lot of really nice features.  They take up much less 
space in memory than strings (especially 2-byte Java strings).  They can often 
be allocated on the stack rather than the heap (important when you are dealing 
with hundreds of thousands of them).  They can be efficiently deserialized and 
serialized.  If we use 64-bit ones, we will never run out of IDs, which means 
that they can always be unique per 

Re: [DISCUSS] KIP-212: Enforce set of legal characters for connector names

2018-01-29 Thread Colin McCabe
Sorry if this has been answered elsewhere.  But what is the rationale behind 
Connect explicitly escaping ' " and / when used in connector names?  Shouldn't 
that be handled by percent-encoding when using REST endpoints?

best,
Colin


On Fri, Jan 26, 2018, at 10:21, Sönke Liebau wrote:
> I spent some time with the code today to try and hit the Jan 30th deadline
> for 1.1.
> I'm not entirely done yet, there is one weird test failure thst I need to
> investigate, but I expect to be able to commit a new version sometime
> tomorrow.
> However, I just wanted to describe the current behavior of the code up
> front to see if everybody agrees with it. There are a few peculiarities /
> decision we may still need to take on whether to extend the logic a little.
> 
> Currently connector names are rejected when the method
> Character.isISOControl returns true for any position of the string. This
> catches ascii 1 through 31 as well as 127 & 128 and the escape sequences /r
> /b /t /n /f and /u representations of these characters. Percent
> encoding is decoded before performing these checks, so that can't be used
> to sneak anything past the check.
> 
> Other escape sequences that are unknown to java cause an exception (unknown
> escape character) to be thrown somewhere in the rest classes - I believe
> there is not much we can do about that.
> 
> So far all is well, on to the stuff I am unsure about.
> There is three java escape sewuencrs remaining: /' /" and //
> Currently these are not unescaped by the code and would show up in the
> connector name exactly like that - which means there is no way to get a
> single / in a connector name. Percentencoded backslashes are also converted
> to //.
> 
> Do we want to substitute this (it would be a finite list of three
> substitutions) at the risk of this maybe causing issues somewhere else in
> the code because we created an illegal escape sequence again, or are we
> happy with that behavior for now?
> 
> Kind regards,
> Sönke
> 
> [1]
> https://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isISOControl(int)
> 
> Am 21.01.2018 23:35 schrieb "Sönke Liebau" :
> 
> > I've updated the KIP to prohibit using control characters is connector
> > names - will create a vote thread tomorrow unless I hear back on
> > necessary changes from anybody.
> >
> > Current proposal is to ban all control characters including newline
> > etc. as well as their escape sequences. I have not specifically listed
> > the escape sequences as I will have to dig into that topic a bit more
> > to come up with a useful solution I think, but the general principle
> > is stated.
> >
> > Let me know what you think.
> >
> > Best regards,
> > Sönke
> >
> > On Sun, Jan 21, 2018 at 8:37 PM, Sönke Liebau
> >  wrote:
> > > Hi everybody,
> > >
> > > I was out of touch for personal reasons the entire week, apologies.
> > > I'll update the KIP tonight and kick of a vote tomorrow morning if no
> > > one objects until then. That gives a little less than two full days
> > > for voting until the deadline kicks in - might work out if everybody
> > > is happy with it.
> > >
> > > Best regards,
> > > Sönke
> > >
> > > On Sat, Jan 20, 2018 at 12:38 AM, Randall Hauch 
> > wrote:
> > >> Sonke,
> > >>
> > >> Have you had a chance to update the KIP and kick off a VOTE thread? We
> > need
> > >> to do this ASAP if we want this to make the KIP deadline for 1.1, which
> > is
> > >> Jan 23!
> > >>
> > >> On Tue, Jan 16, 2018 at 10:33 PM, Ewen Cheslack-Postava <
> > e...@confluent.io>
> > >> wrote:
> > >>
> > >>> Sonke,
> > >>>
> > >>> I'm fine filtering some control characters. The trimming also seems
> > like it
> > >>> might be *somewhat* moot because the way connector names work in
> > standalone
> > >>> mode is limited by ConfigDef, which already does trimming of settings.
> > Not
> > >>> a great reason to be restrictive, but we'd partly just be codifying
> > what's
> > >>> there.
> > >>>
> > >>> I just generally have a distaste for being restrictive without a clear
> > >>> reason. In this case I don't think it has any significant impact.
> > >>>
> > >>> KIP freeze is nearing and this seems like a simple improvement and a
> > PR is
> > >>> already available (modulo any changes re: control characters). I'll
> > start
> > >>> reviewing the PR, do you want to make any last updates about control
> > >>> characters in the KIP and kick off a VOTE thread?
> > >>>
> > >>> -Ewen
> > >>>
> > >>> On Fri, Jan 12, 2018 at 1:43 PM, Colin McCabe 
> > wrote:
> > >>>
> > >>> > On Fri, Jan 12, 2018, at 08:03, Sönke Liebau wrote:
> > >>> > > Hi everybody,
> > >>> > >
> > >>> > > from reading the discussion I understand that we have two things
> > still
> > >>> > > open for discussen.
> > >>> > >
> > >>> > >  Ewen is still a bit on the fence about whether or not we trim
> > >>> > > whitespace characters but seems to favor not doing it due to 

Re: [VOTE] KIP-232: Detect outdated metadata using leaderEpoch and partitionEpoch

2018-01-29 Thread Jun Rao
Just some clarification on the current fencing logic. Currently, if the
producer uses acks=-1, a write will only succeed if the write is received
by all in-sync replicas (i.e., committed). This is true even when min.isr
is set since we first wait for a message to be committed and then check the
min.isr requirement. KIP-250 may change that, but we can discuss the
implication there. In the case where you have 3 replicas A, B and C, and A
and B are partitioned off and C becomes the new leader, the old leader A
can't commit new messages with the current ISR of {A, B, C} since C won't
be fetching from A. A will then try to persist the reduced ISR of just A
and B to ZK. This will fail since the ZK version is outdated. Then, no new
message can be committed by A. This is how we fence off the writer.

Currently, there is no fencing for the readers. We can potentially fence
off the reads in the old leader A after a timeout. However, we still need
to decide what to do within the timeout, especially when A is presented
with an offset that's larger than it's last offset.

Thanks,

Jun

On Fri, Jan 26, 2018 at 12:17 PM, Dong Lin  wrote:

> Hey Colin,
>
>
> On Fri, Jan 26, 2018 at 10:16 AM, Colin McCabe  wrote:
>
> > On Thu, Jan 25, 2018, at 16:47, Dong Lin wrote:
> > > Hey Colin,
> > >
> > > Thanks for the comment.
> > >
> > > On Thu, Jan 25, 2018 at 4:15 PM, Colin McCabe 
> > wrote:
> > >
> > > > On Wed, Jan 24, 2018, at 21:07, Dong Lin wrote:
> > > > > Hey Colin,
> > > > >
> > > > > Thanks for reviewing the KIP.
> > > > >
> > > > > If I understand you right, you maybe suggesting that we can use a
> > global
> > > > > metadataEpoch that is incremented every time controller updates
> > metadata.
> > > > > The problem with this solution is that, if a topic is deleted and
> > created
> > > > > again, user will not know whether that the offset which is stored
> > before
> > > > > the topic deletion is no longer valid. This motivates the idea to
> > include
> > > > > per-partition partitionEpoch. Does this sound reasonable?
> > > >
> > > > Hi Dong,
> > > >
> > > > Perhaps we can store the last valid offset of each deleted topic in
> > > > ZooKeeper.  Then, when a topic with one of those names gets
> > re-created, we
> > > > can start the topic at the previous end offset rather than at 0.
> This
> > > > preserves immutability.  It is no more burdensome than having to
> > preserve a
> > > > "last epoch" for the deleted partition somewhere, right?
> > > >
> > >
> > > My concern with this solution is that the number of zookeeper nodes get
> > > more and more over time if some users keep deleting and creating
> topics.
> > Do
> > > you think this can be a problem?
> >
> > Hi Dong,
> >
> > We could expire the "partition tombstones" after an hour or so.  In
> > practice this would solve the issue for clients that like to destroy and
> > re-create topics all the time.  In any case, doesn't the current proposal
> > add per-partition znodes as well that we have to track even after the
> > partition is deleted?  Or did I misunderstand that?
> >
>
> Actually the current KIP does not add per-partition znodes. Could you
> double check? I can fix the KIP wiki if there is anything misleading.
>
> If we expire the "partition tomstones" after an hour, and the topic is
> re-created after more than an hour since the topic deletion, then we are
> back to the situation where user can not tell whether the topic has been
> re-created or not, right?
>
>
> >
> > It's not really clear to me what should happen when a topic is destroyed
> > and re-created with new data.  Should consumers continue to be able to
> > consume?  We don't know where they stopped consuming from the previous
> > incarnation of the topic, so messages may have been lost.  Certainly
> > consuming data from offset X of the new incarnation of the topic may give
> > something totally different from what you would have gotten from offset X
> > of the previous incarnation of the topic.
> >
>
> With the current KIP, if a consumer consumes a topic based on the last
> remembered (offset, partitionEpoch, leaderEpoch), and if the topic is
> re-created, consume will throw InvalidPartitionEpochException because the
> previous partitionEpoch will be different from the current partitionEpoch.
> This is described in the Proposed Changes -> Consumption after topic
> deletion in the KIP. I can improve the KIP if there is anything not clear.
>
>
> > By choosing to reuse the same (topic, partition, offset) 3-tuple, we have
>
> chosen to give up immutability.  That was a really bad decision.  And now
> > we have to worry about time dependencies, stale cached data, and all the
> > rest.  We can't completely fix this inside Kafka no matter what we do,
> > because not all that cached data is inside Kafka itself.  Some of it may
> be
> > in systems that Kafka has sent data to, such as other daemons, SQL
> > databases, streams, and so forth.
> 

Choose the number of partitions/topics

2018-01-29 Thread Maria Pilar
Hi everyone

I have design an integration between 2 systems throug our API Stream Kafka,
and the requirements are unclear to choose properly the number of
partitions/topics.

That is the use case:

My producer will send 28 different type of events, so I have decided to
create 28 topics.

The max size value for one message will be 4,096 bytes and the total size
(MB/day) will be 2.469,888 mb/day.

The retention will be 2 days.

By default I´m thinking in one partition that as recomentation by confluent
it can produce 10 Mb/second.

However the requirement for the consumer is the minimun latency (sub 3
seconds), so I thinking to create more leader partitions/per topic to
paralle and achive the thoughput.

Do you know what is the best practice or formule to define it properly?

Thanks


RE: [DISCUSS]KIP-235 DNS alias and secured connections

2018-01-29 Thread Skrzypek, Jonathan
Hi,

Yes I believe this might address what you're seeing as well.

Jonathan Skrzypek 
Middleware Engineering
Messaging Engineering
Goldman Sachs International

-Original Message-
From: Stephane Maarek [mailto:steph...@simplemachines.com.au] 
Sent: 06 December 2017 10:43
To: dev@kafka.apache.org
Subject: RE: [DISCUSS]KIP-235 DNS alias and secured connections

Hi Jonathan

I think this will be very useful. I reported something similar here :
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_KAFKA-2D4781=DwIFaQ=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4=nNmJlu1rR_QFAPdxGlafmDu9_r6eaCbPOM0NM1EHo-E=3R1dVnw5Ttyz1YbVIMSRNMz2gjWsQmbTNXl63kwXvKo=MywacMwh18eVH_NvLY6Ffhc3CKMh43Tai3WMUf9PsjM=
 

Please confirm your kip will address it ?

Stéphane

On 6 Dec. 2017 8:20 pm, "Skrzypek, Jonathan" 
wrote:

> True, amended the KIP, thanks.
>
> Jonathan Skrzypek
> Middleware Engineering
> Messaging Engineering
> Goldman Sachs International
>
>
> -Original Message-
> From: Tom Bentley [mailto:t.j.bent...@gmail.com]
> Sent: 05 December 2017 18:19
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS]KIP-235 DNS alias and secured connections
>
> Hi Jonathan,
>
> It might be worth mentioning in the KIP that this is necessary only 
> for
> *Kerberos* on SASL, and not other SASL mechanisms. Reading the JIRA it 
> makes sensem, but I was confused up until that point.
>
> Cheers,
>
> Tom
>
> On 5 December 2017 at 17:53, Skrzypek, Jonathan 
> 
> wrote:
>
> > Hi,
> >
> > I would like to discuss a KIP I've submitted :
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.or
> > g_ 
> > confluence_display_KAFKA_KIP-2D=DwIBaQ=7563p3e2zaQw0AB1wrFVgyagb
> > 2I 
> > E5rTZOYPxLxfZlX4=nNmJlu1rR_QFAPdxGlafmDu9_r6eaCbPOM0NM1EHo-E=GWK
> > XA
> > ILbqxFU2j7LtoOx9MZ00uy_jJcGWWIG92CyAuc=fv5WAkOgLhVOmF4vhEzq_39CWnE
> > o0 q0AJbqhAuDFDT0= 
> > 235%3A+Add+DNS+alias+support+for+secured+connection
> >
> > Feedback and suggestions welcome !
> >
> > Regards,
> > Jonathan Skrzypek
> > Middleware Engineering
> > Messaging Engineering
> > Goldman Sachs International
> > Christchurch Court - 10-15 Newgate Street London EC1A 7HD
> > Tel: +442070512977
> >
> >
>


[jira] [Resolved] (KAFKA-6484) 'ConsumerGroupCommand' performance optimization for old consumer describe group

2018-01-29 Thread Randall Hauch (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch resolved KAFKA-6484.
--
Resolution: Duplicate

> 'ConsumerGroupCommand' performance optimization for old consumer describe 
> group
> ---
>
> Key: KAFKA-6484
> URL: https://issues.apache.org/jira/browse/KAFKA-6484
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin
>Affects Versions: 1.0.0
>Reporter: HongLiang
>Priority: Major
> Attachments: ConsumerGroupCommand.diff
>
>
> ConsumerGroupCommand describegroup performance optimization.
> performance improvement 3 times compare trunk(1.0+). and performance 
> improvement 10 times compare 0.10.2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6496) NAT and Kafka

2018-01-29 Thread Ronald van de Kuil (JIRA)
Ronald van de Kuil created KAFKA-6496:
-

 Summary: NAT and Kafka
 Key: KAFKA-6496
 URL: https://issues.apache.org/jira/browse/KAFKA-6496
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 1.0.0
Reporter: Ronald van de Kuil


Hi,

As far as I know Kafka itself does not support NAT based on a test that I did 
with my physical router.

 

I can imagine that a real use case exists where NAT is desirable. For example, 
an OpenStack installation where Kafka hides behind floating ip addresses.

 

Are there any plans, to make Kafka NAT friendly?

 

Best Regards,

Ronald



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6495) Race condition when creating and deleting quickly topics may lead to metrics leak

2018-01-29 Thread Mickael Maison (JIRA)
Mickael Maison created KAFKA-6495:
-

 Summary: Race condition when creating and deleting quickly topics 
may lead to metrics leak
 Key: KAFKA-6495
 URL: https://issues.apache.org/jira/browse/KAFKA-6495
 Project: Kafka
  Issue Type: Bug
Reporter: Mickael Maison


The issue is described in [https://github.com/apache/kafka/pull/4204]

Once a topic has been created, if it gets deleted when a delayed fetch is in 
the purgatory, the topic metrics can get recreated.

PR 4204 is a simple workaround that reduces greatly the timing window but does 
not fix this completely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6494) Extend ConfigCommand to update broker config using new AdminClient

2018-01-29 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-6494:
-

 Summary: Extend ConfigCommand to update broker config using new 
AdminClient
 Key: KAFKA-6494
 URL: https://issues.apache.org/jira/browse/KAFKA-6494
 Project: Kafka
  Issue Type: Sub-task
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 1.1.0


Add --bootstrap-server and --command-config options for new AdminClient. Update 
ConfigCommand to use new AdminClient for dynamic broker config updates in 
KIP-226. Full conversion of ConfigCommand to new AdminClient will be done later 
under KIP-248.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-trunk-jdk9 #341

2018-01-29 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] MINOR: Optimize KTable-KTable join value getter supplier (#4458)

--
[...truncated 1.74 MB...]
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInnerLeft[caching enabled = false] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterInner[caching enabled = false] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterInner[caching enabled = false] PASSED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterOuter[caching enabled = false] STARTED

org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterOuter[caching enabled = false] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInnerRepartitioned[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInnerRepartitioned[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuterRepartitioned[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuterRepartitioned[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInner[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInner[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuter[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuter[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeft[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeft[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testMultiInner[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testMultiInner[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeftRepartitioned[caching enabled = true] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeftRepartitioned[caching enabled = true] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInnerRepartitioned[caching enabled = false] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInnerRepartitioned[caching enabled = false] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuterRepartitioned[caching enabled = false] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuterRepartitioned[caching enabled = false] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInner[caching enabled = false] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testInner[caching enabled = false] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuter[caching enabled = false] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testOuter[caching enabled = false] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeft[caching enabled = false] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeft[caching enabled = false] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testMultiInner[caching enabled = false] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testMultiInner[caching enabled = false] PASSED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeftRepartitioned[caching enabled = false] STARTED

org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest > 
testLeftRepartitioned[caching enabled = false] PASSED

org.apache.kafka.streams.integration.RestoreIntegrationTest > 
shouldProcessDataFromStoresWithLoggingDisabled STARTED

org.apache.kafka.streams.integration.RestoreIntegrationTest > 
shouldProcessDataFromStoresWithLoggingDisabled PASSED

org.apache.kafka.streams.integration.RestoreIntegrationTest > 
shouldRestoreState STARTED

org.apache.kafka.streams.integration.RestoreIntegrationTest > 
shouldRestoreState PASSED

org.apache.kafka.streams.integration.RestoreIntegrationTest > 
shouldSuccessfullyStartWhenLoggingDisabled STARTED
ERROR: No tool found matching GRADLE_3_4_RC_2_HOME
ERROR: Could not install GRADLE_4_3_HOME
java.lang.NullPointerException

org.apache.kafka.streams.integration.RestoreIntegrationTest > 
shouldSuccessfullyStartWhenLoggingDisabled PASSED