Re: [VOTE] 2.0.0 RC3

2018-07-26 Thread Guozhang Wang
+1.

Validated the following:

1. quick start on binary (2.12)
2. unit test on source
3. javadoc
4. web doc
5. included jars (2.12).


Thanks Rajini!


Guozhang


On Wed, Jul 25, 2018 at 8:10 AM, Ron Dagostino  wrote:

> +1 (non-binding)
>
> Built from source and exercised the new SASL/OAUTHBEARER functionality with
> unsecured tokens.
>
> Thanks, Rajini -- apologies for KAFKA-7182.
>
> Ron
>
> On Tue, Jul 24, 2018 at 5:10 PM Vahid S Hashemian <
> vahidhashem...@us.ibm.com>
> wrote:
>
> > +1 (non-binding)
> >
> > Built from source and ran quickstart successfully with both Java 8 and
> > Java 9 on Ubuntu.
> > Thanks Rajini!
> >
> > --Vahid
> >
> >
> >
> >
> > From:   Rajini Sivaram 
> > To: dev , Users ,
> > kafka-clients 
> > Date:   07/24/2018 08:33 AM
> > Subject:[VOTE] 2.0.0 RC3
> >
> >
> >
> > Hello Kafka users, developers and client-developers,
> >
> >
> > This is the fourth candidate for release of Apache Kafka 2.0.0.
> >
> >
> > This is a major version release of Apache Kafka. It includes 40 new  KIPs
> > and
> >
> > several critical bug fixes. Please see the 2.0.0 release plan for more
> > details:
> >
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=80448820
> >
> >
> >
> > A few notable highlights:
> >
> >- Prefixed wildcard ACLs (KIP-290), Fine grained ACLs for CreateTopics
> >(KIP-277)
> >- SASL/OAUTHBEARER implementation (KIP-255)
> >- Improved quota communication and customization of quotas (KIP-219,
> >KIP-257)
> >- Efficient memory usage for down conversion (KIP-283)
> >- Fix log divergence between leader and follower during fast leader
> >failover (KIP-279)
> >- Drop support for Java 7 and remove deprecated code including old
> > scala
> >clients
> >- Connect REST extension plugin, support for externalizing secrets and
> >improved error handling (KIP-285, KIP-297, KIP-298 etc.)
> >- Scala API for Kafka Streams and other Streams API improvements
> >(KIP-270, KIP-150, KIP-245, KIP-251 etc.)
> >
> >
> > Release notes for the 2.0.0 release:
> >
> > http://home.apache.org/~rsivaram/kafka-2.0.0-rc3/RELEASE_NOTES.html
> >
> >
> >
> > *** Please download, test and vote by Friday July 27, 4pm PT.
> >
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> >
> > http://kafka.apache.org/KEYS
> >
> >
> >
> > * Release artifacts to be voted upon (source and binary):
> >
> > http://home.apache.org/~rsivaram/kafka-2.0.0-rc3/
> >
> >
> >
> > * Maven artifacts to be voted upon:
> >
> > https://repository.apache.org/content/groups/staging/
> >
> >
> >
> > * Javadoc:
> >
> > http://home.apache.org/~rsivaram/kafka-2.0.0-rc3/javadoc/
> >
> >
> >
> > * Tag to be voted upon (off 2.0 branch) is the 2.0.0 tag:
> >
> > https://github.com/apache/kafka/releases/tag/2.0.0-rc3
> >
> >
> > * Documentation:
> >
> > http://kafka.apache.org/20/documentation.html
> >
> >
> >
> > * Protocol:
> >
> > http://kafka.apache.org/20/protocol.html
> >
> >
> >
> > * Successful Jenkins builds for the 2.0 branch:
> >
> > Unit/integration tests:
> > https://builds.apache.org/job/kafka-2.0-jdk8/90/
> >
> >
> > System tests:
> > https://jenkins.confluent.io/job/system-test-kafka/job/2.0/41/
> >
> >
> >
> > /**
> >
> >
> > Thanks,
> >
> >
> >
> > Rajini
> >
> >
> >
> >
> >
>



-- 
-- Guozhang


Jenkins build is back to normal : kafka-trunk-jdk10 #322

2018-07-26 Thread Apache Jenkins Server
See 




[DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-07-26 Thread Boyang Chen
Hey friends,


I would like to open a discussion thread on KIP-345:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Reduce+multiple+consumer+rebalances+by+specifying+member+id


This KIP is trying to resolve multiple rebalances by maintaining the consumer 
member id across rebalance generations. I have verified the theory on our 
internal Stream application, and it could reduce rebalance time to a few 
seconds when service is rolling restart.


Let me know your thoughts, thank you!


Best,

Boyang


Build failed in Jenkins: kafka-trunk-jdk8 #2842

2018-07-26 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-7192: Wipe out if EOS is turned on and checkpoint file does not

--
[...truncated 428.79 KB...]

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask PASSED

kafka.utils.SchedulerTest > testNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testRestart STARTED

kafka.utils.SchedulerTest > testRestart PASSED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler STARTED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest > testPeriodicTask STARTED

kafka.utils.SchedulerTest > testPeriodicTask PASSED

kafka.utils.LoggingTest > testLog4jControllerIsRegistered STARTED

kafka.utils.LoggingTest > testLog4jControllerIsRegistered PASSED

kafka.utils.LoggingTest > testLogName STARTED

kafka.utils.LoggingTest > testLogName PASSED

kafka.utils.LoggingTest > testLogNameOverride STARTED

kafka.utils.LoggingTest > testLogNameOverride PASSED

kafka.utils.ZkUtilsTest > testGetSequenceIdMethod STARTED

kafka.utils.ZkUtilsTest > testGetSequenceIdMethod PASSED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testGetAllPartitionsTopicWithoutPartitions STARTED

kafka.utils.ZkUtilsTest > testGetAllPartitionsTopicWithoutPartitions PASSED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath STARTED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath PASSED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing STARTED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing PASSED

kafka.utils.ZkUtilsTest > testGetLeaderIsrAndEpochForPartition STARTED

kafka.utils.ZkUtilsTest > testGetLeaderIsrAndEpochForPartition PASSED

kafka.utils.CoreUtilsTest > testGenerateUuidAsBase64 STARTED

kafka.utils.CoreUtilsTest > testGenerateUuidAsBase64 PASSED

kafka.utils.CoreUtilsTest > testAbs STARTED

kafka.utils.CoreUtilsTest > testAbs PASSED

kafka.utils.CoreUtilsTest > testReplaceSuffix STARTED

kafka.utils.CoreUtilsTest > testReplaceSuffix PASSED

kafka.utils.CoreUtilsTest > testCircularIterator STARTED

kafka.utils.CoreUtilsTest > testCircularIterator PASSED

kafka.utils.CoreUtilsTest > testReadBytes STARTED

kafka.utils.CoreUtilsTest > testReadBytes PASSED

kafka.utils.CoreUtilsTest > testCsvList STARTED

kafka.utils.CoreUtilsTest > testCsvList PASSED

kafka.utils.CoreUtilsTest > testReadInt STARTED

kafka.utils.CoreUtilsTest > testReadInt PASSED

kafka.utils.CoreUtilsTest > testAtomicGetOrUpdate STARTED

kafka.utils.CoreUtilsTest > testAtomicGetOrUpdate PASSED

kafka.utils.CoreUtilsTest > testUrlSafeBase64EncodeUUID STARTED

kafka.utils.CoreUtilsTest > testUrlSafeBase64EncodeUUID PASSED

kafka.utils.CoreUtilsTest > testCsvMap STARTED

kafka.utils.CoreUtilsTest > testCsvMap PASSED

kafka.utils.CoreUtilsTest > testInLock STARTED

kafka.utils.CoreUtilsTest > testInLock PASSED

kafka.utils.CoreUtilsTest > testTryAll STARTED

kafka.utils.CoreUtilsTest > testTryAll PASSED

kafka.utils.CoreUtilsTest > testSwallow STARTED

kafka.utils.CoreUtilsTest > testSwallow PASSED

kafka.utils.IteratorTemplateTest > testIterator STARTED

kafka.utils.IteratorTemplateTest > testIterator PASSED

kafka.utils.json.JsonValueTest > testJsonObjectIterator STARTED

kafka.utils.json.JsonValueTest > testJsonObjectIterator PASSED

kafka.utils.json.JsonValueTest > testDecodeLong STARTED

kafka.utils.json.JsonValueTest > testDecodeLong PASSED

kafka.utils.json.JsonValueTest > testAsJsonObject STARTED

kafka.utils.json.JsonValueTest > testAsJsonObject PASSED

kafka.utils.json.JsonValueTest > testDecodeDouble STARTED

kafka.utils.json.JsonValueTest > testDecodeDouble PASSED

kafka.utils.json.JsonValueTest > testDecodeOption STARTED

kafka.utils.json.JsonValueTest > testDecodeOption PASSED

kafka.utils.json.JsonValueTest > testDecodeString STARTED

kafka.utils.json.JsonValueTest > testDecodeString PASSED

kafka.utils.json.JsonValueTest > testJsonValueToString STARTED

kafka.utils.json.JsonValueTest > testJsonValueToString PASSED

kafka.utils.json.JsonValueTest > testAsJsonObjectOption STARTED

kafka.utils.json.JsonValueTest > testAsJsonObjectOption PASSED

kafka.utils.json.JsonValueTest > testAsJsonArrayOption STARTED

kafka.utils.json.JsonValueTest > testAsJsonArrayOption PASSED

kafka.utils.json.JsonValueTest > testAsJsonArray STARTED

kafka.utils.json.JsonValueTest > testAsJsonArray PASSED

kafka.utils.json.JsonValueTest > testJsonValueHashCode STARTED

kafka.utils.json.JsonValueTest > testJsonValueHashCode 

Build failed in Jenkins: kafka-trunk-jdk10 #321

2018-07-26 Thread Apache Jenkins Server
See 

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H31 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/kafka.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: error: Could not read 24d0d3351a43f52f6f688143672d7c57c5f19595
error: Could not read 1e98b2aa5027eb60ee1a112a7f8ec45b9c47
error: Could not read 689bac3c6fcae927ccee89c8d4fd54c2fae83be1
remote: Counting objects: 3597, done.
remote: Compressing objects:   9% (1/11)   remote: Compressing objects: 
 18% (2/11)   remote: Compressing objects:  27% (3/11)   
remote: Compressing objects:  36% (4/11)   remote: Compressing objects: 
 45% (5/11)   remote: Compressing objects:  54% (6/11)   
remote: Compressing objects:  63% (7/11)   remote: Compressing objects: 
 72% (8/11)   remote: Compressing objects:  81% (9/11)   
remote: Compressing objects:  90% (10/11)   remote: Compressing 
objects: 100% (11/11)   remote: Compressing objects: 100% (11/11), 
done.
Receiving objects:   0% (1/3597)   Receiving objects:   1% (36/3597)   
Receiving objects:   2% (72/3597)   Receiving objects:   3% (108/3597)   
Receiving objects:   4% (144/3597)   Receiving objects:   5% (180/3597)   
Receiving objects:   6% (216/3597)   Receiving objects:   7% (252/3597)   
Receiving objects:   8% (288/3597)   Receiving objects:   9% (324/3597)   
Receiving objects:  10% (360/3597)   Receiving objects:  11% (396/3597)   
Receiving objects:  12% (432/3597)   Receiving objects:  13% (468/3597)   
Receiving objects:  14% (504/3597)   Receiving objects:  15% (540/3597)   
Receiving objects:  16% (576/3597)   Receiving objects:  17% (612/3597)   
Receiving objects:  18% (648/3597)   Receiving objects:  19% (684/3597)   
Receiving objects:  20% (720/3597)   Receiving objects:  21% (756/3597)   
Receiving objects:  22% (792/3597)   Receiving objects:  23% (828/3597)   
Receiving objects:  24% (864/3597)   Receiving objects:  25% (900/3597)   
Receiving objects:  26% (936/3597)   Receiving objects:  27% (972/3597)   
Receiving objects:  28% (1008/3597)   Receiving objects:  29% (1044/3597)   
Receiving objects:  30% (1080/3597)   Receiving objects:  31% (1116/3597)   
Receiving objects:  32% (1152/3597)   Receiving objects:  33% (1188/3597)   
Receiving objects:  34% (1223/3597)   Receiving objects:  35% (1259/3597)   
Receiving objects:  36% (1295/3597)   Receiving objects:  37% (1331/3597)   
Receiving objects:  38% (1367/3597)   Receiving objects:  39% (1403/3597)   
Receiving objects:  40% (1439/3597)   Receiving objects:  41% (1475/3597)   
Receiving objects:  42% (1511/3597)   Receiving objects:  43% (1547/3597)   
Receiving objects:  44% (1583/3597)   Receiving objects:  45% (1619/3597)   
Receiving objects:  46% (1655/3597)   Receiving objects:  47% (1691/3597)   
Receiving objects:  48% (1727/3597)   Receiving objects:  49% (1763/3597)   
Receiving objects:  50% (1799/3597)   Receiving objects:  51% (1835/3597)   
Receiving objects:  52% (1871/3597)   Receiving objects:  53% (1907/3597)   
Receiving objects:  54% (1943/3597)   Receiving objects:  55% (1979/3597)   
Receiving objects:  56% (2015/3597)   Receiving objects:  57% (2051/3597)   
Receiving objects:  58% (2087/3597)   Receiving objects:  59% 

Jenkins build is back to normal : kafka-1.1-jdk7 #170

2018-07-26 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-trunk-jdk10 #320

2018-07-26 Thread Apache Jenkins Server
See 

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H23 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/kafka.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: error: Could not read 0f3affc0f40751dc8fd064b36b6e859728f63e37
error: Could not read 95fbb2e03f4fe79737c71632e0ef2dfdcfb85a69
error: Could not read 08c465028d057ac23cdfe6d57641fe40240359dd
remote: Counting objects: 6107, done.
remote: Compressing objects:   1% (1/75)   remote: Compressing objects: 
  2% (2/75)   remote: Compressing objects:   4% (3/75)   
remote: Compressing objects:   5% (4/75)   remote: Compressing objects: 
  6% (5/75)   remote: Compressing objects:   8% (6/75)   
remote: Compressing objects:   9% (7/75)   remote: Compressing objects: 
 10% (8/75)   remote: Compressing objects:  12% (9/75)   
remote: Compressing objects:  13% (10/75)   remote: Compressing 
objects:  14% (11/75)   remote: Compressing objects:  16% (12/75)   
remote: Compressing objects:  17% (13/75)   remote: Compressing 
objects:  18% (14/75)   remote: Compressing objects:  20% (15/75)   
remote: Compressing objects:  21% (16/75)   remote: Compressing 
objects:  22% (17/75)   remote: Compressing objects:  24% (18/75)   
remote: Compressing objects:  25% (19/75)   remote: Compressing 
objects:  26% (20/75)   remote: Compressing objects:  28% (21/75)   
remote: Compressing objects:  29% (22/75)   remote: Compressing 
objects:  30% (23/75)   remote: Compressing objects:  32% (24/75)   
remote: Compressing objects:  33% (25/75)   remote: Compressing 
objects:  34% (26/75)   remote: Compressing objects:  36% (27/75)   
remote: Compressing objects:  37% (28/75)   remote: Compressing 
objects:  38% (29/75)   remote: Compressing objects:  40% (30/75)   
remote: Compressing objects:  41% (31/75)   remote: Compressing 
objects:  42% (32/75)   remote: Compressing objects:  44% (33/75)   
remote: Compressing objects:  45% (34/75)   remote: Compressing 
objects:  46% (35/75)   remote: Compressing objects:  48% (36/75)   
remote: Compressing objects:  49% (37/75)   remote: Compressing 
objects:  50% (38/75)   remote: Compressing objects:  52% (39/75)   
remote: Compressing objects:  53% (40/75)   remote: Compressing 
objects:  54% (41/75)   remote: Compressing objects:  56% (42/75)   
remote: Compressing objects:  57% (43/75)   remote: Compressing 
objects:  58% (44/75)   remote: Compressing objects:  60% (45/75)   
remote: Compressing objects:  61% (46/75)   remote: Compressing 
objects:  62% (47/75)   remote: Compressing objects:  64% (48/75)   
remote: Compressing objects:  65% (49/75)   remote: Compressing 
objects:  66% (50/75)   remote: Compressing objects:  68% (51/75)   
remote: Compressing objects:  69% (52/75)   remote: Compressing 
objects:  70% (53/75)   remote: Compressing objects:  72% (54/75)   
remote: Compressing objects:  73% (55/75)   

[jira] [Resolved] (KAFKA-7192) State-store can desynchronise with changelog

2018-07-26 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-7192.
--
Resolution: Fixed

> State-store can desynchronise with changelog
> 
>
> Key: KAFKA-7192
> URL: https://issues.apache.org/jira/browse/KAFKA-7192
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.1.1
>Reporter: Jon Bates
>Assignee: Guozhang Wang
>Priority: Critical
>  Labels: bugs
> Fix For: 2.1.0
>
>
> n.b. this bug has been verified with exactly-once processing enabled
> Consider the following scenario:
>  * A record, N is read into a Kafka topology
>  * the state store is updated
>  * the topology crashes
> h3. *Expected behaviour:*
>  # Node is restarted
>  # Offset was never updated, so record N is reprocessed
>  # State-store is reset to position N-1
>  # Record is reprocessed
> h3. *Actual Behaviour*
>  # Node is restarted
>  # Record N is reprocessed (good)
>  # The state store has the state from the previous processing
> I'd consider this a corruption of the state-store, hence the critical 
> Priority, although High may be more appropriate.
> I wrote a proof-of-concept here, which demonstrates the problem on Linux:
> [https://github.com/spadger/kafka-streams-sad-state-store]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-trunk-jdk8 #2841

2018-07-26 Thread Apache Jenkins Server
See 


Changes:

[lindong28] KAFKA-7126; Reduce number of rebalance for large consumer group 
after a

--
[...truncated 885.29 KB...]

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask PASSED

kafka.utils.SchedulerTest > testNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testRestart STARTED

kafka.utils.SchedulerTest > testRestart PASSED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler STARTED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest > testPeriodicTask STARTED

kafka.utils.SchedulerTest > testPeriodicTask PASSED

kafka.utils.LoggingTest > testLog4jControllerIsRegistered STARTED

kafka.utils.LoggingTest > testLog4jControllerIsRegistered PASSED

kafka.utils.LoggingTest > testLogName STARTED

kafka.utils.LoggingTest > testLogName PASSED

kafka.utils.LoggingTest > testLogNameOverride STARTED

kafka.utils.LoggingTest > testLogNameOverride PASSED

kafka.utils.ZkUtilsTest > testGetSequenceIdMethod STARTED

kafka.utils.ZkUtilsTest > testGetSequenceIdMethod PASSED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testGetAllPartitionsTopicWithoutPartitions STARTED

kafka.utils.ZkUtilsTest > testGetAllPartitionsTopicWithoutPartitions PASSED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath STARTED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath PASSED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing STARTED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing PASSED

kafka.utils.ZkUtilsTest > testGetLeaderIsrAndEpochForPartition STARTED

kafka.utils.ZkUtilsTest > testGetLeaderIsrAndEpochForPartition PASSED

kafka.utils.CoreUtilsTest > testGenerateUuidAsBase64 STARTED

kafka.utils.CoreUtilsTest > testGenerateUuidAsBase64 PASSED

kafka.utils.CoreUtilsTest > testAbs STARTED

kafka.utils.CoreUtilsTest > testAbs PASSED

kafka.utils.CoreUtilsTest > testReplaceSuffix STARTED

kafka.utils.CoreUtilsTest > testReplaceSuffix PASSED

kafka.utils.CoreUtilsTest > testCircularIterator STARTED

kafka.utils.CoreUtilsTest > testCircularIterator PASSED

kafka.utils.CoreUtilsTest > testReadBytes STARTED

kafka.utils.CoreUtilsTest > testReadBytes PASSED

kafka.utils.CoreUtilsTest > testCsvList STARTED

kafka.utils.CoreUtilsTest > testCsvList PASSED

kafka.utils.CoreUtilsTest > testReadInt STARTED

kafka.utils.CoreUtilsTest > testReadInt PASSED

kafka.utils.CoreUtilsTest > testAtomicGetOrUpdate STARTED

kafka.utils.CoreUtilsTest > testAtomicGetOrUpdate PASSED

kafka.utils.CoreUtilsTest > testUrlSafeBase64EncodeUUID STARTED

kafka.utils.CoreUtilsTest > testUrlSafeBase64EncodeUUID PASSED

kafka.utils.CoreUtilsTest > testCsvMap STARTED

kafka.utils.CoreUtilsTest > testCsvMap PASSED

kafka.utils.CoreUtilsTest > testInLock STARTED

kafka.utils.CoreUtilsTest > testInLock PASSED

kafka.utils.CoreUtilsTest > testTryAll STARTED

kafka.utils.CoreUtilsTest > testTryAll PASSED

kafka.utils.CoreUtilsTest > testSwallow STARTED

kafka.utils.CoreUtilsTest > testSwallow PASSED

kafka.utils.IteratorTemplateTest > testIterator STARTED

kafka.utils.IteratorTemplateTest > testIterator PASSED

kafka.utils.json.JsonValueTest > testJsonObjectIterator STARTED

kafka.utils.json.JsonValueTest > testJsonObjectIterator PASSED

kafka.utils.json.JsonValueTest > testDecodeLong STARTED

kafka.utils.json.JsonValueTest > testDecodeLong PASSED

kafka.utils.json.JsonValueTest > testAsJsonObject STARTED

kafka.utils.json.JsonValueTest > testAsJsonObject PASSED

kafka.utils.json.JsonValueTest > testDecodeDouble STARTED

kafka.utils.json.JsonValueTest > testDecodeDouble PASSED

kafka.utils.json.JsonValueTest > testDecodeOption STARTED

kafka.utils.json.JsonValueTest > testDecodeOption PASSED

kafka.utils.json.JsonValueTest > testDecodeString STARTED

kafka.utils.json.JsonValueTest > testDecodeString PASSED

kafka.utils.json.JsonValueTest > testJsonValueToString STARTED

kafka.utils.json.JsonValueTest > testJsonValueToString PASSED

kafka.utils.json.JsonValueTest > testAsJsonObjectOption STARTED

kafka.utils.json.JsonValueTest > testAsJsonObjectOption PASSED

kafka.utils.json.JsonValueTest > testAsJsonArrayOption STARTED

kafka.utils.json.JsonValueTest > testAsJsonArrayOption PASSED

kafka.utils.json.JsonValueTest > testAsJsonArray STARTED

kafka.utils.json.JsonValueTest > testAsJsonArray PASSED

kafka.utils.json.JsonValueTest > testJsonValueHashCode STARTED

kafka.utils.json.JsonValueTest > 

Jenkins build is back to normal : kafka-2.0-jdk8 #92

2018-07-26 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-1.1-jdk7 #169

2018-07-26 Thread Apache Jenkins Server
See 

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H24 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/kafka.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: error: Could not read 060fd7c5cc2a34d43871af8452a07a84ee32bd20
error: Could not read 1785c72ff1ef79f46217661abcf9b20ad2c4faf6
error: Could not read e9f6f2bdcef91f79d128e75e91a1184461ec3de6
error: Could not read 38e4be686ccd8906f936c0c010d628956c2844a7
error: missing object referenced by 'refs/tags/1.0.2'
error: Could not read f54ba7cf8528b5082471a86850ba951244a990e1
error: Could not read 50595d48edeaaf2c0ac255c1514af50b339f7d30
error: Could not read 2a121f7b1d402825cde176be346de8f91c6c7c81
remote: Counting objects: 7575, done.
remote: Compressing objects:   1% (1/62)   remote: Compressing objects: 
  3% (2/62)   remote: Compressing objects:   4% (3/62)   
remote: Compressing objects:   6% (4/62)   remote: Compressing objects: 
  8% (5/62)   remote: Compressing objects:   9% (6/62)   
remote: Compressing objects:  11% (7/62)   remote: Compressing objects: 
 12% (8/62)   remote: Compressing objects:  14% (9/62)   
remote: Compressing objects:  16% (10/62)   remote: Compressing 
objects:  17% (11/62)   remote: Compressing objects:  19% (12/62)   
remote: Compressing objects:  20% (13/62)   remote: Compressing 
objects:  22% (14/62)   remote: Compressing objects:  24% (15/62)   
remote: Compressing objects:  25% (16/62)   remote: Compressing 
objects:  27% (17/62)   remote: Compressing objects:  29% (18/62)   
remote: Compressing objects:  30% (19/62)   remote: Compressing 
objects:  32% (20/62)   remote: Compressing objects:  33% (21/62)   
remote: Compressing objects:  35% (22/62)   remote: Compressing 
objects:  37% (23/62)   remote: Compressing objects:  38% (24/62)   
remote: Compressing objects:  40% (25/62)   remote: Compressing 
objects:  41% (26/62)   remote: Compressing objects:  43% (27/62)   
remote: Compressing objects:  45% (28/62)   remote: Compressing 
objects:  46% (29/62)   remote: Compressing objects:  48% (30/62)   
remote: Compressing objects:  50% (31/62)   remote: Compressing 
objects:  51% (32/62)   remote: Compressing objects:  53% (33/62)   
remote: Compressing objects:  54% (34/62)   remote: Compressing 
objects:  56% (35/62)   remote: Compressing objects:  58% (36/62)   
remote: Compressing objects:  59% (37/62)   remote: Compressing 
objects:  61% (38/62)   remote: Compressing objects:  62% (39/62)   
remote: Compressing objects:  64% (40/62)   remote: Compressing 
objects:  66% (41/62)   remote: Compressing objects:  67% (42/62)   
remote: Compressing objects:  69% (43/62)   remote: Compressing 
objects:  70% (44/62)   remote: Compressing objects:  72% (45/62)   
remote: Compressing objects:  74% (46/62)   remote: Compressing 
objects:  75% (47/62)   remote: Compressing objects:  77% (48/62)   
remote: Compressing objects:  79% (49/62)   remote: Compressing 

Build failed in Jenkins: kafka-trunk-jdk10 #319

2018-07-26 Thread Apache Jenkins Server
See 


Changes:

[lindong28] KAFKA-7126; Reduce number of rebalance for large consumer group 
after a

--
[...truncated 1.54 MB...]

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldSupportEpochsThatDoNotStartFromZero STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldSupportEpochsThatDoNotStartFromZero PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfRequestedEpochLessThanFirstEpoch STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfRequestedEpochLessThanFirstEpoch PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnNextAvailableEpochIfThereIsNoExactEpochForTheOneRequested STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnNextAvailableEpochIfThereIsNoExactEpochForTheOneRequested PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldAddEpochAndMessageOffsetToCache STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldAddEpochAndMessageOffsetToCache PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldDropEntriesOnEpochBoundaryWhenRemovingLatestEntries STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldDropEntriesOnEpochBoundaryWhenRemovingLatestEntries PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateSavedOffsetWhenOffsetToClearToIsBetweenEpochs STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateSavedOffsetWhenOffsetToClearToIsBetweenEpochs PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotResetEpochHistoryTailIfUndefinedPassed STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotResetEpochHistoryTailIfUndefinedPassed PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfNoEpochRecorded STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfNoEpochRecorded PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfNoEpochRecordedAndUndefinedEpochRequested STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfNoEpochRecordedAndUndefinedEpochRequested PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliest STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliest PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPersistEpochsBetweenInstances STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPersistEpochsBetweenInstances PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotClearAnythingIfOffsetToFirstOffset STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotClearAnythingIfOffsetToFirstOffset PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotLetOffsetsGoBackwardsEvenIfEpochsProgress STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotLetOffsetsGoBackwardsEvenIfEpochsProgress PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldGetFirstOffsetOfSubsequentEpochWhenOffsetRequestedForPreviousEpoch STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldGetFirstOffsetOfSubsequentEpochWhenOffsetRequestedForPreviousEpoch PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest2 STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest2 PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearEarliestOnEmptyCache 
STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearEarliestOnEmptyCache 
PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPreserveResetOffsetOnClearEarliestIfOneExists STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPreserveResetOffsetOnClearEarliestIfOneExists PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnInvalidOffsetIfEpochIsRequestedWhichIsNotCurrentlyTracked STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnInvalidOffsetIfEpochIsRequestedWhichIsNotCurrentlyTracked PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldFetchEndOffsetOfEmptyCache 
STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldFetchEndOffsetOfEmptyCache 
PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliestAndUpdateItsOffset STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliestAndUpdateItsOffset PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearAllEntries STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearAllEntries PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearLatestOnEmptyCache 
STARTED


Jenkins build is back to normal : kafka-trunk-jdk8 #2840

2018-07-26 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-1.1-jdk7 #168

2018-07-26 Thread Apache Jenkins Server
See 

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H24 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/kafka.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: error: Could not read 060fd7c5cc2a34d43871af8452a07a84ee32bd20
error: Could not read 1785c72ff1ef79f46217661abcf9b20ad2c4faf6
error: Could not read e9f6f2bdcef91f79d128e75e91a1184461ec3de6
error: Could not read 38e4be686ccd8906f936c0c010d628956c2844a7
error: missing object referenced by 'refs/tags/1.0.2'
error: Could not read f54ba7cf8528b5082471a86850ba951244a990e1
error: Could not read 50595d48edeaaf2c0ac255c1514af50b339f7d30
error: Could not read 2a121f7b1d402825cde176be346de8f91c6c7c81
remote: Counting objects: 7575, done.
remote: Compressing objects:   1% (1/62)   remote: Compressing objects: 
  3% (2/62)   remote: Compressing objects:   4% (3/62)   
remote: Compressing objects:   6% (4/62)   remote: Compressing objects: 
  8% (5/62)   remote: Compressing objects:   9% (6/62)   
remote: Compressing objects:  11% (7/62)   remote: Compressing objects: 
 12% (8/62)   remote: Compressing objects:  14% (9/62)   
remote: Compressing objects:  16% (10/62)   remote: Compressing 
objects:  17% (11/62)   remote: Compressing objects:  19% (12/62)   
remote: Compressing objects:  20% (13/62)   remote: Compressing 
objects:  22% (14/62)   remote: Compressing objects:  24% (15/62)   
remote: Compressing objects:  25% (16/62)   remote: Compressing 
objects:  27% (17/62)   remote: Compressing objects:  29% (18/62)   
remote: Compressing objects:  30% (19/62)   remote: Compressing 
objects:  32% (20/62)   remote: Compressing objects:  33% (21/62)   
remote: Compressing objects:  35% (22/62)   remote: Compressing 
objects:  37% (23/62)   remote: Compressing objects:  38% (24/62)   
remote: Compressing objects:  40% (25/62)   remote: Compressing 
objects:  41% (26/62)   remote: Compressing objects:  43% (27/62)   
remote: Compressing objects:  45% (28/62)   remote: Compressing 
objects:  46% (29/62)   remote: Compressing objects:  48% (30/62)   
remote: Compressing objects:  50% (31/62)   remote: Compressing 
objects:  51% (32/62)   remote: Compressing objects:  53% (33/62)   
remote: Compressing objects:  54% (34/62)   remote: Compressing 
objects:  56% (35/62)   remote: Compressing objects:  58% (36/62)   
remote: Compressing objects:  59% (37/62)   remote: Compressing 
objects:  61% (38/62)   remote: Compressing objects:  62% (39/62)   
remote: Compressing objects:  64% (40/62)   remote: Compressing 
objects:  66% (41/62)   remote: Compressing objects:  67% (42/62)   
remote: Compressing objects:  69% (43/62)   remote: Compressing 
objects:  70% (44/62)   remote: Compressing objects:  72% (45/62)   
remote: Compressing objects:  74% (46/62)   remote: Compressing 
objects:  75% (47/62)   remote: Compressing objects:  77% (48/62)   
remote: Compressing objects:  79% (49/62)   remote: Compressing 

Re: [DISCUSS] KIP-320: Allow fetchers to detect and handle log truncation

2018-07-26 Thread Anna Povzner
Hi Jason,

Thanks for the update. I agree with the current proposal.

Two minor comments:
1) In “API Changes” section, first paragraph says that “users can catch the
more specific exception type and use the new `seekToNearest()` API defined
below.”. Since LogTruncationException “will include the partitions that
were truncated and the offset of divergence”., shouldn’t the client use
seek(offset) to seek to the offset of divergence in response to the
exception?
2) In “Protocol Changes” section, OffsetsForLeaderEpoch subsection says “Note
that consumers will send a sentinel value (-1) for the current epoch and
the broker will simply disregard that validation.”. Is that still true with
MetadataResponse containing leader epoch?

Thanks,
Anna

On Wed, Jul 25, 2018 at 1:44 PM Jason Gustafson  wrote:

> Hi All,
>
> I have made some updates to the KIP. As many of you know, a side project of
> mine has been specifying the Kafka replication protocol in TLA. You can
> check out the code here if you are interested:
> https://github.com/hachikuji/kafka-specification. In addition to
> uncovering
> a couple unknown bugs in the replication protocol (e.g.
> https://issues.apache.org/jira/browse/KAFKA-7128), this has helped me
> validate the behavior in this KIP. In fact, the original version I proposed
> had a weakness. I initially suggested letting the leader validate the
> expected epoch at the fetch offset. This made sense for the consumer in the
> handling of unclean leader election, but it was not strong enough to
> protect the follower in all cases. In order to make advancement of the high
> watermark safe, for example, the leader actually needs to be sure that
> every follower in the ISR matches its own epoch.
>
> I attempted to fix this problem by treating the epoch in the fetch request
> slightly differently for consumers and followers. For consumers, it would
> be the expected epoch of the record at the fetch offset, and the leader
> would raise a LOG_TRUNCATION error if the expectation failed. For
> followers, it would be the current epoch and the leader would require that
> it match its own epoch. This was unsatisfying both because of the
> inconsistency in behavior and because the consumer was left with the weaker
> fencing that we already knew was insufficient for the replicas. Ultimately
> I decided that we should make the behavior consistent and that meant that
> the consumer needed to act more like a following replica. Instead of
> checking for truncation while fetching, the consumer should check for
> truncation after leader changes. After checking for truncation, the
> consumer can then use the current epoch when fetching and get the stronger
> protection that it provides. What this means is that the Metadata API must
> include the current leader epoch. Given the problems we have had around
> stale metadata and how challenging they have been to debug, I'm convinced
> that this is a good idea in any case and it resolves the inconsistent
> behavior in the Fetch API. The downside is that there will be some
> additional overhead upon leader changes, but I don't think it is a major
> concern since leader changes are rare and the OffsetForLeaderEpoch request
> is cheap.
>
> This approach leaves the door open for some interesting follow up
> improvements. For example, now that we have the leader epoch in the
> Metadata request, we can implement similar fencing for the Produce API. And
> now that the consumer can reason about truncation, we could consider having
> a configuration to expose records beyond the high watermark. This would let
> users trade lower end-to-end latency for weaker durability semantics. It is
> sort of like having an acks=0 option for the consumer. Neither of these
> options are included in this KIP, I am just mentioning them as potential
> work for the future.
>
> Finally, based on the discussion in this thread, I have added the
> seekToCommitted API for the consumer. Please take a look and let me know
> what you think.
>
> Thanks,
> Jason
>
> On Fri, Jul 20, 2018 at 2:34 PM, Guozhang Wang  wrote:
>
> > Hi Jason,
> >
> > The proposed API seems reasonable to me too. Could you please also update
> > the wiki page (
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 320%3A+Allow+fetchers+to+detect+and+handle+log+truncation)
> > with a section say "workflow" on how the proposed API will be co-used
> with
> > others to:
> >
> > 1. consumer callers handling a LogTruncationException.
> > 2. consumer internals for handling a retriable
> UnknownLeaderEpochException.
> >
> >
> > Guozhang
> >
> >
> > On Tue, Jul 17, 2018 at 10:23 AM, Anna Povzner 
> wrote:
> >
> > > Hi Jason,
> > >
> > >
> > > I also like your proposal and agree that
> KafkaConsumer#seekToCommitted()
> > > is
> > > more intuitive as a way to initialize both consumer's position and its
> > > fetch state.
> > >
> > >
> > > My understanding that KafkaConsumer#seekToCommitted() is purely for
> > > clients
> > > who store 

Re: [DISCUSS] KIP-346 - Limit blast radius of log compaction failure

2018-07-26 Thread Stanislav Kozlovski
Hey guys,

@Colin - good point. I added some sentences mentioning recent improvements
in the introductory section.

*Disk Failure* - I tend to agree with what Colin said - once a disk fails,
you don't want to work with it again. As such, I've changed my mind and
believe that we should mark the LogDir (assume its a disk) as offline on
the first `IOException` encountered. This is the LogCleaner's current
behavior. We shouldn't change that.

*Respawning Threads* - I believe we should never re-spawn a thread. The
correct approach in my mind is to either have it stay dead or never let it
die in the first place.

*Uncleanable-partition-names metric* - Colin is right, this metric is
unneeded. Users can monitor the `uncleanable-partitions-count` metric and
inspect logs.


Hey Ray,

> 2) I'm 100% with James in agreement with setting up the LogCleaner to
> skip over problematic partitions instead of dying.
I think we can do this for every exception that isn't `IOException`. This
will future-proof us against bugs in the system and potential other errors.
Protecting yourself against unexpected failures is always a good thing in
my mind, but I also think that protecting yourself against bugs in the
software is sort of clunky. What does everybody think about this?

> 4) The only improvement I can think of is that if such an
> error occurs, then have the option (configuration setting?) to create a
> .skip file (or something similar).
This is a good suggestion. Have others also seen corruption be generally
tied to the same segment?

On Wed, Jul 25, 2018 at 11:55 AM Dhruvil Shah  wrote:

> For the cleaner thread specifically, I do not think respawning will help at
> all because we are more than likely to run into the same issue again which
> would end up crashing the cleaner. Retrying makes sense for transient
> errors or when you believe some part of the system could have healed
> itself, both of which I think are not true for the log cleaner.
>
> On Wed, Jul 25, 2018 at 11:08 AM Ron Dagostino  wrote:
>
> > << an
> > infinite loop which consumes resources and fires off continuous log
> > messages.
> > Hi Colin.  In case it could be relevant, one way to mitigate this effect
> is
> > to implement a backoff mechanism (if a second respawn is to occur then
> wait
> > for 1 minute before doing it; then if a third respawn is to occur wait
> for
> > 2 minutes before doing it; then 4 minutes, 8 minutes, etc. up to some max
> > wait time).
> >
> > I have no opinion on whether respawn is appropriate or not in this
> context,
> > but a mitigation like the increasing backoff described above may be
> > relevant in weighing the pros and cons.
> >
> > Ron
> >
> > On Wed, Jul 25, 2018 at 1:26 PM Colin McCabe  wrote:
> >
> > > On Mon, Jul 23, 2018, at 23:20, James Cheng wrote:
> > > > Hi Stanislav! Thanks for this KIP!
> > > >
> > > > I agree that it would be good if the LogCleaner were more tolerant of
> > > > errors. Currently, as you said, once it dies, it stays dead.
> > > >
> > > > Things are better now than they used to be. We have the metric
> > > >   kafka.log:type=LogCleanerManager,name=time-since-last-run-ms
> > > > which we can use to tell us if the threads are dead. And as of 1.1.0,
> > we
> > > > have KIP-226, which allows you to restart the log cleaner thread,
> > > > without requiring a broker restart.
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-226+-+Dynamic+Broker+Configuration
> > > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-226+-+Dynamic+Broker+Configuration
> > >
> > >
> > > > I've only read about this, I haven't personally tried it.
> > >
> > > Thanks for pointing this out, James!  Stanislav, we should probably
> add a
> > > sentence or two mentioning the KIP-226 changes somewhere in the KIP.
> > Maybe
> > > in the intro section?
> > >
> > > I think it's clear that requiring the users to manually restart the log
> > > cleaner is not a very good solution.  But it's good to know that it's a
> > > possibility on some older releases.
> > >
> > > >
> > > > Some comments:
> > > > * I like the idea of having the log cleaner continue to clean as many
> > > > partitions as it can, skipping over the problematic ones if possible.
> > > >
> > > > * If the log cleaner thread dies, I think it should automatically be
> > > > revived. Your KIP attempts to do that by catching exceptions during
> > > > execution, but I think we should go all the way and make sure that a
> > new
> > > > one gets created, if the thread ever dies.
> > >
> > > This is inconsistent with the way the rest of Kafka works.  We don't
> > > automatically re-create other threads in the broker if they terminate.
> > In
> > > general, if there is a serious bug in the code, respawning threads is
> > > likely to make things worse, by putting you in an infinite loop which
> > > consumes resources and fires off continuous log messages.
> > >
> > > >
> > > > * It might be worth trying to re-clean the 

Re: Discussion: New components in JIRA?

2018-07-26 Thread Ray Chiang

Thanks Guozhang.  I'm good with the way the documentation is now.

Is there any other procedure to follow to get "logging" and 
"mirrormaker" added as components or can we just request a JIRA admin to 
do that on this list?


-Ray

On 7/23/18 4:56 PM, Guozhang Wang wrote:

I've just updated the web docs on http://kafka.apache.org/contributing
accordingly.

On Mon, Jul 23, 2018 at 3:30 PM, khaireddine Rezgui <
khaireddine...@gmail.com> wrote:


Good job Ray for the wiki, it's clear enough.

Le 23 juil. 2018 10:17 PM, "Ray Chiang"  a écrit :

Okay, I've created a wiki page Reporting Issues in Apache Kafka
<
https://cwiki.apache.org/confluence/display/KAFKA/
Reporting+Issues+in+Apache+Kafka>.

I'd appreciate any feedback.  If this is good enough, I can file a JIRA
to change the link under "Bugs" in the "Project information" page.


-Ray


On 7/23/18 11:28 AM, Ray Chiang wrote:

Good point.  I'll look into adding some JIRA guidelines to the
documentation/wiki.

-Ray

On 7/22/18 10:23 AM, Guozhang Wang wrote:

Hello Ray,

Thanks for brining this up. I'm generally +1 on the first two, while for
the last category, personally I felt leaving them as part of `tools` is
fine, but I'm also open for other opinions.

A more general question though, is that today we do not have any
guidelines
to ask JIRA reporters to set the right component, i.e. it is purely
best-effort, and we cannot disallow reporters to add any new component
names. And so far the project does not really have a tradition to manage
JIRA reports per-component, as the goal is to not "separate" the project
into silos but recommending everyone to get hands on every aspect of the
project.


Guozhang


On Fri, Jul 20, 2018 at 2:44 PM, Ray Chiang  wrote:


I've been doing a little bit of component cleanup in JIRA.  What do
people
think of adding
one or more of the following components?

- logging: For any consumer/producer/broker logging (i.e. log4j). This
should help disambiguate from the "log" component (i.e. Kafka
messages).

- mirrormaker: There are enough requests specific to MirrorMaker
that it
could be put into its own component.

- scripts: I'm a little more ambivalent about this one, but any of the
bin/*.sh script fixes could belong in their own category.  I'm not
sure if
other people feel strongly for how the "tools" component should be used
w.r.t. the run scripts.

Any thoughts?

-Ray









Re: [DISCUSS] KIP-346 - Limit blast radius of log compaction failure

2018-07-26 Thread Ray Chiang

Thanks for creating this KIP Stanislav.  My observations:

1) I agree with Colin that threads automatically re-launching threads 
generally isn't a great idea.  Metrics and/or monitoring threads are 
generally much safer.  And there's always the issue of what happens if 
the re-launcher dies?


2) I'm 100% with James in agreement with setting up the LogCleaner to 
skip over problematic partitions instead of dying.


3) There's a lot of "feature bloat" suggestions.  From how I see things, 
a message could get corrupted in one of several states:


3a) Message is corrupted by the leader partition saving to disk. 
Replicas have the same error.
3b) Message is corrupted by one of the replica partitions saving to 
disk.  Leader and other replica(s) unlikely to have the same error

3c) Disk corruption happens later (e.g. during partition move)

If we have the simplest solution, then all of the above will not cause 
the LogCleaner to crash and 3b/3c have a chance of manual recovery.


4) In most of the issues I'm seeing via work, most of the corruption 
seems persistent on the same log segment (i.e. a 3b/3c type of 
corruption).  The only improvement I can think of is that if such an 
error occurs, then have the option (configuration setting?) to create a 
.skip file (or something similar).  If the .skip file is 
there, don't re-scan the segment.  If you want a re-try or manage to fix 
the issue manually (e.g. copying from a replica), then the .skip file 
can be deleted after the segment is fixed and the LogCleaner will try 
again on the next iteration.


5) I'm in alignment with Colin's comment about hard drive failures. By 
the time you can reliably detect HDD hardware failures, it's less about 
improving the LogCleaner as much as that data needs to be moved to a new 
drive.


-Ray

On 7/25/18 11:55 AM, Dhruvil Shah wrote:

For the cleaner thread specifically, I do not think respawning will help at
all because we are more than likely to run into the same issue again which
would end up crashing the cleaner. Retrying makes sense for transient
errors or when you believe some part of the system could have healed
itself, both of which I think are not true for the log cleaner.

On Wed, Jul 25, 2018 at 11:08 AM Ron Dagostino  wrote:


<< wrote:


On Mon, Jul 23, 2018, at 23:20, James Cheng wrote:

Hi Stanislav! Thanks for this KIP!

I agree that it would be good if the LogCleaner were more tolerant of
errors. Currently, as you said, once it dies, it stays dead.

Things are better now than they used to be. We have the metric
   kafka.log:type=LogCleanerManager,name=time-since-last-run-ms
which we can use to tell us if the threads are dead. And as of 1.1.0,

we

have KIP-226, which allows you to restart the log cleaner thread,
without requiring a broker restart.


https://cwiki.apache.org/confluence/display/KAFKA/KIP-226+-+Dynamic+Broker+Configuration

<

https://cwiki.apache.org/confluence/display/KAFKA/KIP-226+-+Dynamic+Broker+Configuration



I've only read about this, I haven't personally tried it.

Thanks for pointing this out, James!  Stanislav, we should probably add a
sentence or two mentioning the KIP-226 changes somewhere in the KIP.

Maybe

in the intro section?

I think it's clear that requiring the users to manually restart the log
cleaner is not a very good solution.  But it's good to know that it's a
possibility on some older releases.


Some comments:
* I like the idea of having the log cleaner continue to clean as many
partitions as it can, skipping over the problematic ones if possible.

* If the log cleaner thread dies, I think it should automatically be
revived. Your KIP attempts to do that by catching exceptions during
execution, but I think we should go all the way and make sure that a

new

one gets created, if the thread ever dies.

This is inconsistent with the way the rest of Kafka works.  We don't
automatically re-create other threads in the broker if they terminate.

In

general, if there is a serious bug in the code, respawning threads is
likely to make things worse, by putting you in an infinite loop which
consumes resources and fires off continuous log messages.


* It might be worth trying to re-clean the uncleanable partitions. I've
seen cases where an uncleanable partition later became cleanable. I
unfortunately don't remember how that happened, but I remember being
surprised when I discovered it. It might have been something like a
follower was uncleanable but after a leader election happened, the log
truncated and it was then cleanable again. I'm not sure.

James, I disagree.  We had this behavior in the Hadoop Distributed File
System (HDFS) and it was a constant source of user problems.

What would happen is disks would just go bad over time.  The DataNode
would notice this and take them offline.  But then, due to some
"optimistic" code, the DataNode would periodically try to re-add them to
the system.  Then one of two things would happen: the disk would just

fail

immediately 

Build failed in Jenkins: kafka-trunk-jdk10 #318

2018-07-26 Thread Apache Jenkins Server
See 

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H23 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/kafka.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: error: Could not read 0f3affc0f40751dc8fd064b36b6e859728f63e37
error: Could not read 95fbb2e03f4fe79737c71632e0ef2dfdcfb85a69
error: Could not read 08c465028d057ac23cdfe6d57641fe40240359dd
remote: Counting objects: 6057, done.
remote: Compressing objects:   1% (1/51)   remote: Compressing objects: 
  3% (2/51)   remote: Compressing objects:   5% (3/51)   
remote: Compressing objects:   7% (4/51)   remote: Compressing objects: 
  9% (5/51)   remote: Compressing objects:  11% (6/51)   
remote: Compressing objects:  13% (7/51)   remote: Compressing objects: 
 15% (8/51)   remote: Compressing objects:  17% (9/51)   
remote: Compressing objects:  19% (10/51)   remote: Compressing 
objects:  21% (11/51)   remote: Compressing objects:  23% (12/51)   
remote: Compressing objects:  25% (13/51)   remote: Compressing 
objects:  27% (14/51)   remote: Compressing objects:  29% (15/51)   
remote: Compressing objects:  31% (16/51)   remote: Compressing 
objects:  33% (17/51)   remote: Compressing objects:  35% (18/51)   
remote: Compressing objects:  37% (19/51)   remote: Compressing 
objects:  39% (20/51)   remote: Compressing objects:  41% (21/51)   
remote: Compressing objects:  43% (22/51)   remote: Compressing 
objects:  45% (23/51)   remote: Compressing objects:  47% (24/51)   
remote: Compressing objects:  49% (25/51)   remote: Compressing 
objects:  50% (26/51)   remote: Compressing objects:  52% (27/51)   
remote: Compressing objects:  54% (28/51)   remote: Compressing 
objects:  56% (29/51)   remote: Compressing objects:  58% (30/51)   
remote: Compressing objects:  60% (31/51)   remote: Compressing 
objects:  62% (32/51)   remote: Compressing objects:  64% (33/51)   
remote: Compressing objects:  66% (34/51)   remote: Compressing 
objects:  68% (35/51)   remote: Compressing objects:  70% (36/51)   
remote: Compressing objects:  72% (37/51)   remote: Compressing 
objects:  74% (38/51)   remote: Compressing objects:  76% (39/51)   
remote: Compressing objects:  78% (40/51)   remote: Compressing 
objects:  80% (41/51)   remote: Compressing objects:  82% (42/51)   
remote: Compressing objects:  84% (43/51)   remote: Compressing 
objects:  86% (44/51)   remote: Compressing objects:  88% (45/51)   
remote: Compressing objects:  90% (46/51)   remote: Compressing 
objects:  92% (47/51)   remote: Compressing objects:  94% (48/51)   
remote: Compressing objects:  96% (49/51)   remote: Compressing 
objects:  98% (50/51)   remote: Compressing objects: 100% (51/51)   
remote: Compressing objects: 100% (51/51), done.
Receiving objects:   0% (1/6057)   Receiving objects:   1% (61/6057)   
Receiving objects:   2% (122/6057)   Receiving objects:   3% (182/6057)   
Receiving 

Re: [DISCUSS] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-07-26 Thread Vahid S Hashemian
Hi Jason,

That makes sense.
I have updated the KIP based on the recent feedback.

Thanks!
--Vahid




From:   Jason Gustafson 
To: dev 
Date:   07/25/2018 02:23 PM
Subject:Re: [DISCUSS] KIP-289: Improve the default group id 
behavior in KafkaConsumer



Hi Vahid,

I was thinking we'd only use the old API version if we had to. That is,
only if the user has explicitly configured "" as the group.id. Otherwise,
we'd just use the new one. Another option is to just drop support in the
client for the empty group id, but usually we allow a deprecation period
for changes like this.

-Jason

On Wed, Jul 25, 2018 at 12:49 PM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> Hi Jason,
>
> Thanks for additional clarification.
>
> So the next version of the OffsetCommit API will return an
> INVALID_GROUP_ID error for empty group ids; but on the client side we 
call
> the older version of the client until the next major release.
> The table below should summarize this.
>
> +-+
>   | Client (group.id="") |
> +-+
>   | pre-2.1 |   2.1  |   3.0 |
> 
+-+---+-++--+
> | | V5 (cur.) | works   | works  | works |
> + API 
+---+-++--+
> | | V6| N/A | N/A (calls V5/warning) | INVALID_GROUP_ID 
|
> 
+-+---+-++--+
>
> Assumptions:
> * 2.1: The target release version for this KIP
> * 3.0: The next major release
>
> Please advise if you see an issue; otherwise, I'll update the KIP
> accordingly.
>
> Thanks!
> --Vahid
>
>
>
>
> From:   Jason Gustafson 
> To: dev 
> Date:   07/25/2018 12:08 AM
> Subject:***UNCHECKED*** Re: [DISCUSS] KIP-289: Improve the 
default
> group idbehavior in KafkaConsumer
>
>
>
> Hey Vahid,
>
> Sorry for the confusion. I think we all agree that going forward, we
> shouldn't support the empty group id, so the question is just around
> compatibility. I think we have to bump the OffsetCommit API version so
> that
> old clients which are unknowingly depending on the default empty group 
id
> will continue to work with new brokers. For new versions of the client, 
we
> can either drop support for the empty group id immediately or we can 
give
> users a grace period. I was thinking we would do the latter. We can 
change
> the default group.id, but in the case that a user has explicitly
> configured
> the empty group, then we can just use an old version of the OffsetCommit
> API which still supports it. In a future release, we can drop this 
support
> and only use the latest OffsetCommit version. Does that make sense?
>
> Thanks,
> Jason
>
>
> On Tue, Jul 24, 2018 at 12:36 PM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
>
> > Hi Jason,
> >
> > Thanks for clarifying.
> >
> > So if we are going to continue supporting the empty group id as before
> > (with only an addition of a deprecation warning), and disable
> > enable.auto.commit for the new default (null) group id on the client
> side,
> > do we really need to bump up the OffsetCommit version?
> >
> > You mentioned "If an explicit empty string is configured for the group
> id,
> > then maybe we keep the current behavior for compatibility" which makes
> > sense to me, but I find it in conflict with your earlier suggestion 
"we
> > just need to bump the OffsetCommit request API and only accept the
> offset
> > commit for older versions.". Maybe I'm missing something?
> >
> > Thanks!
> > --Vahid
> >
> >
> >
> >
> > From:   Jason Gustafson 
> > To: dev 
> > Date:   07/23/2018 10:52 PM
> > Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> > behavior in KafkaConsumer
> >
> >
> >
> > Hey Vahid,
> >
> > Thanks for the updates. Just to clarify, I was suggesting that we
> disable
> > enable.auto.commit only if no explicit group.id is configured. If an
> > explicit empty string is configured for the group id, then maybe we 
keep
> > the current behavior for compatibility. We can log a warning 
mentioning
> > the
> > deprecation and we can use the old version of the OffsetCommit API 
that
> > allows the empty group id. In a later release, we can drop this 
support
> in
> > the client. Does that seem reasonable?
> >
> > By the way, instead of using the new ILLEGAL_OFFSET_COMMIT error code,
> > couldn't we use INVALID_GROUP_ID?
> >
> > Thanks,
> > Jason
> >
> >
> >
> > On Mon, Jul 23, 2018 at 5:14 PM, Stanislav Kozlovski
> >  > > wrote:
> >
> > > Hey Vahid,
> > >
> > > No I don't see an issue with it. I believe it to be the best 
approach.
> > >
> > > Best,
> > > Stanisav
> > >
> > > On Mon, Jul 23, 2018 at 12:41 PM Vahid S Hashemian <
> > > vahidhashem...@us.ibm.com> wrote:
> > >
> > > > Hi Stanislav,
> > > >
> > > > Thanks 

Build failed in Jenkins: kafka-trunk-jdk10 #317

2018-07-26 Thread Apache Jenkins Server
See 

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H23 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/kafka.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: error: Could not read 0f3affc0f40751dc8fd064b36b6e859728f63e37
error: Could not read 95fbb2e03f4fe79737c71632e0ef2dfdcfb85a69
error: Could not read 08c465028d057ac23cdfe6d57641fe40240359dd
remote: Counting objects: 6057, done.
remote: Compressing objects:   1% (1/51)   remote: Compressing objects: 
  3% (2/51)   remote: Compressing objects:   5% (3/51)   
remote: Compressing objects:   7% (4/51)   remote: Compressing objects: 
  9% (5/51)   remote: Compressing objects:  11% (6/51)   
remote: Compressing objects:  13% (7/51)   remote: Compressing objects: 
 15% (8/51)   remote: Compressing objects:  17% (9/51)   
remote: Compressing objects:  19% (10/51)   remote: Compressing 
objects:  21% (11/51)   remote: Compressing objects:  23% (12/51)   
remote: Compressing objects:  25% (13/51)   remote: Compressing 
objects:  27% (14/51)   remote: Compressing objects:  29% (15/51)   
remote: Compressing objects:  31% (16/51)   remote: Compressing 
objects:  33% (17/51)   remote: Compressing objects:  35% (18/51)   
remote: Compressing objects:  37% (19/51)   remote: Compressing 
objects:  39% (20/51)   remote: Compressing objects:  41% (21/51)   
remote: Compressing objects:  43% (22/51)   remote: Compressing 
objects:  45% (23/51)   remote: Compressing objects:  47% (24/51)   
remote: Compressing objects:  49% (25/51)   remote: Compressing 
objects:  50% (26/51)   remote: Compressing objects:  52% (27/51)   
remote: Compressing objects:  54% (28/51)   remote: Compressing 
objects:  56% (29/51)   remote: Compressing objects:  58% (30/51)   
remote: Compressing objects:  60% (31/51)   remote: Compressing 
objects:  62% (32/51)   remote: Compressing objects:  64% (33/51)   
remote: Compressing objects:  66% (34/51)   remote: Compressing 
objects:  68% (35/51)   remote: Compressing objects:  70% (36/51)   
remote: Compressing objects:  72% (37/51)   remote: Compressing 
objects:  74% (38/51)   remote: Compressing objects:  76% (39/51)   
remote: Compressing objects:  78% (40/51)   remote: Compressing 
objects:  80% (41/51)   remote: Compressing objects:  82% (42/51)   
remote: Compressing objects:  84% (43/51)   remote: Compressing 
objects:  86% (44/51)   remote: Compressing objects:  88% (45/51)   
remote: Compressing objects:  90% (46/51)   remote: Compressing 
objects:  92% (47/51)   remote: Compressing objects:  94% (48/51)   
remote: Compressing objects:  96% (49/51)   remote: Compressing 
objects:  98% (50/51)   remote: Compressing objects: 100% (51/51)   
remote: Compressing objects: 100% (51/51), done.
Receiving objects:   0% (1/6057)   Receiving objects:   1% (61/6057)   
Receiving objects:   2% (122/6057)   Receiving objects:   3% (182/6057)   
Receiving 

Build failed in Jenkins: kafka-trunk-jdk10 #316

2018-07-26 Thread Apache Jenkins Server
See 


Changes:

[jason] KAFKA-5886; Introduce delivery.timeout.ms producer config (KIP-91)

[github] MINOR: Caching layer should forward record timestamp (#5423)

--
[...truncated 1.54 MB...]
kafka.utils.JsonTest > testEncodeAsBytes PASSED

kafka.utils.JsonTest > testEncodeAsString STARTED

kafka.utils.JsonTest > testEncodeAsString PASSED

kafka.utils.ShutdownableThreadTest > testShutdownWhenCalledAfterThreadStart 
STARTED

kafka.utils.ShutdownableThreadTest > testShutdownWhenCalledAfterThreadStart 
PASSED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask PASSED

kafka.utils.SchedulerTest > testNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testRestart STARTED

kafka.utils.SchedulerTest > testRestart PASSED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler STARTED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest > testPeriodicTask STARTED

kafka.utils.SchedulerTest > testPeriodicTask PASSED

kafka.utils.LoggingTest > testLog4jControllerIsRegistered STARTED

kafka.utils.LoggingTest > testLog4jControllerIsRegistered PASSED

kafka.utils.LoggingTest > testLogName STARTED

kafka.utils.LoggingTest > testLogName PASSED

kafka.utils.LoggingTest > testLogNameOverride STARTED

kafka.utils.LoggingTest > testLogNameOverride PASSED

kafka.utils.ZkUtilsTest > testGetSequenceIdMethod STARTED

kafka.utils.ZkUtilsTest > testGetSequenceIdMethod PASSED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testGetAllPartitionsTopicWithoutPartitions STARTED

kafka.utils.ZkUtilsTest > testGetAllPartitionsTopicWithoutPartitions PASSED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath STARTED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath PASSED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing STARTED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing PASSED

kafka.utils.ZkUtilsTest > testGetLeaderIsrAndEpochForPartition STARTED

kafka.utils.ZkUtilsTest > testGetLeaderIsrAndEpochForPartition PASSED

kafka.utils.CoreUtilsTest > testGenerateUuidAsBase64 STARTED

kafka.utils.CoreUtilsTest > testGenerateUuidAsBase64 PASSED

kafka.utils.CoreUtilsTest > testAbs STARTED

kafka.utils.CoreUtilsTest > testAbs PASSED

kafka.utils.CoreUtilsTest > testReplaceSuffix STARTED

kafka.utils.CoreUtilsTest > testReplaceSuffix PASSED

kafka.utils.CoreUtilsTest > testCircularIterator STARTED

kafka.utils.CoreUtilsTest > testCircularIterator PASSED

kafka.utils.CoreUtilsTest > testReadBytes STARTED

kafka.utils.CoreUtilsTest > testReadBytes PASSED

kafka.utils.CoreUtilsTest > testCsvList STARTED

kafka.utils.CoreUtilsTest > testCsvList PASSED

kafka.utils.CoreUtilsTest > testReadInt STARTED

kafka.utils.CoreUtilsTest > testReadInt PASSED

kafka.utils.CoreUtilsTest > testAtomicGetOrUpdate STARTED

kafka.utils.CoreUtilsTest > testAtomicGetOrUpdate PASSED

kafka.utils.CoreUtilsTest > testUrlSafeBase64EncodeUUID STARTED

kafka.utils.CoreUtilsTest > testUrlSafeBase64EncodeUUID PASSED

kafka.utils.CoreUtilsTest > testCsvMap STARTED

kafka.utils.CoreUtilsTest > testCsvMap PASSED

kafka.utils.CoreUtilsTest > testInLock STARTED

kafka.utils.CoreUtilsTest > testInLock PASSED

kafka.utils.CoreUtilsTest > testTryAll STARTED

kafka.utils.CoreUtilsTest > testTryAll PASSED

kafka.utils.CoreUtilsTest > testSwallow STARTED

kafka.utils.CoreUtilsTest > testSwallow PASSED

kafka.utils.IteratorTemplateTest > testIterator STARTED

kafka.utils.IteratorTemplateTest > testIterator PASSED

kafka.utils.json.JsonValueTest > testJsonObjectIterator STARTED

kafka.utils.json.JsonValueTest > testJsonObjectIterator PASSED

kafka.utils.json.JsonValueTest > testDecodeLong STARTED

kafka.utils.json.JsonValueTest > testDecodeLong PASSED

kafka.utils.json.JsonValueTest > testAsJsonObject STARTED

kafka.utils.json.JsonValueTest > testAsJsonObject PASSED

kafka.utils.json.JsonValueTest > testDecodeDouble STARTED

kafka.utils.json.JsonValueTest > testDecodeDouble PASSED

kafka.utils.json.JsonValueTest > testDecodeOption STARTED

kafka.utils.json.JsonValueTest > testDecodeOption PASSED

kafka.utils.json.JsonValueTest > testDecodeString STARTED

kafka.utils.json.JsonValueTest > testDecodeString PASSED

kafka.utils.json.JsonValueTest > testJsonValueToString STARTED

kafka.utils.json.JsonValueTest > testJsonValueToString PASSED

kafka.utils.json.JsonValueTest > 

[jira] [Created] (KAFKA-7211) MM should handle timeouts in commitSync

2018-07-26 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-7211:
--

 Summary: MM should handle timeouts in commitSync
 Key: KAFKA-7211
 URL: https://issues.apache.org/jira/browse/KAFKA-7211
 Project: Kafka
  Issue Type: Improvement
Reporter: Jason Gustafson


Now that we have KIP-266, the user can override `default.api.timeout.ms` for 
the consumer so that commitSync does not block indefinitely. MM needs to be 
updated to handle TimeoutException. We may also need some logic to handle 
deleted topics. If MM attempts to commit an offset for a deleted topic, the 
call will timeout and we should probably check if the topic exists and remove 
the offset if it doesn't.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Processor API StateStore and Recovery with State Machines question.

2018-07-26 Thread Adam Bellemare
Thanks Guozhang, I appreciate the explanation. That cleared up a lot for me
and confirmed what I thought.

Based on the above, in my scenario, the state machine could get into an
incorrect state. (Just for whoever may be watching this thread).

On Tue, Jul 24, 2018 at 6:20 PM, Guozhang Wang  wrote:

> Hello Adam,
>
> I figured that rather than answering your questions one-by-one, I'd give
> you a more general explanation between consumer offset commits, changelog
> and state store.
>
> If you have a state store update processor, the state store maintenance
> workflow is this:
>
>
> 1) updating the state store:
>
> 1.a write to state store.
> 1.b write to changelog topic
>
>
> Note that 1.a) could be async: the state store may be caching enabled, and
> also even the state store itself may have some write buffer (e.g. rocksDB);
> also 1.b) is async and batching enabled as well, and the actual sending
> request is done via another thread.
>
> So at the end of 1.b) either is possible: data is written persistently to
> the local files of the state store, but have not been sent to changelog, or
> data not written persistently to local files yet, but have been sent to
> changelog, or both have happened, or neither has happened.
>
>
> 2) committing the state store:
>
> 2.a) flush state store (make sure all previous writes have been persisted)
> 2.b) flush on producer (make sure all previous writes to changelog topics
> have been acknowledged).
> 2.c) commit offset.
>
> That is, if committing succeeded, by the end of 2.c) all should be done,
> and everything is consistent.
>
> Now if there is a crash after 1.b) and before 2), then like above said, any
> scenarios may happen, but note that consumer's offset will definitely NOT
> committed yet (it should only be done in 2.c) ), so upon restarting the
> data will be re-processed, and hence either state store's image or
> changelog may contained duplicated results, aka "at-least-once".
>
> 3) Finally, when exactly-once is enabled, if there is any crashes, the
> changelog topic / state store will be "rewinded" (I omit the implementation
> details here, but just assume that logically, we can rewind them) to the
> previously successful commit, so `exactly-once` is guaranteed.
>
>
> Guozhang
>
> On Sun, Jul 22, 2018 at 5:29 PM, Adam Bellemare 
> wrote:
>
> > Hi Folks
> >
> > I have a quick question about a scenario that I would appreciate some
> > insight on. This is related to a KIP I am working on, but I wanted to
> break
> > this out into its own scenario to reach a wider audience. In this
> scenario,
> > I am using builder.internalTopologyBuilder to create the following
> within
> > the internals of Kafka Streaming:
> >
> > 1) Internal Topic Source (builder.internalTopologyBuilder.addSource(...)
> )
> >
> > 2) ProcessorSupplier with StateStore, Changelogging enabled. For the
> > purpose of this question, this processor is a very simple state machine.
> > All it does is alternately block each other event, of a given key, from
> > processing. For instance:
> > (A,1)
> > (A,2)
> > (A,3)
> > It would block the propagation of (A,2). The state of the system after
> > processing each event is:
> > blockNext = true
> > blockNext = false
> > blockNext = true
> >
> > The expecation is that this component would always block the same event,
> in
> > any failure mode and subsequent recovery (ie: ALWAYS blocks (A,2), but
> not
> > (A,1) or (A,3) ). In other words, it would maintain perfect state in
> > accordance with the offsets of the upstream and downstream elements.
> >
> > 3) The third component is a KTable with a Materialized StateStore where I
> > want to sink the remaining events. It is also backed by a change log. The
> > events arriving would be:
> > (A,1)
> > (A,3)
> >
> > The components are ordered as:
> > 1 -> 2 -> 3
> >
> >
> > Note that I am keeping the state machine in a separate state store. My
> main
> > questions are:
> >
> > 1) Will this workflow be consistent in all manners of failure? For
> example,
> > are the state stores change logs fully written to internal topics before
> > the offset is updated for the consumer in #1?
> >
> > 2) Is it possible that one State Store with changelogging will be logged
> to
> > Kafka safely (say component #3) but the other (#2) will not be, prior to
> a
> > sudden, hard termination of the node?
> >
> > 3) Is the alternate possible, where #2 is backed up to its Kafka Topic
> but
> > #3 is not? Does the ordering of the topology matter in this case?
> >
> > 4) Is it possible that the state store #2 is updated and logged, but the
> > source topic (#1) offset is not updated?
> >
> > In all of these cases, my main concern is keeping the state and the
> > expected output consistent. For any failure mode, will I be able to
> recover
> > to a fully consistent state given the requirements of the state machine
> in
> > #2?
> >
> > Though this is a trivial example, I am not certain about the dynamics
> > between maintaining state, 

[jira] [Created] (KAFKA-7210) Add system test for log compaction

2018-07-26 Thread Manikumar (JIRA)
Manikumar created KAFKA-7210:


 Summary: Add system test for log compaction
 Key: KAFKA-7210
 URL: https://issues.apache.org/jira/browse/KAFKA-7210
 Project: Kafka
  Issue Type: Improvement
Reporter: Manikumar
Assignee: Manikumar
 Fix For: 2.1.0


Currently we have TestLogCleaning tool for stress test log compaction. This 
JIRA is to integrate the tool to system test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7126) Reduce number of rebalance for large consumer groups after a topic is created

2018-07-26 Thread Dong Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin resolved KAFKA-7126.
-
Resolution: Fixed

> Reduce number of rebalance for large consumer groups after a topic is created
> -
>
> Key: KAFKA-7126
> URL: https://issues.apache.org/jira/browse/KAFKA-7126
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Jon Lee
>Priority: Major
> Fix For: 2.0.0, 2.1.0
>
> Attachments: 1.diff
>
>
> For a group of 200 MirrorMaker consumers with patten-based topic 
> subscription, a single topic creation caused 50 rebalances for each of these 
> consumer over 5 minutes period. This causes the MM to significantly lag 
> behind during this 5 minutes period and the clusters may be considerably 
> out-of-sync during this period.
> Ideally we would like to trigger only 1 rebalance in the MM group after a 
> topic is created. And conceptually it should be doable.
>  
> Here is the explanation of this repeated consumer rebalance based on the 
> consumer rebalance logic in the latest Kafka code:
> 1) A topic of 10 partitions are created in the cluster and it matches the 
> subscription pattern of the MM consumers.
> 2) The leader of the MM consumer group detects the new topic after metadata 
> refresh. It triggers rebalance.
> 3) At time T0, the first rebalance finishes. 10 consumers are assigned 1 
> partition of this topic. The other 190 consumers are not assigned any 
> partition of this topic. At this moment, the newly created topic will appear 
> in `ConsumerCoordinator.subscriptions.subscription` for those consumers who 
> is assigned partition of this consumer or who has refreshed metadata before 
> time T0.
> 4) In the common case, half of the consumers has refreshed metadata before 
> the leader of the consumer group refreshed metadata. Thus around 100 + 10 = 
> 110 consumers has the newly created topic in 
> `ConsumerCoordinator.subscriptions.subscription`. The other 90 consumers do 
> not have this topic in `ConsumerCoordinator.subscriptions.subscription`.
> 5) For those 90 consumers, if any consumer refreshes metadata, it will add 
> this topic to `ConsumerCoordinator.subscriptions.subscription`, which causes 
> `ConsumerCoordinator.rejoinNeededOrPending()` to return true and triggers 
> another rebalance. If a few consumers refresh metadata almost at the same 
> time, they will jointly trigger one rebalance. Otherwise, they each trigger a 
> separate rebalance.
> 6) The default metadata.max.age.ms is 5 minutes. Thus in the worse case, 
> which is probably also the average case if number of consumers in the group 
> is large, the latest consumer will refresh its metadata 5 minutes after T0. 
> And the rebalance will be repeated during this 5 minutes interval.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7209) Kafka stream does not rebalance when one node gets down

2018-07-26 Thread Yogesh BG (JIRA)
Yogesh BG created KAFKA-7209:


 Summary: Kafka stream does not rebalance when one node gets down
 Key: KAFKA-7209
 URL: https://issues.apache.org/jira/browse/KAFKA-7209
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.11.0.1
Reporter: Yogesh BG


I use kafka streams 0.11.0.1session timeout 6, retries to be int.max and 
backoff time default
 
I have 3 nodes running kafka cluster of 3 broker

and i am running the 3 kafka stream with same 
[application.id|http://application.id/]
each node has one broker one kafka stream application
everything works fine during setup
i bringdown one node, so one kafka broker and one streaming app is down
now i see exceptions in other two streaming apps and it never gets re balanced 
waited for hours and never comes back to norma
is there anything am missing?
i also tried looking into when one broker is down call stream.close, cleanup 
and restart this also doesn't help
can anyone help me?
 
 
 
 One thing i observed lately is that kafka topics with partitions one gets 
reassigned but i have topics of 16 partitions and replication factor 3. It 
never settles up



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-4690) IllegalStateException using DeleteTopicsRequest when delete.topic.enable=false

2018-07-26 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-4690.
--
Resolution: Duplicate

Resolving as duplicate of KAFKA-5975.

> IllegalStateException using DeleteTopicsRequest when delete.topic.enable=false
> --
>
> Key: KAFKA-4690
> URL: https://issues.apache.org/jira/browse/KAFKA-4690
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.0
> Environment: OS X
>Reporter: Jon Chiu
>Assignee: Manikumar
>Priority: Major
> Attachments: delete-topics-request.java
>
>
> There is no indication as to why the delete request fails. Perhaps an error 
> code?
> This can be reproduced with the following steps:
> 1. Start ZK and 1 broker (with default {{delete.topic.enable=false}})
> 2. Create a topic test
> {noformat}
> bin/kafka-topics.sh --zookeeper localhost:2181 \
>   --create --topic test --partition 1 --replication-factor 1
> {noformat}
> 3. Delete topic by sending a DeleteTopicsRequest
> 4. An error is returned
> {noformat}
> org.apache.kafka.common.errors.DisconnectException
> {noformat}
> or 
> {noformat}
> java.lang.IllegalStateException: Attempt to retrieve exception from future 
> which hasn't failed
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.exception(RequestFuture.java:99)
>   at 
> io.confluent.adminclient.KafkaAdminClient.send(KafkaAdminClient.java:195)
>   at 
> io.confluent.adminclient.KafkaAdminClient.deleteTopic(KafkaAdminClient.java:152)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7208) AlterConfigsRequest with null config value throws NullPointerException

2018-07-26 Thread Sean Policarpio (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Policarpio resolved KAFKA-7208.

Resolution: Not A Problem

Oops, realised I misunderstood how this API request works; ConfigEntries 
represents the entire map of overrides you want to apply for the resource 
(topic in this case). So to delete an override, it just needs to not be 
specified in the map submitted by the API, in which case the default value will 
take over.

The only issue here is that the documentation is still wrong, in that it says 
the config value is a nullable String.

> AlterConfigsRequest with null config value throws NullPointerException
> --
>
> Key: KAFKA-7208
> URL: https://issues.apache.org/jira/browse/KAFKA-7208
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, config, core
>Affects Versions: 1.1.1
>Reporter: Sean Policarpio
>Priority: Major
>
> The following exception is thrown from the Kafka server when an 
> AlterConfigsRequest API request is made via the binary protocol where the 
> CONFIG_ENTRY's CONFIG_VALUE is set to null:
> {noformat}
> [2018-07-26 15:53:01,487] ERROR [Admin Manager on Broker 1000]: Error 
> processing alter configs request for resource Resource(type=TOPIC, 
> name='foo'}, config 
> org.apache.kafka.common.requests.AlterConfigsRequest$Config@692d8300 
> (kafka.server.AdminManager)
> java.lang.NullPointerException
>   at java.util.Hashtable.put(Hashtable.java:459)
>   at java.util.Properties.setProperty(Properties.java:166)
>   at 
> kafka.server.AdminManager$$anonfun$alterConfigs$1$$anonfun$apply$18.apply(AdminManager.scala:357)
>   at 
> kafka.server.AdminManager$$anonfun$alterConfigs$1$$anonfun$apply$18.apply(AdminManager.scala:356)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at 
> kafka.server.AdminManager$$anonfun$alterConfigs$1.apply(AdminManager.scala:356)
>   at 
> kafka.server.AdminManager$$anonfun$alterConfigs$1.apply(AdminManager.scala:339)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at kafka.server.AdminManager.alterConfigs(AdminManager.scala:339)
>   at 
> kafka.server.KafkaApis.handleAlterConfigsRequest(KafkaApis.scala:1994)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:143)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> As a first guess, I'd say the issue is happening 
> [here|https://github.com/apache/kafka/blob/49db5a63c043b50c10c2dfd0648f8d74ee917b6a/core/src/main/scala/kafka/server/AdminManager.scala#L361],
>  since HashTable/Property can't take a null value.
> The reason I'm sending a null for the configuration value is I assumed that 
> since the API documentation says that the value is nullable (see 
> [here|http://kafka.apache.org/protocol.html#The_Messages_AlterConfigs] and 
> [here|https://github.com/apache/kafka/blob/49db5a63c043b50c10c2dfd0648f8d74ee917b6a/clients/src/main/java/org/apache/kafka/common/requests/AlterConfigsRequest.java#L53]),
>  this meant the (override) configuration itself would be deleted when a null 
> value was received.
> Contradictory to that, I did notice null checks throughout the code, like 
> [here|https://github.com/apache/kafka/blob/49db5a63c043b50c10c2dfd0648f8d74ee917b6a/clients/src/main/java/org/apache/kafka/common/requests/AlterConfigsRequest.java#L92].
> If null is in fact not meant to be accepted by the binary API, this probably 
> needs to be addressed at the protocol level. Further to that, can someone 
> show me how to then remove a configuration override via the binary API?
> For clarity sake, here's what my request looked like (pseudo/Rust):
> {noformat}
> AlterConfigsRequest {
>   resources: [
> Resource
> {
>   resource_type: 2,
>   resource_name: "foo",
>   config_entries: [
> ConfigEntry
> {
>   config_name: "file.delete.delay.ms",
>   

Re: [VOTE] KIP-342 - Add support for custom SASL extensions in OAuthBearer authentication

2018-07-26 Thread Rajini Sivaram
Hi Stanislav,

Thanks for the KIP.

+1 (binding)

Regards,

Rajini


On Wed, Jul 25, 2018 at 6:41 PM, Ron Dagostino  wrote:

> +1 (Non-binding).  Thanks for the KIP and the PR, Stanislav.
>
> Ron
>
> On Wed, Jul 25, 2018 at 1:04 PM Stanislav Kozlovski <
> stanis...@confluent.io>
> wrote:
>
> > Hey everbody,
> >
> > I'd like to start a vote thread for KIP-342 Add support for custom SASL
> > extensions in OAuthBearer authentication
> > <
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 342%3A+Add+support+for+Custom+SASL+extensions+in+
> OAuthBearer+authentication
> > >
> >
> > --
> > Best,
> > Stanislav
> >
>


Re: [DISCUSS] KIP-342 Add Customizable SASL extensions to OAuthBearer authentication

2018-07-26 Thread Rajini Sivaram
Looks good. Thanks, Stanislav.

On Wed, Jul 25, 2018 at 7:46 PM, Stanislav Kozlovski  wrote:

> Hi Rajini,
>
> I updated the KIP. Please check if the clarification is okay
>
> On Wed, Jul 25, 2018 at 10:49 AM Rajini Sivaram 
> wrote:
>
> > Hi Stanislav,
> >
> > 1. Can you clarify the following line in the KIP in the 'Public
> Interfaces'
> > section? When you are reading the KIP for the first time, it sounds like
> we
> > adding a new Kafka config. But we are adding JAAS config options with a
> > prefix that can be used with the default unsecured bearer tokens. We
> could
> > include the example in this section or at least link to the example.
> >
> >- New config option for default, unsecured bearer tokens -
> >`unsecuredLoginExtension_`.
> >
> >
> > 2. Can you add the package for SaslExtensionsCallback class?
> >
> >
> > On Tue, Jul 24, 2018 at 10:03 PM, Stanislav Kozlovski <
> > stanis...@confluent.io> wrote:
> >
> > > Hi Ron,
> > >
> > > Thanks for the suggestions. I have applied them to the KIP.
> > >
> > > On Tue, Jul 24, 2018 at 1:39 PM Ron Dagostino 
> wrote:
> > >
> > > > Hi Stanislav.  The statement "New config option for
> > > OAuthBearerLoginModule"
> > > > is technically incorrect; it should be "New config option for
> default,
> > > > unsecured bearer tokens" since that is what provides the
> functionality
> > > (as
> > > > opposed to the login module, which does not).  Also, please state
> that
> > > > "auth" is not supported as a custom extension name with any
> > > > SASL/OAUTHBEARER mechanism, including the unsecured one, since it is
> > > > reserved by the spec for what is normally sent in the HTTP
> > Authorization
> > > > header an attempt to use it will result in a configuration exception.
> > > >
> > > > Finally, please also state that while the OAuthBearerLoginModule and
> > the
> > > > OAuthBearerSaslClient will be changed to request the extensions from
> > its
> > > > callback handler, for backwards compatibility it is not necessary for
> > the
> > > > callback handler to support SaslExtensionsCallback -- any
> > > > UnsupportedCallbackException that is thrown will be ignored and no
> > > > extensions will be added.
> > > >
> > > > Ron
> > > >
> > > > On Tue, Jul 24, 2018 at 11:20 AM Stanislav Kozlovski <
> > > > stanis...@confluent.io>
> > > > wrote:
> > > >
> > > > > Hey everybody,
> > > > >
> > > > > I have updated the KIP to reflect the latest changes as best as I
> > > could.
> > > > If
> > > > > there aren't more suggestions, I intent to start the [VOTE] thread
> > > > > tomorrow.
> > > > >
> > > > > Best,
> > > > > Stanislav
> > > > >
> > > > > On Tue, Jul 24, 2018 at 6:34 AM Ron Dagostino 
> > > wrote:
> > > > >
> > > > > > Hi Stanislav.  Could you update the KIP to reflect the latest
> > > > definition
> > > > > of
> > > > > > SaslExtensions and confirm or correct the impact it has to the
> > > > > > SCRAM-related classes?  I'm not sure if the currently-described
> > > impact
> > > > is
> > > > > > still accurate.  Also, could you mention the changes to
> > > > > > OAuthBearerUnsecuredLoginCallbackHandler in the text in
> addition to
> > > > > giving
> > > > > > the examples?  The examples show the new
> > > > > > unsecuredLoginExtension_ feature, but that
> feature
> > is
> > > > not
> > > > > > described anywhere prior to it appearing there.
> > > > > >
> > > > > > Ron
> > > > > >
> > > > > > On Mon, Jul 23, 2018 at 1:42 PM Ron Dagostino  >
> > > > wrote:
> > > > > >
> > > > > > > Hi Rajini.  I think a class is fine as long as we make sure the
> > > > > semantics
> > > > > > > of immutability are clear -- it would have to be a value class,
> > and
> > > > any
> > > > > > > constructor that accepts a Map as input would have to copy that
> > Map
> > > > > > rather
> > > > > > > than store it in a member variable.  Similarly, any Map that it
> > > might
> > > > > > > return would have to be unmodifiable.
> > > > > > >
> > > > > > > Ron
> > > > > > >
> > > > > > > On Mon, Jul 23, 2018 at 12:24 PM Rajini Sivaram <
> > > > > rajinisiva...@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hi Ron, Stanislav,
> > > > > > >>
> > > > > > >> I agree with Stanislav that it would be better to leave
> > > > > `SaslExtensions`
> > > > > > >> as
> > > > > > >> a class rather than make it an interface. We don''t really
> > expect
> > > > > users
> > > > > > to
> > > > > > >> extends this class, so it is convenient to have an
> > implementation
> > > > > since
> > > > > > >> users need to create an instance. The class provided by the
> > public
> > > > API
> > > > > > >> should be sufficient in the vast majority of the cases. Ron,
> do
> > > you
> > > > > > agree?
> > > > > > >>
> > > > > > >> On Mon, Jul 23, 2018 at 11:35 AM, Ron Dagostino <
> > > rndg...@gmail.com>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Hi Stanislav.  See
> > > > https://tools.ietf.org/html/rfc7628#section-3.1,
> > > > > > and
> > > > > > >> > that section refers to the 

CVE-2018-1288: Authenticated Kafka clients may interfere with data replication

2018-07-26 Thread Rajini Sivaram
CVE-2018-1288: Authenticated Kafka clients may interfere with data
replication



Severity: Moderate



Vendor: The Apache Software Foundation



Versions Affected:

Apache Kafka 0.9.0.0 to 0.9.0.1, 0.10.0.0 to 0.10.2.1, 0.11.0.0 to
0.11.0.2, 1.0.0



Description:

Authenticated Kafka users may perform action reserved for the Broker via a
manually created fetch request interfering with data replication, resulting
in data loss.



Mitigation:

Apache Kafka users should upgrade to one of the following versions where
this vulnerability has been fixed.


   - 0.10.2.2 or higher
   - 0.11.0.3 or higher
   - 1.0.1 or higher
   - 1.1.0 or higher



Acknowledgements:

We would like to thank Edoardo Comar and Mickael Maison for reporting this
issue and providing a resolution.



Regards,


Rajini


CVE-2017-12610: Authenticated Kafka clients may impersonate other users

2018-07-26 Thread Rajini Sivaram
CVE-2017-12610: Authenticated Kafka clients may impersonate other users


Severity: Moderate



Vendor: The Apache Software Foundation



Versions Affected:

Apache Kafka 0.10.0.0 to 0.10.2.1, 0.11.0.0 to 0.11.0.1



Description:

Authenticated Kafka clients may use impersonation via a manually crafted
protocol message with SASL/PLAIN or SASL/SCRAM authentication when using
the built-in PLAIN or SCRAM server implementations in Apache Kafka.



Mitigation:

Apache Kafka users should upgrade to one of the following versions where
this vulnerability has been fixed:


   - 0.10.2.2 or higher
   - 0.11.0.2 or higher
   - 1.0.0 or higher



Acknowledgements:

This issue was reported by Rajini Sivaram.



Regards,


Rajini


Re: [Vote] KIP-321: Update TopologyDescription to better represent Source and Sink Nodes

2018-07-26 Thread Guozhang Wang
+1

On Wed, Jul 25, 2018 at 11:13 PM, Matthias J. Sax 
wrote:

> +1 (binding)
>
> -Matthias
>
> On 7/25/18 7:47 PM, Ted Yu wrote:
> > +1
> >
> > On Wed, Jul 25, 2018 at 7:24 PM Nishanth Pradeep 
> > wrote:
> >
> >> Hello,
> >>
> >> I'm calling a vote for KIP-321:
> >>
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-321%3A+Update+
> TopologyDescription+to+better+represent+Source+and+Sink+Nodes
> >>
> >> Best,
> >> Nishanth Pradeep
> >>
> >
>
>


-- 
-- Guozhang


Re: [Vote] KIP-321: Update TopologyDescription to better represent Source and Sink Nodes

2018-07-26 Thread Matthias J. Sax
+1 (binding)

-Matthias

On 7/25/18 7:47 PM, Ted Yu wrote:
> +1
> 
> On Wed, Jul 25, 2018 at 7:24 PM Nishanth Pradeep 
> wrote:
> 
>> Hello,
>>
>> I'm calling a vote for KIP-321:
>>
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-321%3A+Update+TopologyDescription+to+better+represent+Source+and+Sink+Nodes
>>
>> Best,
>> Nishanth Pradeep
>>
> 



signature.asc
Description: OpenPGP digital signature


[jira] [Created] (KAFKA-7208) AlterConfigsRequest with null config value throws NullPointerException

2018-07-26 Thread Sean Policarpio (JIRA)
Sean Policarpio created KAFKA-7208:
--

 Summary: AlterConfigsRequest with null config value throws 
NullPointerException
 Key: KAFKA-7208
 URL: https://issues.apache.org/jira/browse/KAFKA-7208
 Project: Kafka
  Issue Type: Bug
  Components: admin, config, core
Affects Versions: 1.1.1
Reporter: Sean Policarpio


The following exception is thrown from the Kafka server when an 
AlterConfigsRequest API request is made via the binary protocol where the 
CONFIG_ENTRY's CONFIG_VALUE is set to null:
{noformat}
[2018-07-26 15:53:01,487] ERROR [Admin Manager on Broker 1000]: Error 
processing alter configs request for resource Resource(type=TOPIC, name='foo'}, 
config org.apache.kafka.common.requests.AlterConfigsRequest$Config@692d8300 
(kafka.server.AdminManager) java.lang.NullPointerException at 
java.util.Hashtable.put(Hashtable.java:459) at 
java.util.Properties.setProperty(Properties.java:166) at 
kafka.server.AdminManager$$anonfun$alterConfigs$1$$anonfun$apply$18.apply(AdminManager.scala:357)
 at 
kafka.server.AdminManager$$anonfun$alterConfigs$1$$anonfun$apply$18.apply(AdminManager.scala:356)
 at scala.collection.Iterator$class.foreach(Iterator.scala:891) at 
scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at 
scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at 
scala.collection.AbstractIterable.foreach(Iterable.scala:54) at 
kafka.server.AdminManager$$anonfun$alterConfigs$1.apply(AdminManager.scala:356) 
at 
kafka.server.AdminManager$$anonfun$alterConfigs$1.apply(AdminManager.scala:339) 
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at scala.collection.Iterator$class.foreach(Iterator.scala:891) at 
scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at 
scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at 
scala.collection.AbstractIterable.foreach(Iterable.scala:54) at 
scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at 
scala.collection.AbstractTraversable.map(Traversable.scala:104) at 
kafka.server.AdminManager.alterConfigs(AdminManager.scala:339) at 
kafka.server.KafkaApis.handleAlterConfigsRequest(KafkaApis.scala:1994) at 
kafka.server.KafkaApis.handle(KafkaApis.scala:143) at 
kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at 
java.lang.Thread.run(Thread.java:745)
{noformat}
As a first guess, I'd say the issue is happening 
[here|https://github.com/apache/kafka/blob/49db5a63c043b50c10c2dfd0648f8d74ee917b6a/core/src/main/scala/kafka/server/AdminManager.scala#L361],
 since HashTable/Property can't take a null value.

The reason I'm sending a null for the configuration value is I assumed that 
since the API documentation says that the value is nullable (see 
[here|http://kafka.apache.org/protocol.html#The_Messages_AlterConfigs] and 
[here|https://github.com/apache/kafka/blob/49db5a63c043b50c10c2dfd0648f8d74ee917b6a/clients/src/main/java/org/apache/kafka/common/requests/AlterConfigsRequest.java#L53]),
 this meant the (override) configuration itself would be deleted when a null 
value was received.

Contradictory to that, I did notice null checks throughout the code, like 
[here|https://github.com/apache/kafka/blob/49db5a63c043b50c10c2dfd0648f8d74ee917b6a/clients/src/main/java/org/apache/kafka/common/requests/AlterConfigsRequest.java#L92].

If null is in fact not meant to be accepted by the binary API, this probably 
needs to be addressed at the protocol level. Further to that, can someone show 
me how to then remove a configuration override via the binary API?

For clarity sake, here's what my request looked like (pseudo/Rust):
{noformat}
AlterConfigsRequest { 
  resources: [ 
Resource { 
  resource_type: 2, 
  resource_name: "foo", 
  config_entries: [ 
ConfigEntry { 
  config_name: "file.delete.delay.ms", 
  config_value: None // serialized as a 16-bit '-1' 
} 
   ] 
  } 
 ], 
 validate_only: false 
}
{noformat}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)