Re: [DISCUSS] KIP 141 - ProducerRecordBuilder Interface

2017-05-03 Thread Ewen Cheslack-Postava
Stephane,

VOTES are really on-demand based on the author, but obviously it's good to
come to some level of consensus in the DISCUSS thread before initiating a
vote. I think the request for comments/votes on your 3 options is a
reasonable way to gauge current opinions.

For myself, I think either 1 or 3 are good options, and I think at least
Matthias & Jay are in agreement -- basically have one preferred, but
possibly support 2 approaches for awhile.

I think 3 is the right way to go long term -- I don't expect so many more
built-in fields to be added, but then again I didn't expect this much churn
this quickly (headers were a surprise for me). We've gotten to enough
parameters that a builder is more effective. It sucks a bit for existing
users that rely on the constructors, but a full major release cycle (at the
minimum) is a pretty significant window, and we can always choose to extend
the window longer if we want to give people more time to transition. To me,
the biggest problem is all the tutorials and content that we *can't*
control -- there's a ton of code and tutorials out there that will still
reference the constructors, and those will last far longer than any
deprecation period we put in place.

-Ewen

On Wed, May 3, 2017 at 5:46 PM, Stephane Maarek <
steph...@simplemachines.com.au> wrote:

> How do votes works?
>
> I feel there are 3 options right here, and I’d like a pre vote before a
> real vote?
> 1) Adding constructors. Could get messy over time, especially with headers
> coming into play, and future possible improvement to the message format
> 2) Adding a builder / nicer looking API (like fluent) to help build a
> ProducerRecord in a safe way. Issue here are two ways of building a
> ProducerRecord can bring confusion
> 3) Same as 2), but deprecating all the constructors. May be too much of an
> aggressive strategy
>
>
> I’m happy to go over 2), update the docs, and tell people this is the
> “preferred” way. Won’t outdate all the literature on Kafka, but I feel this
> set people up for success in the future.
> Thoughts  / pre vote?
>
> On 3/5/17, 4:20 pm, "Ewen Cheslack-Postava"  wrote:
>
> I understand the convenience of pointing at a JIRA/PR, but can we put
> the
> concrete changes proposed in the JIRA (under "Proposed Changes"). I
> don't
> think voting on the KIP would be reasonable otherwise since the changes
> under vote could change arbitrarily...
>
> I'm increasingly skeptical of adding more convenience constructors --
> the
> current patch adds timestamps, we're about to add headers as well (for
> core, for Connect we have
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 145+-+Expose+Record+Headers+in+Kafka+Connect
> in flight). It just continues to get messier over time.
>
> I think builders in the right context are useful, as long as they
> exceed a
> certain number of parameters (SchemaBuilder in Connect is an artifact
> of
> that position). I don't think a transition period with 2 ways to
> construct
> an object is actually a problem -- if there's always an "all N
> parameters"
> version of the constructor, all other constructors are just convenience
> shortcuts, but the Builder provides a shorthand.
>
> I also agree w/ Ismael that deprecating to aggressively is bad -- we
> added
> the APIs instead of a builder and there's not any real maintenance
> cost, so
> why add the deprecation? I don't want to suggest actually adding such
> an
> annotation, but the real issue here is that one API will become
> "preferred"
> for some time.
>
> -Ewen
>
> On Tue, May 2, 2017 at 1:15 AM, Ismael Juma  wrote:
>
> > Hi Matthias,
> >
> > Deprecating widely used APIs is a big deal. Build warnings are a
> nuisance
> > and can potentially break the build for those who have a
> zero-warnings
> > policy (which is good practice). It creates a bunch of busy work for
> our
> > users and various resources like books, blog posts, etc. become out
> of
> > date.
> >
> > This does not mean that we should not do it, but the benefit has to
> be
> > worth it and we should not do it lightly.
> >
> > Ismael
> >
> > On Sat, Apr 29, 2017 at 6:52 PM, Matthias J. Sax <
> matth...@confluent.io>
> > wrote:
> >
> > > I understand that we cannot just break stuff (btw: also not for
> > > Streams!). But deprecating does not break anything, so I don't
> think
> > > it's a big deal to change the API as long as we keep the old API as
> > > deprecated.
> > >
> > >
> > > -Matthias
> > >
> > > On 4/29/17 9:28 AM, Jay Kreps wrote:
> > > > Hey Matthias,
> > > >
> > > > Yeah I agree, I'm not against change as a general thing! I also
> think
> > if
> > > > you look back on the last two years, we completely rewrote the
> producer
> > > and
> > > > consumer APIs, reworked the 

[jira] [Commented] (KAFKA-4923) Add Exactly-Once Semantics to Streams

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996186#comment-15996186
 ] 

ASF GitHub Bot commented on KAFKA-4923:
---

GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/2974

KAFKA-4923: Add Exaclty-Once Semantics to Streams (testing)

 - add broker compatibility system tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka 
kafka-4923-add-eos-to-streams-add-broker-check-and-system-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2974.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2974


commit f92b7168a572967656290a6ad75d4799823d4392
Author: Matthias J. Sax 
Date:   2017-05-03T18:28:59Z

KAFKA-4923: Add Exaclty-Once Semantics to Streams (testing)
 - add broker compatibility system tests




> Add Exactly-Once Semantics to Streams
> -
>
> Key: KAFKA-4923
> URL: https://issues.apache.org/jira/browse/KAFKA-4923
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>  Labels: kip
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-129%3A+Streams+Exactly-Once+Semantics



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2974: KAFKA-4923: Add Exaclty-Once Semantics to Streams ...

2017-05-03 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/2974

KAFKA-4923: Add Exaclty-Once Semantics to Streams (testing)

 - add broker compatibility system tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka 
kafka-4923-add-eos-to-streams-add-broker-check-and-system-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2974.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2974


commit f92b7168a572967656290a6ad75d4799823d4392
Author: Matthias J. Sax 
Date:   2017-05-03T18:28:59Z

KAFKA-4923: Add Exaclty-Once Semantics to Streams (testing)
 - add broker compatibility system tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5171) TC should not accept empty string transactional id

2017-05-03 Thread Umesh Chaudhary (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996180#comment-15996180
 ] 

Umesh Chaudhary commented on KAFKA-5171:


[~guozhang] , Please review the PR. 

> TC should not accept empty string transactional id
> --
>
> Key: KAFKA-5171
> URL: https://issues.apache.org/jira/browse/KAFKA-5171
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Guozhang Wang
> Fix For: 0.11.0.0
>
>
> Currently on TC, both `null` and `empty string` will be accepted and a new 
> pid will be returned. However, on the producer client end empty string 
> transactional id is not allowed, and if user specifically set it with empty 
> string RTE will be thrown.
> We can make TC's behavior consistent with client to also reject empty string 
> transactional id.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2973: Kafka 5171:TC should not accept empty string trans...

2017-05-03 Thread umesh9794
GitHub user umesh9794 opened a pull request:

https://github.com/apache/kafka/pull/2973

Kafka 5171:TC should not accept empty string transactional id

This is an initial PR. Changed the unit tests accordingly as per the 
expectation from TC.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/umesh9794/kafka KAFKA-5171

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2973.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2973


commit f2b35a8e56d83301a42a09fef61a9ba752acce70
Author: umesh9794 
Date:   2017-04-28T05:14:04Z

KAFKA-5137 : Controlled shutdown timeout message improvement

commit b64a78d71d1f8723be556d12ee89401efcf5e6d2
Author: umesh chaudhary 
Date:   2017-05-02T05:50:29Z

KAFKA-5137 : Controlled shutdown timeout message improvement

commit 88528e0339f21d0616c7784cbdade9569cfa7927
Author: umesh chaudhary 
Date:   2017-05-04T02:46:26Z

Merge branch 'trunk' of https://github.com/apache/kafka into local

commit deac443b6779c100c4698071611d9d8d82b3b405
Author: umesh chaudhary 
Date:   2017-05-04T04:52:04Z

KAFKA-5171 : TC should not accept empty string transactional id

commit f86200785bd0ad63957ff2a654ac7ea8b7d1c94e
Author: umesh chaudhary 
Date:   2017-05-04T04:57:16Z

KAFKA-5171 : TC should not accept empty string transactional id




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk7 #2146

2017-05-03 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-5045: Clarify on KTable APIs for queryable stores

--
[...truncated 2.12 MB...]
:36:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 commitTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP,

  ^
:37:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 expireTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP) {

  ^
:524:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
  if (offsetAndMetadata.expireTimestamp == 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP)

^
:335:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
  if (partitionData.timestamp == 
OffsetCommitRequest.DEFAULT_TIMESTAMP)
 ^
:94:
 class ProducerConfig in package producer is deprecated: This class has been 
deprecated and will be removed in a future release. Please use 
org.apache.kafka.clients.producer.ProducerConfig instead.
val producerConfig = new ProducerConfig(props)
 ^
:95:
 method fetchTopicMetadata in object ClientUtils is deprecated: This method has 
been deprecated and will be removed in a future release.
fetchTopicMetadata(topics, brokers, producerConfig, correlationId)
^
:187:
 object ProducerRequestStatsRegistry in package producer is deprecated: This 
object has been deprecated and will be removed in a future release.
ProducerRequestStatsRegistry.removeProducerRequestStats(clientId)
^
:129:
 method readFromReadableChannel in class NetworkReceive is deprecated: see 
corresponding Javadoc for more information.
  response.readFromReadableChannel(channel)
   ^
:335:
 value timestamp in class PartitionData is deprecated: see corresponding 
Javadoc for more information.
  if (partitionData.timestamp == 
OffsetCommitRequest.DEFAULT_TIMESTAMP)
^
:338:
 value timestamp in class PartitionData is deprecated: see corresponding 
Javadoc for more information.
offsetRetention + partitionData.timestamp
^
:603:
 method offsetData in class ListOffsetRequest is deprecated: see corresponding 
Javadoc for more information.
val (authorizedRequestInfo, unauthorizedRequestInfo) = 
offsetRequest.offsetData.asScala.partition {
 ^
:603:
 class PartitionData in object ListOffsetRequest is deprecated: see 
corresponding Javadoc for more information.
val (authorizedRequestInfo, unauthorizedRequestInfo) = 
offsetRequest.offsetData.asScala.partition {

  ^
:608:
 constructor PartitionData in class 

Re: Exiting a streams app at end of stream?

2017-05-03 Thread Matthias J. Sax
Hi,

there is KIP for this idea, but nobody or working on it actively atm:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-95%3A+Incremental+Batch+Processing+for+Kafka+Streams


Right now, you can only check the committed offsets and terminate the
app accordingly.


-Matthias

On 5/3/17 5:32 AM, Thomas Becker wrote:
> We have had a number of situations where we need to migrate data in a
> Kafka topic to a new topic that is keyed differently. Stream processing
> is a good fit for this use-case with one exception: there is no easy
> way to know when your "migration job" is finished. Has any thought been
> given to adding an "end of stream" notion to Kafka Streams, and a
> corresponding mode to exit the application when all input streams have
> hit it?
> 
> --
> 
> 
> Tommy Becker
> 
> Senior Software Engineer
> 
> O +1 919.460.4747
> 
> tivo.com
> 
> 
> 
> 
> This email and any attachments may contain confidential and privileged 
> material for the sole use of the intended recipient. Any review, copying, or 
> distribution of this email (or any attachments) by others is prohibited. If 
> you are not the intended recipient, please contact the sender immediately and 
> permanently delete this email and any attachments. No employee or agent of 
> TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo 
> Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed 
> written agreement.
> 



signature.asc
Description: OpenPGP digital signature


[jira] [Commented] (KAFKA-5172) CachingSessionStore doesn't fetchPrevious correctly.

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996038#comment-15996038
 ] 

ASF GitHub Bot commented on KAFKA-5172:
---

GitHub user KyleWinkelman opened a pull request:

https://github.com/apache/kafka/pull/2972

[KAFKA-5172] Fix fetchPrevious to find the correct session.

Change fetchPrevious to use findSessions with the proper key and timestamps 
rather than using fetch.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KyleWinkelman/kafka 
CachingSessionStore-fetchPrevious

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2972.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2972


commit c09a1ace192877ef9f633e14288b680f374b3792
Author: Kyle Winkelman 
Date:   2017-05-04T02:03:50Z

Fix fetchPrevious to find the correct session.




> CachingSessionStore doesn't fetchPrevious correctly.
> 
>
> Key: KAFKA-5172
> URL: https://issues.apache.org/jira/browse/KAFKA-5172
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Kyle Winkelman
>
> When using KStreamSessionWindowAggregate by calling 
> KGroupedStream#aggregate() a CachingSessionStore is created.
> This causes the following chain of method calls when a new record that 
> requires removing others from the store appear:
> KStreamSessionWindowAggregate
> CachingSessionStore.remove(Windowed)
> CachingSessionStore.put(Windowed, V)
> ThreadCache.put(String, Bytes *containing Windowed info*, LRUCacheEntry)
> ThreadCache.maybeEvict(String)
> NamedCache.evict()
> NamedCache.flush(LRUNode *containing Bytes and LRUCacheEntry from 
> ThreadCache#put*)
> DirtyEntryFlushListener *defined in CachingSessionStore line 80* 
> .apply(ThreadCache.DirtyEntry *containing Bytes and LRUCacheEntry from 
> ThreadCache#put*)
> CachingSessionStore.putAndMaybeForward(ThreadCache.DirtyEntry *containing 
> Bytes and LRUCacheEntry from ThreadCache#put*, InternalProcessorContext)
> CachingSessionStore.fetchPrevious(Bytes *containing Windowed info*)
> RocksDBSessionStore.fetch(Bytes *containing Windowed info*)
> RocksDBSessionStore.findSessions *on line 48* (Bytes *containing Windowed 
> info*, 0, Long.MAX_VALUE)
> MeteredSegmentedByteStore.fetch(Bytes *containing Windowed info*, 0, 
> Long.MAX_VALUE)
> ChangeLoggingSegmentedByteStore.fetch(Bytes *containing Windowed info*, 0, 
> Long.MAX_VALUE)
> RocksDBSegmentedBytesStore.fetch(Bytes *containing Windowed info*, 0, 
> Long.MAX_VALUE)
> SessionKeySchema.lower/upperRange(Bytes *containing Windowed info*, Long)
> ** in this method the already Windowed gets Windowed again *
> The point of showing all this is to point out that the windowed gets windowed 
> and because it passes the 0, Long.MAX_VALUE it searches for a strange key and 
> searches all times for it. I think the fetchPrevious method of 
> CachingSessionStore should be changed to call the 
> byteStores.findSessions(Bytes.wrap(serdes.rawKey(key.key())), 
> key.window().start(), key.window().end()). 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2972: [KAFKA-5172] Fix fetchPrevious to find the correct...

2017-05-03 Thread KyleWinkelman
GitHub user KyleWinkelman opened a pull request:

https://github.com/apache/kafka/pull/2972

[KAFKA-5172] Fix fetchPrevious to find the correct session.

Change fetchPrevious to use findSessions with the proper key and timestamps 
rather than using fetch.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KyleWinkelman/kafka 
CachingSessionStore-fetchPrevious

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2972.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2972


commit c09a1ace192877ef9f633e14288b680f374b3792
Author: Kyle Winkelman 
Date:   2017-05-04T02:03:50Z

Fix fetchPrevious to find the correct session.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5172) CachingSessionStore doesn't fetchPrevious correctly.

2017-05-03 Thread Kyle Winkelman (JIRA)
Kyle Winkelman created KAFKA-5172:
-

 Summary: CachingSessionStore doesn't fetchPrevious correctly.
 Key: KAFKA-5172
 URL: https://issues.apache.org/jira/browse/KAFKA-5172
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Kyle Winkelman


When using KStreamSessionWindowAggregate by calling KGroupedStream#aggregate() 
a CachingSessionStore is created.
This causes the following chain of method calls when a new record that requires 
removing others from the store appear:

KStreamSessionWindowAggregate
CachingSessionStore.remove(Windowed)
CachingSessionStore.put(Windowed, V)
ThreadCache.put(String, Bytes *containing Windowed info*, LRUCacheEntry)
ThreadCache.maybeEvict(String)
NamedCache.evict()
NamedCache.flush(LRUNode *containing Bytes and LRUCacheEntry from 
ThreadCache#put*)
DirtyEntryFlushListener *defined in CachingSessionStore line 80* 
.apply(ThreadCache.DirtyEntry *containing Bytes and LRUCacheEntry from 
ThreadCache#put*)
CachingSessionStore.putAndMaybeForward(ThreadCache.DirtyEntry *containing Bytes 
and LRUCacheEntry from ThreadCache#put*, InternalProcessorContext)
CachingSessionStore.fetchPrevious(Bytes *containing Windowed info*)
RocksDBSessionStore.fetch(Bytes *containing Windowed info*)
RocksDBSessionStore.findSessions *on line 48* (Bytes *containing Windowed 
info*, 0, Long.MAX_VALUE)
MeteredSegmentedByteStore.fetch(Bytes *containing Windowed info*, 0, 
Long.MAX_VALUE)
ChangeLoggingSegmentedByteStore.fetch(Bytes *containing Windowed info*, 0, 
Long.MAX_VALUE)
RocksDBSegmentedBytesStore.fetch(Bytes *containing Windowed info*, 0, 
Long.MAX_VALUE)
SessionKeySchema.lower/upperRange(Bytes *containing Windowed info*, Long)
** in this method the already Windowed gets Windowed again *

The point of showing all this is to point out that the windowed gets windowed 
and because it passes the 0, Long.MAX_VALUE it searches for a strange key and 
searches all times for it. I think the fetchPrevious method of 
CachingSessionStore should be changed to call the 
byteStores.findSessions(Bytes.wrap(serdes.rawKey(key.key())), 
key.window().start(), key.window().end()). 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1479

2017-05-03 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-5045: Clarify on KTable APIs for queryable stores

--
[...truncated 1.65 MB...]

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
PASSED


[GitHub] kafka pull request #2971: [Minor] Remove use of aggSerde in RocksDBSessionSt...

2017-05-03 Thread KyleWinkelman
GitHub user KyleWinkelman opened a pull request:

https://github.com/apache/kafka/pull/2971

[Minor] Remove use of aggSerde in RocksDBSessionStore.

RocksDBSessionStore wasn't properly using the default aggSerde if no Serde 
was supplied.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KyleWinkelman/kafka 
RocksDBSessionStore-fix-aggSerde-use

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2971.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2971


commit 20b624a83a1244cec39c0f9b423b40831ef2fd05
Author: Kyle Winkelman 
Date:   2017-05-04T01:23:23Z

Remove use of aggSerde in RocksDBSessionStore and substitue serdes




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2970: Kafka-5160; KIP-98 Broker side support for TxnOffs...

2017-05-03 Thread apurvam
GitHub user apurvam opened a pull request:

https://github.com/apache/kafka/pull/2970

Kafka-5160; KIP-98 Broker side support for TxnOffsetCommitRequest

This patch adds support for the `TxnOffsetCommitRequest` added in KIP-98. 
Desired handling for this request is [described 
here](https://docs.google.com/document/d/11Jqy_GjUGtdXJK94XGsEIK7CP1SnQGdp2eF0wSw9ra8/edit#bookmark=id.55yzhvkppi6m)
 . 

The functionality includes handling the stable state of receiving 
`TxnOffsetCommitRequests` and materializing results only when the commit marker 
for the transaction is received. It also handles partition emigration and 
immigration and rebuilds the required data structures on these events.

Tests are included for the basic stable state functionality. Still need to 
add tests for the immigration/emigration functionality.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apurvam/kafka 
KAFKA-5160-broker-side-support-for-txnoffsetcommitrequest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2970.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2970


commit 925e104ca60bccb6a57ba96c1aaa4135f8734dab
Author: Apurva Mehta 
Date:   2017-05-02T22:07:04Z

WIP

commit 27760828c7672b31050cc0c02e9e6e3f6256a0db
Author: Apurva Mehta 
Date:   2017-05-03T06:59:32Z

WIP commit:

 1. Able to write txn offset commits to the log and delay
materialization until the commit marker appears.
 2. Able to recover these commits on partition load.

Todo:

 1. Tests for the above.
 2. Materialize or drop cached offset commits when the transaction
marker is received.

commit 00ab335cc0423c3544b052cccb5cb47e0e42dcc8
Author: Apurva Mehta 
Date:   2017-05-03T07:06:30Z

Small simplification

commit f8139073e7c25781b521c855420ba83b5a1c252a
Author: Apurva Mehta 
Date:   2017-05-03T07:07:54Z

fix indentation

commit 4a4bbd39ed71edc04ecd31c5000a1b83c73483d3
Author: Apurva Mehta 
Date:   2017-05-03T23:03:05Z

Code complete barring integration. Now to add unit tests

commit 8993b93d2e9565016d9be528b7451e1c0a865a35
Author: Apurva Mehta 
Date:   2017-05-04T01:00:25Z

Added the first test cases

commit 684ccea1b35a44c19e148f9c9b44cebd5bcb4fa9
Author: Apurva Mehta 
Date:   2017-05-04T01:16:10Z

Completed the functional test cases




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5171) TC should not accept empty string transactional id

2017-05-03 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-5171:


 Summary: TC should not accept empty string transactional id
 Key: KAFKA-5171
 URL: https://issues.apache.org/jira/browse/KAFKA-5171
 Project: Kafka
  Issue Type: Sub-task
Reporter: Guozhang Wang


Currently on TC, both `null` and `empty string` will be accepted and a new pid 
will be returned. However, on the producer client end empty string 
transactional id is not allowed, and if user specifically set it with empty 
string RTE will be thrown.

We can make TC's behavior consistent with client to also reject empty string 
transactional id.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-146: Classloading Isolation in Connect

2017-05-03 Thread Stephane Maarek
Glad you find the feedback useful !
Definitely all the ideas should be split in reasonable length KIPs. I just want 
to make sure the ideas are not lost. I won’t create the subsequent KIPs because 
I’m not good enough to implement the changes, but happy to keep on providing 
feedback alongside the way. 

Regarding the versioning comments: yes there’s a version of Connector, but how 
is that referenced in the config? I believe no config exposes a “version” 
field, which would tie a configuration to a connector version?
Regarding shipping connect with a few connectors, that’s fine, but once a 
capability to pull from maven is here, I’d rather have a vanilla lightweight 
connect. Anyway, discussions for later. 
 
 

On 4/5/17, 4:17 am, "Konstantine Karantasis"  wrote:

Thank you Stephane,

your comments bring interesting and useful subjects to the discussion. I'm
adding my replies below Ewen's comments.


On Tue, May 2, 2017 at 10:15 PM, Ewen Cheslack-Postava 
wrote:

> On Tue, May 2, 2017 at 10:01 PM, Stephane Maarek <
> steph...@simplemachines.com.au> wrote:
>
> Excellent feedback, Stephane!
>
> Thanks for the work, it’s definitely needed!
> > I’d like to suggest to take it one step further.
> >
> > To me, I’d like to see Kafka Connect the same way we have Docker and
> > Docker repositories.
> >
> > Here’s how I would envision the flow:
> > - Kafka Connect workers are just workers. They come with no jars
> whatsoever
> > - The REST API allow you to add a config to the connect cluster
> > - The workers, seeing the config, pull the jars from the available
> > (maven?) repositories (public or private)
> >
>
> I think supporting this mode is really valuable. It seems *really*
> attractive if you have some easily accessible, scalable, centralized
> storage for connectors (i.e. some central distributed FS).
>
> But having to jump through these hoops (which presumably include some 
extra
> URLs to point to the connectors, some definition of what that URL points 
to
> e.g. zip of jars? uberjar? something with a more complex spec?) just to do
> a quickstart seems like quite a bit of overhead. I think we should 
consider
> both a) incremental progress that opens up new options but doesn't prevent
> us from the ideal end state and b) all local testing/dev/prod use cases
> (which is also why I still like having the plain old CLASSPATH option
> available).
>
> I think the proposal leaves open the scope for this -- it doesn't specify
> connector-specific overrides, but that's obviously something that could be
> added easily.
>
>
Regarding 1) the workers carrying no modules, I believe we are pretty close
to this situation today, at least in terms of connectors. Still, I feel
that whether we bundle a few basic modules with the framework is orthogonal
to class loading isolation. I agree with you that Connectors, Converters
and Transformations should be external modules that are loaded by the
framework. But having a few basic ones shipped with Connect simplifies
on-boarding and quickstarts significantly.

Regarding 2) and 3) these are very good to have, and definitely belong to
the near term vision for Kafka Connect. Many interesting things to do here,
from programmatically fetching connectors and their dependencies from maven
repos (using something like Aether maybe), to deciding to support certain
types of module bundling such as zip, uberjars etc. However dealing with
the issue of extended discoverability at this point seems to broaden
significantly the scope of this KIP. I think it's more practical (and
probably faster) to proceed in phases. I estimate that after this KIP,
subsequent KIPs will be intuitive and quite transparent.



> > - Classpath isolation comes into play so that the pulled jar doesn’t
> > interact with other connectors
> > - Additionally, I believe the config should have a “tag” or “version”
> > (like docker really), so that
> > o you can run different versions of the same connector on your connect
> > cluster
> > o configurations are strongly linked to a connector (right now if I
> update
> > my connector jars, I may break my configuration)
> >
>
> We have versions on Connectors. We don't have them on transformations or
> converters, which was definitely an oversight -- we should figure out how
> to get them in there.
>
> I think none of the proposals here rule out taking advantage of versioning
> in the future -- would it be helpful to add a "Future Work" section that
> gives an idea of how we'd extend this in the future to handle versions as
> well?
>
>
Dealing with versioning was left out of the scope of 

Re: [DISCUSS] KIP 141 - ProducerRecordBuilder Interface

2017-05-03 Thread Stephane Maarek
How do votes works?

I feel there are 3 options right here, and I’d like a pre vote before a real 
vote? 
1) Adding constructors. Could get messy over time, especially with headers 
coming into play, and future possible improvement to the message format
2) Adding a builder / nicer looking API (like fluent) to help build a 
ProducerRecord in a safe way. Issue here are two ways of building a 
ProducerRecord can bring confusion
3) Same as 2), but deprecating all the constructors. May be too much of an 
aggressive strategy
 

I’m happy to go over 2), update the docs, and tell people this is the 
“preferred” way. Won’t outdate all the literature on Kafka, but I feel this set 
people up for success in the future.
Thoughts  / pre vote? 

On 3/5/17, 4:20 pm, "Ewen Cheslack-Postava"  wrote:

I understand the convenience of pointing at a JIRA/PR, but can we put the
concrete changes proposed in the JIRA (under "Proposed Changes"). I don't
think voting on the KIP would be reasonable otherwise since the changes
under vote could change arbitrarily...

I'm increasingly skeptical of adding more convenience constructors -- the
current patch adds timestamps, we're about to add headers as well (for
core, for Connect we have

https://cwiki.apache.org/confluence/display/KAFKA/KIP-145+-+Expose+Record+Headers+in+Kafka+Connect
in flight). It just continues to get messier over time.

I think builders in the right context are useful, as long as they exceed a
certain number of parameters (SchemaBuilder in Connect is an artifact of
that position). I don't think a transition period with 2 ways to construct
an object is actually a problem -- if there's always an "all N parameters"
version of the constructor, all other constructors are just convenience
shortcuts, but the Builder provides a shorthand.

I also agree w/ Ismael that deprecating to aggressively is bad -- we added
the APIs instead of a builder and there's not any real maintenance cost, so
why add the deprecation? I don't want to suggest actually adding such an
annotation, but the real issue here is that one API will become "preferred"
for some time.

-Ewen

On Tue, May 2, 2017 at 1:15 AM, Ismael Juma  wrote:

> Hi Matthias,
>
> Deprecating widely used APIs is a big deal. Build warnings are a nuisance
> and can potentially break the build for those who have a zero-warnings
> policy (which is good practice). It creates a bunch of busy work for our
> users and various resources like books, blog posts, etc. become out of
> date.
>
> This does not mean that we should not do it, but the benefit has to be
> worth it and we should not do it lightly.
>
> Ismael
>
> On Sat, Apr 29, 2017 at 6:52 PM, Matthias J. Sax 
> wrote:
>
> > I understand that we cannot just break stuff (btw: also not for
> > Streams!). But deprecating does not break anything, so I don't think
> > it's a big deal to change the API as long as we keep the old API as
> > deprecated.
> >
> >
> > -Matthias
> >
> > On 4/29/17 9:28 AM, Jay Kreps wrote:
> > > Hey Matthias,
> > >
> > > Yeah I agree, I'm not against change as a general thing! I also think
> if
> > > you look back on the last two years, we completely rewrote the 
producer
> > and
> > > consumer APIs, reworked the binary protocol many times over, and added
> > the
> > > connector and stream processing apis, both major new additions. So I
> > don't
> > > think we're in too much danger of stagnating!
> > >
> > > My two cents was just around breaking compatibility for trivial 
changes
> > > like constructor => builder. I think this only applies to the 
producer,
> > > consumer, and connect apis which are heavily embedded in hundreds of
> > > ecosystem components that depend on them. This is different from 
direct
> > > usage. If we break the streams api it is really no big deal---apps 
just
> > > need to rebuild when they upgrade, not the end of the world at all.
> > However
> > > because many intermediate things depend on the Kafka producer you can
> > cause
> > > these weird situations where your app depends on two third party 
things
> > > that use Kafka and each requires different, incompatible versions. We
> did
> > > this a lot in earlier versions of Kafka and it was the cause of much
> > angst
> > > (and an ingrained general reluctance to upgrade) from our users.
> > >
> > > I still think we may have to break things, i just don't think we 
should
> > do
> > > it for things like builders vs direct constructors which i think are
> kind
> > > of a debatable matter of taste.
> > >
> > > -Jay
> > >
> > >
> > >
> > > On Mon, Apr 24, 2017 at 

[GitHub] kafka pull request #2969: doc typo

2017-05-03 Thread smferguson
GitHub user smferguson opened a pull request:

https://github.com/apache/kafka/pull/2969

doc typo



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/smferguson/kafka doc_typo

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2969.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2969


commit 1196ea26a960bfaaf520dde84f46c98bc3663b80
Author: Scott Ferguson 
Date:   2017-05-04T00:43:20Z

doc typo




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5170) KafkaAdminClientIntegration test should wait until metadata is propagated to all brokers

2017-05-03 Thread Colin P. McCabe (JIRA)
Colin P. McCabe created KAFKA-5170:
--

 Summary: KafkaAdminClientIntegration test should wait until 
metadata is propagated to all brokers
 Key: KAFKA-5170
 URL: https://issues.apache.org/jira/browse/KAFKA-5170
 Project: Kafka
  Issue Type: Bug
  Components: admin
Affects Versions: 0.11.0.0
Reporter: Colin P. McCabe
Assignee: Colin P. McCabe
 Fix For: 0.11.0.0


The KafkaAdminClientIntegration test and its subclasses should wait until the 
metadata is propagated to all brokers.  We have seen a few test failures that 
resulted from some brokers having partial metadata.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1478

2017-05-03 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] MINOR: Fix error logged if not enough alive brokers for transactions

[wangguoz] KAFKA-5144: renamed variables in MinTimestampTracker and added 
comments

[wangguoz] KAFKA-5055: Fix Kafka Streams skipped-records-rate sensor bug

--
[...truncated 840.61 KB...]

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED


[jira] [Updated] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-05-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xavier Léauté updated KAFKA-5150:
-
Status: Patch Available  (was: Open)

> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.1, 0.8.2.2, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
> Fix For: 0.11.0.0
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5045) KTable materialization and improved semantics

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995882#comment-15995882
 ] 

ASF GitHub Bot commented on KAFKA-5045:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2832


> KTable materialization and improved semantics
> -
>
> Key: KAFKA-5045
> URL: https://issues.apache.org/jira/browse/KAFKA-5045
> Project: Kafka
>  Issue Type: Task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>  Labels: kip
> Fix For: 0.11.0.0
>
>
> This is the JIRA for KIP-114: KTable materialization and improved semantics 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-114%3A+KTable+materialization+and+improved+semantics



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2832: KAFKA-5045: KTable cleanup

2017-05-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2832


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-5045) KTable materialization and improved semantics

2017-05-03 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-5045.
--
Resolution: Fixed

Issue resolved by pull request 2832
[https://github.com/apache/kafka/pull/2832]

> KTable materialization and improved semantics
> -
>
> Key: KAFKA-5045
> URL: https://issues.apache.org/jira/browse/KAFKA-5045
> Project: Kafka
>  Issue Type: Task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>  Labels: kip
> Fix For: 0.11.0.0
>
>
> This is the JIRA for KIP-114: KTable materialization and improved semantics 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-114%3A+KTable+materialization+and+improved+semantics



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (KAFKA-2837) FAILING TEST: kafka.api.ProducerBounceTest > testBrokerFailure

2017-05-03 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang reopened KAFKA-2837:
--

Saw this test failure again:

{code}

kafka.api.ProducerBounceTest > testBrokerFailure FAILED
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at 
java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
at 
org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:85)
at 
kafka.network.BlockingChannel.readCompletely(BlockingChannel.scala:129)
at kafka.network.BlockingChannel.receive(BlockingChannel.scala:120)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:100)
at 
kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:84)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:133)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:32)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:132)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
at 
kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:32)
at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:131)
at 
kafka.api.ProducerBounceTest$$anonfun$2.apply(ProducerBounceTest.scala:116)
at 
kafka.api.ProducerBounceTest$$anonfun$2.apply(ProducerBounceTest.scala:113)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at 
kafka.api.ProducerBounceTest.testBrokerFailure(ProducerBounceTest.scala:113)
{code}

> FAILING TEST: kafka.api.ProducerBounceTest > testBrokerFailure 
> ---
>
> Key: KAFKA-2837
> URL: https://issues.apache.org/jira/browse/KAFKA-2837
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 0.9.0.0
>Reporter: Gwen Shapira
>Assignee: jin xing
>  Labels: newbie
> Fix For: 0.10.0.0
>
>
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> kafka.api.ProducerBounceTest.testBrokerFailure(ProducerBounceTest.scala:117)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at 

[jira] [Commented] (KAFKA-4801) Transient test failure (part 2): ConsumerBounceTest.testConsumptionWithBrokerFailures

2017-05-03 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995872#comment-15995872
 ] 

Guozhang Wang commented on KAFKA-4801:
--

[~hachikuji] Saw this happen again but with a different stack trace:

{code}
kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
java.lang.AssertionError: expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:105)
at 
kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:84)
{code}

> Transient test failure (part 2): 
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> -
>
> Key: KAFKA-4801
> URL: https://issues.apache.org/jira/browse/KAFKA-4801
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Armin Braun
>Assignee: Jason Gustafson
>Priority: Minor
>  Labels: transient-system-test-failure
>
> There is still some (but very little ... when reproducing this you need more 
> than 100 runs in half the cases statistically) instability left in the test
> {code}
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> {code}
> Resulting in this exception being thrown at a relatively low rate (I'd say 
> def less than 0.5% of all runs on my machine).
> {code}
> kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
> java.lang.IllegalArgumentException: You can only check the position for 
> partitions assigned to this consumer.
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1271)
> at 
> kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:96)
> at 
> kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:69)
> {code}
> this was also reported in a comment to the original KAFKA-4198
> https://issues.apache.org/jira/browse/KAFKA-4198?focusedCommentId=15765468=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15765468



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4801) Transient test failure (part 2): ConsumerBounceTest.testConsumptionWithBrokerFailures

2017-05-03 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995863#comment-15995863
 ] 

Jason Gustafson commented on KAFKA-4801:


After my patch was merged, this seems to now be showing up as this:
{code}
kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be 
completed since the group has already rebalanced and assigned the partitions to 
another member. This means that the time between subsequent calls to poll() was 
longer than the configured max.poll.interval.ms, which typically implies that 
the poll loop is spending too much time message processing. You can address 
this either by increasing the session timeout or by reducing the maximum size 
of batches returned in poll() with max.poll.records.
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:776)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:722)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:797)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:778)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:486)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:346)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:260)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:206)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:182)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:589)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1104)
at 
kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:113)
at 
kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:84)
{code}
This suggests that the consumer is falling out of the group in the test case 
for whatever reason.

> Transient test failure (part 2): 
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> -
>
> Key: KAFKA-4801
> URL: https://issues.apache.org/jira/browse/KAFKA-4801
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Armin Braun
>Assignee: Jason Gustafson
>Priority: Minor
>  Labels: transient-system-test-failure
>
> There is still some (but very little ... when reproducing this you need more 
> than 100 runs in half the cases statistically) instability left in the test
> {code}
> ConsumerBounceTest.testConsumptionWithBrokerFailures
> {code}
> Resulting in this exception being thrown at a relatively low rate (I'd say 
> def less than 0.5% of all runs on my machine).
> {code}
> kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures FAILED
> java.lang.IllegalArgumentException: You can only check the position for 
> partitions assigned to this consumer.
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1271)
> at 
> kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:96)
> at 
> kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:69)
> {code}
> this was also reported in a comment to the original KAFKA-4198
> https://issues.apache.org/jira/browse/KAFKA-4198?focusedCommentId=15765468=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15765468



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk7 #2145

2017-05-03 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] MINOR: Fix error logged if not enough alive brokers for transactions

[wangguoz] KAFKA-5144: renamed variables in MinTimestampTracker and added 
comments

[wangguoz] KAFKA-5055: Fix Kafka Streams skipped-records-rate sensor bug

--
[...truncated 841.48 KB...]
kafka.integration.SaslSslTopicMetadataTest > testAutoCreateTopicWithCollision 
STARTED

kafka.integration.SaslSslTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.SaslSslTopicMetadataTest > testAliveBrokerListWithNoTopics 
STARTED

kafka.integration.SaslSslTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.SaslSslTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.SaslSslTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslSslTopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.SaslSslTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.SaslSslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.SaslSslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslSslTopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.SaslSslTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslSslTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.SaslSslTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.SaslSslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.SaslSslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionEnabled 
STARTED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionEnabled 
PASSED

kafka.integration.UncleanLeaderElectionTest > 
testCleanLeaderElectionDisabledByTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testCleanLeaderElectionDisabledByTopicOverride PASSED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionDisabled 
STARTED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionDisabled 
PASSED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionInvalidTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionInvalidTopicOverride PASSED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionEnabledByTopicOverride STARTED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionEnabledByTopicOverride PASSED

kafka.integration.MinIsrConfigTest > testDefaultKafkaConfig STARTED

kafka.integration.MinIsrConfigTest > testDefaultKafkaConfig PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 

Build failed in Jenkins: kafka-trunk-jdk8 #1477

2017-05-03 Thread Apache Jenkins Server
See 


Changes:

[rajinisivaram] KAFKA-4703: Test with two SASL_SSL listeners with different JAAS

--
[...truncated 1.65 MB...]
kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata 
STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
STARTED

kafka.integration.PlaintextTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.PlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata STARTED

kafka.integration.PlaintextTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAutoCreateTopicWithInvalidReplication PASSED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown STARTED

kafka.integration.PlaintextTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 

[jira] [Commented] (KAFKA-5167) streams task gets stuck after re-balance due to LockException

2017-05-03 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995777#comment-15995777
 ] 

Matthias J. Sax commented on KAFKA-5167:


[~Narendra Kumar] Thanks for reporting this. We fixed couple of bugs with 
regard to rebalance and state directory locks in {{0.10.2.1}} that got released 
last week. Can you try the new release to see if it got already fixed there?

> streams task gets stuck after re-balance due to LockException
> -
>
> Key: KAFKA-5167
> URL: https://issues.apache.org/jira/browse/KAFKA-5167
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Narendra Kumar
> Attachments: logs.txt
>
>
> During rebalance processor node's close() method gets called two times once 
> from StreamThread.suspendTasksAndState() and once from 
> StreamThread.closeNonAssignedSuspendedTasks(). I have some instance filed 
> which I am closing in processor's close method. This instance's close method 
> throws some exception if I call close more than once. Because of this 
> exception, the Kafka streams does not attempt to close the statemanager ie.  
> task.closeStateManager(true) is never called. When a task moves from one 
> thread to another within same machine the task blocks trying to get lock on 
> state directory which is still held by unclosed statemanager and keep 
> throwing the following exception:
> 2017-04-30 12:34:17 WARN  StreamThread:1214 - Could not create task 0_1. Will 
> retry.
> org.apache.kafka.streams.errors.LockException: task [0_1] Failed to lock the 
> state directory for task 0_1
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:100)
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:864)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1237)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:967)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$600(StreamThread.java:69)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:234)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:259)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:352)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:290)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1029)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:592)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:361)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5114) Clarify meaning of logs in Introduction: Topics and Logs

2017-05-03 Thread Michael Ernest (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995767#comment-15995767
 ] 

Michael Ernest commented on KAFKA-5114:
---

What if a partition were described as a structured (ordered?) commit log, and a 
topic described as a collection of one or more partitions? In that way the 
hierarchy of these abstractions is clear, and it is also clear the properties 
of the partition are of course an aspect of its controlling topic.

> Clarify meaning of logs in Introduction: Topics and Logs
> 
>
> Key: KAFKA-5114
> URL: https://issues.apache.org/jira/browse/KAFKA-5114
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Michael Ernest
>Priority: Minor
>
> The term log is ambiguous in this section:
> * To describe a partition as a 'structured commit log'
> * To describe a topic as a partitioned log
> Then there's this sentence under Distribution: "The partitions of the log are 
> distributed over the servers in the Kafka cluster with each server handling 
> data and requests for a share of the partitions"
> In that last sentence, replacing 'log' with 'topic' would be clearer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk7 #2144

2017-05-03 Thread Apache Jenkins Server
See 


Changes:

[rajinisivaram] KAFKA-4703: Test with two SASL_SSL listeners with different JAAS

--
[...truncated 1.64 MB...]

kafka.server.DynamicConfigChangeTest > testDefaultClientIdQuotaConfigChange 
STARTED

kafka.server.DynamicConfigChangeTest > testDefaultClientIdQuotaConfigChange 
PASSED

kafka.server.DynamicConfigChangeTest > testQuotaInitialization STARTED

kafka.server.DynamicConfigChangeTest > testQuotaInitialization PASSED

kafka.server.DynamicConfigChangeTest > testUserQuotaConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testUserQuotaConfigChange PASSED

kafka.server.DynamicConfigChangeTest > testClientIdQuotaConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testClientIdQuotaConfigChange PASSED

kafka.server.DynamicConfigChangeTest > testUserClientIdQuotaChange STARTED

kafka.server.DynamicConfigChangeTest > testUserClientIdQuotaChange PASSED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaProperties 
STARTED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaProperties 
PASSED

kafka.server.DynamicConfigChangeTest > 
shouldParseRegardlessOfWhitespaceAroundValues STARTED

kafka.server.DynamicConfigChangeTest > 
shouldParseRegardlessOfWhitespaceAroundValues PASSED

kafka.server.DynamicConfigChangeTest > testDefaultUserQuotaConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testDefaultUserQuotaConfigChange PASSED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaReset STARTED

kafka.server.DynamicConfigChangeTest > shouldParseReplicationQuotaReset PASSED

kafka.server.DynamicConfigChangeTest > testDefaultUserClientIdQuotaConfigChange 
STARTED

kafka.server.DynamicConfigChangeTest > testDefaultUserClientIdQuotaConfigChange 
PASSED

kafka.server.DynamicConfigChangeTest > testConfigChangeOnNonExistingTopic 
STARTED

kafka.server.DynamicConfigChangeTest > testConfigChangeOnNonExistingTopic PASSED

kafka.server.DynamicConfigChangeTest > testConfigChange STARTED

kafka.server.DynamicConfigChangeTest > testConfigChange PASSED

kafka.server.ServerGenerateBrokerIdTest > testGetSequenceIdMethod STARTED

kafka.server.ServerGenerateBrokerIdTest > testGetSequenceIdMethod PASSED

kafka.server.ServerGenerateBrokerIdTest > testBrokerMetadataOnIdCollision 
STARTED

kafka.server.ServerGenerateBrokerIdTest > testBrokerMetadataOnIdCollision PASSED

kafka.server.ServerGenerateBrokerIdTest > testAutoGenerateBrokerId STARTED

kafka.server.ServerGenerateBrokerIdTest > testAutoGenerateBrokerId PASSED

kafka.server.ServerGenerateBrokerIdTest > testMultipleLogDirsMetaProps STARTED

kafka.server.ServerGenerateBrokerIdTest > testMultipleLogDirsMetaProps PASSED

kafka.server.ServerGenerateBrokerIdTest > testDisableGeneratedBrokerId STARTED

kafka.server.ServerGenerateBrokerIdTest > testDisableGeneratedBrokerId PASSED

kafka.server.ServerGenerateBrokerIdTest > testUserConfigAndGeneratedBrokerId 
STARTED

kafka.server.ServerGenerateBrokerIdTest > testUserConfigAndGeneratedBrokerId 
PASSED

kafka.server.ServerGenerateBrokerIdTest > 
testConsistentBrokerIdFromUserConfigAndMetaProps STARTED

kafka.server.ServerGenerateBrokerIdTest > 
testConsistentBrokerIdFromUserConfigAndMetaProps PASSED

kafka.server.AdvertiseBrokerTest > testBrokerAdvertiseHostNameAndPortToZK 
STARTED

kafka.server.AdvertiseBrokerTest > testBrokerAdvertiseHostNameAndPortToZK PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
PASSED

kafka.server.SaslPlaintextReplicaFetchTest > testReplicaFetcherThread STARTED

kafka.server.SaslPlaintextReplicaFetchTest > testReplicaFetcherThread PASSED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestWithUnsupportedVersion STARTED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestWithUnsupportedVersion PASSED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestBeforeSaslHandshakeRequest STARTED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestBeforeSaslHandshakeRequest PASSED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestAfterSaslHandshakeRequest STARTED

kafka.server.SaslApiVersionsRequestTest > 
testApiVersionsRequestAfterSaslHandshakeRequest PASSED

kafka.server.ServerGenerateClusterIdTest > 
testAutoGenerateClusterIdForKafkaClusterParallel STARTED

kafka.server.ServerGenerateClusterIdTest > 
testAutoGenerateClusterIdForKafkaClusterParallel PASSED

kafka.server.ServerGenerateClusterIdTest > testAutoGenerateClusterId STARTED

kafka.server.ServerGenerateClusterIdTest > testAutoGenerateClusterId PASSED


[jira] [Updated] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-05-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xavier Léauté updated KAFKA-5150:
-
Fix Version/s: 0.11.0.0

> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
> Fix For: 0.11.0.0
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5144) MinTimestampTracker uses confusing variable names

2017-05-03 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995746#comment-15995746
 ] 

Eno Thereska commented on KAFKA-5144:
-

[~mihbor] thanks. I have a PR that cleans up the code further, would love your 
comments: https://github.com/apache/kafka/pull/2728

> MinTimestampTracker uses confusing variable names
> -
>
> Key: KAFKA-5144
> URL: https://issues.apache.org/jira/browse/KAFKA-5144
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Michal Borowiecki
>Priority: Trivial
> Fix For: 0.11.0.0
>
>
> --When adding elements MinTimestampTracker removes all existing elements 
> greater than the added element.--
> --Perhaps I've missed something and this is intended behaviour but I can't 
> find any evidence for that in comments or tests.--
> {{MinTimestampTracker}} maintains a list of elements ascending by timestamp, 
> but calls the list var {{descendingSubsequence}} -- it also get the largest 
> element at the end and stores it in a var called {{minElem}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5169) KafkaConsumer.close should be idempotent

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995742#comment-15995742
 ] 

ASF GitHub Bot commented on KAFKA-5169:
---

GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/2968

KAFKA-5169: KafkaConsumer.close should be idempotent



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka kafka-5169-consumer-close

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2968.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2968


commit ed23400dcae41acfa0c946c760b4a4fff90c5c1d
Author: Matthias J. Sax 
Date:   2017-05-03T21:41:31Z

KAFKA-5169: KafkaConsumer.close should be idempotent




> KafkaConsumer.close should be idempotent
> 
>
> Key: KAFKA-5169
> URL: https://issues.apache.org/jira/browse/KAFKA-5169
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.2.0, 0.10.2.1
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.11.0.0
>
>
> Closing a {{KafkaConsumer}} twice result in {{IllegalStateException}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5169) KafkaConsumer.close should be idempotent

2017-05-03 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5169:
---
Fix Version/s: 0.11.0.0
Affects Version/s: 0.10.2.1
   0.10.2.0
   Status: Patch Available  (was: Open)

> KafkaConsumer.close should be idempotent
> 
>
> Key: KAFKA-5169
> URL: https://issues.apache.org/jira/browse/KAFKA-5169
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.2.0, 0.10.2.1
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.11.0.0
>
>
> Closing a {{KafkaConsumer}} twice result in {{IllegalStateException}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2968: KAFKA-5169: KafkaConsumer.close should be idempote...

2017-05-03 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/2968

KAFKA-5169: KafkaConsumer.close should be idempotent



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka kafka-5169-consumer-close

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2968.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2968


commit ed23400dcae41acfa0c946c760b4a4fff90c5c1d
Author: Matthias J. Sax 
Date:   2017-05-03T21:41:31Z

KAFKA-5169: KafkaConsumer.close should be idempotent




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5055) Kafka Streams skipped-records-rate sensor producing nonzero values even when FailOnInvalidTimestamp is used as extractor

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995701#comment-15995701
 ] 

ASF GitHub Bot commented on KAFKA-5055:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2949


> Kafka Streams skipped-records-rate sensor producing nonzero values even when 
> FailOnInvalidTimestamp is used as extractor
> 
>
> Key: KAFKA-5055
> URL: https://issues.apache.org/jira/browse/KAFKA-5055
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Nikki Thean
>Assignee: Davor Poldrugo
> Fix For: 0.11.0.0
>
>
> According to the code and the documentation for this metric, the only reason 
> for a skipped record is an invalid timestamp, except that a) I am reading 
> from a topic that is populated solely by Kafka Connect and b) I am using 
> `FailOnInvalidTimestamp` as the timestamp extractor.
> Either I'm missing something in the documentation (i.e. another reason for 
> skipped records) or there is a bug in the code that calculates this metric.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5055) Kafka Streams skipped-records-rate sensor producing nonzero values even when FailOnInvalidTimestamp is used as extractor

2017-05-03 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-5055:
-
   Resolution: Fixed
Fix Version/s: 0.11.0.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2949
[https://github.com/apache/kafka/pull/2949]

> Kafka Streams skipped-records-rate sensor producing nonzero values even when 
> FailOnInvalidTimestamp is used as extractor
> 
>
> Key: KAFKA-5055
> URL: https://issues.apache.org/jira/browse/KAFKA-5055
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Nikki Thean
>Assignee: Davor Poldrugo
> Fix For: 0.11.0.0
>
>
> According to the code and the documentation for this metric, the only reason 
> for a skipped record is an invalid timestamp, except that a) I am reading 
> from a topic that is populated solely by Kafka Connect and b) I am using 
> `FailOnInvalidTimestamp` as the timestamp extractor.
> Either I'm missing something in the documentation (i.e. another reason for 
> skipped records) or there is a bug in the code that calculates this metric.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995673#comment-15995673
 ] 

ASF GitHub Bot commented on KAFKA-5150:
---

GitHub user xvrl opened a pull request:

https://github.com/apache/kafka/pull/2967

KAFKA-5150 reduce lz4 decompression overhead

- reuse decompression buffers, keeping one per thread
- switch lz4 input stream to operate directly on ByteBuffers
- more tests with both compressible / incompressible data, multiple
  blocks, and various other combinations to increase code coverage
- fixes bug that would cause EOFException instead of invalid block size
  for invalid incompressible blocks

Overall this improves LZ4 decompression performance by up to 23x for small 
batches.
Most improvements are seen for batches of size 1 with messages on the order 
of ~100B.
At least 10x improvements for for batch sizes of < 10 messages, with 
messages of < 10kB

See benchmark code and results here
https://gist.github.com/xvrl/05132e0643513df4adf842288be86efd

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xvrl/kafka kafka-5150

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2967.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2967


commit 0efc6e7f15b6994a6665da5975e69c77426cf904
Author: Xavier Léauté 
Date:   2017-05-03T20:40:45Z

KAFKA-5150 reduce lz4 decompression overhead

- reuse decompression buffers, keeping one per thread
- switch lz4 input stream to operate directly on ByteBuffers
- more tests with both compressible / incompressible data, multiple
  blocks, and various other combinations to increase code coverage
- fixes bug that would cause EOFException instead of invalid block size
  for invalid incompressible blocks




> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2967: KAFKA-5150 reduce lz4 decompression overhead

2017-05-03 Thread xvrl
GitHub user xvrl opened a pull request:

https://github.com/apache/kafka/pull/2967

KAFKA-5150 reduce lz4 decompression overhead

- reuse decompression buffers, keeping one per thread
- switch lz4 input stream to operate directly on ByteBuffers
- more tests with both compressible / incompressible data, multiple
  blocks, and various other combinations to increase code coverage
- fixes bug that would cause EOFException instead of invalid block size
  for invalid incompressible blocks

Overall this improves LZ4 decompression performance by up to 23x for small 
batches.
Most improvements are seen for batches of size 1 with messages on the order 
of ~100B.
At least 10x improvements for for batch sizes of < 10 messages, with 
messages of < 10kB

See benchmark code and results here
https://gist.github.com/xvrl/05132e0643513df4adf842288be86efd

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xvrl/kafka kafka-5150

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2967.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2967


commit 0efc6e7f15b6994a6665da5975e69c77426cf904
Author: Xavier Léauté 
Date:   2017-05-03T20:40:45Z

KAFKA-5150 reduce lz4 decompression overhead

- reuse decompression buffers, keeping one per thread
- switch lz4 input stream to operate directly on ByteBuffers
- more tests with both compressible / incompressible data, multiple
  blocks, and various other combinations to increase code coverage
- fixes bug that would cause EOFException instead of invalid block size
  for invalid incompressible blocks




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5144) MinTimestampTracker uses confusing variable names

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995635#comment-15995635
 ] 

ASF GitHub Bot commented on KAFKA-5144:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2948


> MinTimestampTracker uses confusing variable names
> -
>
> Key: KAFKA-5144
> URL: https://issues.apache.org/jira/browse/KAFKA-5144
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Michal Borowiecki
>Priority: Trivial
> Fix For: 0.11.0.0
>
>
> --When adding elements MinTimestampTracker removes all existing elements 
> greater than the added element.--
> --Perhaps I've missed something and this is intended behaviour but I can't 
> find any evidence for that in comments or tests.--
> {{MinTimestampTracker}} maintains a list of elements ascending by timestamp, 
> but calls the list var {{descendingSubsequence}} -- it also get the largest 
> element at the end and stores it in a var called {{minElem}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2948: KAFKA-5144 renamed and added comments to make it c...

2017-05-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2948


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5144) MinTimestampTracker uses confusing variable names

2017-05-03 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-5144:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2948
[https://github.com/apache/kafka/pull/2948]

> MinTimestampTracker uses confusing variable names
> -
>
> Key: KAFKA-5144
> URL: https://issues.apache.org/jira/browse/KAFKA-5144
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Michal Borowiecki
>Priority: Trivial
> Fix For: 0.11.0.0
>
>
> --When adding elements MinTimestampTracker removes all existing elements 
> greater than the added element.--
> --Perhaps I've missed something and this is intended behaviour but I can't 
> find any evidence for that in comments or tests.--
> {{MinTimestampTracker}} maintains a list of elements ascending by timestamp, 
> but calls the list var {{descendingSubsequence}} -- it also get the largest 
> element at the end and stores it in a var called {{minElem}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KAFKA-5114) Clarify meaning of logs in Introduction: Topics and Logs

2017-05-03 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang reassigned KAFKA-5114:


Assignee: (was: Michael Ernest)

> Clarify meaning of logs in Introduction: Topics and Logs
> 
>
> Key: KAFKA-5114
> URL: https://issues.apache.org/jira/browse/KAFKA-5114
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Michael Ernest
>Priority: Minor
>
> The term log is ambiguous in this section:
> * To describe a partition as a 'structured commit log'
> * To describe a topic as a partitioned log
> Then there's this sentence under Distribution: "The partitions of the log are 
> distributed over the servers in the Kafka cluster with each server handling 
> data and requests for a share of the partitions"
> In that last sentence, replacing 'log' with 'topic' would be clearer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KAFKA-5114) Clarify meaning of logs in Introduction: Topics and Logs

2017-05-03 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang reassigned KAFKA-5114:


Assignee: Michael Ernest

> Clarify meaning of logs in Introduction: Topics and Logs
> 
>
> Key: KAFKA-5114
> URL: https://issues.apache.org/jira/browse/KAFKA-5114
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Michael Ernest
>Assignee: Michael Ernest
>Priority: Minor
>
> The term log is ambiguous in this section:
> * To describe a partition as a 'structured commit log'
> * To describe a topic as a partitioned log
> Then there's this sentence under Distribution: "The partitions of the log are 
> distributed over the servers in the Kafka cluster with each server handling 
> data and requests for a share of the partitions"
> In that last sentence, replacing 'log' with 'topic' would be clearer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Request to add to the contributor list

2017-05-03 Thread Guozhang Wang
Hi Amit,

I have added you to the contributor list and assigned 4996 to you. Cheers.

Will also review your PR soon.


Guozhang


On Tue, May 2, 2017 at 5:43 AM, Amit Daga  wrote:

> Hello,
>
> Wondering if you were able to look into it?
>
> Thanks,
> Amit Daga
>
> On Sun, Apr 30, 2017 at 1:29 PM, Amit Daga  wrote:
>
> > Hello Team,
> >
> > Hope you are doing well.
> >
> > My name is Amit Daga. This is to request you to add me to the contributor
> > list. Also if KAFKA-4996 issue is still not assigned, I would like to
> work
> > on it.
> >
> > Thanks,
> > Amit
> >
>



-- 
-- Guozhang


[jira] [Assigned] (KAFKA-4996) Fix findbugs multithreaded correctness warnings for streams

2017-05-03 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang reassigned KAFKA-4996:


Assignee: Amit Daga

> Fix findbugs multithreaded correctness warnings for streams
> ---
>
> Key: KAFKA-4996
> URL: https://issues.apache.org/jira/browse/KAFKA-4996
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Colin P. McCabe
>Assignee: Amit Daga
>  Labels: newbie
>
> Fix findbugs multithreaded correctness warnings for streams
> {code}
> Multithreaded correctness Warnings
>   
>   
> 
>   
>   
>   
> 
>Code Warning   
>   
>   
> 
>AT   Sequence of calls to java.util.concurrent.ConcurrentHashMap may not 
> be atomic in 
> org.apache.kafka.streams.state.internals.Segments.getOrCreateSegment(long, 
> ProcessorContext) 
> 
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.KafkaStreams.stateListener; locked 66% of time   
>   
>   
> 
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.processor.internals.StreamThread.stateListener; 
> locked 66% of time
>   
>  
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.processor.TopologyBuilder.applicationId; locked 50% 
> of time   
>   
>  
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.context; locked 
> 66% of time   
>   
> 
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.CachingWindowStore.cache; locked 60% 
> of time   
>   
> 
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.CachingWindowStore.context; locked 
> 66% of time   
>   
>   
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.CachingWindowStore.name; locked 60% 
> of time   
>   
>  
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.CachingWindowStore.serdes; locked 
> 70% of time   
>   
>
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.RocksDBStore.db; locked 63% of time  
>   
>   
> 
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.RocksDBStore.serdes; locked 76% of 
> time  
>   
>   
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2954: MINOR: Fix error logged if not enough alive broker...

2017-05-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2954


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Reopened] (KAFKA-4583) KafkaConsumerTest.testGracefulClose transient failure

2017-05-03 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang reopened KAFKA-4583:
--

Saw this failure again but with a different stack trace:

https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/3432/testReport/junit/org.apache.kafka.clients.consumer/KafkaConsumerTest/testGracefulClose/

{code}
Stacktrace

java.util.concurrent.TimeoutException
at java.util.concurrent.FutureTask.get(FutureTask.java:205)
at 
org.apache.kafka.clients.consumer.KafkaConsumerTest.consumerCloseTest(KafkaConsumerTest.java:1319)
at 
org.apache.kafka.clients.consumer.KafkaConsumerTest.testGracefulClose(KafkaConsumerTest.java:1205)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:147)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:129)
at 
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)
at 
org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
at 
org.gradle.internal.concurrent.StoppableExecutorImpl$1.run(StoppableExecutorImpl.java:46)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 

[jira] [Commented] (KAFKA-4703) Test with two SASL_SSL listeners with different JAAS contexts

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995557#comment-15995557
 ] 

ASF GitHub Bot commented on KAFKA-4703:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2506


> Test with two SASL_SSL listeners with different JAAS contexts
> -
>
> Key: KAFKA-4703
> URL: https://issues.apache.org/jira/browse/KAFKA-4703
> Project: Kafka
>  Issue Type: Test
>Reporter: Ismael Juma
>Assignee: Balint Molnar
>  Labels: newbie
> Fix For: 0.11.0.0
>
>
> [~rsivaram] suggested the following in 
> https://github.com/apache/kafka/pull/2406
> {quote}
> I think this feature allows two SASL_SSL listeners, one for external and one 
> for internal and the two can use different mechanisms and different JAAS 
> contexts. That makes the multi-mechanism configuration neater. I think it 
> will be useful to have an integration test for this, perhaps change 
> SaslMultiMechanismConsumerTest.
> {quote}
> And my reply:
> {quote}
> I think it's a bit tricky to support multiple listeners in 
> KafkaServerTestHarness. Maybe it's easier to do the test you suggest in 
> MultipleListenersWithSameSecurityProtocolTest.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2506: KAFKA-4703 Test with two SASL_SSL listeners with d...

2017-05-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2506


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4703) Test with two SASL_SSL listeners with different JAAS contexts

2017-05-03 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-4703:
--
   Resolution: Fixed
Fix Version/s: 0.11.0.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2506
[https://github.com/apache/kafka/pull/2506]

> Test with two SASL_SSL listeners with different JAAS contexts
> -
>
> Key: KAFKA-4703
> URL: https://issues.apache.org/jira/browse/KAFKA-4703
> Project: Kafka
>  Issue Type: Test
>Reporter: Ismael Juma
>Assignee: Balint Molnar
>  Labels: newbie
> Fix For: 0.11.0.0
>
>
> [~rsivaram] suggested the following in 
> https://github.com/apache/kafka/pull/2406
> {quote}
> I think this feature allows two SASL_SSL listeners, one for external and one 
> for internal and the two can use different mechanisms and different JAAS 
> contexts. That makes the multi-mechanism configuration neater. I think it 
> will be useful to have an integration test for this, perhaps change 
> SaslMultiMechanismConsumerTest.
> {quote}
> And my reply:
> {quote}
> I think it's a bit tricky to support multiple listeners in 
> KafkaServerTestHarness. Maybe it's easier to do the test you suggest in 
> MultipleListenersWithSameSecurityProtocolTest.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Topic Creation programatically

2017-05-03 Thread Colin McCabe
Hi Zishan,

The best practice will depend on the situation.  In most cases, creating
the topic using the CLI creation command prior to running your code is
easier than creating it programmatically.  There will be a new
AdminClient API for creating topics programmatically in the 0.11
release; prior to that, your main option would be using topic
auto-creation.  I would not recommend using topic auto-creation, because
you will not be able to set the configuration that is used on the new
topic.

best,
Colin


On Tue, Apr 11, 2017, at 01:51, Zishan Ali Saiyed wrote:
> Hi Team,
> 
> I am using kafka integrated with java client. I have a question " What is
> the best practice to create topic using programmatically or using CLI
> topic
> creation command?"
> 
> 
> Thanks,
> Zishan Ali


[jira] [Updated] (KAFKA-5168) Cleanup delayed produce purgatory during partition emmigration

2017-05-03 Thread Damian Guy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Guy updated KAFKA-5168:
--
Status: Patch Available  (was: Open)

> Cleanup delayed produce purgatory during partition emmigration 
> ---
>
> Key: KAFKA-5168
> URL: https://issues.apache.org/jira/browse/KAFKA-5168
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Damian Guy
>Assignee: Damian Guy
>
> When partitions are emmigrated we need to forceComplete any partition in the 
> replica managers delayed produce purgatory so that they can error out. This 
> needs to be done after the partition has been removed from the 
> ownedPartitions map so that they can error out with NOT_COORDINATOR



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4996) Fix findbugs multithreaded correctness warnings for streams

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995430#comment-15995430
 ] 

ASF GitHub Bot commented on KAFKA-4996:
---

GitHub user amitdaga opened a pull request:

https://github.com/apache/kafka/pull/2966

KAFKA-4996: Fix findbugs multithreaded correctness warnings for streams



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amitdaga/kafka findbugs-streams-multithread

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2966.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2966


commit bf7c4cf3588112e8d9518b43ed086aa36588d75f
Author: Amit Daga 
Date:   2017-04-30T17:58:39Z

Fixing multithread correctness warning in streams




> Fix findbugs multithreaded correctness warnings for streams
> ---
>
> Key: KAFKA-4996
> URL: https://issues.apache.org/jira/browse/KAFKA-4996
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Colin P. McCabe
>  Labels: newbie
>
> Fix findbugs multithreaded correctness warnings for streams
> {code}
> Multithreaded correctness Warnings
>   
>   
> 
>   
>   
>   
> 
>Code Warning   
>   
>   
> 
>AT   Sequence of calls to java.util.concurrent.ConcurrentHashMap may not 
> be atomic in 
> org.apache.kafka.streams.state.internals.Segments.getOrCreateSegment(long, 
> ProcessorContext) 
> 
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.KafkaStreams.stateListener; locked 66% of time   
>   
>   
> 
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.processor.internals.StreamThread.stateListener; 
> locked 66% of time
>   
>  
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.processor.TopologyBuilder.applicationId; locked 50% 
> of time   
>   
>  
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.CachingKeyValueStore.context; locked 
> 66% of time   
>   
> 
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.CachingWindowStore.cache; locked 60% 
> of time   
>   
> 
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.CachingWindowStore.context; locked 
> 66% of time   
>   
>   
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.CachingWindowStore.name; locked 60% 
> of time   
>   
>  
>IS   Inconsistent synchronization of 
> org.apache.kafka.streams.state.internals.CachingWindowStore.serdes; locked 
> 70% of time   
> 

[GitHub] kafka pull request #2966: KAFKA-4996: Fix findbugs multithreaded correctness...

2017-05-03 Thread amitdaga
GitHub user amitdaga opened a pull request:

https://github.com/apache/kafka/pull/2966

KAFKA-4996: Fix findbugs multithreaded correctness warnings for streams



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amitdaga/kafka findbugs-streams-multithread

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2966.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2966


commit bf7c4cf3588112e8d9518b43ed086aa36588d75f
Author: Amit Daga 
Date:   2017-04-30T17:58:39Z

Fixing multithread correctness warning in streams




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-143: Controller Health Metrics

2017-05-03 Thread Joel Koshy
On Wed, May 3, 2017 at 10:54 AM, Onur Karaman 
wrote:

> Regarding the ControllerState and the potential for overlap, I think it
> depends on our definition of controller state. While KAFKA-5028 allows only
> a single ControllerEvent to be processed at a time, it still allows
> interleavings for long-lasting actions like partition reassignment and
> topic deletion. For instance, a topic can get created while another topic
> is undergoing partition reassignment. In that sense, there is overlap.
> However, in the sense of the ControllerEvents being processed, there can be
> no overlap.
>

Yes - that is roughly what I was thinking (although deletes are no longer
long running). Also, what is the "steady-state" controller state? Idle?
What about a broker that is not the controller? Would you need a separate
idle-not-controller state? Given that most of the state changes are short
we would just see blips in the best case and nothing in the worst case
(depending on how often metrics get sampled). It would only help if you
want to visually detect any transitions that are taking an inordinate
duration.



> > 1. Yes, the long term goal is to migrate the metrics on the broker to
>> > kafka-metrics. Since many people are using Yammer reporters, we probably
>> > need to support a few popular ones in kafka-metrics before migrating.
>> Until
>> > that happens, we probably want to stick with the Yammer metrics on the
>> > server side unless we depend on features from kafka-metrics (e.g,
>> quota).
>>
>
Ok - my thought was since we are already using kafka-metrics for quotas and
selector metrics we could just do the same for this (and any *new* metrics
on the broker).


> 4. Metrics #2 and #3. The issue with relying on metric #1 is that the
>> > latter is sensitive to the frequency of metric collection. For example,
>> if
>> > the starting of the controller takes 30 secs and the metric is only
>> > collected once a minute, one may not know the latency with just metric
>> #1,
>> > but will know the latency with metrics #2 and #3. Are you concerned
>> about
>> > the memory overhead of histograms? It doesn't seem that a couple of more
>> > histograms will hurt.
>>
>
No I don't have concerns about the histograms - just wondering if it is
useful enough to have these in the first place, but your summary makes
sense.

Joel


> >
>> > Hi, Isamel,
>> >
>> > Thanks the for proposal. A couple of more comments.,
>> >
>> > 10. It would be useful to add a new metrics for the controller queue
>> size.
>> > kafka.controller:type=ControllerStats,name=QueueSize
>> >
>> > 11. It would also be useful to know how long an event is waiting in the
>> > controller queue before being processing. Perhaps, we can add a
>> histogram
>> > metric like the following.
>> > kafka.controller:type=ControllerStats,name=QueueTimeMs
>> >
>> > Jun
>> >
>> > On Thu, Apr 27, 2017 at 11:39 AM, Joel Koshy 
>> wrote:
>> >
>> > > Thanks for the KIP - couple of comments:
>> > > - Do you intend to actually use yammer metrics? or use kafka-metrics
>> and
>> > > split the timer into an explicit rate and time? I think long term we
>> > ought
>> > > to move off yammer and use kafka-metrics only. Actually either is
>> fine,
>> > but
>> > > we should ideally use only one in the long term - and I thought the
>> plan
>> > > was to use kafka-metrics.
>> > > - metric #9 appears to be redundant since we already have per-API
>> request
>> > > rate and time metrics.
>> > > - Same for metric #4, #5 (as there are request stats for
>> > > DeleteTopicRequest - although it is possible for users to trigger
>> deletes
>> > > via ZK)
>> > > - metric #2, #3 are potentially useful, but a bit overkill for a
>> > > histogram. Alternative is to stick to last known value, but that
>> doesn't
>> > > play well with alerts if a high value isn't reset/decayed. Perhaps
>> metric
>> > > #1 would be sufficient to gauge slow start/resignation transitions.
>> > > - metric #1 - some of the states may actually overlap
>> > > - I don't actually understand the semantics of metric #6. Is it rate
>> of
>> > > partition reassignment triggers? does the number of partitions matter?
>> > >
>> > > Joel
>> > >
>> > > On Thu, Apr 27, 2017 at 8:04 AM, Tom Crayford 
>> > > wrote:
>> > >
>> > >> Ismael,
>> > >>
>> > >> Great, that sounds lovely.
>> > >>
>> > >> I'd like a `Timer` (using yammer metrics parlance) over how long it
>> took
>> > >> to
>> > >> process the event, so we can get at p99 and max times spent
>> processing
>> > >> things. Maybe we could even do a log at warning level if event
>> > processing
>> > >> takes over some timeout?
>> > >>
>> > >> Thanks
>> > >>
>> > >> Tom
>> > >>
>> > >> On Thu, Apr 27, 2017 at 3:59 PM, Ismael Juma 
>> wrote:
>> > >>
>> > >> > Hi Tom,
>> > >> >
>> > >> > Yes, the plan is to merge KAFKA-5028 first and then use a lock-free
>> > >> > approach for the new  metrics. I considered 

[jira] [Commented] (KAFKA-5168) Cleanup delayed produce purgatory during partition emmigration

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995395#comment-15995395
 ] 

ASF GitHub Bot commented on KAFKA-5168:
---

GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/2965

KAFKA-5168: Cleanup delayed produce purgatory during partition emmigration

remove operations from the replica manager's producer purgatory on 
transaction emmigration

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka cpkafka-711

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2965.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2965


commit e93df465bee73f2b25fc18608b22a2b7fba1e175
Author: Damian Guy 
Date:   2017-05-03T18:35:52Z

remove operations from the replica manager's producer purgatory on 
transaction emmigration




> Cleanup delayed produce purgatory during partition emmigration 
> ---
>
> Key: KAFKA-5168
> URL: https://issues.apache.org/jira/browse/KAFKA-5168
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Damian Guy
>Assignee: Damian Guy
>
> When partitions are emmigrated we need to forceComplete any partition in the 
> replica managers delayed produce purgatory so that they can error out. This 
> needs to be done after the partition has been removed from the 
> ownedPartitions map so that they can error out with NOT_COORDINATOR



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2965: KAFKA-5168: Cleanup delayed produce purgatory duri...

2017-05-03 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/2965

KAFKA-5168: Cleanup delayed produce purgatory during partition emmigration

remove operations from the replica manager's producer purgatory on 
transaction emmigration

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka cpkafka-711

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2965.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2965


commit e93df465bee73f2b25fc18608b22a2b7fba1e175
Author: Damian Guy 
Date:   2017-05-03T18:35:52Z

remove operations from the replica manager's producer purgatory on 
transaction emmigration




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5059) Implement Transactional Coordinator

2017-05-03 Thread Damian Guy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Guy updated KAFKA-5059:
--
Fix Version/s: 0.11.0.0

> Implement Transactional Coordinator
> ---
>
> Key: KAFKA-5059
> URL: https://issues.apache.org/jira/browse/KAFKA-5059
> Project: Kafka
>  Issue Type: New Feature
>  Components: core
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 0.11.0.0
>
>
> This covers the implementation of the transaction coordinator to support 
> transactions, as described in KIP-98: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5168) Cleanup delayed produce purgatory during partition emmigration

2017-05-03 Thread Damian Guy (JIRA)
Damian Guy created KAFKA-5168:
-

 Summary: Cleanup delayed produce purgatory during partition 
emmigration 
 Key: KAFKA-5168
 URL: https://issues.apache.org/jira/browse/KAFKA-5168
 Project: Kafka
  Issue Type: Sub-task
  Components: core
Reporter: Damian Guy
Assignee: Damian Guy


When partitions are emmigrated we need to forceComplete any partition in the 
replica managers delayed produce purgatory so that they can error out. This 
needs to be done after the partition has been removed from the ownedPartitions 
map so that they can error out with NOT_COORDINATOR



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-03 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995380#comment-15995380
 ] 

Matthias J. Sax commented on KAFKA-5154:


Thanks. We tried to reproduce, too. But no success yet... Let's see how quickly 
we can track it down.

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application error during processing:
>  java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> 2017-05-01T00:02:00,038 INFO  StreamThread-1 
> org.apache.kafka.clients.producer.KafkaProducer.close() @689 - Closing the 
> Kafka 

Re: [DISCUSS] KIP-146: Classloading Isolation in Connect

2017-05-03 Thread Konstantine Karantasis
Thank you Stephane,

your comments bring interesting and useful subjects to the discussion. I'm
adding my replies below Ewen's comments.


On Tue, May 2, 2017 at 10:15 PM, Ewen Cheslack-Postava 
wrote:

> On Tue, May 2, 2017 at 10:01 PM, Stephane Maarek <
> steph...@simplemachines.com.au> wrote:
>
> Excellent feedback, Stephane!
>
> Thanks for the work, it’s definitely needed!
> > I’d like to suggest to take it one step further.
> >
> > To me, I’d like to see Kafka Connect the same way we have Docker and
> > Docker repositories.
> >
> > Here’s how I would envision the flow:
> > - Kafka Connect workers are just workers. They come with no jars
> whatsoever
> > - The REST API allow you to add a config to the connect cluster
> > - The workers, seeing the config, pull the jars from the available
> > (maven?) repositories (public or private)
> >
>
> I think supporting this mode is really valuable. It seems *really*
> attractive if you have some easily accessible, scalable, centralized
> storage for connectors (i.e. some central distributed FS).
>
> But having to jump through these hoops (which presumably include some extra
> URLs to point to the connectors, some definition of what that URL points to
> e.g. zip of jars? uberjar? something with a more complex spec?) just to do
> a quickstart seems like quite a bit of overhead. I think we should consider
> both a) incremental progress that opens up new options but doesn't prevent
> us from the ideal end state and b) all local testing/dev/prod use cases
> (which is also why I still like having the plain old CLASSPATH option
> available).
>
> I think the proposal leaves open the scope for this -- it doesn't specify
> connector-specific overrides, but that's obviously something that could be
> added easily.
>
>
Regarding 1) the workers carrying no modules, I believe we are pretty close
to this situation today, at least in terms of connectors. Still, I feel
that whether we bundle a few basic modules with the framework is orthogonal
to class loading isolation. I agree with you that Connectors, Converters
and Transformations should be external modules that are loaded by the
framework. But having a few basic ones shipped with Connect simplifies
on-boarding and quickstarts significantly.

Regarding 2) and 3) these are very good to have, and definitely belong to
the near term vision for Kafka Connect. Many interesting things to do here,
from programmatically fetching connectors and their dependencies from maven
repos (using something like Aether maybe), to deciding to support certain
types of module bundling such as zip, uberjars etc. However dealing with
the issue of extended discoverability at this point seems to broaden
significantly the scope of this KIP. I think it's more practical (and
probably faster) to proceed in phases. I estimate that after this KIP,
subsequent KIPs will be intuitive and quite transparent.



> > - Classpath isolation comes into play so that the pulled jar doesn’t
> > interact with other connectors
> > - Additionally, I believe the config should have a “tag” or “version”
> > (like docker really), so that
> > o you can run different versions of the same connector on your connect
> > cluster
> > o configurations are strongly linked to a connector (right now if I
> update
> > my connector jars, I may break my configuration)
> >
>
> We have versions on Connectors. We don't have them on transformations or
> converters, which was definitely an oversight -- we should figure out how
> to get them in there.
>
> I think none of the proposals here rule out taking advantage of versioning
> in the future -- would it be helpful to add a "Future Work" section that
> gives an idea of how we'd extend this in the future to handle versions as
> well?
>
>
Dealing with versioning was left out of the scope of this KIP because
Transformations and Converters currently are not versioned. I like the idea
of adding a Future Work section. I'll add one to highlight the intention to
support versioning.


>
> > I know this is a bit out of scope, but if major changes are coming to
> > connect then these are my two cents.
> >
>
> :) I think we can address most of these incrementally, but keep the scope
> here a bit smaller and more manageable (and hopefully get it into
> 0.11.0.0!). I just don't want perfect to be the enemy of good -- getting
> the first step in as long as it doesn't cause problems down the line seems
> like a good step.
>
>
> >
> > Finally, maybe extend that construct to Transformers. The ability to
> > externalise transformers as jars would democratize their usage IMO
> >
>
> This should definitely happen! It's important for pretty much any pluggable
> component, though I think Transformations will, on average, have the fewest
> extra dependencies. Connectors have the most, followed by Converters, then
> Transformations. But I think Konstantine has mentioned these in the KIP --
> if it could be clearer in the proposed changes, perhaps you 

Re: [DISCUSS] KIP-146: Classloading Isolation in Connect

2017-05-03 Thread Konstantine Karantasis
Thanks Ewen. I'm replying inline as well.

On Tue, May 2, 2017 at 11:24 AM, Ewen Cheslack-Postava 
wrote:

> Thanks for the KIP.
>
> A few responses inline, followed by additional comments.
>
> On Mon, May 1, 2017 at 9:50 PM, Konstantine Karantasis <
> konstant...@confluent.io> wrote:
>
> > Gwen, Randall thank you for your very insightful observations. I'm glad
> you
> > find this first draft to be an adequate platform for discussion.
> >
> > I'll attempt replying to your comments in order.
> >
> > Gwen, I also debated exactly the same two options: a) interpreting
> absence
> > of module path as a user's intention to turn off isolation and b)
> > explicitly using an additional boolean property. A few reasons why I went
> > with b) in this first draft are:
> > 1) As Randall mentions, to leave the option of using a default value
> open.
> > If not immediately in the first version of isolation, maybe in the
> future.
> > 2) I didn't like the implicit character of the choice of interpreting an
> > empty string as a clear intention to turn isolation off by the user. Half
> > the time could be just that users forget to set a location, although
> they'd
> > like to use class loading isolation.
> > 3) There's a slim possibility that in rare occasions a user might want to
> > avoid even the slightest increase in memory consumption due to class
> > loading duplication. I admit this should be very rare, but given the
> other
> > concerns and that we would really like to keep the isolation
> implementation
> > simple, the option to turn off this feature by using only one additional
> > config property might not seem too excessive. At least at the start of
> this
> > discussion.
> > 4) Debugging during development might be simpler in some cases.
> > 5) Finally, as you mention, this could allow for smoother upgrades.
> >
>
> I'm not sure any of these keep you from removing the extra config. Is there
> any reason you couldn't have clean support for relying on the CLASSPATH
> while still supporting the classloaders? Then getting people onto the new
> classloaders does require documentation for how to install connectors, but
> that's pretty minimal. And we don't break existing installations where
> people are just adding to the CLASSPATH. It seems like this:
>
> 1. Allows you to set a default. Isolation is always enabled, but we won't
> include any paths/directories we already use. Setting a default just
> requires specifying a new location where we'd hold these directories.
> 2. It doesn't require the implicit choice -- you actually never turn off
> isolation, but still support the regular CLASSPATH with an empty list of
> isolated loaders
> 3. The user can still use CLASSPATH if they want to minimize classloader
> overhead
> 4. Debugging can still use CLASSPATH
> 5. Upgrades just work.
>

Falling back to CLASSPATH for non-isolated mode makes sense. The extra
config property was suggested proactively, as well as for clarity and
handling of defaults. But it's much better if we can do without it. Will be
removed.


>
>
> > Randall, regarding your comments:
> > 1) To keep its focus narrow, this KIP, as well as the first
> implementation
> > of isolation in Connect, assume filesystem based discovery. With careful
> > implementation, transitioning to discovery schemes that support broader
> > URIs I believe should be easy in the future.
> >
>
> Maybe just mention a couple of quick examples in the KIP. When described
> inline it might be more obvious that it will extend cleanly.
>
>
There's an example for a filesystem-based structure. I will enhance it.



> > 2) The example you give makes a good point. However I'm inclined to say
> > that such cases should be addressed more as exceptions rather than as
> being
> > the common case. Therefore, I wouldn't see all dependencies imported by
> the
> > framework as required to be filtered out, because in that case we lose
> the
> > advantage of isolation between the framework and the connectors (and we
> are
> > left only with isolation between connectors).
>
> 3) I tried to abstract implementation details in this the KIP, but you are
> > right. Even though filtering here is mainly used semantically rather than
> > literally, it gives an implementation hint that we could avoid.
> >
>
> I think we're missing another option -- don't do filtering and require that
> those dependencies are correctly filtered out of the modules. If we want to
> be nicer about this, we could also detect maybe 2 or 3 classes while
> scanning for Connectors/Converters/Transformations that indicate the
> classloader has jars that it shouldn't and warn about it. I can't think of
> that many that would be an issue -- basically connect-api, connect-runtime
> if they really mess it up, and maybe slf4j.
>
>
This sounds a bit more strict, which is not necessarily a bad thing. Still,
it seems to also keep the subject under the category of implementation
decisions.

> 4) In the same spirit as in 3) 

Re: [DISCUSS] KIP-143: Controller Health Metrics

2017-05-03 Thread Onur Karaman
Regarding the ControllerState and the potential for overlap, I think it
depends on our definition of controller state. While KAFKA-5028 allows only
a single ControllerEvent to be processed at a time, it still allows
interleavings for long-lasting actions like partition reassignment and
topic deletion. For instance, a topic can get created while another topic
is undergoing partition reassignment. In that sense, there is overlap.
However, in the sense of the ControllerEvents being processed, there can be
no overlap.

I also think adding the QueueSize and QueueTimeMs metrics could be a
premature move. I completely agree that these metrics would be valuable
given KAFKA-5028. However, I'm not sure whether the controller event queue
and controller thread as implemented today is actually here to stay or if
it's merely a first step in the controller redesign. Especially when
considering the possibility of moving away from the simple synchronous
zookeeper apis and having better control over handling zookeeper
disconnects and session expirations, it's possible that the queue and
thread could actually get ripped out of the controller and being part of
something more general, making these two controller-level metrics invalid.

On Wed, May 3, 2017 at 7:55 AM, Ismael Juma  wrote:

> Thanks for the feedback Tom, Joel and Jun.
>
> I updated the KIP in the following way:
>
> 1. Removed ControlledShutdownRateAndTimeMs
> 2. Added QueueSize and QueueTimeMs
> 3. Renamed FailedIsrUpdateRate to FailedIsrUpdatesPerSec for consistency
> with other metrics in the Partition class
> 4. Mentioned that Yammer metrics will be used
> 5. Noted that no Controller locks are acquired when retrieving metric
> values
>
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?
> pageId=69407758=6=4
>
> If there are no additional concerns, I will start a vote tomorrow.
>
> Ismael
>
> On Tue, May 2, 2017 at 2:35 AM, Jun Rao  wrote:
>
> > Hi, Joel,
> >
> > 1. Yes, the long term goal is to migrate the metrics on the broker to
> > kafka-metrics. Since many people are using Yammer reporters, we probably
> > need to support a few popular ones in kafka-metrics before migrating.
> Until
> > that happens, we probably want to stick with the Yammer metrics on the
> > server side unless we depend on features from kafka-metrics (e.g, quota).
> >
> > 2. Thanks for Onur, we now have moved to a single threaded model. So,
> none
> > of the event in metric #1 will overlap.
> >
> > 3. Metric #6 just track the rate/latency each time the controller is
> called
> > to initiate or resume the processing of a reassginment request.
> >
> > 4. Metrics #2 and #3. The issue with relying on metric #1 is that the
> > latter is sensitive to the frequency of metric collection. For example,
> if
> > the starting of the controller takes 30 secs and the metric is only
> > collected once a minute, one may not know the latency with just metric
> #1,
> > but will know the latency with metrics #2 and #3. Are you concerned about
> > the memory overhead of histograms? It doesn't seem that a couple of more
> > histograms will hurt.
> >
> > 5. Metric #9. Agreed. After KAFKA-5028, this will be reflected in the
> > remoteTimeMs of the controlled shutdown request.
> >
> > 6. Metrics #4 and #5. They are actually a bit different. The local time
> of
> > createTopic/deleteTopic just includes the time to add/delete the topic
> path
> > in ZK. The remote time includes the time that the controller processes
> the
> > request plus the time for the metadata to be propagated back to the
> > controller. So, knowing just the portion of the time spent in the
> > controller can still be useful.
> >
> > Hi, Isamel,
> >
> > Thanks the for proposal. A couple of more comments.,
> >
> > 10. It would be useful to add a new metrics for the controller queue
> size.
> > kafka.controller:type=ControllerStats,name=QueueSize
> >
> > 11. It would also be useful to know how long an event is waiting in the
> > controller queue before being processing. Perhaps, we can add a histogram
> > metric like the following.
> > kafka.controller:type=ControllerStats,name=QueueTimeMs
> >
> > Jun
> >
> > On Thu, Apr 27, 2017 at 11:39 AM, Joel Koshy 
> wrote:
> >
> > > Thanks for the KIP - couple of comments:
> > > - Do you intend to actually use yammer metrics? or use kafka-metrics
> and
> > > split the timer into an explicit rate and time? I think long term we
> > ought
> > > to move off yammer and use kafka-metrics only. Actually either is fine,
> > but
> > > we should ideally use only one in the long term - and I thought the
> plan
> > > was to use kafka-metrics.
> > > - metric #9 appears to be redundant since we already have per-API
> request
> > > rate and time metrics.
> > > - Same for metric #4, #5 (as there are request stats for
> > > DeleteTopicRequest - although it is possible for users to trigger
> deletes
> > > via ZK)
> > > - metric #2, #3 are 

Re: [DISCUSS] KIP-137: Enhance TopicCommand --describe to show topics marked for deletion

2017-05-03 Thread Mickael Maison
Yes it's what I was thinking when writing this up, JSON output would
be nice. I'll be happy to have a look at it. I'm guessing that would
require another KIP ?

On Wed, May 3, 2017 at 8:26 AM, Ismael Juma  wrote:
> Yeah, structured output for the CLI tools would be great. 3 digit number
> JIRA, nice. :)
>
> Ismael
>
> On Wed, May 3, 2017 at 7:05 AM, Ewen Cheslack-Postava 
> wrote:
>
>> Since everything is whitespace delimited anyway, I don't think we should
>> worry about the compatibility issue. We don't guarantee this unstructured
>> output format. I think it is fine to say that any parser that doesn't do
>> something straightforward and reliable like splitting the line by
>> whitespace then checking the : prefixed value to determine if it is usable
>> is ok to break.
>>
>> Long term, we should really just get more structured output formats for the
>> command line tools, a la https://issues.apache.org/jira/browse/KAFKA-313.
>>
>> -Ewen
>>
>> On Wed, Apr 26, 2017 at 2:59 AM, Ismael Juma  wrote:
>>
>> > Right, the reason for inserting it before the configs is that
>> > MarkedForDeletion is a fixed length field while configs is a variable
>> > length field. The fact that MarkedForDeletion is optional and typically
>> not
>> > set means that it's also justifiable to place it after the configs. So,
>> I'm
>> > OK either way.
>> >
>> > Ismael
>> >
>> > On Wed, Apr 26, 2017 at 10:42 AM, Mickael Maison <
>> mickael.mai...@gmail.com
>> > >
>> > wrote:
>> >
>> > > Thanks for the feedback.
>> > >
>> > > I had the same thinking as James. Also we plan to only add the
>> > > MarkedForDeletion field for topics pending deletion as the output of
>> > > --describe is already pretty dense and most topics are never pending
>> > > deletion.
>> > >
>> > > The only reason I came up to insert it in the middle is if Configs is
>> > > long, then MarkedForDeletion could be pushed on a new line/off-screen.
>> > > Am I missing something ?
>> > >
>> > > That said, I don't have a strong opinion about it and if most people
>> > > prefer it the other way around I'll be happy to update the KIP.
>> > >
>> > > On Wed, Apr 26, 2017 at 12:25 AM, James Cheng 
>> > > wrote:
>> > > > Having "MarkedForDeletion" before "Configs" may break anyone who is
>> > > parsing this output, since they may be expecting the 4th string to be
>> > > "Configs".
>> > > >
>> > > > I know that the Compatibility section already says that people
>> parsing
>> > > this may have to adjust their parsing logic, so maybe that covers my
>> > > concern already. But inserting the new MarkedForDeletion word into the
>> > > middle of the string seems like it'll break parsing more than just
>> > adding a
>> > > new value at the end.
>> > > >
>> > > > I'm fine either way, though.
>> > > >
>> > > > -James
>> > > >
>> > > >> On Apr 25, 2017, at 9:38 AM, Vahid S Hashemian <
>> > > vahidhashem...@us.ibm.com> wrote:
>> > > >>
>> > > >> Thanks for the KIP Mickael.
>> > > >> Looks good. I also prefer 'MarkedForDeletion' before 'Configs'.
>> > > >>
>> > > >> --Vahid
>> > > >>
>> > > >>
>> > > >>
>> > > >> From:   Ismael Juma 
>> > > >> To: dev@kafka.apache.org
>> > > >> Date:   04/25/2017 04:15 AM
>> > > >> Subject:Re: [DISCUSS] KIP-137: Enhance TopicCommand
>> --describe
>> > > to
>> > > >> show topics marked for deletion
>> > > >> Sent by:isma...@gmail.com
>> > > >>
>> > > >>
>> > > >>
>> > > >> Thanks for the KIP. Would it make sense for MarkedForDeletion to be
>> > > before
>> > > >> `Configs`? I can see arguments both ways, so I was wondering what
>> your
>> > > >> thoughts were?
>> > > >>
>> > > >> Ismael
>> > > >>
>> > > >> On Thu, Mar 30, 2017 at 5:39 PM, Mickael Maison <
>> > > mickael.mai...@gmail.com>
>> > > >> wrote:
>> > > >>
>> > > >>> Hi all,
>> > > >>>
>> > > >>> We created KIP-137: Enhance TopicCommand --describe to show topics
>> > > >>> marked for deletion
>> > > >>>
>> > > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > > >>>
>> > > >> 137%3A+Enhance+TopicCommand+--describe+to+show+topics+marked
>> > > +for+deletion
>> > > >>>
>> > > >>> Please help review the KIP. You feedback is appreciated!
>> > > >>>
>> > > >>> Thanks
>> > > >>>
>> > > >>
>> > > >>
>> > > >>
>> > > >>
>> > > >
>> > >
>> >
>>


[jira] [Commented] (KAFKA-5004) poll() timeout not enforced when connecting to 0.10.0 broker

2017-05-03 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995257#comment-15995257
 ] 

Colin P. McCabe commented on KAFKA-5004:


Thanks for filing this, [~mjsax].  I think the severity is mitigated somewhat 
by the fact that there has to be a client-side bug (polling thread dies) to 
trigger the bad behavior.

bq. IMHO, a "clean" solution would be, to disable the heartbeat thread if the 
client connects to 0.10.0 broker and sends heartbeats on poll() as 0.10.0 
consumer does. Not sure, how complex this would be to do though.

I think this would be a bit risky since we'd be adding code that only ever gets 
used in a very obscure error path when talking to 0.10.0 brokers.  It's not 
likely to be well-tested.

bq. [~cmccabe] had the idea to set a "flag" on the heartbeat thread each time 
poll() is called, and let the heartbeat thread stop if max.poll.interval.ms 
passed and flag got not "renewed".

Yeah, this might be a good option.

> poll() timeout not enforced when connecting to 0.10.0 broker
> 
>
> Key: KAFKA-5004
> URL: https://issues.apache.org/jira/browse/KAFKA-5004
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 0.10.2.0
>Reporter: Matthias J. Sax
>
> In 0.10.1, heartbeat thread and new poll timeout {{max.poll.interval.ms}} got 
> introduced via KIP-62. In 0.10.2, we added client-broker backward 
> compatibility.
> Now, if a 0.10.2 client connects to a 0.10.0 broker, the broker only 
> understand the heartbeat timeout but not the poll timeout, while the client 
> is still using the heartbeat background threat. Thus, the new client config 
> {{max.poll.interval.ms}} is ignored.
> In the worst case, the polling threat might die while the heartbeat thread is 
> still up. Thus, the broker would not timeout the client and no rebalance 
> would be triggered while at the same time the client is effectively dead not 
> making any progress in its assigned partitions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5167) streams task gets stuck after re-balance due to LockException

2017-05-03 Thread Narendra Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Narendra Kumar updated KAFKA-5167:
--
 Attachment: logs.txt
Description: 
During rebalance processor node's close() method gets called two times once 
from StreamThread.suspendTasksAndState() and once from 
StreamThread.closeNonAssignedSuspendedTasks(). I have some instance filed which 
I am closing in processor's close method. This instance's close method throws 
some exception if I call close more than once. Because of this exception, the 
Kafka streams does not attempt to close the statemanager ie.  
task.closeStateManager(true) is never called. When a task moves from one thread 
to another within same machine the task blocks trying to get lock on state 
directory which is still held by unclosed statemanager and keep throwing the 
following exception:

2017-04-30 12:34:17 WARN  StreamThread:1214 - Could not create task 0_1. Will 
retry.
org.apache.kafka.streams.errors.LockException: task [0_1] Failed to lock the 
state directory for task 0_1
at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:100)
at 
org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
at 
org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
at 
org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:864)
at 
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1237)
at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210)
at 
org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:967)
at 
org.apache.kafka.streams.processor.internals.StreamThread.access$600(StreamThread.java:69)
at 
org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:234)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:259)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:352)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:290)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1029)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:592)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:361)


  was:
During rebalance processor node's close() method gets called two times. I have 
some instance filed which I am closing in processor's close method. This 
instance's close method throws some exception if I call close more than once. 
Because of this exception, the Kafka streams does not attempt to close the 
statemanager ie.  task.closeStateManager(true) is never called. When a task 
moves from one thread to another within same machine the task blocks trying to 
get lock on state directory which is still held by unclosed statemanager and 
keep throwing the following exception:

2017-04-30 12:34:17 WARN  StreamThread:1214 - Could not create task 0_1. Will 
retry.
org.apache.kafka.streams.errors.LockException: task [0_1] Failed to lock the 
state directory for task 0_1
at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:100)
at 
org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
at 
org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
at 
org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:864)
at 
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1237)
at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210)
at 
org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:967)
at 
org.apache.kafka.streams.processor.internals.StreamThread.access$600(StreamThread.java:69)
at 
org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:234)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:259)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:352)
at 

[jira] [Created] (KAFKA-5167) streams task gets stuck after re-balance due to LockException

2017-05-03 Thread Narendra Kumar (JIRA)
Narendra Kumar created KAFKA-5167:
-

 Summary: streams task gets stuck after re-balance due to 
LockException
 Key: KAFKA-5167
 URL: https://issues.apache.org/jira/browse/KAFKA-5167
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.10.2.0
Reporter: Narendra Kumar


During rebalance processor node's close() method gets called two times. I have 
some instance filed which I am closing in processor's close method. This 
instance's close method throws some exception if I call close more than once. 
Because of this exception, the Kafka streams does not attempt to close the 
statemanager ie.  task.closeStateManager(true) is never called. When a task 
moves from one thread to another within same machine the task blocks trying to 
get lock on state directory which is still held by unclosed statemanager and 
keep throwing the following exception:

2017-04-30 12:34:17 WARN  StreamThread:1214 - Could not create task 0_1. Will 
retry.
org.apache.kafka.streams.errors.LockException: task [0_1] Failed to lock the 
state directory for task 0_1
at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:100)
at 
org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
at 
org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
at 
org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:864)
at 
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1237)
at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210)
at 
org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:967)
at 
org.apache.kafka.streams.processor.internals.StreamThread.access$600(StreamThread.java:69)
at 
org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:234)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:259)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:352)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:290)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1029)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:592)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:361)




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Questions about Development Practices

2017-05-03 Thread Colin McCabe
On Sun, Apr 2, 2017, at 13:28, Daan Rennings wrote:
> Dear developers,
> 
> When going over your way of working adopted with Apache Kafka, I was
> wondering about the following:
> 
> *1) Visualizing Technical Debt*
> Based on my findings (with FindBugs, CheckStyle and JaCoCo), I concluded
> that Kafka's codebase has a good overall quality with regard to the
> architecture and individual lines of code (great work!). I was however
> wondering, have you ever considered to also include reports from e.g.
> Sonarqube to make this quality more visuable? I believe it's nice for
> programmers to see that they are delivering a good job (through e.g. the
> Sonarqube panel after fixing some issues with regard to technical debt),
> but technical debt would also become even more manageable through the
> adoption of such a tool (with regard to the latter, one could also think
> of
> e.g. using CodeCity).

Hi Daan,

That's interesting.  Jenkins provides some of the functionality you
describe, in terms of giving developers a dashboard to look at.  I think
the biggest improvement we could make is probably improving test
stability there.  There have also been proposals to add a test coverage
checking tool to our Jenkins pre-commit build, which would be a welcome
improvement for anyone who wants to contribute.

> 
> *2) Adaptations from FindBugs and CheckStyle's defaults*
> Based on the findbugs-exclude.xml and checkstyle.xml, I found that you
> have
> decided to deviate from some default values (e.g. exluding bugs with
> regard
> to MS (Malicious code vulnerabilities) and NPathComplexity of max 500
> instead of the default 200). Is there any documentation on the decision
> made for these deviations? Or, if not, could you elaborate upon your
> choices?

In general, we found that the findbugs MS warnings were not practical to
fix.  For example, it would have meant getting rid of all public array
fields, as well as any function that returned a mutable reference to
anything inside a class.  This would have required a rewrite of many
internal classes, for unclear benefit.  Kafka was also originally a
Scala project, so there is a philosophy of sometimes leaving internal
data fields public or package-private, as is typically done in Scala. 
We have been moving away from this in the Java code, and towards putting
things behind accessors, but a lot of code still eschews them.  These
issues tend to be more important when you are writing library APIs than
in internal code.

With regard to NPathComplexity, I think that the reasoning was that
setting it to 200 was causing it to fail too much existing code.  I'm
actually not a big fan of this checkstyle rule in the first place, since
I think the complexity of code is better assessed by a human than by
checkstyle.  I do think the naming and whitespace rules enforced by
checkstyle are useful.

cheers,
Colin

> 
> Thank you very much in advance!
> 
> Kind regards,
> 
> Daan Rennings
> 
> P.S. I am with a team of students from Delft University of Technology,
> trying to analyze Apache Kafka as part of the course "IN4315 Software
> Architecture" which will publish it's findings in a GitBook (for more
> information, please have a look at
> https://avandeursen.com/2017/01/15/the-collaborative-software-architecture-course/).
> Answers to both questions would be useful for our analysis of Apache
> Kafka.


[jira] [Created] (KAFKA-5166) Add option "dry run" to Streams application reset tool

2017-05-03 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-5166:
--

 Summary: Add option "dry run" to Streams application reset tool
 Key: KAFKA-5166
 URL: https://issues.apache.org/jira/browse/KAFKA-5166
 Project: Kafka
  Issue Type: Improvement
  Components: streams, tools
Affects Versions: 0.10.2.0
Reporter: Matthias J. Sax
Priority: Minor


We want to add an option to Streams application reset tool, that allow for a 
"dry run". Ie, only prints what topics would get modified/deleted without 
actually applying any actions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is back to normal : kafka-trunk-jdk8 #1476

2017-05-03 Thread Apache Jenkins Server
See 




[jira] [Commented] (KAFKA-4985) kafka-acls should resolve dns names and accept ip ranges

2017-05-03 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995159#comment-15995159
 ] 

Colin P. McCabe commented on KAFKA-4985:


Hmm.  The problem with resolving hostnames client-side is that it would cause a 
lot of confusion when resolution happened differently client-side versus 
server-side.  It's probably better just to use IPs to be unambiguous.

Allowing patterns would be a nice improvement.  In the past, we've held back 
from this since we didn't want to be tied to a particular regular expression 
implementation. Maybe if we could find a fast and standard one, we could use 
that, though.

> kafka-acls should resolve dns names and accept ip ranges
> 
>
> Key: KAFKA-4985
> URL: https://issues.apache.org/jira/browse/KAFKA-4985
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Reporter: Ryan P
>
> Per KAFKA-2869 it looks like a conscious decision was made to move away from 
> using hostnames for authorization purposes. 
> This is fine however IP addresses are terrible inconvenient compared to 
> hostname with regard to configuring ACLs. 
> I'd like to propose the following two improvements to make managing these 
> ACLs easier for end-users. 
> 1. Allow for simple patterns to be matched 
> i.e --allow-host 10.17.81.11[1-9] 
> 2. Allow for hostnames to be used even if they are resolved on the client 
> side. Simple pattern matching on hostnames would be a welcome addition as well
> i.e. --allow-host host.name.com
> Accepting a comma delimited list of hostnames and ip addresses would also be 
> helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-143: Controller Health Metrics

2017-05-03 Thread Ismael Juma
Thanks for the feedback Tom, Joel and Jun.

I updated the KIP in the following way:

1. Removed ControlledShutdownRateAndTimeMs
2. Added QueueSize and QueueTimeMs
3. Renamed FailedIsrUpdateRate to FailedIsrUpdatesPerSec for consistency
with other metrics in the Partition class
4. Mentioned that Yammer metrics will be used
5. Noted that no Controller locks are acquired when retrieving metric values

https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=69407758=6=4

If there are no additional concerns, I will start a vote tomorrow.

Ismael

On Tue, May 2, 2017 at 2:35 AM, Jun Rao  wrote:

> Hi, Joel,
>
> 1. Yes, the long term goal is to migrate the metrics on the broker to
> kafka-metrics. Since many people are using Yammer reporters, we probably
> need to support a few popular ones in kafka-metrics before migrating. Until
> that happens, we probably want to stick with the Yammer metrics on the
> server side unless we depend on features from kafka-metrics (e.g, quota).
>
> 2. Thanks for Onur, we now have moved to a single threaded model. So, none
> of the event in metric #1 will overlap.
>
> 3. Metric #6 just track the rate/latency each time the controller is called
> to initiate or resume the processing of a reassginment request.
>
> 4. Metrics #2 and #3. The issue with relying on metric #1 is that the
> latter is sensitive to the frequency of metric collection. For example, if
> the starting of the controller takes 30 secs and the metric is only
> collected once a minute, one may not know the latency with just metric #1,
> but will know the latency with metrics #2 and #3. Are you concerned about
> the memory overhead of histograms? It doesn't seem that a couple of more
> histograms will hurt.
>
> 5. Metric #9. Agreed. After KAFKA-5028, this will be reflected in the
> remoteTimeMs of the controlled shutdown request.
>
> 6. Metrics #4 and #5. They are actually a bit different. The local time of
> createTopic/deleteTopic just includes the time to add/delete the topic path
> in ZK. The remote time includes the time that the controller processes the
> request plus the time for the metadata to be propagated back to the
> controller. So, knowing just the portion of the time spent in the
> controller can still be useful.
>
> Hi, Isamel,
>
> Thanks the for proposal. A couple of more comments.,
>
> 10. It would be useful to add a new metrics for the controller queue size.
> kafka.controller:type=ControllerStats,name=QueueSize
>
> 11. It would also be useful to know how long an event is waiting in the
> controller queue before being processing. Perhaps, we can add a histogram
> metric like the following.
> kafka.controller:type=ControllerStats,name=QueueTimeMs
>
> Jun
>
> On Thu, Apr 27, 2017 at 11:39 AM, Joel Koshy  wrote:
>
> > Thanks for the KIP - couple of comments:
> > - Do you intend to actually use yammer metrics? or use kafka-metrics and
> > split the timer into an explicit rate and time? I think long term we
> ought
> > to move off yammer and use kafka-metrics only. Actually either is fine,
> but
> > we should ideally use only one in the long term - and I thought the plan
> > was to use kafka-metrics.
> > - metric #9 appears to be redundant since we already have per-API request
> > rate and time metrics.
> > - Same for metric #4, #5 (as there are request stats for
> > DeleteTopicRequest - although it is possible for users to trigger deletes
> > via ZK)
> > - metric #2, #3 are potentially useful, but a bit overkill for a
> > histogram. Alternative is to stick to last known value, but that doesn't
> > play well with alerts if a high value isn't reset/decayed. Perhaps metric
> > #1 would be sufficient to gauge slow start/resignation transitions.
> > - metric #1 - some of the states may actually overlap
> > - I don't actually understand the semantics of metric #6. Is it rate of
> > partition reassignment triggers? does the number of partitions matter?
> >
> > Joel
> >
> > On Thu, Apr 27, 2017 at 8:04 AM, Tom Crayford 
> > wrote:
> >
> >> Ismael,
> >>
> >> Great, that sounds lovely.
> >>
> >> I'd like a `Timer` (using yammer metrics parlance) over how long it took
> >> to
> >> process the event, so we can get at p99 and max times spent processing
> >> things. Maybe we could even do a log at warning level if event
> processing
> >> takes over some timeout?
> >>
> >> Thanks
> >>
> >> Tom
> >>
> >> On Thu, Apr 27, 2017 at 3:59 PM, Ismael Juma  wrote:
> >>
> >> > Hi Tom,
> >> >
> >> > Yes, the plan is to merge KAFKA-5028 first and then use a lock-free
> >> > approach for the new  metrics. I considered mentioning that in the KIP
> >> > given KAFKA-5120, but didn't in the end. I'll add it to make it clear.
> >> >
> >> > Regarding locks, they are removed by KAFKA-5028, as you say. So, if I
> >> > understand correctly, you are suggesting an event processing rate
> metric
> >> > with event type as a tag? Onur and Jun, 

[jira] [Commented] (KAFKA-5155) Messages can be deleted prematurely when some producers use timestamps and some not

2017-05-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/KAFKA-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994831#comment-15994831
 ] 

Petr Plavjaník commented on KAFKA-5155:
---

Hi [~huxi_2b] and [~mihbor],

this defect is about the potential data loss that has occurred in our test 
scenario and not about the ordering of messages based on their timestamps. We 
were using a non-Java producer that used version 0 of the message format 
(without timestamp) but the first message in each partition was written by 
KafkaProducer in Java that used version 1 message format with timestamp. Some 
messages were lost (written to the log but deleted before they were read) when 
the first retention was done. The fix for it should not change how timestamps 
are used elsewhere just make sure that time-based retention.
It is listed in section _Potential breaking changes in 0.10.1.0_: {quote}The 
log retention time is no longer based on last modified time of the log 
segments. Instead it will be based on the largest timestamp of the messages in 
a log segment.{quote} 

But it can be surprising that old producers create messages with no timestamps 
and that that these are not taken into consideration when the segment is 
deleted. When I first read it I thought that timestamp of messages is the log 
append timestamp. The circumstances when the data loss has occurred are quite 
rare (a segment where the message at the beginning of the log segment has a 
timestamp and the rest do not) but data loss is not good in any case.

I am not sure what is the right way to fix it. One way is just to change the 
{{deleteRetenionMsBreachedSegments()}} to account for {{lastModified}} 
timestamp as before.

{code}
  private def deleteRetenionMsBreachedSegments() : Int = {
if (config.retentionMs < 0) return 0
val startMs = time.milliseconds
deleteOldSegments(startMs - Math.max(_.largestTimestamp, _.lastModified) > 
config.retentionMs)
  }
{code}

The other way is to use current time if {{appendInfo.maxTimestamp}} is {{-1}} 
in {{Log.append()}}. This also affects log segment rolling. But 
{{LogSegment.maxTimestampSoFar}} can be changed by {{LogSegment.truncateTo()}} 
to lower values and then {{Log.deleteRetenionMsBreachedSegments()}} that used 
{{LogSegment.largestTimestamp}} (uses {{LogSegment.maxTimestampSoFar}}) will 
not take messages with no timestamp into account.

> Messages can be deleted prematurely when some producers use timestamps and 
> some not
> ---
>
> Key: KAFKA-5155
> URL: https://issues.apache.org/jira/browse/KAFKA-5155
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.2.0
>Reporter: Petr Plavjaník
>
> Some messages can be deleted prematurely and never read in following 
> scenario. A producer uses timestamps and produces messages that are appended 
> to the beginning of a log segment. Other producer produces messages without a 
> timestamp. In that case the largest timestamp is made by the old messages 
> with a timestamp and new messages with the timestamp does not influence and 
> the log segment with old and new messages can be delete immediately after the 
> last new message with no timestamp is appended. When all appended messages 
> have no timestamp, then they are not deleted because {{lastModified}} 
> attribute of a {{LogSegment}} is used.
> New test case to {{kafka.log.LogTest}} that fails:
> {code}
>   @Test
>   def 
> shouldNotDeleteTimeBasedSegmentsWhenTimestampIsNotProvidedForSomeMessages() {
> val retentionMs = 1000
> val old = TestUtils.singletonRecords("test".getBytes, timestamp = 0)
> val set = TestUtils.singletonRecords("test".getBytes, timestamp = -1, 
> magicValue = 0)
> val log = createLog(set.sizeInBytes, retentionMs = retentionMs)
> // append some messages to create some segments
> log.append(old)
> for (_ <- 0 until 12)
>   log.append(set)
> assertEquals("No segment should be deleted", 0, log.deleteOldSegments())
>   }
> {code}
> It can be prevented by using {{def largestTimestamp = 
> Math.max(maxTimestampSoFar, lastModified)}} in LogSegment, or by using 
> current timestamp when messages with timestamp {{-1}} are appended.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1475

2017-05-03 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFA-4378: Fix Scala 2.12 "eta-expansion of zero-argument method"

--
[...truncated 3.51 MB...]
kafka.utils.CommandLineUtilsTest > testParseSingleArg PASSED

kafka.utils.CommandLineUtilsTest > testParseArgs STARTED

kafka.utils.CommandLineUtilsTest > testParseArgs PASSED

kafka.utils.CommandLineUtilsTest > testParseEmptyArgAsValid STARTED

kafka.utils.CommandLineUtilsTest > testParseEmptyArgAsValid PASSED

kafka.utils.ReplicationUtilsTest > testUpdateLeaderAndIsr STARTED

kafka.utils.ReplicationUtilsTest > testUpdateLeaderAndIsr PASSED

kafka.utils.ReplicationUtilsTest > testGetLeaderIsrAndEpochForPartition STARTED

kafka.utils.ReplicationUtilsTest > testGetLeaderIsrAndEpochForPartition PASSED

kafka.utils.JsonTest > testJsonEncoding STARTED

kafka.utils.JsonTest > testJsonEncoding PASSED

kafka.utils.ShutdownableThreadTest > testShutdownWhenCalledAfterThreadStart 
STARTED

kafka.utils.ShutdownableThreadTest > testShutdownWhenCalledAfterThreadStart 
PASSED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask PASSED

kafka.utils.SchedulerTest > testNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testRestart STARTED

kafka.utils.SchedulerTest > testRestart PASSED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler STARTED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest > testPeriodicTask STARTED

kafka.utils.SchedulerTest > testPeriodicTask PASSED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath STARTED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath PASSED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing STARTED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing PASSED

kafka.utils.IteratorTemplateTest > testIterator STARTED

kafka.utils.IteratorTemplateTest > testIterator PASSED

kafka.utils.UtilsTest > testGenerateUuidAsBase64 STARTED

kafka.utils.UtilsTest > testGenerateUuidAsBase64 PASSED

kafka.utils.UtilsTest > testAbs STARTED

kafka.utils.UtilsTest > testAbs PASSED

kafka.utils.UtilsTest > testReplaceSuffix STARTED

kafka.utils.UtilsTest > testReplaceSuffix PASSED

kafka.utils.UtilsTest > testCircularIterator STARTED

kafka.utils.UtilsTest > testCircularIterator PASSED

kafka.utils.UtilsTest > testReadBytes STARTED

kafka.utils.UtilsTest > testReadBytes PASSED

kafka.utils.UtilsTest > testCsvList STARTED

kafka.utils.UtilsTest > testCsvList PASSED

kafka.utils.UtilsTest > testReadInt STARTED

kafka.utils.UtilsTest > testReadInt PASSED

kafka.utils.UtilsTest > testUrlSafeBase64EncodeUUID STARTED

kafka.utils.UtilsTest > testUrlSafeBase64EncodeUUID PASSED

kafka.utils.UtilsTest > testCsvMap STARTED

kafka.utils.UtilsTest > testCsvMap PASSED

kafka.utils.UtilsTest > testInLock STARTED

kafka.utils.UtilsTest > testInLock PASSED

kafka.utils.UtilsTest > testSwallow STARTED

kafka.utils.UtilsTest > testSwallow PASSED

kafka.producer.AsyncProducerTest > testFailedSendRetryLogic STARTED

kafka.producer.AsyncProducerTest > testFailedSendRetryLogic PASSED

kafka.producer.AsyncProducerTest > testQueueTimeExpired STARTED

kafka.producer.AsyncProducerTest > testQueueTimeExpired PASSED

kafka.producer.AsyncProducerTest > testPartitionAndCollateEvents STARTED

kafka.producer.AsyncProducerTest > testPartitionAndCollateEvents PASSED

kafka.producer.AsyncProducerTest > testBatchSize STARTED

kafka.producer.AsyncProducerTest > testBatchSize PASSED

kafka.producer.AsyncProducerTest > testSerializeEvents STARTED

kafka.producer.AsyncProducerTest > testSerializeEvents PASSED

kafka.producer.AsyncProducerTest > testProducerQueueSize STARTED

kafka.producer.AsyncProducerTest > testProducerQueueSize PASSED

kafka.producer.AsyncProducerTest > testRandomPartitioner STARTED

kafka.producer.AsyncProducerTest > testRandomPartitioner PASSED

kafka.producer.AsyncProducerTest > testInvalidConfiguration STARTED

kafka.producer.AsyncProducerTest > testInvalidConfiguration PASSED

kafka.producer.AsyncProducerTest > testInvalidPartition STARTED

kafka.producer.AsyncProducerTest > testInvalidPartition PASSED

kafka.producer.AsyncProducerTest > testNoBroker STARTED

kafka.producer.AsyncProducerTest > testNoBroker PASSED

kafka.producer.AsyncProducerTest > testProduceAfterClosed STARTED

kafka.producer.AsyncProducerTest > testProduceAfterClosed PASSED


Exiting a streams app at end of stream?

2017-05-03 Thread Thomas Becker
We have had a number of situations where we need to migrate data in a
Kafka topic to a new topic that is keyed differently. Stream processing
is a good fit for this use-case with one exception: there is no easy
way to know when your "migration job" is finished. Has any thought been
given to adding an "end of stream" notion to Kafka Streams, and a
corresponding mode to exit the application when all input streams have
hit it?

--


Tommy Becker

Senior Software Engineer

O +1 919.460.4747

tivo.com




This email and any attachments may contain confidential and privileged material 
for the sole use of the intended recipient. Any review, copying, or 
distribution of this email (or any attachments) by others is prohibited. If you 
are not the intended recipient, please contact the sender immediately and 
permanently delete this email and any attachments. No employee or agent of TiVo 
Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by 
email. Binding agreements with TiVo Inc. may only be made by a signed written 
agreement.


Build failed in Jenkins: kafka-trunk-jdk7 #2143

2017-05-03 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-3754; Add GC log retention policy to limit size of log

--
[...truncated 837.81 KB...]
kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithSuccessOnAddPartitionsWhenStateIsCompleteAbort PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareAbortState
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareAbortState
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldNotRetryOnCommitWhenAppendToLogFailsWithNotCoordinator STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldNotRetryOnCommitWhenAppendToLogFailsWithNotCoordinator PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithErrorsNoneOnAddPartitionWhenNoErrorsAndPartitionsTheSame 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithErrorsNoneOnAddPartitionWhenNoErrorsAndPartitionsTheSame PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithInvalidRequestOnEndTxnWhenTransactionalIdIsEmpty STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithInvalidRequestOnEndTxnWhenTransactionalIdIsEmpty PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendCompleteAbortToLogOnEndTxnWhenStatusIsOngoingAndResultIsAbort 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendCompleteAbortToLogOnEndTxnWhenStatusIsOngoingAndResultIsAbort PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendCompleteCommitToLogOnEndTxnWhenStatusIsOngoingAndResultIsCommit 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendCompleteCommitToLogOnEndTxnWhenStatusIsOngoingAndResultIsCommit 
PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithInvalidRequestAddPartitionsToTransactionWhenTransactionalIdIsEmpty
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithInvalidRequestAddPartitionsToTransactionWhenTransactionalIdIsEmpty
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendPrepareAbortToLogOnEndTxnWhenStatusIsOngoingAndResultIsAbort STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendPrepareAbortToLogOnEndTxnWhenStatusIsOngoingAndResultIsAbort PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAcceptInitPidAndReturnNextPidWhenTransactionalIdIsNull STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAcceptInitPidAndReturnNextPidWhenTransactionalIdIsNull PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRemoveTransactionsForPartitionOnEmigration STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRemoveTransactionsForPartitionOnEmigration PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareCommitState
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareCommitState
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnInvalidTxnRequestOnEndTxnRequestWhenStatusIsCompleteCommitAndResultIsNotCommit
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnInvalidTxnRequestOnEndTxnRequestWhenStatusIsCompleteCommitAndResultIsNotCommit
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnOkOnEndTxnWhenStatusIsCompleteCommitAndResultIsCommit STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnOkOnEndTxnWhenStatusIsCompleteCommitAndResultIsCommit PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionsOnAddPartitionsWhenStateIsPrepareCommit 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionsOnAddPartitionsWhenStateIsPrepareCommit 
PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldIncrementEpochAndUpdateMetadataOnHandleInitPidWhenExistingCompleteTransaction
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldIncrementEpochAndUpdateMetadataOnHandleInitPidWhenExistingCompleteTransaction
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionOnAddPartitionsWhenStateIsPrepareAbort 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionOnAddPartitionsWhenStateIsPrepareAbort 
PASSED


[jira] [Commented] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-03 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994646#comment-15994646
 ] 

Lukas Gemela commented on KAFKA-5154:
-

Ok I've made it running with kafka streams tag 0.10.2.0 plus Guozhang's patch, 
I let you know if the problem occurs again

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application error during processing:
>  java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> 2017-05-01T00:02:00,038 INFO  StreamThread-1 
> org.apache.kafka.clients.producer.KafkaProducer.close() @689 - Closing the 

Build failed in Jenkins: kafka-trunk-jdk7 #2142

2017-05-03 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFA-4378: Fix Scala 2.12 "eta-expansion of zero-argument method"

--
[...truncated 841.97 KB...]
kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithSuccessOnAddPartitionsWhenStateIsCompleteAbort PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareAbortState
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareAbortState
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldNotRetryOnCommitWhenAppendToLogFailsWithNotCoordinator STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldNotRetryOnCommitWhenAppendToLogFailsWithNotCoordinator PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithErrorsNoneOnAddPartitionWhenNoErrorsAndPartitionsTheSame 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithErrorsNoneOnAddPartitionWhenNoErrorsAndPartitionsTheSame PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithInvalidRequestOnEndTxnWhenTransactionalIdIsEmpty STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithInvalidRequestOnEndTxnWhenTransactionalIdIsEmpty PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendCompleteAbortToLogOnEndTxnWhenStatusIsOngoingAndResultIsAbort 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendCompleteAbortToLogOnEndTxnWhenStatusIsOngoingAndResultIsAbort PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendCompleteCommitToLogOnEndTxnWhenStatusIsOngoingAndResultIsCommit 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendCompleteCommitToLogOnEndTxnWhenStatusIsOngoingAndResultIsCommit 
PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithInvalidRequestAddPartitionsToTransactionWhenTransactionalIdIsEmpty
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithInvalidRequestAddPartitionsToTransactionWhenTransactionalIdIsEmpty
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendPrepareAbortToLogOnEndTxnWhenStatusIsOngoingAndResultIsAbort STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAppendPrepareAbortToLogOnEndTxnWhenStatusIsOngoingAndResultIsAbort PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAcceptInitPidAndReturnNextPidWhenTransactionalIdIsNull STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldAcceptInitPidAndReturnNextPidWhenTransactionalIdIsNull PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRemoveTransactionsForPartitionOnEmigration STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRemoveTransactionsForPartitionOnEmigration PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareCommitState
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldWaitForCommitToCompleteOnHandleInitPidAndExistingTransactionInPrepareCommitState
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnInvalidTxnRequestOnEndTxnRequestWhenStatusIsCompleteCommitAndResultIsNotCommit
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnInvalidTxnRequestOnEndTxnRequestWhenStatusIsCompleteCommitAndResultIsNotCommit
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnOkOnEndTxnWhenStatusIsCompleteCommitAndResultIsCommit STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldReturnOkOnEndTxnWhenStatusIsCompleteCommitAndResultIsCommit PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionsOnAddPartitionsWhenStateIsPrepareCommit 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionsOnAddPartitionsWhenStateIsPrepareCommit 
PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldIncrementEpochAndUpdateMetadataOnHandleInitPidWhenExistingCompleteTransaction
 STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldIncrementEpochAndUpdateMetadataOnHandleInitPidWhenExistingCompleteTransaction
 PASSED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionOnAddPartitionsWhenStateIsPrepareAbort 
STARTED

kafka.coordinator.transaction.TransactionCoordinatorTest > 
shouldRespondWithConcurrentTransactionOnAddPartitionsWhenStateIsPrepareAbort 
PASSED


[jira] [Created] (KAFKA-5165) Kafka Logs Cleanup Not happening, Huge File Growth - Windows

2017-05-03 Thread Manikandan P (JIRA)
Manikandan P created KAFKA-5165:
---

 Summary: Kafka Logs Cleanup Not happening, Huge File Growth - 
Windows
 Key: KAFKA-5165
 URL: https://issues.apache.org/jira/browse/KAFKA-5165
 Project: Kafka
  Issue Type: Bug
 Environment: windows, Kafka Server(Version: 0.9.0.1)
Reporter: Manikandan P


We had set the below configuration: Retention hours as 1, Retention bytes as 
150 MB in the server.properties in the Kafka Server(Version: 0.9.0.1). Also 
modified other settings as given below.

log.dirs=/tmp/kafka-logs  
log.retention.hours=1
log.retention.bytes=157286400
log.segment.bytes=1073741824
log.retention.check.interval.ms=30
log.cleaner.enable=true
log.cleanup.policy=delete
After checking few days, Size of the Kafka log folder too huge as 13.2 GB. We 
have seen that Topic Offset getting updated and ignores the Old data but Log 
File doesnt reduce and has all the Old Data and become too huge. Could you help 
us to find out why Kafka is not deleting the logs(Physically). Do we need to 
change any configuration ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3754) Kafka default -Xloggc settings should include GC log rotation flags

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994593#comment-15994593
 ] 

ASF GitHub Bot commented on KAFKA-3754:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1431


> Kafka default -Xloggc settings should include GC log rotation flags
> ---
>
> Key: KAFKA-3754
> URL: https://issues.apache.org/jira/browse/KAFKA-3754
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ryan P
>Assignee: Ryan P
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> By default kafka-run-class.sh defines it's GC settings like so:
> KAFKA_GC_LOG_OPTS="-Xloggc:$LOG_DIR/$GC_LOG_FILE_NAME -verbose:gc 
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps "
> This has to potential to generate incredibly large log files when left 
> unmanaged. 
> Instead it should include some sort of default GC log retention policy by 
> adding the following flags:
> -XX:+UseGCLogFileRotation 
> -XX:NumberOfGCLogFiles= 10
>  -XX:GCLogFileSize=100M
> http://www.oracle.com/technetwork/java/javase/7u2-relnotes-1394228.html 
> details these flags and their defaults



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #1431: KAFKA-3754 Add GC log retention policy to limit si...

2017-05-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1431


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-3754) Kafka default -Xloggc settings should include GC log rotation flags

2017-05-03 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3754.

   Resolution: Fixed
Fix Version/s: 0.11.0.0

Issue resolved by pull request 1431
[https://github.com/apache/kafka/pull/1431]

> Kafka default -Xloggc settings should include GC log rotation flags
> ---
>
> Key: KAFKA-3754
> URL: https://issues.apache.org/jira/browse/KAFKA-3754
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ryan P
>Assignee: Ryan P
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> By default kafka-run-class.sh defines it's GC settings like so:
> KAFKA_GC_LOG_OPTS="-Xloggc:$LOG_DIR/$GC_LOG_FILE_NAME -verbose:gc 
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps "
> This has to potential to generate incredibly large log files when left 
> unmanaged. 
> Instead it should include some sort of default GC log retention policy by 
> adding the following flags:
> -XX:+UseGCLogFileRotation 
> -XX:NumberOfGCLogFiles= 10
>  -XX:GCLogFileSize=100M
> http://www.oracle.com/technetwork/java/javase/7u2-relnotes-1394228.html 
> details these flags and their defaults



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-3754) Kafka default -Xloggc settings should include GC log rotation flags

2017-05-03 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3754:
---
Description: 
By default kafka-run-class.sh defines it's GC settings like so:

KAFKA_GC_LOG_OPTS="-Xloggc:$LOG_DIR/$GC_LOG_FILE_NAME -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps "

This has to potential to generate incredibly large log files when left 
unmanaged. 

Instead it should include some sort of default GC log retention policy by 
adding the following flags:
-XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles= 10
 -XX:GCLogFileSize=100M

http://www.oracle.com/technetwork/java/javase/7u2-relnotes-1394228.html details 
these flags and their defaults


  was:
By default kafka-run-class.sh defines it's GC settings like so:

KAFKA_GC_LOG_OPTS="-Xloggc:$LOG_DIR/$GC_LOG_FILE_NAME -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps "

This has to potential to generate incredibly large log files when left 
unmanaged. 

Instead it should include some sort of default GC log retention policy by 
adding the following flags:
-XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles= 
 -XX:GCLogFileSize=

http://www.oracle.com/technetwork/java/javase/7u2-relnotes-1394228.html details 
these flags and their defaults



> Kafka default -Xloggc settings should include GC log rotation flags
> ---
>
> Key: KAFKA-3754
> URL: https://issues.apache.org/jira/browse/KAFKA-3754
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ryan P
>Assignee: Ryan P
>Priority: Minor
>
> By default kafka-run-class.sh defines it's GC settings like so:
> KAFKA_GC_LOG_OPTS="-Xloggc:$LOG_DIR/$GC_LOG_FILE_NAME -verbose:gc 
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps "
> This has to potential to generate incredibly large log files when left 
> unmanaged. 
> Instead it should include some sort of default GC log retention policy by 
> adding the following flags:
> -XX:+UseGCLogFileRotation 
> -XX:NumberOfGCLogFiles= 10
>  -XX:GCLogFileSize=100M
> http://www.oracle.com/technetwork/java/javase/7u2-relnotes-1394228.html 
> details these flags and their defaults



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5155) Messages can be deleted prematurely when some producers use timestamps and some not

2017-05-03 Thread Michal Borowiecki (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994575#comment-15994575
 ] 

Michal Borowiecki edited comment on KAFKA-5155 at 5/3/17 9:44 AM:
--

Hi [~huxi_2b],
Personally, I feel the similarity is superficial. KAFKA-4398 is about consuming 
messages in timestamp order, which challenges the current design and basically 
calls out for a new feature.

This ticket on the other hand is reporting a defect, with potential data loss, 
which violates the at-least-once semantics.
However, it does not challenge the design, simply points out that one line of 
code needs changing to cater for the case when msgs with and without timestamps 
are appended to the same segment, which IMHO is a non-contentious bugfix.



was (Author: mihbor):
Hi @huxi,
Personally, I feel the similarity is superficial. KAFKA-4398 is about consuming 
messages in timestamp order, which challenges the current design and basically 
calls out for a new feature.

This ticket on the other hand is reporting a defect, with potential data loss, 
which violates the at-least-once semantics.
However, it does not challenge the design, simply points out that one line of 
code needs changing to cater for the case when msgs with and without timestamps 
are appended to the same segment, which IMHO is a non-contentious bugfix.


> Messages can be deleted prematurely when some producers use timestamps and 
> some not
> ---
>
> Key: KAFKA-5155
> URL: https://issues.apache.org/jira/browse/KAFKA-5155
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.2.0
>Reporter: Petr Plavjaník
>
> Some messages can be deleted prematurely and never read in following 
> scenario. A producer uses timestamps and produces messages that are appended 
> to the beginning of a log segment. Other producer produces messages without a 
> timestamp. In that case the largest timestamp is made by the old messages 
> with a timestamp and new messages with the timestamp does not influence and 
> the log segment with old and new messages can be delete immediately after the 
> last new message with no timestamp is appended. When all appended messages 
> have no timestamp, then they are not deleted because {{lastModified}} 
> attribute of a {{LogSegment}} is used.
> New test case to {{kafka.log.LogTest}} that fails:
> {code}
>   @Test
>   def 
> shouldNotDeleteTimeBasedSegmentsWhenTimestampIsNotProvidedForSomeMessages() {
> val retentionMs = 1000
> val old = TestUtils.singletonRecords("test".getBytes, timestamp = 0)
> val set = TestUtils.singletonRecords("test".getBytes, timestamp = -1, 
> magicValue = 0)
> val log = createLog(set.sizeInBytes, retentionMs = retentionMs)
> // append some messages to create some segments
> log.append(old)
> for (_ <- 0 until 12)
>   log.append(set)
> assertEquals("No segment should be deleted", 0, log.deleteOldSegments())
>   }
> {code}
> It can be prevented by using {{def largestTimestamp = 
> Math.max(maxTimestampSoFar, lastModified)}} in LogSegment, or by using 
> current timestamp when messages with timestamp {{-1}} are appended.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4421) Update release process so that Scala 2.12 artifacts are published

2017-05-03 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994544#comment-15994544
 ] 

Ismael Juma commented on KAFKA-4421:


The release script that Ewen wrote handles this automatically:

https://github.com/apache/kafka/pull/2795/files

So, we can close this once that script is merged. Once we move to Java 8, we 
can enable Scala 2.12 by default and simplify the release script.

> Update release process so that Scala 2.12 artifacts are published
> -
>
> Key: KAFKA-4421
> URL: https://issues.apache.org/jira/browse/KAFKA-4421
> Project: Kafka
>  Issue Type: Sub-task
>  Components: build
>Reporter: Ismael Juma
> Fix For: 0.11.0.0
>
>
> Since Scala 2.12 requires Java 8 while Kafka still supports Java 7, the *All 
> commands don't include Scala 2.12. As such, simply running releaseTarGzAll 
> won't generate the Scala 2.12 artifacts and we also need to run `./gradlew 
> releaseTagGz -PscalaVersion=2.12`.
> The following page needs to be updated to include this and any other change 
> required:
> https://cwiki.apache.org/confluence/display/KAFKA/Release+Process



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KAFKA-4378) Address 2.12 eta-expansion warnings

2017-05-03 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-4378.

   Resolution: Fixed
Fix Version/s: 0.11.0.0

> Address 2.12 eta-expansion warnings
> ---
>
> Key: KAFKA-4378
> URL: https://issues.apache.org/jira/browse/KAFKA-4378
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Bernard Leach
>Assignee: Bernard Leach
> Fix For: 0.11.0.0
>
>
> The 2.12 compiler generates warnings about zero-argument eta-expansion.  
> Update the code to make the expansion explicit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP 145 - Expose Record Headers in Kafka Connect

2017-05-03 Thread Michael Pearce
Hi Ewen,

As code I think helps, as I don’t think I explained what I meant very well.

I have pushed what I was thinking to the branch/pr.
https://github.com/apache/kafka/pull/2942

The key bits added on top here are:
new ConnectHeader that holds the header key (as string) and then header value 
object header value schema

new SubjectConverter which allows exposing a subject, in this case the subject 
is the key. - this can be used to register the header type in repos like schema 
registry, or in my case below in a property file.


We can default the subject converter to String based of Byte based where all 
header values are treated safely as String or byte[] type.

But this way you could add in your own converter which could be more 
sophisticated and convert the header based on the key.

The main part is to have access to the key, so you can look up the header value 
type, based on the key from somewhere, aka a properties file, or some central 
repo (aka schema repo), where the repo subject could be the topic + key, or 
just key if key type is global, and the schema could be primitive, String, 
byte[] or even can be more elaborate.

Cheers
Mike

On 03/05/2017, 06:00, "Ewen Cheslack-Postava"  wrote:

Michael,

Aren't JMS headers an example where the variety is a problem? Unless I'm
misunderstanding, there's not even a fixed serialization format expected
for them since JMS defines the runtime types, not the wire format. For
example, we have JMSCorrelationID (String), JMSExpires (Long), and
JMSReplyTo (Destination). These are simply run time types, so we'd need
either (a) a different serializer/deserializer for each or (b) a
serializer/deserializer that can handle all of them (e.g. Avro, JSON, etc).

What is the actual serialized format of the different fields? And if it's
not specified anywhere in the KIP, why should using the well-known type for
the header key (e.g. use StringSerializer, IntSerializer, etc) be better or
worse than using a general serialization format (e.g. Avro, JSON)? And if
the latter is the choice, how do you decide on the format?

-Ewen

On Tue, May 2, 2017 at 12:48 PM, Michael André Pearce <
michael.andre.pea...@me.com> wrote:

> Hi Ewan,
>
> So on the point of JMS the predefined/standardised JMS and JMSX headers
> have predefined types. So these can be serialised/deserialised 
accordingly.
>
> Custom jms headers agreed could be a bit more difficult but on the 80/20
> rule I would agree mostly they're string values and as anyhow you can hold
> bytes as a string it wouldn't cause any issue, defaulting to that.
>
> But I think easily we maybe able to do one better.
>
> Obviously can override the/config the headers converter but we can supply
> a default converter could take a config file with key to type mapping?
>
> Allowing people to maybe define/declare a header key with the expected
> type in some property file? To support string, byte[] and primitives? And
> undefined headers just either default to String or byte[]
>
> We could also pre define known headers like the jms ones mentioned above.
>
> E.g
>
> AwesomeHeader1=boolean
> AwesomeHeader2=long
> JMSCorrelationId=String
> JMSXGroupId=String
>
>
> What you think?
>
>
> Cheers
> Mike
>
>
>
>
>
>
> Sent from my iPhone
>
> > On 2 May 2017, at 18:45, Ewen Cheslack-Postava 
> wrote:
> >
> > A couple of thoughts:
> >
> > First, agreed that we definitely want to expose header functionality.
> Thank
> > you Mike for starting the conversation! Even if Connect doesn't do
> anything
> > special with it, there's value in being able to access/set headers.
> >
> > On motivation -- I think there are much broader use cases. When thinking
> > about exposing headers, I'd actually use Replicator as only a minor
> > supporting case. The reason is that it is a very uncommon case where
> there
> > is zero impedance mismatch between the source and sink of the data since
> > they are both Kafka. This means you don't need to think much about data
> > formats/serialization. I think the JMS use case is a better example 
since
> > JMS headers and Kafka headers don't quite match up. Here's a quick list
> of
> > use cases I can think of off the top of my head:
> >
> > 1. Include headers from other systems that support them: JMS (or really
> any
> > MQ), HTTP
> > 2. Other connector-specific headers. For example, from JDBC maybe the
> table
> > the data comes from is a header; for a CDC connector you might include
> the
> > binlog offset as a header.
> > 3. Interceptor/SMT-style use cases for annotating things like provenance
> of
> > data:
> > 3a. Generically w/ 

Re: [DISCUSS]: KIP-149: Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner

2017-05-03 Thread Jeyhun Karimov
Hi Mathieu,

Thanks for feedback. I followed similar approach and updated PR and KIP
accordingly. I tried to guard the key in Processors sending a copy of an
actual key.
Because I am doing deep copy of an object, I think memory can be bottleneck
in some use-cases.

Cheers,
Jeyhun

On Tue, May 2, 2017 at 5:10 PM Mathieu Fenniak 
wrote:

> Hi Jeyhun,
>
> This approach would change ValueMapper (...etc) to be classes, rather than
> interfaces, which is also a backwards incompatible change.  An alternative
> approach that would be backwards compatible would be to define new
> interfaces, and provide overrides where those interfaces are used.
>
> Unfortunately, making the key parameter as "final" doesn't change much
> about guarding against key change.  It only prevents the parameter variable
> from being reassigned.  If the key type is a mutable object (eg. byte[]),
> it can still be mutated. (eg. key[0] = 0).  But I'm not really sure there's
> much that can be done about that.
>
> Mathieu
>
>
> On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov 
> wrote:
>
> > Thanks for comments.
> >
> > The concerns makes sense. Although we can guard for immutable keys in
> > current implementation (with few changes), I didn't consider backward
> > compatibility.
> >
> > In this case 2 solutions come to my mind. In both cases, user accesses
> the
> > key in Object type, as passing extra type parameter will break
> > backwards-compatibility.  So user has to cast to actual key type.
> >
> > 1. Firstly, We can overload apply method with 2 argument (key and value)
> > and force key to be *final*. By doing this,  I think we can address both
> > backward-compatibility and guarding against key change.
> >
> > 2. Secondly, we can create class KeyAccess like:
> >
> > public class KeyAccess {
> > Object key;
> > public void beforeApply(final Object key) {
> > this.key = key;
> > }
> > public Object getKey() {
> > return key;
> > }
> > }
> >
> > We can extend *ValueMapper, ValueJoiner* and *ValueTransformer* from
> > *KeyAccess*. Inside processor (for example *KTableMapValuesProcessor*)
> > before calling *mapper.apply(value)* we can set the *key* by
> > *mapper.beforeApply(key)*. As a result, user can use *getKey()* to access
> > the key inside *apply(value)* method.
> >
> >
> > Cheers,
> > Jeyhun
> >
> >
> >
> >
> > On Mon, May 1, 2017 at 7:24 PM Matthias J. Sax 
> > wrote:
> >
> > > Jeyhun,
> > >
> > > thanks a lot for the KIP!
> > >
> > > I think there are two issues we need to address:
> > >
> > > (1) The KIP does not consider backward compatibility. Users did
> complain
> > > about this in past releases already, and as the user base grows, we
> > > should not break backward compatibility in future releases anymore.
> > > Thus, we should think of a better way to allow key access.
> > >
> > > Mathieu's comment goes into the same direction
> > >
> > > >> On the other hand, the number of compile failures that would need to
> > be
> > > >> fixed from this change is unfortunate. :-)
> > >
> > > (2) Another concern is, that there is no guard to prevent user code to
> > > modify the key. This might corrupt partitioning if users do alter the
> > > key (accidentally -- or users are just not aware that they are not
> > > allowed to modify the provided key object) and thus break the
> > > application. (This was the original motivation to not provide the key
> in
> > > the first place -- it's guards against modification.)
> > >
> > >
> > > -Matthias
> > >
> > >
> > >
> > > On 5/1/17 6:31 AM, Mathieu Fenniak wrote:
> > > > Hi Jeyhun,
> > > >
> > > > I just want to add my voice that, I too, have wished for access to
> the
> > > > record key during a mapValues or similar operation.
> > > >
> > > > On the other hand, the number of compile failures that would need to
> be
> > > > fixed from this change is unfortunate. :-)  But at least it would all
> > be
> > > a
> > > > pretty clear and easy change.
> > > >
> > > > Mathieu
> > > >
> > > >
> > > > On Mon, May 1, 2017 at 6:55 AM, Jeyhun Karimov  >
> > > wrote:
> > > >
> > > >> Dear community,
> > > >>
> > > >> I want to share KIP-149 [1] based on issues KAFKA-4218 [2],
> KAFKA-4726
> > > [3],
> > > >> KAFKA-3745 [4]. The related PR can be found at [5].
> > > >> I would like to get your comments.
> > > >>
> > > >> [1]
> > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > >> 149%3A+Enabling+key+access+in+ValueTransformer%2C+
> > > >> ValueMapper%2C+and+ValueJoiner
> > > >> [2] https://issues.apache.org/jira/browse/KAFKA-4218
> > > >> [3] https://issues.apache.org/jira/browse/KAFKA-4726
> > > >> [4] https://issues.apache.org/jira/browse/KAFKA-3745
> > > >> [5] https://github.com/apache/kafka/pull/2946
> > > >>
> > > >>
> > > >> Cheers,
> > > >> Jeyhun
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> -Cheers
> > > >>
> > > >> Jeyhun
> > > >>
> > > >
> > >
> > 

[jira] [Commented] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-03 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994475#comment-15994475
 ] 

Lukas Gemela commented on KAFKA-5154:
-

[~mjsax] should not be a problem, I can make it running on one of our boxes. 
Not sure if it helps though, we were not able to reproduce the issue since

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application error during processing:
>  java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> 2017-05-01T00:02:00,038 INFO  StreamThread-1 
> 

RE: Kafka Connect and Partitions

2017-05-03 Thread david.franklin
Hi Randall,
Many thanks for your and Gwen's help with this - it's very reassuring that help 
is at hand in such circumstances :)
All the best,
David

-Original Message-
From: Randall Hauch [mailto:rha...@gmail.com] 
Sent: 02 May 2017 21:01
To: dev@kafka.apache.org
Subject: Re: Kafka Connect and Partitions

Hi, David.

Excellent. I'm glad that you've solved the puzzle.
Best regards,

Randall

On Tue, May 2, 2017 at 9:18 AM,  wrote:

> Hi Gwen/Randall,
>
> I think I've finally understood, more or less, how partitioning relates to
> SourceRecords.
>
> Because I was using the SourceRecord constructor that doesn't provide
> values for key and key schema, the key is null.  The DefaultPartioner
> appears to map null to a constant value rather than round-robin across all
> of the partitions :(
> SourceRecord(Map sourcePartition, Map
> sourceOffset, String topic, Schema valueSchema, Object value)
>
> Another SourceRecord constructor enables the partition to be specified but
> I'd prefer not to use this as I don't want to couple the non-Kafka source
> side to Kafka by making it aware of topic partitions - this would also
> presumably involve coupling it to the Cluster so that the number of
> partitions in a topic can be determined :(
> SourceRecord(Map sourcePartition, Map
> sourceOffset, String topic, Integer partition, Schema keySchema, Object
> key, Schema valueSchema, Object value)
>
> Instead, if I use the SourceRecord constructor that also takes arguments
> for the key and key schema (making them take the same values as the value
> and value schema in my application), then the custom partitioner /
> producer.partitioner.class property is not required and the data is
> distributed across the partitions :)
> SourceRecord(Map sourcePartition, Map
> sourceOffset, String topic, Integer partition, Schema keySchema, Object
> key, Schema valueSchema, Object value)
>
> Many thanks once again for your guidance.  I think this puzzle is now
> solved :)
> Best wishes,
> David
>
> -Original Message-
> From: Randall Hauch [mailto:rha...@gmail.com]
> Sent: 28 April 2017 16:08
> To: dev@kafka.apache.org
> Subject: Re: Kafka Connect and Partitions
>
> The source connector creates SourceRecord object and can set a number of
> fields, including the message's key and value, the Kafka topic name and, if
> desired, the Kafka topic partition number. If the connector does se the the
> topic partition to a non-null value, then that's the partition to which
> Kafka Connect will write the message; otherwise, the customer partitioner
> (e.g., your custom partitioner) used by the Kafka Connect producer will
> choose/compute the partition based purely upon the key and value byte
> arrays. Note that if the connector doesn't set the topic partition number
> and no special producer partitioner is specified, the default hash-based
> Kafka partitioner will be used.
>
> So, the connector can certainly set the topic partition number, and it may
> be easier to do it there since the connector actually has the key and
> values before they are serialized. But no matter what, the connector is the
> only thing that sets the message key in the source record.
>
> BTW, the SourceRecord's "source position" and "source offset" are actually
> the connector-defined information about the source and where the connector
> has read in that source. Don't confuse these with the topic name or topic
> partition number.
>
> Hope that helps.
>
> Randall
>
> On Fri, Apr 28, 2017 at 7:15 AM,  wrote:
>
> > Hi Gwen,
> >
> > Having added a custom partitioner (via the producer.partitioner.class
> > property in worker.properties) that simply randomly selects a partition,
> > the data is now written evenly across all the partitions :)
> >
> > The root of my confusion regarding why the default partitioner writes all
> > data to the same partition is that I don't understand how the
> SourceRecords
> > returned in the source task poll() method are used by the partitioner.
> The
> > data that is passed to the partitioner includes a key Object (which is an
> > empty byte array - presumably this is a bad idea!), and a value Object
> > (which is a non-empty byte array):
> >
> > @Override
> > public int partition(String topic, Object key, byte[] keyBytes,
> Object
> > value, byte[] valueBytes, Cluster cluster) {
> > System.out.println(String.format(
> > "### PARTITION key[%s][%s][%d] value[%s][%s][%d]",
> > key, key.getClass().getSimpleName(), keyBytes.length,
> > value, value.getClass().getSimpleName(),
> > valueBytes.length));
> >
> > =>
> > ### PARTITION key[[B@584f599f][byte[]][0] value[[B@73cc0cd8][byte[]][
> 236]
> >
> > However, I don't understand how the above key and value are derived from
> > the SourceRecord attributes which, in my application's case, is as
> 

Re: [DISCUSS] KIP-137: Enhance TopicCommand --describe to show topics marked for deletion

2017-05-03 Thread Ismael Juma
Yeah, structured output for the CLI tools would be great. 3 digit number
JIRA, nice. :)

Ismael

On Wed, May 3, 2017 at 7:05 AM, Ewen Cheslack-Postava 
wrote:

> Since everything is whitespace delimited anyway, I don't think we should
> worry about the compatibility issue. We don't guarantee this unstructured
> output format. I think it is fine to say that any parser that doesn't do
> something straightforward and reliable like splitting the line by
> whitespace then checking the : prefixed value to determine if it is usable
> is ok to break.
>
> Long term, we should really just get more structured output formats for the
> command line tools, a la https://issues.apache.org/jira/browse/KAFKA-313.
>
> -Ewen
>
> On Wed, Apr 26, 2017 at 2:59 AM, Ismael Juma  wrote:
>
> > Right, the reason for inserting it before the configs is that
> > MarkedForDeletion is a fixed length field while configs is a variable
> > length field. The fact that MarkedForDeletion is optional and typically
> not
> > set means that it's also justifiable to place it after the configs. So,
> I'm
> > OK either way.
> >
> > Ismael
> >
> > On Wed, Apr 26, 2017 at 10:42 AM, Mickael Maison <
> mickael.mai...@gmail.com
> > >
> > wrote:
> >
> > > Thanks for the feedback.
> > >
> > > I had the same thinking as James. Also we plan to only add the
> > > MarkedForDeletion field for topics pending deletion as the output of
> > > --describe is already pretty dense and most topics are never pending
> > > deletion.
> > >
> > > The only reason I came up to insert it in the middle is if Configs is
> > > long, then MarkedForDeletion could be pushed on a new line/off-screen.
> > > Am I missing something ?
> > >
> > > That said, I don't have a strong opinion about it and if most people
> > > prefer it the other way around I'll be happy to update the KIP.
> > >
> > > On Wed, Apr 26, 2017 at 12:25 AM, James Cheng 
> > > wrote:
> > > > Having "MarkedForDeletion" before "Configs" may break anyone who is
> > > parsing this output, since they may be expecting the 4th string to be
> > > "Configs".
> > > >
> > > > I know that the Compatibility section already says that people
> parsing
> > > this may have to adjust their parsing logic, so maybe that covers my
> > > concern already. But inserting the new MarkedForDeletion word into the
> > > middle of the string seems like it'll break parsing more than just
> > adding a
> > > new value at the end.
> > > >
> > > > I'm fine either way, though.
> > > >
> > > > -James
> > > >
> > > >> On Apr 25, 2017, at 9:38 AM, Vahid S Hashemian <
> > > vahidhashem...@us.ibm.com> wrote:
> > > >>
> > > >> Thanks for the KIP Mickael.
> > > >> Looks good. I also prefer 'MarkedForDeletion' before 'Configs'.
> > > >>
> > > >> --Vahid
> > > >>
> > > >>
> > > >>
> > > >> From:   Ismael Juma 
> > > >> To: dev@kafka.apache.org
> > > >> Date:   04/25/2017 04:15 AM
> > > >> Subject:Re: [DISCUSS] KIP-137: Enhance TopicCommand
> --describe
> > > to
> > > >> show topics marked for deletion
> > > >> Sent by:isma...@gmail.com
> > > >>
> > > >>
> > > >>
> > > >> Thanks for the KIP. Would it make sense for MarkedForDeletion to be
> > > before
> > > >> `Configs`? I can see arguments both ways, so I was wondering what
> your
> > > >> thoughts were?
> > > >>
> > > >> Ismael
> > > >>
> > > >> On Thu, Mar 30, 2017 at 5:39 PM, Mickael Maison <
> > > mickael.mai...@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> Hi all,
> > > >>>
> > > >>> We created KIP-137: Enhance TopicCommand --describe to show topics
> > > >>> marked for deletion
> > > >>>
> > > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > >>>
> > > >> 137%3A+Enhance+TopicCommand+--describe+to+show+topics+marked
> > > +for+deletion
> > > >>>
> > > >>> Please help review the KIP. You feedback is appreciated!
> > > >>>
> > > >>> Thanks
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >
> > >
> >
>


Re: [VOTE] KIP-86: Configurable SASL callback handlers

2017-05-03 Thread Rajini Sivaram
Can we have some more reviews or votes for this KIP to include in 0.11.0.0?
It is not a breaking change and the code is ready for integration, so it
will be good to get it in if possible.

Ismael/Jun, since you had reviewed the KIP earlier, can you let me know if
I can do anything more to get your votes?


Thank you,

Rajini


On Mon, Apr 10, 2017 at 12:18 PM, Edoardo Comar  wrote:

> +1 (non binding)
> many thanks Rajini !
>
> --
> Edoardo Comar
> IBM MessageHub
> eco...@uk.ibm.com
> IBM UK Ltd, Hursley Park, SO21 2JN
>
> IBM United Kingdom Limited Registered in England and Wales with number
> 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6
> 3AU
>
>
>
> From:   Rajini Sivaram 
> To: dev@kafka.apache.org
> Date:   06/04/2017 10:53
> Subject:[VOTE] KIP-86: Configurable SASL callback handlers
>
>
>
> Hi all,
>
> I would like to start the voting process for KIP-86:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 86%3A+Configurable+SASL+callback+handlers
>
>
> The KIP makes callback handlers for SASL configurable to make it simpler
> to
> integrate with custom authentication database or custom authentication
> servers. This is particularly useful for SASL/PLAIN where the
> implementation in Kafka based on credentials stored in jaas.conf is not
> suitable for production use. It is also useful for SCRAM in environments
> where ZooKeeper is not secure.
>
> Thank you...
>
> Regards,
>
> Rajini
>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>


Re: [VOTE] KIP-137: Enhance TopicCommand --describe to show topics marked for deletion

2017-05-03 Thread Rajini Sivaram
Thanks for the KIP, Mickael.

+1 (binding)

Regards,

Rajini

On Wed, May 3, 2017 at 7:06 AM, Ewen Cheslack-Postava 
wrote:

> +1 (binding)
>
> Thanks for helping improve the CLI tools!
>
> -Ewen
>
> On Tue, May 2, 2017 at 8:25 AM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com
> > wrote:
>
> > +1 (non-binding)
> >
> > Thanks Mickael.
> >
> > --Vahid
> >
> >
> >
> >
> > From:   Ismael Juma 
> > To: dev@kafka.apache.org
> > Date:   05/02/2017 03:46 AM
> > Subject:Re: [VOTE] KIP-137: Enhance TopicCommand --describe to
> > show topics marked for deletion
> > Sent by:isma...@gmail.com
> >
> >
> >
> > Thanks for the KIP, +1 (binding).
> >
> > Ismael
> >
> > On Tue, May 2, 2017 at 11:43 AM, Mickael Maison <
> mickael.mai...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I'd like to start the vote on KIP-137:
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >
> > 137%3A+Enhance+TopicCommand+--describe+to+show+topics+
> marked+for+deletion
> > >
> > > Thanks,
> > >
> >
> >
> >
> >
> >
>


Re: [DISCUSS] KIP-136: Add Listener name and Security Protocol name to SelectorMetrics tags

2017-05-03 Thread Edoardo Comar
Thanks Ismael, 
I meant for this change to only apply to brokers, not clients, I'm going 
to update the KIP
--
Edoardo Comar
IBM MessageHub
eco...@uk.ibm.com
IBM UK Ltd, Hursley Park, SO21 2JN

IBM United Kingdom Limited Registered in England and Wales with number 
741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 
3AU



From:   Ismael Juma 
To: dev@kafka.apache.org
Date:   02/05/2017 20:20
Subject:Re: [DISCUSS] KIP-136: Add Listener name and Security 
Protocol name to SelectorMetrics tags
Sent by:isma...@gmail.com



Edoardo,

Are you planning to do this only in the broker? Listener names only exist
at the broker level. Also, clients only have a single security protocol, 
so
maybe we should do this for brokers only. In that case, the compatibility
impact is also lower because more users rely on the Yammer metrics 
instead.

I would suggest you start the voting thread after clarifying if this 
change
only applies at the broker level.

Ismael

On Tue, Apr 25, 2017 at 1:19 PM, Ismael Juma  wrote:

> Thanks for the KIP. I think it makes sense to have those tags. My only
> question is regarding the compatibility impact. We don't have a good
> compatibility story when it comes to adding tags to existing metrics 
since
> the JmxReporter adds the tags to the object name.
>
> Ismael
>
> On Thu, Mar 30, 2017 at 4:51 PM, Edoardo Comar  
wrote:
>
>> Hi all,
>>
>> We created KIP-136: Add Listener name and Security Protocol name to
>> SelectorMetrics tags
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-136%
>> 3A+Add+Listener+name+and+Security+Protocol+name+to+SelectorMetrics+tags
>>
>> Please help review the KIP. You feedback is appreciated!
>>
>> cheers,
>> Edo
>> --
>> Edoardo Comar
>> IBM MessageHub
>> eco...@uk.ibm.com
>> IBM UK Ltd, Hursley Park, SO21 2JN
>>
>> IBM United Kingdom Limited Registered in England and Wales with number
>> 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. 
PO6
>> 3AU
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with 
number
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 
3AU
>>
>
>



Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Re: [DISCUSS] KIP 141 - ProducerRecordBuilder Interface

2017-05-03 Thread Ewen Cheslack-Postava
I understand the convenience of pointing at a JIRA/PR, but can we put the
concrete changes proposed in the JIRA (under "Proposed Changes"). I don't
think voting on the KIP would be reasonable otherwise since the changes
under vote could change arbitrarily...

I'm increasingly skeptical of adding more convenience constructors -- the
current patch adds timestamps, we're about to add headers as well (for
core, for Connect we have
https://cwiki.apache.org/confluence/display/KAFKA/KIP-145+-+Expose+Record+Headers+in+Kafka+Connect
in flight). It just continues to get messier over time.

I think builders in the right context are useful, as long as they exceed a
certain number of parameters (SchemaBuilder in Connect is an artifact of
that position). I don't think a transition period with 2 ways to construct
an object is actually a problem -- if there's always an "all N parameters"
version of the constructor, all other constructors are just convenience
shortcuts, but the Builder provides a shorthand.

I also agree w/ Ismael that deprecating to aggressively is bad -- we added
the APIs instead of a builder and there's not any real maintenance cost, so
why add the deprecation? I don't want to suggest actually adding such an
annotation, but the real issue here is that one API will become "preferred"
for some time.

-Ewen

On Tue, May 2, 2017 at 1:15 AM, Ismael Juma  wrote:

> Hi Matthias,
>
> Deprecating widely used APIs is a big deal. Build warnings are a nuisance
> and can potentially break the build for those who have a zero-warnings
> policy (which is good practice). It creates a bunch of busy work for our
> users and various resources like books, blog posts, etc. become out of
> date.
>
> This does not mean that we should not do it, but the benefit has to be
> worth it and we should not do it lightly.
>
> Ismael
>
> On Sat, Apr 29, 2017 at 6:52 PM, Matthias J. Sax 
> wrote:
>
> > I understand that we cannot just break stuff (btw: also not for
> > Streams!). But deprecating does not break anything, so I don't think
> > it's a big deal to change the API as long as we keep the old API as
> > deprecated.
> >
> >
> > -Matthias
> >
> > On 4/29/17 9:28 AM, Jay Kreps wrote:
> > > Hey Matthias,
> > >
> > > Yeah I agree, I'm not against change as a general thing! I also think
> if
> > > you look back on the last two years, we completely rewrote the producer
> > and
> > > consumer APIs, reworked the binary protocol many times over, and added
> > the
> > > connector and stream processing apis, both major new additions. So I
> > don't
> > > think we're in too much danger of stagnating!
> > >
> > > My two cents was just around breaking compatibility for trivial changes
> > > like constructor => builder. I think this only applies to the producer,
> > > consumer, and connect apis which are heavily embedded in hundreds of
> > > ecosystem components that depend on them. This is different from direct
> > > usage. If we break the streams api it is really no big deal---apps just
> > > need to rebuild when they upgrade, not the end of the world at all.
> > However
> > > because many intermediate things depend on the Kafka producer you can
> > cause
> > > these weird situations where your app depends on two third party things
> > > that use Kafka and each requires different, incompatible versions. We
> did
> > > this a lot in earlier versions of Kafka and it was the cause of much
> > angst
> > > (and an ingrained general reluctance to upgrade) from our users.
> > >
> > > I still think we may have to break things, i just don't think we should
> > do
> > > it for things like builders vs direct constructors which i think are
> kind
> > > of a debatable matter of taste.
> > >
> > > -Jay
> > >
> > >
> > >
> > > On Mon, Apr 24, 2017 at 9:40 AM, Matthias J. Sax <
> matth...@confluent.io>
> > > wrote:
> > >
> > >> Hey Jay,
> > >>
> > >> I understand your concern, and for sure, we will need to keep the
> > >> current constructors deprecated for a long time (ie, many years).
> > >>
> > >> But if we don't make the move, we will not be able to improve. And I
> > >> think warnings about using deprecated APIs is an acceptable price to
> > >> pay. And the API improvements will help new people who adopt Kafka to
> > >> get started more easily.
> > >>
> > >> Otherwise Kafka might end up as many other enterprise software with a
> > >> lots of old stuff that is kept forever because nobody has the guts to
> > >> improve/change it.
> > >>
> > >> Of course, we can still improve the docs of the deprecated
> constructors,
> > >> too.
> > >>
> > >> Just my two cents.
> > >>
> > >>
> > >> -Matthias
> > >>
> > >> On 4/23/17 3:37 PM, Jay Kreps wrote:
> > >>> Hey guys,
> > >>>
> > >>> I definitely think that the constructors could have been better
> > designed,
> > >>> but I think given that they're in heavy use I don't think this
> proposal
> > >>> will improve things. Deprecating constructors just leaves everyone
> with
> 

Re: [DISCUSS] KIP-137: Enhance TopicCommand --describe to show topics marked for deletion

2017-05-03 Thread Ewen Cheslack-Postava
Since everything is whitespace delimited anyway, I don't think we should
worry about the compatibility issue. We don't guarantee this unstructured
output format. I think it is fine to say that any parser that doesn't do
something straightforward and reliable like splitting the line by
whitespace then checking the : prefixed value to determine if it is usable
is ok to break.

Long term, we should really just get more structured output formats for the
command line tools, a la https://issues.apache.org/jira/browse/KAFKA-313.

-Ewen

On Wed, Apr 26, 2017 at 2:59 AM, Ismael Juma  wrote:

> Right, the reason for inserting it before the configs is that
> MarkedForDeletion is a fixed length field while configs is a variable
> length field. The fact that MarkedForDeletion is optional and typically not
> set means that it's also justifiable to place it after the configs. So, I'm
> OK either way.
>
> Ismael
>
> On Wed, Apr 26, 2017 at 10:42 AM, Mickael Maison  >
> wrote:
>
> > Thanks for the feedback.
> >
> > I had the same thinking as James. Also we plan to only add the
> > MarkedForDeletion field for topics pending deletion as the output of
> > --describe is already pretty dense and most topics are never pending
> > deletion.
> >
> > The only reason I came up to insert it in the middle is if Configs is
> > long, then MarkedForDeletion could be pushed on a new line/off-screen.
> > Am I missing something ?
> >
> > That said, I don't have a strong opinion about it and if most people
> > prefer it the other way around I'll be happy to update the KIP.
> >
> > On Wed, Apr 26, 2017 at 12:25 AM, James Cheng 
> > wrote:
> > > Having "MarkedForDeletion" before "Configs" may break anyone who is
> > parsing this output, since they may be expecting the 4th string to be
> > "Configs".
> > >
> > > I know that the Compatibility section already says that people parsing
> > this may have to adjust their parsing logic, so maybe that covers my
> > concern already. But inserting the new MarkedForDeletion word into the
> > middle of the string seems like it'll break parsing more than just
> adding a
> > new value at the end.
> > >
> > > I'm fine either way, though.
> > >
> > > -James
> > >
> > >> On Apr 25, 2017, at 9:38 AM, Vahid S Hashemian <
> > vahidhashem...@us.ibm.com> wrote:
> > >>
> > >> Thanks for the KIP Mickael.
> > >> Looks good. I also prefer 'MarkedForDeletion' before 'Configs'.
> > >>
> > >> --Vahid
> > >>
> > >>
> > >>
> > >> From:   Ismael Juma 
> > >> To: dev@kafka.apache.org
> > >> Date:   04/25/2017 04:15 AM
> > >> Subject:Re: [DISCUSS] KIP-137: Enhance TopicCommand --describe
> > to
> > >> show topics marked for deletion
> > >> Sent by:isma...@gmail.com
> > >>
> > >>
> > >>
> > >> Thanks for the KIP. Would it make sense for MarkedForDeletion to be
> > before
> > >> `Configs`? I can see arguments both ways, so I was wondering what your
> > >> thoughts were?
> > >>
> > >> Ismael
> > >>
> > >> On Thu, Mar 30, 2017 at 5:39 PM, Mickael Maison <
> > mickael.mai...@gmail.com>
> > >> wrote:
> > >>
> > >>> Hi all,
> > >>>
> > >>> We created KIP-137: Enhance TopicCommand --describe to show topics
> > >>> marked for deletion
> > >>>
> > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >>>
> > >> 137%3A+Enhance+TopicCommand+--describe+to+show+topics+marked
> > +for+deletion
> > >>>
> > >>> Please help review the KIP. You feedback is appreciated!
> > >>>
> > >>> Thanks
> > >>>
> > >>
> > >>
> > >>
> > >>
> > >
> >
>