[jira] [Commented] (KAFKA-5545) Kafka Stream not able to successfully restart over new broker ip

2017-07-10 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081698#comment-16081698
 ] 

Guozhang Wang commented on KAFKA-5545:
--

Thanks for the update. So when you mentioned

{code}
One more thing is if I do close with in connction timeout all goes well.
But if I issue close after connection timeout the threads are stuck
{code}

Did you mean that if you call `close(timeout)` with a timeout parameter you set 
it to the session timeout value, or mean that the broker ip changes event 
happens within the session timeout right after the streams app was started that 
triggers the close / restart of the streams app?

> Kafka Stream not able to successfully restart over new broker ip
> 
>
> Key: KAFKA-5545
> URL: https://issues.apache.org/jira/browse/KAFKA-5545
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: Yogesh BG
>Priority: Critical
> Attachments: kafkastreams.log
>
>
> Hi
> I have one kafka broker and one kafka stream application
> initially kafka stream connected and starts processing data. Then i restart 
> the broker. When broker restarts new ip will be assigned.
> In kafka stream i have a 5min interval thread which checks if broker ip 
> changed and if changed, we cleanup the stream, rebuild topology(tried with 
> reusing topology) and start the stream again. I end up with the following 
> exceptions.
> 11:04:08.032 [StreamThread-38] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-38] Creating active task 0_5 with assigned 
> partitions [PR-5]
> 11:04:08.033 [StreamThread-41] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-41] Creating active task 0_1 with assigned 
> partitions [PR-1]
> 11:04:08.036 [StreamThread-34] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-34] Creating active task 0_7 with assigned 
> partitions [PR-7]
> 11:04:08.036 [StreamThread-37] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-37] Creating active task 0_3 with assigned 
> partitions [PR-3]
> 11:04:08.036 [StreamThread-45] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-45] Creating active task 0_0 with assigned 
> partitions [PR-0]
> 11:04:08.037 [StreamThread-36] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-36] Creating active task 0_4 with assigned 
> partitions [PR-4]
> 11:04:08.037 [StreamThread-43] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-43] Creating active task 0_6 with assigned 
> partitions [PR-6]
> 11:04:08.038 [StreamThread-48] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-48] Creating active task 0_2 with assigned 
> partitions [PR-2]
> 11:04:09.034 [StreamThread-38] WARN  o.a.k.s.p.internals.StreamThread - Could 
> not create task 0_5. Will retry.
> org.apache.kafka.streams.errors.LockException: task [0_5] Failed to lock the 
> state directory for task 0_5
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:100)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:864)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1237)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:967)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$600(StreamThread.java:69)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:234)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:259)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-

[jira] [Commented] (KAFKA-5576) Support Power platform by updating rocksdb

2017-07-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081664#comment-16081664
 ] 

ASF GitHub Bot commented on KAFKA-5576:
---

GitHub user yussufsh opened a pull request:

https://github.com/apache/kafka/pull/3519

KAFKA-5576: increase the rocksDB version to 5.5.1 for Power support



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yussufsh/kafka rockdbupgrade

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3519.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3519


commit f3434bb624442c1f5929bb2591c479675544c7a8
Author: root 
Date:   2017-07-11T04:48:00Z

increase the rocksDB version to 5.5.1 for Power support




> Support Power platform by updating rocksdb
> --
>
> Key: KAFKA-5576
> URL: https://issues.apache.org/jira/browse/KAFKA-5576
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.1.0
> Environment: $ cat /etc/lsb-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=14.04
> DISTRIB_CODENAME=trusty
> DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"
> $ uname -a
> Linux pts00432-vm20 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 
> 17:42:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
>Reporter: Yussuf Shaikh
>Priority: Minor
> Fix For: 0.11.1.0
>
> Attachments: KAFKA-5576.patch
>
>
> Many test cases are failing with one to the following exceptions related to 
> rocksdb.
> 1. java.lang.NoClassDefFoundError: Could not initialize class 
> org.rocksdb.Options
> at 
> org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:119)
> at 
> org.apache.kafka.streams.state.internals.Segment.openDB(Segment.java:40)
> 2. java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni4427030040392983276.so: 
> /lib64/ld64.so.2: version `GLIBC_2.22' not found (required by 
> /tmp/librocksdbjni4427030040392983276.so)
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1086)
> at 
> org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:78)
> 3. java.lang.AssertionError: Condition not met within timeout 3. 
> Expecting 3 records from topic output-topic-2 while only received 0: []
> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:274)
> at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:160)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5545) Kafka Stream not able to successfully restart over new broker ip

2017-07-10 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-5545:
-
Affects Version/s: 0.10.2.1

> Kafka Stream not able to successfully restart over new broker ip
> 
>
> Key: KAFKA-5545
> URL: https://issues.apache.org/jira/browse/KAFKA-5545
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: Yogesh BG
>Priority: Critical
> Attachments: kafkastreams.log
>
>
> Hi
> I have one kafka broker and one kafka stream application
> initially kafka stream connected and starts processing data. Then i restart 
> the broker. When broker restarts new ip will be assigned.
> In kafka stream i have a 5min interval thread which checks if broker ip 
> changed and if changed, we cleanup the stream, rebuild topology(tried with 
> reusing topology) and start the stream again. I end up with the following 
> exceptions.
> 11:04:08.032 [StreamThread-38] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-38] Creating active task 0_5 with assigned 
> partitions [PR-5]
> 11:04:08.033 [StreamThread-41] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-41] Creating active task 0_1 with assigned 
> partitions [PR-1]
> 11:04:08.036 [StreamThread-34] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-34] Creating active task 0_7 with assigned 
> partitions [PR-7]
> 11:04:08.036 [StreamThread-37] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-37] Creating active task 0_3 with assigned 
> partitions [PR-3]
> 11:04:08.036 [StreamThread-45] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-45] Creating active task 0_0 with assigned 
> partitions [PR-0]
> 11:04:08.037 [StreamThread-36] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-36] Creating active task 0_4 with assigned 
> partitions [PR-4]
> 11:04:08.037 [StreamThread-43] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-43] Creating active task 0_6 with assigned 
> partitions [PR-6]
> 11:04:08.038 [StreamThread-48] INFO  o.a.k.s.p.internals.StreamThread - 
> stream-thread [StreamThread-48] Creating active task 0_2 with assigned 
> partitions [PR-2]
> 11:04:09.034 [StreamThread-38] WARN  o.a.k.s.p.internals.StreamThread - Could 
> not create task 0_5. Will retry.
> org.apache.kafka.streams.errors.LockException: task [0_5] Failed to lock the 
> state directory for task 0_5
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:100)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:864)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1237)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1210)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:967)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$600(StreamThread.java:69)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:234)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:259)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:352)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator

[jira] [Created] (KAFKA-5581) Streams can be smarter in deciding when to create changelog topics for state stores

2017-07-10 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-5581:


 Summary: Streams can be smarter in deciding when to create 
changelog topics for state stores
 Key: KAFKA-5581
 URL: https://issues.apache.org/jira/browse/KAFKA-5581
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Guozhang Wang


Today Streams make all state stores to be backed by a changelog topic by 
default unless users overrides it by {{disableLogging}} when creating the state 
store / materializing the KTable. However there are a few cases where a 
separate changelog topic would not be required as we can re-use an existing 
topic for that. A few examples:

There are a few places where the materialized store do not need a separate 
changelog topic, for example:

1) If a KTable is read directly from a source topic, and is materialized i.e. 

{code}
table1 = builder.table("topic1", "store1")`.
{code}

In this case {{table1}}'s changelog topic can just be {{topic1}}, and we do not 
need to create a separate {{table1-changelog}} topic.

2) if a KTable is materialized and then sent directly into a sink topic with 
the same key, e.g.

{code}
table1 = stream.groupBy(...).aggregate("state1").to("topic2");
{code}

In this case {{state1}}'s changelog topic can just be {{topic2}}, and we do not 
need to create a separate {{state1-changelog}} topic anymore;

3) if a KStream is materialized for joins where the streams are directly from a 
topic, e.g.:

{code}
stream1 = builder.stream("topic1");
stream2 = builder.stream("topic2");
stream3 = stream1.join(stream2, windows);  // stream1 and stream2 are 
materialized with a changelog topic
{code}

Since stream materialization is append-only we do not need a changelog for the 
state store as well but can just use the source {{topic1}} and {{topic2}}.

4) When you have some simple transformation operations or even join operations 
that generated new KTables, and which needs to be materialized with a state 
store, you can use the changelog topic of the previous KTable and applies the 
transformation logic upon restoration instead of creating a new changelog 
topic. For example:

{code}
table1 = builder.table("topic1");
table2 = table1.filter(..).join(table3); // table2 needs to be materialized for 
joining
{code}

We can set the {{getter}} function of table2's materialized store, say 
{{state2}} to be reading from {{topic1}} and then apply the filter operator, 
instead of creating a new {{state2-changelog}} topic in this case.

5) more use cases ...

We can come up with a general internal impl optimizations to determine when / 
how to set the changelog topic for those materialized stores at the runtime 
startup when generating the topology.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5580) Group rebalance delay only used with generation 0

2017-07-10 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-5580:
--

 Summary: Group rebalance delay only used with generation 0
 Key: KAFKA-5580
 URL: https://issues.apache.org/jira/browse/KAFKA-5580
 Project: Kafka
  Issue Type: Bug
Reporter: Jason Gustafson
Assignee: Damian Guy


We use {{InitialDelayedJoin}} when a group is empty for the purpose of 
[KIP-134| 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-134%3A+Delay+initial+consumer+group+rebalance].
 However, the logic only kicks in if the generation is 0, which is not 
necessarily the case when the group is empty. When all members in the group 
depart, we bump up the generation and write an entry to the log with the state 
set to empty; we do not reset the generation to 0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5571) Possible deadlock during shutdown in setState in kafka streams 10.2

2017-07-10 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081255#comment-16081255
 ] 

Guozhang Wang commented on KAFKA-5571:
--

Is this already fixed in 0.11.0? [~enothereska] [~damianguy]

> Possible deadlock during shutdown in setState in kafka streams 10.2
> ---
>
> Key: KAFKA-5571
> URL: https://issues.apache.org/jira/browse/KAFKA-5571
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: Greg Fodor
>Assignee: Eno Thereska
> Attachments: kafka-streams.deadlock.log
>
>
> I'm running a 10.2 job across 5 nodes with 32 stream threads on each node and 
> find that when gracefully shutdown all of them at once via an ansible 
> scripts, some of the nodes end up freezing -- at a glance the attached thread 
> dump implies a deadlock between stream threads trying to update their state 
> via setState. We haven't had this problem before but it may or may not be 
> related to changes in 10.2 (we are upgrading from 10.0 to 10.2)
> when we gracefully shutdown all nodes simultaneously, what typically happens 
> is some subset of the nodes end up not shutting down completely but end up 
> going through a rebalance first. it seems this deadlock requires this 
> rebalancing to occur simultaneously with the graceful shutdown. if we happen 
> to shut them down and no rebalance happens, i don't believe this deadlock is 
> triggered.
> the deadlock appears related to the state change handlers being subscribed 
> across threads and the fact that both StreamThread#setState and 
> StreamStateListener#onChange are both synchronized methods.
> Another thing worth mentioning is that one of the transformers used in the 
> job has a close() method that can take 10-15 seconds to finish since it needs 
> to flush some data to a database. Having a long close() method combined with 
> a rebalance during a shutdown across many threads may be necessary for 
> reproduction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5571) Possible deadlock during shutdown in setState in kafka streams 10.2

2017-07-10 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081255#comment-16081255
 ] 

Guozhang Wang edited comment on KAFKA-5571 at 7/10/17 10:11 PM:


Is this already fixed in 0.11.0? [~enothereska] [~damianguy] If not we should 
also think how to fix it in 0.11.0 / trunk besides backporting the fix to 
0.10.2 for a potential 0.10.2.2 release.


was (Author: guozhang):
Is this already fixed in 0.11.0? [~enothereska] [~damianguy]

> Possible deadlock during shutdown in setState in kafka streams 10.2
> ---
>
> Key: KAFKA-5571
> URL: https://issues.apache.org/jira/browse/KAFKA-5571
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: Greg Fodor
>Assignee: Eno Thereska
> Attachments: kafka-streams.deadlock.log
>
>
> I'm running a 10.2 job across 5 nodes with 32 stream threads on each node and 
> find that when gracefully shutdown all of them at once via an ansible 
> scripts, some of the nodes end up freezing -- at a glance the attached thread 
> dump implies a deadlock between stream threads trying to update their state 
> via setState. We haven't had this problem before but it may or may not be 
> related to changes in 10.2 (we are upgrading from 10.0 to 10.2)
> when we gracefully shutdown all nodes simultaneously, what typically happens 
> is some subset of the nodes end up not shutting down completely but end up 
> going through a rebalance first. it seems this deadlock requires this 
> rebalancing to occur simultaneously with the graceful shutdown. if we happen 
> to shut them down and no rebalance happens, i don't believe this deadlock is 
> triggered.
> the deadlock appears related to the state change handlers being subscribed 
> across threads and the fact that both StreamThread#setState and 
> StreamStateListener#onChange are both synchronized methods.
> Another thing worth mentioning is that one of the transformers used in the 
> job has a close() method that can take 10-15 seconds to finish since it needs 
> to flush some data to a database. Having a long close() method combined with 
> a rebalance during a shutdown across many threads may be necessary for 
> reproduction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5566) Instable test QueryableStateIntegrationTest.shouldAllowToQueryAfterThreadDied

2017-07-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081263#comment-16081263
 ] 

ASF GitHub Bot commented on KAFKA-5566:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3504


> Instable test QueryableStateIntegrationTest.shouldAllowToQueryAfterThreadDied
> -
>
> Key: KAFKA-5566
> URL: https://issues.apache.org/jira/browse/KAFKA-5566
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.11.1.0
>
>
> This test failed about 4 times in the last 24h. Always the same stack trace 
> so far:
> {noformat}
> java.lang.AssertionError: Condition not met within timeout 3. wait for 
> agg to be '123'
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:274)
>   at 
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest.shouldAllowToQueryAfterThreadDied(QueryableStateIntegrationTest.java:793)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5579) SchemaBuilder.type(Schema.Type) should not allow null.

2017-07-10 Thread Jeremy Custenborder (JIRA)
Jeremy Custenborder created KAFKA-5579:
--

 Summary: SchemaBuilder.type(Schema.Type) should not allow null.
 Key: KAFKA-5579
 URL: https://issues.apache.org/jira/browse/KAFKA-5579
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Jeremy Custenborder
Assignee: Jeremy Custenborder
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5579) SchemaBuilder.type(Schema.Type) should not allow null.

2017-07-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081212#comment-16081212
 ] 

ASF GitHub Bot commented on KAFKA-5579:
---

GitHub user jcustenborder opened a pull request:

https://github.com/apache/kafka/pull/3517

KAFKA-5579 check for null.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jcustenborder/kafka KAFKA-5579

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3517.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3517


commit 645b10b1b6ba780d9d4bdbae8c213c26df63c205
Author: Jeremy Custenborder 
Date:   2017-07-10T21:45:17Z

KAFKA-5579 check for null.




> SchemaBuilder.type(Schema.Type) should not allow null.
> --
>
> Key: KAFKA-5579
> URL: https://issues.apache.org/jira/browse/KAFKA-5579
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Jeremy Custenborder
>Assignee: Jeremy Custenborder
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5571) Possible deadlock during shutdown in setState in kafka streams 10.2

2017-07-10 Thread Damian Guy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081169#comment-16081169
 ] 

Damian Guy commented on KAFKA-5571:
---

It looks like it is because {{KafkaStreams#close}} is blocked waiting for the 
threads to shutdown and it owns the lock on the {{KafkaStreams}} instance. The 
threads that are reporting back in the `onChange` method are then trying to get 
the same lock, which of course they won't be able to get until the {{close}} 
method has finished. If calling the no arg version of {{KafkaStreams#close}} 
the lock will be held indefinitely. One "workaround" for this in 0.10.2 would 
be to use the {{KafkaStreams#close(time, TimeUnit)}}

> Possible deadlock during shutdown in setState in kafka streams 10.2
> ---
>
> Key: KAFKA-5571
> URL: https://issues.apache.org/jira/browse/KAFKA-5571
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: Greg Fodor
>Assignee: Eno Thereska
> Attachments: kafka-streams.deadlock.log
>
>
> I'm running a 10.2 job across 5 nodes with 32 stream threads on each node and 
> find that when gracefully shutdown all of them at once via an ansible 
> scripts, some of the nodes end up freezing -- at a glance the attached thread 
> dump implies a deadlock between stream threads trying to update their state 
> via setState. We haven't had this problem before but it may or may not be 
> related to changes in 10.2 (we are upgrading from 10.0 to 10.2)
> when we gracefully shutdown all nodes simultaneously, what typically happens 
> is some subset of the nodes end up not shutting down completely but end up 
> going through a rebalance first. it seems this deadlock requires this 
> rebalancing to occur simultaneously with the graceful shutdown. if we happen 
> to shut them down and no rebalance happens, i don't believe this deadlock is 
> triggered.
> the deadlock appears related to the state change handlers being subscribed 
> across threads and the fact that both StreamThread#setState and 
> StreamStateListener#onChange are both synchronized methods.
> Another thing worth mentioning is that one of the transformers used in the 
> job has a close() method that can take 10-15 seconds to finish since it needs 
> to flush some data to a database. Having a long close() method combined with 
> a rebalance during a shutdown across many threads may be necessary for 
> reproduction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5461) KIP-168: Add GlobalTopicCount metric per cluster

2017-07-10 Thread Abhishek Mendhekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Mendhekar updated KAFKA-5461:
--
Summary: KIP-168: Add GlobalTopicCount metric per cluster  (was: KIP-168: 
Add TotalTopicCount metric per cluster)

> KIP-168: Add GlobalTopicCount metric per cluster
> 
>
> Key: KAFKA-5461
> URL: https://issues.apache.org/jira/browse/KAFKA-5461
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Abhishek Mendhekar
>Assignee: Abhishek Mendhekar
> Fix For: 0.11.0.1
>
>
> See KIP 
> [here|https://cwiki.apache.org/confluence/display/KAFKA/KIP-168%3A+Add+TopicCount+metric+per+cluster]
> Email discussion 
> [here|http://mail-archives.apache.org/mod_mbox/kafka-dev/201706.mbox/%3CCAMcwe-ugep-UiSn9TkKEMwwTM%3DAzGC4jPro9LnyYRezyZg_NKA%40mail.gmail.com%3E]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5157) Options for handling corrupt data during deserialization

2017-07-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080880#comment-16080880
 ] 

ASF GitHub Bot commented on KAFKA-5157:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3423


> Options for handling corrupt data during deserialization
> 
>
> Key: KAFKA-5157
> URL: https://issues.apache.org/jira/browse/KAFKA-5157
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>  Labels: user-experience
> Fix For: 0.11.1.0
>
>
> When there is a bad formatted data in the source topics, deserialization will 
> throw a runtime exception all the way to the users. And since deserialization 
> happens before it was ever processed at the beginning of the topology, today 
> there is no ways to handle such errors on the user-app level.
> We should consider allowing users to handle such "poison pills" in a 
> customizable way.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4931) stop script fails due 4096 ps output limit

2017-07-10 Thread Vahid Hashemian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080885#comment-16080885
 ] 

Vahid Hashemian commented on KAFKA-4931:


[~tombentley] What exactly has the 4096 character limit? Is it the output of 
{{ps ax}}? Or something else?
For me in an Ubuntu machine, the output of {{ps ax}} is over 50K long, and the 
stop script works fine.

> stop script fails due 4096 ps output limit
> --
>
> Key: KAFKA-4931
> URL: https://issues.apache.org/jira/browse/KAFKA-4931
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.10.2.0
>Reporter: Amit Jain
>Assignee: Tom Bentley
>Priority: Minor
>  Labels: patch-available
>
> When run the script: bin/zookeeper-server-stop.sh fails to stop the zookeeper 
> server process if the ps output exceeds 4096 character limit of linux. I 
> think instead of ps we can use ${JAVA_HOME}/bin/jps -vl | grep QuorumPeerMain 
>  it would correctly stop zookeeper process. Currently we are using kill 
> PIDS=$(ps ax | grep java | grep -i QuorumPeerMain | grep -v grep | awk 
> '{print $1}')



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-4931) stop script fails due 4096 ps output limit

2017-07-10 Thread Vahid Hashemian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080885#comment-16080885
 ] 

Vahid Hashemian edited comment on KAFKA-4931 at 7/10/17 7:01 PM:
-

[~tombentley] What exactly has the 4096 character limit? Is it the output of 
{{ps ax}}? Or something else?
For me in an Ubuntu machine, the output of {{ps ax}} is over 50K long, and the 
stop script works fine.
Could you please clarify? Thanks.


was (Author: vahid):
[~tombentley] What exactly has the 4096 character limit? Is it the output of 
{{ps ax}}? Or something else?
For me in an Ubuntu machine, the output of {{ps ax}} is over 50K long, and the 
stop script works fine.

> stop script fails due 4096 ps output limit
> --
>
> Key: KAFKA-4931
> URL: https://issues.apache.org/jira/browse/KAFKA-4931
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.10.2.0
>Reporter: Amit Jain
>Assignee: Tom Bentley
>Priority: Minor
>  Labels: patch-available
>
> When run the script: bin/zookeeper-server-stop.sh fails to stop the zookeeper 
> server process if the ps output exceeds 4096 character limit of linux. I 
> think instead of ps we can use ${JAVA_HOME}/bin/jps -vl | grep QuorumPeerMain 
>  it would correctly stop zookeeper process. Currently we are using kill 
> PIDS=$(ps ax | grep java | grep -i QuorumPeerMain | grep -v grep | awk 
> '{print $1}')



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5157) Options for handling corrupt data during deserialization

2017-07-10 Thread Damian Guy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Guy resolved KAFKA-5157.
---
Resolution: Fixed

Issue resolved by pull request 3423
[https://github.com/apache/kafka/pull/3423]

> Options for handling corrupt data during deserialization
> 
>
> Key: KAFKA-5157
> URL: https://issues.apache.org/jira/browse/KAFKA-5157
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>  Labels: user-experience
> Fix For: 0.11.1.0
>
>
> When there is a bad formatted data in the source topics, deserialization will 
> throw a runtime exception all the way to the users. And since deserialization 
> happens before it was ever processed at the beginning of the topology, today 
> there is no ways to handle such errors on the user-app level.
> We should consider allowing users to handle such "poison pills" in a 
> customizable way.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5578) Streams Task Assignor should consider the staleness of state directories when allocating tasks

2017-07-10 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080805#comment-16080805
 ] 

Guozhang Wang commented on KAFKA-5578:
--

Hey [~damianguy] do you mean that there could be multiple threads that have the 
state dir for a given task and that task is not on any thread's {{previous 
active tasks}} list? In the StreamPartitionAssignor we chose priorities as:

1. Previous active tasks: this information is get from the {{removeActiveTask}} 
so at most one thread could ever has this task in its list.
2. Previous standby tasks: this information is collected from stat dir, so that 
multiple threads could have this task in its list.
3. Pick random one with load balance in mind.

Your concern is that if there is none for priority 1), then we may have 
multiple candidates for priority 2) but actually have different restoration 
cost, is that correct? I wonder in practice, with state dir cleanup process, 
how often could we ever hit that issue?

> Streams Task Assignor should consider the staleness of state directories when 
> allocating tasks
> --
>
> Key: KAFKA-5578
> URL: https://issues.apache.org/jira/browse/KAFKA-5578
> Project: Kafka
>  Issue Type: Bug
>Reporter: Damian Guy
>
> During task assignment we use the presence of a state directory to assign 
> precedence to which instances should be assigned the task. We first chose 
> previous active tasks, but then fall back to the existence of a state dir. 
> Unfortunately we don't take into account the recency of the data from the 
> available state dirs. So in the case where a task has run on many instances, 
> it may be that we chose an instance that has relatively old data.
> When doing task assignment we should take into consideration the age of the 
> data in the state dirs. We could use the data from the checkpoint files to 
> determine which instance is most up-to-date and attempt to assign accordingly 
> (obviously making sure that tasks are still balanced across available 
> instances)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5578) Streams Task Assignor should consider the staleness of state directories when allocating tasks

2017-07-10 Thread Damian Guy (JIRA)
Damian Guy created KAFKA-5578:
-

 Summary: Streams Task Assignor should consider the staleness of 
state directories when allocating tasks
 Key: KAFKA-5578
 URL: https://issues.apache.org/jira/browse/KAFKA-5578
 Project: Kafka
  Issue Type: Bug
Reporter: Damian Guy


During task assignment we use the presence of a state directory to assign 
precedence to which instances should be assigned the task. We first chose 
previous active tasks, but then fall back to the existence of a state dir. 
Unfortunately we don't take into account the recency of the data from the 
available state dirs. So in the case where a task has run on many instances, it 
may be that we chose an instance that has relatively old data.

When doing task assignment we should take into consideration the age of the 
data in the state dirs. We could use the data from the checkpoint files to 
determine which instance is most up-to-date and attempt to assign accordingly 
(obviously making sure that tasks are still balanced across available instances)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5536) Tools splitted between Java and Scala implementation

2017-07-10 Thread Paolo Patierno (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080704#comment-16080704
 ] 

Paolo Patierno commented on KAFKA-5536:
---

Thanks Ismael !

1. Good. I can recover the previous work I have done using argparse4j. In any 
case my idea is to mimic the same behaviour we have with the joptsimple lib in 
order to avoid problems.
2. Regarding sharing the code, the CommandLineUtils leverages on the joptsimple 
lib so it's not possible just move it to use argparse4j in Java and then 
sharing with other Scala tools (even because the migration will be slow, tool 
by tool and not all together). So it seems that the better solution for now is 
just rewriting for the tools module.

Let's see how things will evolve on TopicCommand, the first tool that will be 
the subject of this big migration.

> Tools splitted between Java and Scala implementation
> 
>
> Key: KAFKA-5536
> URL: https://issues.apache.org/jira/browse/KAFKA-5536
> Project: Kafka
>  Issue Type: Wish
>Reporter: Paolo Patierno
>
> Hi,
> is there any specific reason why tools are splitted between Java and Scala 
> implementations ?
> Maybe it could be better having only one language for all of them.
> What do you think ?
> Thanks,
> Paolo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5562) Do streams state directory cleanup on a single thread

2017-07-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080653#comment-16080653
 ] 

ASF GitHub Bot commented on KAFKA-5562:
---

GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/3516

KAFKA-5562: execute state dir cleanup on single thread

Use a single `StateDirectory` per streams instance.
Use threadId to determine which thread owns the lock.
Only allow the owning thread to unlock.
Execute cleanup on a scheduled thread in `KafkaStreams`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka kafka-5562

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3516.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3516


commit 3ef084d34be8fafb368a5e9ceced1339b345d5f2
Author: Damian Guy 
Date:   2017-07-07T08:44:37Z

execute state dir cleanup on single thread




> Do streams state directory cleanup on a single thread
> -
>
> Key: KAFKA-5562
> URL: https://issues.apache.org/jira/browse/KAFKA-5562
> Project: Kafka
>  Issue Type: Bug
>Reporter: Damian Guy
>Assignee: Damian Guy
>
> Currently in streams we clean up old state directories every so often (as 
> defined by {{state.cleanup.delay.ms}}). However, every StreamThread runs the 
> cleanup, which is both unnecessary and can potentially lead to race 
> conditions.
> It would be better to perform the state cleanup on a single thread and only 
> when the {{KafkaStreams}} instance is in a running state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5577) WindowedStreamPartitioner does not provide topic name to serializer

2017-07-10 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-5577:
--

 Summary: WindowedStreamPartitioner does not provide topic name to 
serializer
 Key: KAFKA-5577
 URL: https://issues.apache.org/jira/browse/KAFKA-5577
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.10.2.0
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
 Fix For: 0.11.0.0, 0.10.2.1


WindowedStreamPartitioner does not provide topic name to serializer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5536) Tools splitted between Java and Scala implementation

2017-07-10 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080562#comment-16080562
 ] 

Ismael Juma commented on KAFKA-5536:


Thanks for the PR. Answers to the questions:

1. argparse4j was deemed to be more flexible and chosen for the newer Java 
tools. It would probably make sense to use that for all Java tools. But we need 
to make sure we have tests to ensure that we don't break compatibility.

2. There are two options for CLI utils:
* If we think it makes sense to share it between tools and core: rewrite in 
Java in the common package in the clients module, update Scala code to use it.
* Otherwise: rewrite in Java in the tools module.

> Tools splitted between Java and Scala implementation
> 
>
> Key: KAFKA-5536
> URL: https://issues.apache.org/jira/browse/KAFKA-5536
> Project: Kafka
>  Issue Type: Wish
>Reporter: Paolo Patierno
>
> Hi,
> is there any specific reason why tools are splitted between Java and Scala 
> implementations ?
> Maybe it could be better having only one language for all of them.
> What do you think ?
> Thanks,
> Paolo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5577) WindowedStreamPartitioner does not provide topic name to serializer

2017-07-10 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-5577.

Resolution: Fixed

Fixed via https://github.com/apache/kafka/pull/2776

> WindowedStreamPartitioner does not provide topic name to serializer
> ---
>
> Key: KAFKA-5577
> URL: https://issues.apache.org/jira/browse/KAFKA-5577
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.11.0.0, 0.10.2.1
>
>
> WindowedStreamPartitioner does not provide topic name to serializer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-3268) Refactor existing CLI scripts to use new AdminClient

2017-07-10 Thread Viktor Somogyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi reassigned KAFKA-3268:
-

Assignee: Viktor Somogyi

> Refactor existing CLI scripts to use new AdminClient
> 
>
> Key: KAFKA-3268
> URL: https://issues.apache.org/jira/browse/KAFKA-3268
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Assignee: Viktor Somogyi
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3268) Refactor existing CLI scripts to use new AdminClient

2017-07-10 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080385#comment-16080385
 ] 

Grant Henke commented on KAFKA-3268:


Not at all. Feel free.

> Refactor existing CLI scripts to use new AdminClient
> 
>
> Key: KAFKA-3268
> URL: https://issues.apache.org/jira/browse/KAFKA-3268
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Grant Henke
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3268) Refactor existing CLI scripts to use new AdminClient

2017-07-10 Thread Viktor Somogyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080358#comment-16080358
 ] 

Viktor Somogyi commented on KAFKA-3268:
---

[~granthenke] do you mind if I pick this up?

> Refactor existing CLI scripts to use new AdminClient
> 
>
> Key: KAFKA-3268
> URL: https://issues.apache.org/jira/browse/KAFKA-3268
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Grant Henke
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4107) Support offset reset capability in Kafka Connect

2017-07-10 Thread JIRA

[ 
https://issues.apache.org/jira/browse/KAFKA-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080346#comment-16080346
 ] 

Sönke Liebau commented on KAFKA-4107:
-

Sounds good to me. I'll investigate the issue a little and draft a high level 
KIP as basis for further discussion.

> Support offset reset capability in Kafka Connect
> 
>
> Key: KAFKA-4107
> URL: https://issues.apache.org/jira/browse/KAFKA-4107
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Jason Gustafson
>
> It would be useful in some cases to be able to reset connector offsets. For 
> example, if a topic in Kafka corresponding to a source database is 
> accidentally deleted (or deleted because of corrupt data), an administrator 
> may want to reset offsets and reproduce the log from the beginning. It may 
> also be useful to have support for overriding offsets, but that seems like a 
> less likely use case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5536) Tools splitted between Java and Scala implementation

2017-07-10 Thread Paolo Patierno (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080163#comment-16080163
 ] 

Paolo Patierno commented on KAFKA-5536:
---

Hi [~ijuma], 

I have opened a PR on https://issues.apache.org/jira/browse/KAFKA-5561 just for 
evaluation if we agree on that path for refactoring the first tool, 
TopicCommand, from Scala to Java and
at same time using the Admin client API.
Regarding the command line arguments parsing I used the joptsimple lib as 
already used in the Scala version even if I see that the few
Java tools are using argparse4j. With joptsimple the transition from Scala to 
Java is smooth and the logic could be the same.
I see that the original TopicCommand uses some utilities classes from the 
"core" (CommandLineUtils and CoreUtils for example).
I rewrote the CommandLineUtils in Java (putting it inside the "tools") but I 
was wondering if we should reuse the Scala one; in this
case we should add the "core" as dependecy on "tools". I have a doubt about 
that.

Can you take a look at that, please ? 

> Tools splitted between Java and Scala implementation
> 
>
> Key: KAFKA-5536
> URL: https://issues.apache.org/jira/browse/KAFKA-5536
> Project: Kafka
>  Issue Type: Wish
>Reporter: Paolo Patierno
>
> Hi,
> is there any specific reason why tools are splitted between Java and Scala 
> implementations ?
> Maybe it could be better having only one language for all of them.
> What do you think ?
> Thanks,
> Paolo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5561) Rewrite TopicCommand using the new Admin client

2017-07-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080159#comment-16080159
 ] 

ASF GitHub Bot commented on KAFKA-5561:
---

GitHub user ppatierno opened a pull request:

https://github.com/apache/kafka/pull/3514

KAFKA-5561: Rewrite TopicCommand using the new Admin client (** WIP do not 
merge **)

Start on porting TopicCommand from Scala to Java using Admin Client API as 
well

Please do not merge it's just for evaluation with other contributors and 
committers as starting point for a migration from Scala to Java for tools. 
Other considerations here : 

https://issues.apache.org/jira/browse/KAFKA-5536

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppatierno/kafka kafka-5561

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3514.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3514


commit a4146eeab9960d183733bc195516839fc2f76601
Author: ppatierno 
Date:   2017-07-10T10:51:53Z

Start on porting TopicCommand from Scala to Java using Admin Client API as 
well




> Rewrite TopicCommand using the new Admin client
> ---
>
> Key: KAFKA-5561
> URL: https://issues.apache.org/jira/browse/KAFKA-5561
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Paolo Patierno
>Assignee: Paolo Patierno
>
> Hi, 
> as suggested in the https://issues.apache.org/jira/browse/KAFKA-3331, it 
> could be great to have the TopicCommand using the new Admin client instead of 
> the way it works today.
> As pushed by [~gwenshap] in the above JIRA, I'm going to work on it.
> Thanks,
> Paolo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-1044) change log4j to slf4j

2017-07-10 Thread Viktor Somogyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080123#comment-16080123
 ] 

Viktor Somogyi commented on KAFKA-1044:
---

[~ewencp] could you please look at my change when you have time?
An important question has been asked on the review: should we use scala logging?

> change log4j to slf4j 
> --
>
> Key: KAFKA-1044
> URL: https://issues.apache.org/jira/browse/KAFKA-1044
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.8.0
>Reporter: shijinkui
>Assignee: Viktor Somogyi
>Priority: Minor
>  Labels: newbie
>
> can u chanage the log4j to slf4j, in my project, i use logback, it's conflict 
> with log4j.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5576) Support Power platform by updating rocksdb

2017-07-10 Thread Yussuf Shaikh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080092#comment-16080092
 ] 

Yussuf Shaikh edited comment on KAFKA-5576 at 7/10/17 10:07 AM:


RocksDB team have applied the fixes for Power platform in their latest release. 
Hence propose to update the rocksdb version to 5.4.7/5.5.1 which fixes the 
issues on Power.

All tests ran successfully after making below change in gradle dependencies 
file:
rocksDB: "5.5.1"


was (Author: yussufshaikh):
RocksDB team have applied the fixes for Power platform in their latest release. 
Hence propose to update the rocksdb version to 5.4.7 which fixes the issues on 
Power.

All tests ran successfully after making below change in gradle dependencies 
file:
rocksDB: "5.5.1"

> Support Power platform by updating rocksdb
> --
>
> Key: KAFKA-5576
> URL: https://issues.apache.org/jira/browse/KAFKA-5576
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.1.0
> Environment: $ cat /etc/lsb-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=14.04
> DISTRIB_CODENAME=trusty
> DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"
> $ uname -a
> Linux pts00432-vm20 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 
> 17:42:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
>Reporter: Yussuf Shaikh
>Priority: Minor
> Attachments: KAFKA-5576.patch
>
>
> Many test cases are failing with one to the following exceptions related to 
> rocksdb.
> 1. java.lang.NoClassDefFoundError: Could not initialize class 
> org.rocksdb.Options
> at 
> org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:119)
> at 
> org.apache.kafka.streams.state.internals.Segment.openDB(Segment.java:40)
> 2. java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni4427030040392983276.so: 
> /lib64/ld64.so.2: version `GLIBC_2.22' not found (required by 
> /tmp/librocksdbjni4427030040392983276.so)
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1086)
> at 
> org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:78)
> 3. java.lang.AssertionError: Condition not met within timeout 3. 
> Expecting 3 records from topic output-topic-2 while only received 0: []
> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:274)
> at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:160)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5576) Support Power platform by updating rocksdb

2017-07-10 Thread Yussuf Shaikh (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yussuf Shaikh updated KAFKA-5576:
-
Attachment: KAFKA-5576.patch

> Support Power platform by updating rocksdb
> --
>
> Key: KAFKA-5576
> URL: https://issues.apache.org/jira/browse/KAFKA-5576
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.1.0
> Environment: $ cat /etc/lsb-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=14.04
> DISTRIB_CODENAME=trusty
> DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"
> $ uname -a
> Linux pts00432-vm20 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 
> 17:42:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
>Reporter: Yussuf Shaikh
>Priority: Minor
> Attachments: KAFKA-5576.patch
>
>
> Many test cases are failing with one to the following exceptions related to 
> rocksdb.
> 1. java.lang.NoClassDefFoundError: Could not initialize class 
> org.rocksdb.Options
> at 
> org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:119)
> at 
> org.apache.kafka.streams.state.internals.Segment.openDB(Segment.java:40)
> 2. java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni4427030040392983276.so: 
> /lib64/ld64.so.2: version `GLIBC_2.22' not found (required by 
> /tmp/librocksdbjni4427030040392983276.so)
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1086)
> at 
> org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:78)
> 3. java.lang.AssertionError: Condition not met within timeout 3. 
> Expecting 3 records from topic output-topic-2 while only received 0: []
> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:274)
> at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:160)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5576) Support Power platform by updating rocksdb

2017-07-10 Thread Yussuf Shaikh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080092#comment-16080092
 ] 

Yussuf Shaikh edited comment on KAFKA-5576 at 7/10/17 10:08 AM:


RocksDB team have applied the fixes for Power platform in their latest release. 
Hence propose to update the rocksdb version to 5.4.7 or 5.5.1 which fixes the 
issues on Power.

All tests ran successfully after making below change in gradle dependencies 
file:
rocksDB: "5.5.1"


was (Author: yussufshaikh):
RocksDB team have applied the fixes for Power platform in their latest release. 
Hence propose to update the rocksdb version to 5.4.7/5.5.1 which fixes the 
issues on Power.

All tests ran successfully after making below change in gradle dependencies 
file:
rocksDB: "5.5.1"

> Support Power platform by updating rocksdb
> --
>
> Key: KAFKA-5576
> URL: https://issues.apache.org/jira/browse/KAFKA-5576
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.1.0
> Environment: $ cat /etc/lsb-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=14.04
> DISTRIB_CODENAME=trusty
> DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"
> $ uname -a
> Linux pts00432-vm20 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 
> 17:42:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
>Reporter: Yussuf Shaikh
>Priority: Minor
> Attachments: KAFKA-5576.patch
>
>
> Many test cases are failing with one to the following exceptions related to 
> rocksdb.
> 1. java.lang.NoClassDefFoundError: Could not initialize class 
> org.rocksdb.Options
> at 
> org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:119)
> at 
> org.apache.kafka.streams.state.internals.Segment.openDB(Segment.java:40)
> 2. java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni4427030040392983276.so: 
> /lib64/ld64.so.2: version `GLIBC_2.22' not found (required by 
> /tmp/librocksdbjni4427030040392983276.so)
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1086)
> at 
> org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:78)
> 3. java.lang.AssertionError: Condition not met within timeout 3. 
> Expecting 3 records from topic output-topic-2 while only received 0: []
> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:274)
> at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:160)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5576) Support Power platform by updating rocksdb

2017-07-10 Thread Yussuf Shaikh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080092#comment-16080092
 ] 

Yussuf Shaikh commented on KAFKA-5576:
--

RocksDB team have applied the fixes for Power platform in their latest release. 
Hence propose to update the rocksdb version to 5.4.7 which fixes the issues on 
Power.

All tests ran successfully after making below change in gradle dependencies 
file:
rocksDB: "5.5.1"

> Support Power platform by updating rocksdb
> --
>
> Key: KAFKA-5576
> URL: https://issues.apache.org/jira/browse/KAFKA-5576
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.1.0
> Environment: $ cat /etc/lsb-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=14.04
> DISTRIB_CODENAME=trusty
> DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"
> $ uname -a
> Linux pts00432-vm20 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 
> 17:42:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
>Reporter: Yussuf Shaikh
>Priority: Minor
>
> Many test cases are failing with one to the following exceptions related to 
> rocksdb.
> 1. java.lang.NoClassDefFoundError: Could not initialize class 
> org.rocksdb.Options
> at 
> org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:119)
> at 
> org.apache.kafka.streams.state.internals.Segment.openDB(Segment.java:40)
> 2. java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni4427030040392983276.so: 
> /lib64/ld64.so.2: version `GLIBC_2.22' not found (required by 
> /tmp/librocksdbjni4427030040392983276.so)
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1086)
> at 
> org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:78)
> 3. java.lang.AssertionError: Condition not met within timeout 3. 
> Expecting 3 records from topic output-topic-2 while only received 0: []
> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:274)
> at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:160)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5576) Support Power platform by updating rocksdb

2017-07-10 Thread Yussuf Shaikh (JIRA)
Yussuf Shaikh created KAFKA-5576:


 Summary: Support Power platform by updating rocksdb
 Key: KAFKA-5576
 URL: https://issues.apache.org/jira/browse/KAFKA-5576
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.11.1.0
 Environment: $ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"
$ uname -a
Linux pts00432-vm20 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 
17:42:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
Reporter: Yussuf Shaikh
Priority: Minor


Many test cases are failing with one to the following exceptions related to 
rocksdb.

1. java.lang.NoClassDefFoundError: Could not initialize class 
org.rocksdb.Options
at 
org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:119)
at 
org.apache.kafka.streams.state.internals.Segment.openDB(Segment.java:40)
2. java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni4427030040392983276.so: 
/lib64/ld64.so.2: version `GLIBC_2.22' not found (required by 
/tmp/librocksdbjni4427030040392983276.so)
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
at java.lang.Runtime.load0(Runtime.java:809)
at java.lang.System.load(System.java:1086)
at 
org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:78)
3. java.lang.AssertionError: Condition not met within timeout 3. Expecting 
3 records from topic output-topic-2 while only received 0: []
at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:274)
at 
org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:160)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5560) LogManager should be able to create new logs based on free disk space

2017-07-10 Thread huxihx (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huxihx updated KAFKA-5560:
--
Labels: kips  (was: )

> LogManager should be able to create new logs based on free disk space
> -
>
> Key: KAFKA-5560
> URL: https://issues.apache.org/jira/browse/KAFKA-5560
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.11.0.0
>Reporter: huxihx
>  Labels: kips
>
> Currently, log manager chooses a directory configured in `log.dirs` by 
> calculating the number partitions in each directory and then choosing the one 
> with the fewest partitions. But in some real production scenarios where data 
> volumes of partitions are not even, some disks nearly become full whereas the 
> others have a lot of spaces which lead to a poor data distribution.
> We should offer a new strategy to users to have log manager honor the real 
> disk free spaces and choose the directory with the most disk space. Maybe a 
> new broker configuration parameter is needed, `log.directory.strategy` for 
> instance. Perhaps this needs a new KIP also.
> Does it make sense?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5575) SchemaBuilder should have a method to clone an existing Schema.

2017-07-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16079890#comment-16079890
 ] 

ASF GitHub Bot commented on KAFKA-5575:
---

GitHub user jcustenborder opened a pull request:

https://github.com/apache/kafka/pull/3511

KAFKA-5575 - Add SchemaBuilder.from

Added `SchemaBuilder.from` method which allows creating a schema builder 
prepopulated with the details from the schema. Added tests for structs, maps, 
arrays, primitives, and logical types.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jcustenborder/kafka KAFKA-5575

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3511.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3511


commit 1acb99a6b96b50958956659c770a055c3fd4d133
Author: Jeremy Custenborder 
Date:   2017-07-10T05:36:38Z

KAFKA-5575 Added `SchemaBuilder.from` method which allows creating a schema 
builder prepopulated with the details from the schema. Added tests for structs, 
maps, arrays, primitives, and logical types.




> SchemaBuilder should have a method to clone an existing Schema.
> ---
>
> Key: KAFKA-5575
> URL: https://issues.apache.org/jira/browse/KAFKA-5575
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Jeremy Custenborder
>Assignee: Jeremy Custenborder
>Priority: Minor
>
> Now that Transformations have landed in Kafka Connect we should have an easy 
> way to do quick modifications to schemas. For example changing the name of a 
> schema shouldn't be much more than. I should be able to do more stuff like 
> this.
> {code:java}
> return SchemaBuilder.from(Schema.STRING_SCHEMA).name("MyNewName").build()
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)