[jira] [Updated] (KAFKA-9466) Add documentation for new stream EOS change

2020-01-22 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-9466:
---
Component/s: streams
 docs

> Add documentation for new stream EOS change
> ---
>
> Key: KAFKA-9466
> URL: https://issues.apache.org/jira/browse/KAFKA-9466
> Project: Kafka
>  Issue Type: Sub-task
>  Components: docs, streams
>Reporter: Boyang Chen
>Assignee: Matthias J. Sax
>Priority: Major
>
> We shall fill in more details when we actually reach this stage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9464) Close the producer in completeShutdown

2020-01-22 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax reassigned KAFKA-9464:
--

Assignee: Ted Yu

> Close the producer in completeShutdown
> --
>
> Key: KAFKA-9464
> URL: https://issues.apache.org/jira/browse/KAFKA-9464
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> In StreamThread#completeShutdown, the producer (if not null) should be closed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9465) Enclose consumer call with catching InvalidOffsetException

2020-01-22 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021807#comment-17021807
 ] 

Matthias J. Sax commented on KAFKA-9465:


Add you to the list of contributors – you can now self assign tickets.

> Enclose consumer call with catching InvalidOffsetException
> --
>
> Key: KAFKA-9465
> URL: https://issues.apache.org/jira/browse/KAFKA-9465
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> In maybeUpdateStandbyTasks, the try block encloses restoreConsumer.poll and 
> record handling.
> Since InvalidOffsetException is thrown by restoreConsumer.poll, we should 
> enclose this call in the try block.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9465) Enclose consumer call with catching InvalidOffsetException

2020-01-22 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax reassigned KAFKA-9465:
--

Assignee: Ted Yu

> Enclose consumer call with catching InvalidOffsetException
> --
>
> Key: KAFKA-9465
> URL: https://issues.apache.org/jira/browse/KAFKA-9465
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> In maybeUpdateStandbyTasks, the try block encloses restoreConsumer.poll and 
> record handling.
> Since InvalidOffsetException is thrown by restoreConsumer.poll, we should 
> enclose this call in the try block.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8532) controller-event-thread deadlock with zk-session-expiry-handler0

2020-01-22 Thread Jun Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021795#comment-17021795
 ] 

Jun Rao commented on KAFKA-8532:


[~lbdai3190] :  To verify this, I wrote the following short program. I created 
an instance of KafkaZkClient and registered a StateChangeHandler that blocks in 
beforeInitializingSession(). 
{code:java}
package kafka.tools

import kafka.zk.KafkaZkClient
import kafka.zookeeper.StateChangeHandler
import org.apache.kafka.common.utils.Time

object Mytest {

  // visible for testing
  private[tools] val RecordIndent = "|"

  def main(args: Array[String]): Unit = {
val kakfaZkClient = KafkaZkClient("localhost:2181", false, 6000, 6000, 10, 
Time.SYSTEM, "mytest")
kakfaZkClient.registerStateChangeHandler(new StateChangeHandler {
  override val name: String = "mytest"
  override def afterInitializingSession(): Unit = {
throw new IllegalStateException
  }
  override def beforeInitializingSession(): Unit = {
Thread.sleep(Integer.MAX_VALUE) //block forever
  }
})

println("zookeeper client state: " + 
kakfaZkClient.currentZooKeeper.getState)
Thread.sleep(2)
try {
  val children = kakfaZkClient.getChildren("/")
  println("child nodes are " + children)
} catch {
  case t: Throwable =>
println("hit exception " + t)
}
println("zookeeper client state: " + 
kakfaZkClient.currentZooKeeper.getState)
  }
}{code}
 

I then started zookeeper server, and ran the above program 
(bin/kafka-run-class.sh kafka.tools.Mytest). I waited until the zookeeper 
client got to the CONNECTED state (but before the 20 sec sleep completes) and 
did "kill -STOP" to pause the program. I waited for another 6 seconds for the 
ZK session to expire. Then I did "kill -CONT" to resume the program. The 
following is the output that I got.

 

 
{code:java}
zookeeper client state: CONNECTED
[2020-01-22 21:47:01,693] WARN Client session timed out, have not heard from 
server in 17038ms for sessionid 0x1002ced93e6 
(org.apache.zookeeper.ClientCnxn)
[2020-01-22 21:47:03,351] WARN Unable to reconnect to ZooKeeper service, 
session 0x1002ced93e6 has expired (org.apache.zookeeper.ClientCnxn)
hit exception org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /
zookeeper client state: CLOSED 
{code}
 

As you can see, this program verifies a few things. (1) If a ZK session 
expires, the state of ZK client transitions to CLOSED (not CONNECTING). (2) If 
ZooKeeperClient.handleRequests() is called after the ZK session has expired, 
the call doesn't block and returns a Session expired error code. (3) Even if 
beforeInitializingSession() blocks, ZooKeeperClient.handleRequests() is not 
blocked on an expired session.

> controller-event-thread deadlock with zk-session-expiry-handler0
> 
>
> Key: KAFKA-8532
> URL: https://issues.apache.org/jira/browse/KAFKA-8532
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.1
>Reporter: leibo
>Priority: Blocker
> Attachments: controllerDeadLockAnalysis-2020-01-20.png, js.log, 
> js0.log, js1.log, js2.log
>
>
> We have observed a serious deadlock between controller-event-thead and 
> zk-session-expirey-handle thread. When this issue occurred, it's only one way 
> to recovery the kafka cluster is restart kafka server. The  follows is the 
> jstack log of controller-event-thead and zk-session-expiry-handle thread.
> "zk-session-expiry-handler0" #163089 daemon prio=5 os_prio=0 
> tid=0x7fcc9c01 nid=0xfb22 waiting on condition [0x7fcbb01f8000]
>  java.lang.Thread.State: WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for <0x0005ee3f7000> (a 
> java.util.concurrent.CountDownLatch$Sync)
>  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) // 
> 等待controller-event-thread线程处理expireEvent
>  at 
> kafka.controller.KafkaController$Expire.waitUntilProcessingStarted(KafkaController.scala:1533)
>  at 
> kafka.controller.KafkaController$$anon$7.beforeInitializingSession(KafkaController.scala:173)
>  at 
> kafka.zookeeper.ZooKeeperClient.callBeforeInitializingSession(ZooKeeperClient.scala:408)
>  at 
> kafka.zookeeper.ZooKeeperClient.$anonfun$reinitialize$1(ZooKeeperClient.scala:374)

[jira] [Updated] (KAFKA-9468) config.storage.topic partition count issue is hard to debug

2020-01-22 Thread Evelyn Bayes (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evelyn Bayes updated KAFKA-9468:

Description: 
When you run connect distributed with 2 or more workers and 
config.storage.topic has more then 1 partition, you can end up with one of the 
workers rebalancing endlessly:

[2020-01-13 12:53:23,535] INFO [Worker clientId=connect-1, 
groupId=connect-cluster] Current config state offset 37 is behind group 
assignment 63, reading to end of config log 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)
 [2020-01-13 12:53:23,584] INFO [Worker clientId=connect-1, 
groupId=connect-cluster] Finished reading to end of log and updated config 
snapshot, new config log offset: 37 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)
 [2020-01-13 12:53:23,584] INFO [Worker clientId=connect-1, 
groupId=connect-cluster] Current config state offset 37 does not match group 
assignment 63. Forcing rebalance. 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)

 

In case any person viewing this doesn't know you are only ever meant to create 
this topic with one partition.

 

*Suggested Solution*

Make the connect worker check the partition count when it starts and if 
partition count is > 1 Kafka Connect stops and logs the reason why.

I think this is reasonable as it would stop users just starting out from 
building it incorrectly and would be easy to fix early. For those upgrading 
this would easily be caught in a PRE-PROD environment. And even if they 
upgraded directly in PROD you would only be impacted if upgraded all connect 
workers at the same time.

  was:
When you run connect distributed with 2 or more workers and 
config.storage.topic has more then 1 partition, you can end up with one of the 
workers rebalancing endlessly:

[2020-01-13 12:53:23,535] INFO [Worker clientId=connect-1, 
groupId=connect-cluster] Current config state offset 37 is behind group 
assignment 63, reading to end of config log 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)
[2020-01-13 12:53:23,584] INFO [Worker clientId=connect-1, 
groupId=connect-cluster] Finished reading to end of log and updated config 
snapshot, new config log offset: 37 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)
[2020-01-13 12:53:23,584] INFO [Worker clientId=connect-1, 
groupId=connect-cluster] Current config state offset 37 does not match group 
assignment 63. Forcing rebalance. 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)

 

*Suggested Solution*

Make the connect worker check the partition count when it starts and if 
partition count is > 1 Kafka Connect stops and logs the reason why.

I think this is reasonable as it would stop users just starting out from 
building it incorrectly and would be easy to fix early. For those upgrading 
this would easily be caught in a PRE-PROD environment. And even if they 
upgraded directly in PROD you would only be impacted if upgraded all connect 
workers at the same time.


> config.storage.topic partition count issue is hard to debug
> ---
>
> Key: KAFKA-9468
> URL: https://issues.apache.org/jira/browse/KAFKA-9468
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 1.0.2, 1.1.1, 2.0.1, 2.1.1, 2.2.2, 2.4.0, 2.3.1
>Reporter: Evelyn Bayes
>Priority: Minor
>
> When you run connect distributed with 2 or more workers and 
> config.storage.topic has more then 1 partition, you can end up with one of 
> the workers rebalancing endlessly:
> [2020-01-13 12:53:23,535] INFO [Worker clientId=connect-1, 
> groupId=connect-cluster] Current config state offset 37 is behind group 
> assignment 63, reading to end of config log 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
>  [2020-01-13 12:53:23,584] INFO [Worker clientId=connect-1, 
> groupId=connect-cluster] Finished reading to end of log and updated config 
> snapshot, new config log offset: 37 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
>  [2020-01-13 12:53:23,584] INFO [Worker clientId=connect-1, 
> groupId=connect-cluster] Current config state offset 37 does not match group 
> assignment 63. Forcing rebalance. 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
>  
> In case any person viewing this doesn't know you are only ever meant to 
> create this topic with one partition.
>  
> *Suggested Solution*
> Make the connect worker check the partition count when it starts and if 
> partition count is > 1 Kafka Connect stops and logs the reason why.
> I think this is reasonable as it would stop users just starting out from 
> building it incorrectly and would be easy to fix early. For those upgrading 
> this would easily be caught in a PRE-PROD 

[jira] [Created] (KAFKA-9468) config.storage.topic partition count issue is hard to debug

2020-01-22 Thread Evelyn Bayes (Jira)
Evelyn Bayes created KAFKA-9468:
---

 Summary: config.storage.topic partition count issue is hard to 
debug
 Key: KAFKA-9468
 URL: https://issues.apache.org/jira/browse/KAFKA-9468
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Affects Versions: 2.3.1, 2.4.0, 2.2.2, 2.1.1, 2.0.1, 1.1.1, 1.0.2
Reporter: Evelyn Bayes


When you run connect distributed with 2 or more workers and 
config.storage.topic has more then 1 partition, you can end up with one of the 
workers rebalancing endlessly:

[2020-01-13 12:53:23,535] INFO [Worker clientId=connect-1, 
groupId=connect-cluster] Current config state offset 37 is behind group 
assignment 63, reading to end of config log 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)
[2020-01-13 12:53:23,584] INFO [Worker clientId=connect-1, 
groupId=connect-cluster] Finished reading to end of log and updated config 
snapshot, new config log offset: 37 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)
[2020-01-13 12:53:23,584] INFO [Worker clientId=connect-1, 
groupId=connect-cluster] Current config state offset 37 does not match group 
assignment 63. Forcing rebalance. 
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)

 

*Suggested Solution*

Make the connect worker check the partition count when it starts and if 
partition count is > 1 Kafka Connect stops and logs the reason why.

I think this is reasonable as it would stop users just starting out from 
building it incorrectly and would be easy to fix early. For those upgrading 
this would easily be caught in a PRE-PROD environment. And even if they 
upgraded directly in PROD you would only be impacted if upgraded all connect 
workers at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9467) Multiple wallclock punctuators may be scheduled after a rebalance

2020-01-22 Thread Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021696#comment-17021696
 ] 

Sophie Blee-Goldman commented on KAFKA-9467:


Obviously option 3 is a lot more work and may not be worth it, but the option 
is always there. That means we can choose to go with 1 or 2 now, and if it 
seems like we made the wrong choice based on actual user needs then we can 
always add a new type and implement the other option

> Multiple wallclock punctuators may be scheduled after a rebalance
> -
>
> Key: KAFKA-9467
> URL: https://issues.apache.org/jira/browse/KAFKA-9467
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Sophie Blee-Goldman
>Priority: Major
>
> In the eager rebalancing protocol*, Streams will suspend all tasks at the 
> beginning of a rebalance and then resume those which have been reassigned to 
> the same StreamThread. Part of suspending and resuming a task involves 
> closing and reinitializing the topology, specifically calling Processor#close 
> followed by Processor#init. If a wallclock punctuator is scheduled as part of 
> init, it will be rescheduled again after every rebalance. Streams does not 
> cancel existing punctuators during suspension, and does not tell users they 
> must cancel punctuations themselves during Processor#close.
> This can cause multiple punctuators to build up over time, which has the 
> apparent effect of increasing the net punctuation rate for wallclock 
> punctuators. (The same technically occurs with event-time punctuators, but 
> the punctuation times are anchored relative to a fixed point and only one 
> will be triggered at a time, so there is no increased punctuation rate).
> There are several options at this point:
> A) Clear/cancel any existing punctuators during task suspension
> B) Push it to the user to cancel their punctuators in Processor#close, and 
> update the documentation and examples to clarify this.
> C) Leave existing punctuators alone during suspension, and instead block new 
> ones from being scheduled on top during re-initialization.
> One drawback of options A and B is that cancelling/rescheduling punctuators 
> can mean a punctuation is never triggered if rebalances are more frequent 
> than the punctuation interval. Even if they are still triggered, the 
> effective punctuation interval will actually decrease as each rebalance 
> delays the punctuation.
> Of course, if the task _does_ get migrated to another thread/instance the 
> punctuation would be reset anyways with option C, since we do not currently 
> store/persist the punctuation information anywhere. The wallclock semantics 
> are somewhat loosely defined, but I think most users would not consider any 
> of these a proper fix on their own as it just pushes the issue in the other 
> direction.
> Of course, if we were to anchor the wallclock punctuations to a fixed time 
> then this would not be a problem. At that point it seems reasonable to just 
> leave it up to the user to cancel the punctuation during Processor#close, 
> similar to any other kind of resource that must be cleaned up. Even if users 
> forgot to do so it wouldn't affect the actual behavior, just causes unused 
> punctuators to build up. See https://issues.apache.org/jira/browse/KAFKA-7699.
> Given this, I think the options for a complete solution are:
> 1) Implement KAFKA-7699 and then do A or B
> 2) Persist the current punctuation schedule while migrating a task 
> (presumably in the Subscription userdata) and then do C
> Choosing the best course of action here is probably blocked on a decision on 
> whether or not we want to anchor wallclock punctuations (KAFKA-7699). If we 
> can't get consensus on that, we could always
> 3) Introduce a third type of punctuation, then do both 1 and 2 (for the new 
> "anchored-wall-clock" type and the existing "wall-clock" type, respectively).
>  
> -*Another naive workaround for this issue is to turn on/upgrade to 
> cooperative rebalancing, which will not suspend and resume all active tasks 
> during a rebalance, and only suspend tasks that will be immediately closed 
> and migrated to another instance or StreamThread. Of course, this will still 
> cause the punctuation to be reset for tasks that _are_ actually 
> closed/migrated, so practically speaking it's identical to option C alone



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9467) Multiple wallclock punctuators may be scheduled after a rebalance

2020-01-22 Thread Sophie Blee-Goldman (Jira)
Sophie Blee-Goldman created KAFKA-9467:
--

 Summary: Multiple wallclock punctuators may be scheduled after a 
rebalance
 Key: KAFKA-9467
 URL: https://issues.apache.org/jira/browse/KAFKA-9467
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Sophie Blee-Goldman


In the eager rebalancing protocol*, Streams will suspend all tasks at the 
beginning of a rebalance and then resume those which have been reassigned to 
the same StreamThread. Part of suspending and resuming a task involves closing 
and reinitializing the topology, specifically calling Processor#close followed 
by Processor#init. If a wallclock punctuator is scheduled as part of init, it 
will be rescheduled again after every rebalance. Streams does not cancel 
existing punctuators during suspension, and does not tell users they must 
cancel punctuations themselves during Processor#close.

This can cause multiple punctuators to build up over time, which has the 
apparent effect of increasing the net punctuation rate for wallclock 
punctuators. (The same technically occurs with event-time punctuators, but the 
punctuation times are anchored relative to a fixed point and only one will be 
triggered at a time, so there is no increased punctuation rate).

There are several options at this point:

A) Clear/cancel any existing punctuators during task suspension

B) Push it to the user to cancel their punctuators in Processor#close, and 
update the documentation and examples to clarify this.

C) Leave existing punctuators alone during suspension, and instead block new 
ones from being scheduled on top during re-initialization.

One drawback of options A and B is that cancelling/rescheduling punctuators can 
mean a punctuation is never triggered if rebalances are more frequent than the 
punctuation interval. Even if they are still triggered, the effective 
punctuation interval will actually decrease as each rebalance delays the 
punctuation.

Of course, if the task _does_ get migrated to another thread/instance the 
punctuation would be reset anyways with option C, since we do not currently 
store/persist the punctuation information anywhere. The wallclock semantics are 
somewhat loosely defined, but I think most users would not consider any of 
these a proper fix on their own as it just pushes the issue in the other 
direction.

Of course, if we were to anchor the wallclock punctuations to a fixed time then 
this would not be a problem. At that point it seems reasonable to just leave it 
up to the user to cancel the punctuation during Processor#close, similar to any 
other kind of resource that must be cleaned up. Even if users forgot to do so 
it wouldn't affect the actual behavior, just causes unused punctuators to build 
up. See https://issues.apache.org/jira/browse/KAFKA-7699.

Given this, I think the options for a complete solution are:

1) Implement KAFKA-7699 and then do A or B

2) Persist the current punctuation schedule while migrating a task (presumably 
in the Subscription userdata) and then do C

Choosing the best course of action here is probably blocked on a decision on 
whether or not we want to anchor wallclock punctuations (KAFKA-7699). If we 
can't get consensus on that, we could always

3) Introduce a third type of punctuation, then do both 1 and 2 (for the new 
"anchored-wall-clock" type and the existing "wall-clock" type, respectively).

 

-*Another naive workaround for this issue is to turn on/upgrade to cooperative 
rebalancing, which will not suspend and resume all active tasks during a 
rebalance, and only suspend tasks that will be immediately closed and migrated 
to another instance or StreamThread. Of course, this will still cause the 
punctuation to be reset for tasks that _are_ actually closed/migrated, so 
practically speaking it's identical to option C alone



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9082) Move ConfigCommand to use KafkaAdminClient APIs

2020-01-22 Thread Brian Byrne (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Byrne resolved KAFKA-9082.

Resolution: Duplicate

The outstanding work to be completed is now identical to KAFKA-7740. Marking as 
duplicate.

> Move ConfigCommand to use KafkaAdminClient APIs
> ---
>
> Key: KAFKA-9082
> URL: https://issues.apache.org/jira/browse/KAFKA-9082
> Project: Kafka
>  Issue Type: Sub-task
>  Components: admin
>Reporter: Brian Byrne
>Assignee: Brian Byrne
>Priority: Critical
> Fix For: 2.5.0
>
>
> The ConfigCommand currently only supports a subset of commands when 
> interacting with the KafkaAdminClient (as opposed to ZooKeeper directly). It 
> needs to be brought up to parity for KIP-500 work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9082) Move ConfigCommand to use KafkaAdminClient APIs

2020-01-22 Thread Brian Byrne (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Byrne updated KAFKA-9082:
---
Fix Version/s: 2.5.0

> Move ConfigCommand to use KafkaAdminClient APIs
> ---
>
> Key: KAFKA-9082
> URL: https://issues.apache.org/jira/browse/KAFKA-9082
> Project: Kafka
>  Issue Type: Sub-task
>  Components: admin
>Reporter: Brian Byrne
>Assignee: Brian Byrne
>Priority: Critical
> Fix For: 2.5.0
>
>
> The ConfigCommand currently only supports a subset of commands when 
> interacting with the KafkaAdminClient (as opposed to ZooKeeper directly). It 
> needs to be brought up to parity for KIP-500 work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9466) Add documentation for new stream EOS change

2020-01-22 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-9466:
---
Description: We shall fill in more details when we actually reach this 
stage.

> Add documentation for new stream EOS change
> ---
>
> Key: KAFKA-9466
> URL: https://issues.apache.org/jira/browse/KAFKA-9466
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Matthias J. Sax
>Priority: Major
>
> We shall fill in more details when we actually reach this stage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9466) Add documentation for new stream EOS change

2020-01-22 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-9466:
--

 Summary: Add documentation for new stream EOS change
 Key: KAFKA-9466
 URL: https://issues.apache.org/jira/browse/KAFKA-9466
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Matthias J. Sax






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-7740) Kafka Admin Client should be able to manage user/client configurations for users and clients

2020-01-22 Thread Brian Byrne (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Byrne updated KAFKA-7740:
---
Fix Version/s: 2.5.0

> Kafka Admin Client should be able to manage user/client configurations for 
> users and clients
> 
>
> Key: KAFKA-7740
> URL: https://issues.apache.org/jira/browse/KAFKA-7740
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 1.1.0, 1.1.1, 2.0.0, 2.1.0
> Environment: linux
>Reporter: Yaodong Yang
>Assignee: Brian Byrne
>Priority: Major
>  Labels: features
> Fix For: 2.5.0
>
>
> Right now, Kafka Admin Client only allow users to change the configuration of 
> brokers and topics. There are some use cases that users want to setOrUpdate 
> quota configurations for users and clients through Kafka Admin Client. 
> Without this new capability, users have to manually talk to zookeeper for 
> this, which will pose other challenges for customers.
> Considering we have already have the framework for the much complex brokers 
> and topic configuration changes, it seems straightforward to add the support 
> for the alterConfig and describeConfig for users and clients as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-7740) Kafka Admin Client should be able to manage user/client configurations for users and clients

2020-01-22 Thread Brian Byrne (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021612#comment-17021612
 ] 

Brian Byrne commented on KAFKA-7740:


This will be resolved as a function of KIP-546, which is planned for the 2.5.0 
release. Adjust ticket to reflect this.

> Kafka Admin Client should be able to manage user/client configurations for 
> users and clients
> 
>
> Key: KAFKA-7740
> URL: https://issues.apache.org/jira/browse/KAFKA-7740
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 1.1.0, 1.1.1, 2.0.0, 2.1.0
> Environment: linux
>Reporter: Yaodong Yang
>Assignee: Brian Byrne
>Priority: Major
>  Labels: features
> Fix For: 2.5.0
>
>
> Right now, Kafka Admin Client only allow users to change the configuration of 
> brokers and topics. There are some use cases that users want to setOrUpdate 
> quota configurations for users and clients through Kafka Admin Client. 
> Without this new capability, users have to manually talk to zookeeper for 
> this, which will pose other challenges for customers.
> Considering we have already have the framework for the much complex brokers 
> and topic configuration changes, it seems straightforward to add the support 
> for the alterConfig and describeConfig for users and clients as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9465) Enclose consumer call with catching InvalidOffsetException

2020-01-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021585#comment-17021585
 ] 

ASF GitHub Bot commented on KAFKA-9465:
---

tedyu commented on pull request #8001: KAFKA-9465: Enclose consumer call with 
catching InvalidOffsetException
URL: https://github.com/apache/kafka/pull/8001
 
 
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enclose consumer call with catching InvalidOffsetException
> --
>
> Key: KAFKA-9465
> URL: https://issues.apache.org/jira/browse/KAFKA-9465
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
>
> In maybeUpdateStandbyTasks, the try block encloses restoreConsumer.poll and 
> record handling.
> Since InvalidOffsetException is thrown by restoreConsumer.poll, we should 
> enclose this call in the try block.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9465) Enclose consumer call with catching InvalidOffsetException

2020-01-22 Thread Ted Yu (Jira)
Ted Yu created KAFKA-9465:
-

 Summary: Enclose consumer call with catching InvalidOffsetException
 Key: KAFKA-9465
 URL: https://issues.apache.org/jira/browse/KAFKA-9465
 Project: Kafka
  Issue Type: Improvement
Reporter: Ted Yu


In maybeUpdateStandbyTasks, the try block encloses restoreConsumer.poll and 
record handling.
Since InvalidOffsetException is thrown by restoreConsumer.poll, we should 
enclose this call in the try block.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9417) Integration test for new EOS model with vanilla Producer and Consumer

2020-01-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021558#comment-17021558
 ] 

ASF GitHub Bot commented on KAFKA-9417:
---

abbccdda commented on pull request #8000: (WIP) KAFKA-9417: New Integration 
Test for KIP-447
URL: https://github.com/apache/kafka/pull/8000
 
 
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Integration test for new EOS model with vanilla Producer and Consumer
> -
>
> Key: KAFKA-9417
> URL: https://issues.apache.org/jira/browse/KAFKA-9417
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> We would like to extend the `TransactionMessageCopier` to use the new 
> subscription mode consumer and do a system test based off that in order to 
> verify the new semantic actually works.
> We also want to make sure the backward compatibility is maintained by using 
> group metadata API in existing tests as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9417) Integration test for new EOS model with vanilla Producer and Consumer

2020-01-22 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-9417:
---
Description: 
We would like to extend the `TransactionMessageCopier` to use the new 
subscription mode consumer and do a system test based off that in order to 
verify the new semantic actually works.

We also want to make sure the backward compatibility is maintained by using 
group metadata API in existing tests as well.

  was:We would like to extend the `TransactionMessageCopier` to use the new 
subscription mode consumer and do a system test based off that in order to 
verify the new semantic actually works.


> Integration test for new EOS model with vanilla Producer and Consumer
> -
>
> Key: KAFKA-9417
> URL: https://issues.apache.org/jira/browse/KAFKA-9417
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> We would like to extend the `TransactionMessageCopier` to use the new 
> subscription mode consumer and do a system test based off that in order to 
> verify the new semantic actually works.
> We also want to make sure the backward compatibility is maintained by using 
> group metadata API in existing tests as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9417) Integration test for new EOS model with vanilla Producer and Consumer

2020-01-22 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-9417:
---
Summary: Integration test for new EOS model with vanilla Producer and 
Consumer  (was: Add extension to the TransactionsTest.transactions_test for new 
EOS model)

> Integration test for new EOS model with vanilla Producer and Consumer
> -
>
> Key: KAFKA-9417
> URL: https://issues.apache.org/jira/browse/KAFKA-9417
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> We would like to extend the `TransactionMessageCopier` to use the new 
> subscription mode consumer and do a system test based off that in order to 
> verify the new semantic actually works.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9464) Close the producer in completeShutdown

2020-01-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021548#comment-17021548
 ] 

ASF GitHub Bot commented on KAFKA-9464:
---

tedyu commented on pull request #7999: KAFKA-9464: Close the producer in 
completeShutdown
URL: https://github.com/apache/kafka/pull/7999
 
 
   In StreamThread#completeShutdown, the producer (if not null) should be 
closed.
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Close the producer in completeShutdown
> --
>
> Key: KAFKA-9464
> URL: https://issues.apache.org/jira/browse/KAFKA-9464
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> In StreamThread#completeShutdown, the producer (if not null) should be closed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9418) Add new sendOffsets API to include consumer group metadata

2020-01-22 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-9418.

Fix Version/s: 2.5.0
   Resolution: Fixed

> Add new sendOffsets API to include consumer group metadata
> --
>
> Key: KAFKA-9418
> URL: https://issues.apache.org/jira/browse/KAFKA-9418
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.5.0
>
>
> Add the consumer group metadata as part of producer sendTransactions API to 
> enable proper fencing under 447



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9418) Add new sendOffsets API to include consumer group metadata

2020-01-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021546#comment-17021546
 ] 

ASF GitHub Bot commented on KAFKA-9418:
---

hachikuji commented on pull request #7952: KAFKA-9418: Add new 
sendOffsetsToTransaction API to KafkaProducer
URL: https://github.com/apache/kafka/pull/7952
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add new sendOffsets API to include consumer group metadata
> --
>
> Key: KAFKA-9418
> URL: https://issues.apache.org/jira/browse/KAFKA-9418
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> Add the consumer group metadata as part of producer sendTransactions API to 
> enable proper fencing under 447



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9464) Close the producer in completeShutdown

2020-01-22 Thread Ted Yu (Jira)
Ted Yu created KAFKA-9464:
-

 Summary: Close the producer in completeShutdown
 Key: KAFKA-9464
 URL: https://issues.apache.org/jira/browse/KAFKA-9464
 Project: Kafka
  Issue Type: Bug
Reporter: Ted Yu


In StreamThread#completeShutdown, the producer (if not null) should be closed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8377) KTable#transformValue might lead to incorrect result in joins

2020-01-22 Thread Andy Coates (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021338#comment-17021338
 ] 

Andy Coates commented on KAFKA-8377:


This also causes https://github.com/confluentinc/ksql/issues/4131

> KTable#transformValue might lead to incorrect result in joins
> -
>
> Key: KAFKA-8377
> URL: https://issues.apache.org/jira/browse/KAFKA-8377
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
>Reporter: Matthias J. Sax
>Assignee: Aishwarya Pradeep Kumar
>Priority: Major
>  Labels: newbie++
>
> Kafka Streams uses an optimization to not materialize every result KTable. If 
> a non-materialized KTable is input to a join, the lookup into the table 
> results in a lookup of the parents table plus a call to the operator. For 
> example,
> {code:java}
> KTable nonMaterialized = materializedTable.filter(...);
> KTable table2 = ...
> table2.join(nonMaterialized,...){code}
> If there is a table2 input record, the lookup to the other side is performed 
> as a lookup into materializedTable plus applying the filter().
> For stateless operation like filter, this is safe. However, 
> #transformValues() might have an attached state store. Hence, when an input 
> record r is processed by #transformValues() with current state S, it might 
> produce an output record r' (that is not materialized). When the join later 
> does a lookup to get r from the parent table, there is no guarantee that 
> #transformValues() again produces r' because its state might not be the same 
> any longer. A similar issue applies to stateless #transformValue() the 
> accessed the `ProcessorContext` – when the `ProcessorContext` is accessed a 
> second time (when processing the data from the upstream lookup, to recompute 
> the store content) the `ProcessorContext` would return different data (ie, 
> now the data of the currently processed record)
> Hence, it seems to be required, to always materialize the result of a 
> KTable#transformValues() operation if there is state or if `ProcessorContext` 
> is used – one issue is, that we don't know upfront if `ProcessorContext` is 
> used and thus might be conservative and always materialize the result (maybe 
> be this some what to optimize operations like `filter` though). Note, that if 
> there would be a consecutive filter() after tranformValue(), it would also be 
> ok to materialize the filter() result. Furthermore, if there is no downstream 
> join(), materialization is also not required.
> Basically, it seems to be unsafe to apply `KTableValueGetter` on a 
> #transformValues()` operator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9460) Enable TLSv1.2 by default and disable all others protocol versions

2020-01-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021304#comment-17021304
 ] 

ASF GitHub Bot commented on KAFKA-9460:
---

nizhikov commented on pull request #7998: KAFKA-9460: Enable TLSv1.2 by default 
and disable all others protocol versions
URL: https://github.com/apache/kafka/pull/7998
 
 
   This PR by default disable all SSL protocols except TLSv1.2.
   Changes discussed in KIP-553.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable TLSv1.2 by default and disable all others protocol versions
> --
>
> Key: KAFKA-9460
> URL: https://issues.apache.org/jira/browse/KAFKA-9460
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Reporter: Nikolay Izhikov
>Assignee: Nikolay Izhikov
>Priority: Major
>  Labels: needs-kip
>
> In KAFKA-7251 support of TLS1.3 was introduced.
> For now, only TLS1.2 and TLS1.3 are recommended for the usage, other versions 
> of TLS considered as obsolete:
> https://www.rfc-editor.org/info/rfc8446
> https://en.wikipedia.org/wiki/Transport_Layer_Security#History_and_development
> But testing of TLS1.3 incomplete, for now.
> We should enable actual versions of the TLS protocol by default to provide to 
> the users only secure implementations.
> Users can enable obsolete versions of the TLS with the configuration if they 
> want to. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-7273) Converters should have access to headers.

2020-01-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021270#comment-17021270
 ] 

ASF GitHub Bot commented on KAFKA-7273:
---

rhauch commented on pull request #7489: KAFKA-7273 Clarification on mutability 
of headers passed to Converter…
URL: https://github.com/apache/kafka/pull/7489
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Converters should have access to headers.
> -
>
> Key: KAFKA-7273
> URL: https://issues.apache.org/jira/browse/KAFKA-7273
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Jeremy Custenborder
>Assignee: Jeremy Custenborder
>Priority: Major
> Fix For: 2.4.0
>
>
> I found myself wanting to build a converter that stored additional type 
> information within headers. The converter interface does not allow a 
> developer to access to the headers in a Converter. I'm not suggesting that we 
> change the method for serializing them, rather that 
> *org.apache.kafka.connect.header.Headers* be passed in for *fromConnectData* 
> and *toConnectData*. For example something like this.
> {code:java}
> import org.apache.kafka.connect.data.Schema;
> import org.apache.kafka.connect.data.SchemaAndValue;
> import org.apache.kafka.connect.header.Headers;
> import org.apache.kafka.connect.storage.Converter;
> public interface Converter {
>   default byte[] fromConnectData(String topic, Headers headers, Schema 
> schema, Object object) {
> return fromConnectData(topic, schema, object);
>   }
>   default SchemaAndValue toConnectData(String topic, Headers headers, byte[] 
> payload) {
> return toConnectData(topic, payload);
>   }
>   void configure(Map var1, boolean var2);
>   byte[] fromConnectData(String var1, Schema var2, Object var3);
>   SchemaAndValue toConnectData(String var1, byte[] var2);
> }
> {code}
> This would be a similar approach to what was already done with 
> ExtendedDeserializer and ExtendedSerializer in the Kafka client.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)