[jira] [Commented] (KAFKA-5694) Add ChangeReplicaDirRequest and DescribeReplicaDirRequest (KIP-113)
[ https://issues.apache.org/jira/browse/KAFKA-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113958#comment-16113958 ] ASF GitHub Bot commented on KAFKA-5694: --- GitHub user lindong28 opened a pull request: https://github.com/apache/kafka/pull/3621 KAFKA-5694; Add ChangeReplicaDirRequest and DescribeReplicaDirRequest (KIP-113) You can merge this pull request into a Git repository by running: $ git pull https://github.com/lindong28/kafka KAFKA-5694 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3621.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3621 commit 0c3262f53de8a99da0d1e2b6b7e817a4c570353b Author: Dong LinDate: 2017-08-02T19:10:07Z KAFKA-5694; Add ChangeReplicaDirRequest and DescribeReplicaDirRequest (KIP-113) > Add ChangeReplicaDirRequest and DescribeReplicaDirRequest (KIP-113) > --- > > Key: KAFKA-5694 > URL: https://issues.apache.org/jira/browse/KAFKA-5694 > Project: Kafka > Issue Type: New Feature >Reporter: Dong Lin >Assignee: Dong Lin > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-5665) Incorrect interruption invoking method used for Heartbeat thread
[ https://issues.apache.org/jira/browse/KAFKA-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huxihx resolved KAFKA-5665. --- Resolution: Not A Bug > Incorrect interruption invoking method used for Heartbeat thread > - > > Key: KAFKA-5665 > URL: https://issues.apache.org/jira/browse/KAFKA-5665 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 0.11.0.0 >Reporter: huxihx >Assignee: huxihx >Priority: Minor > > When interrupting the background heartbeat thread, `Thread.interrupted();` is > used. Actually, `Thread.currentThread().interrupt();` should be used to > restore the interruption status. An alternative way to solve is to remove > `Thread.interrupted();` since HeartbeatThread extends Thread and all code > higher up on the call stack is controlled, so we could safely swallow this > exception. Anyway, `Thread.interrupted();` should not be used here. It's a > test method not an action. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5665) Incorrect interruption invoking method used for Heartbeat thread
[ https://issues.apache.org/jira/browse/KAFKA-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113866#comment-16113866 ] ASF GitHub Bot commented on KAFKA-5665: --- Github user huxihx closed the pull request at: https://github.com/apache/kafka/pull/3586 > Incorrect interruption invoking method used for Heartbeat thread > - > > Key: KAFKA-5665 > URL: https://issues.apache.org/jira/browse/KAFKA-5665 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 0.11.0.0 >Reporter: huxihx >Assignee: huxihx >Priority: Minor > > When interrupting the background heartbeat thread, `Thread.interrupted();` is > used. Actually, `Thread.currentThread().interrupt();` should be used to > restore the interruption status. An alternative way to solve is to remove > `Thread.interrupted();` since HeartbeatThread extends Thread and all code > higher up on the call stack is controlled, so we could safely swallow this > exception. Anyway, `Thread.interrupted();` should not be used here. It's a > test method not an action. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5700) producer missed header information when splitting batches
[ https://issues.apache.org/jira/browse/KAFKA-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113861#comment-16113861 ] ASF GitHub Bot commented on KAFKA-5700: --- GitHub user huxihx opened a pull request: https://github.com/apache/kafka/pull/3620 KAFKA-5700: Producer should not drop header information when splitting batches Producer should not drop header information when splitting batches. This PR also corrects a minor typo in Sender.java, where `spitting and retrying` should be `splitting and retrying`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/huxihx/kafka KAFKA-5700 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3620.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3620 commit 105ba9051f53d1872c3fd13efb4fad991183e651 Author: huxihxDate: 2017-08-04T03:21:26Z KAFKA-5700: Producer should not drop header information when splitting big batches. This PR also corrects a minor typo in Sender.java, where `spitting and retrying` should be `splitting and retrying`. > producer missed header information when splitting batches > - > > Key: KAFKA-5700 > URL: https://issues.apache.org/jira/browse/KAFKA-5700 > Project: Kafka > Issue Type: Bug > Components: producer >Affects Versions: 0.11.0.0 >Reporter: huxihx >Assignee: huxihx > > In `ProducerBatch.tryAppendForSplit`, invoking > `this.recordsBuilder.append(timestamp, key, value);` missed the header > information in the ProducerRecord. Should invoke this like : > `this.recordsBuilder.append(timestamp, key, value, headers);` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5678) When the broker graceful shutdown occurs, the producer side sends timeout.
[ https://issues.apache.org/jira/browse/KAFKA-5678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113845#comment-16113845 ] cuiyang commented on KAFKA-5678: [~becket_qin] I have totally gotten your point, thanks again :) > When the broker graceful shutdown occurs, the producer side sends timeout. > -- > > Key: KAFKA-5678 > URL: https://issues.apache.org/jira/browse/KAFKA-5678 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.9.0.0, 0.10.0.0, 0.11.0.0 >Reporter: tuyang > > Test environment as follows. > 1.Kafka version:0.9.0.1 > 2.Cluster with 3 broker which with broker id A,B,C > 3.Topic with 6 partitions with 2 replicas,with 2 leader partitions at each > broker. > We can reproduce the problem as follows. > 1.we send message as quickly as possible with ack -1. > 2.if partition p0's leader is on broker A and we graceful shutdown broker > A,but we send a message to p0 before the leader is reelect, so the message > can be appended to the leader replica successful, but if the follower replica > not catch it as quickly as possible, so the shutting down broker will create > a delayProduce for this request to wait complete until request.timeout.ms . > 3.because of the controllerShutdown request from broker A, then the p0 > partition leader will reelect > , then the replica on broker A will become follower before complete shut > down.then the delayProduce will not be trigger to complete until expire. > 4.if broker A shutdown cost too long, then the producer will get response > after request.timeout.ms, which results in increase the producer send latency > when we are restarting broker one by one. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5358) Consumer perf tool should count rebalance time separately
[ https://issues.apache.org/jira/browse/KAFKA-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113750#comment-16113750 ] Jason Gustafson commented on KAFKA-5358: [~huxi_2b] It seems like the discussion thread on the dev list has died down. So perhaps it is time to start a vote (also on the dev list)? > Consumer perf tool should count rebalance time separately > - > > Key: KAFKA-5358 > URL: https://issues.apache.org/jira/browse/KAFKA-5358 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Assignee: huxihx > > It would be helpful to measure rebalance time separately in the performance > tool so that throughput between different versions can be compared more > easily in spite of improvements such as > https://cwiki.apache.org/confluence/display/KAFKA/KIP-134%3A+Delay+initial+consumer+group+rebalance. > At the moment, running the perf tool on 0.11.0 or trunk for a short amount > of time will present a severely skewed picture since the overall time will be > dominated by the join group delay. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KAFKA-5698) Sort processor node based on name suffix in TopologyDescription.toString()
[ https://issues.apache.org/jira/browse/KAFKA-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang reassigned KAFKA-5698: Assignee: Guozhang Wang > Sort processor node based on name suffix in TopologyDescription.toString() > -- > > Key: KAFKA-5698 > URL: https://issues.apache.org/jira/browse/KAFKA-5698 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Guozhang Wang >Assignee: Guozhang Wang > Labels: newbie++ > > Today when we print the topology via the {{Topology#describe()#toString}}, > the processor nodes are not sorted inside the > {{TopologyDescription.toString()}} function. For example, for the word count > demo topology we get: > {code} > Sub-topologies: > Sub-topology: 0 > Processor: KSTREAM-FILTER-05(stores: []) --> > KSTREAM-SINK-04 <-- KSTREAM-MAP-02 > Source: KSTREAM-SOURCE-00(topics: streams-wordcount-input) --> > KSTREAM-FLATMAPVALUES-01 > Processor: KSTREAM-FLATMAPVALUES-01(stores: []) --> > KSTREAM-MAP-02 <-- KSTREAM-SOURCE-00 > Processor: KSTREAM-MAP-02(stores: []) --> > KSTREAM-FILTER-05 <-- KSTREAM-FLATMAPVALUES-01 > Sink: KSTREAM-SINK-04(topic: Counts-repartition) <-- > KSTREAM-FILTER-05 > Sub-topology: 1 > Source: KSTREAM-SOURCE-06(topics: Counts-repartition) --> > KSTREAM-AGGREGATE-03 > Sink: KSTREAM-SINK-08(topic: streams-wordcount-output) <-- > KTABLE-TOSTREAM-07 > Processor: KTABLE-TOSTREAM-07(stores: []) --> > KSTREAM-SINK-08 <-- KSTREAM-AGGREGATE-03 > Processor: KSTREAM-AGGREGATE-03(stores: [Counts]) --> > KTABLE-TOSTREAM-07 <-- KSTREAM-SOURCE-06 > {code} > While ideally we want: > {code} > Sub-topologies: > Sub-topology: 0 > Source: KSTREAM-SOURCE-00(topics: streams-wordcount-input) --> > KSTREAM-FLATMAPVALUES-01 > Processor: KSTREAM-FLATMAPVALUES-01(stores: []) --> > KSTREAM-MAP-02 <-- KSTREAM-SOURCE-00 > Processor: KSTREAM-MAP-02(stores: []) --> > KSTREAM-FILTER-05 <-- KSTREAM-FLATMAPVALUES-01 > Processor: KSTREAM-FILTER-05(stores: []) --> > KSTREAM-SINK-04 <-- KSTREAM-MAP-02 > Sink: KSTREAM-SINK-04(topic: Counts-repartition) <-- > KSTREAM-FILTER-05 > Sub-topology: 1 > Source: KSTREAM-SOURCE-06(topics: Counts-repartition) --> > KSTREAM-AGGREGATE-03 > Processor: KSTREAM-AGGREGATE-03(stores: [Counts]) --> > KTABLE-TOSTREAM-07 <-- KSTREAM-SOURCE-06 > Processor: KTABLE-TOSTREAM-07(stores: []) --> > KSTREAM-SINK-08 <-- KSTREAM-AGGREGATE-03 > Sink: KSTREAM-SINK-08(topic: streams-wordcount-output) <-- > KTABLE-TOSTREAM-07 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5698) Sort processor node based on name suffix in TopologyDescription.toString()
[ https://issues.apache.org/jira/browse/KAFKA-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113746#comment-16113746 ] ASF GitHub Bot commented on KAFKA-5698: --- GitHub user guozhangwang opened a pull request: https://github.com/apache/kafka/pull/3618 KAFKA-5698: Sort processor nodes based on its sub-tree size 1. Sort processor nodes within a sub-topology by its sub-tree size: nodes with largest sizes are source nodes and hence printed earlier. 2. Minor: start newlines for predecessor and successor. 3. Minor: space between processor nodes and stores / topics; maintain `[]` for the topic names. You can merge this pull request into a Git repository by running: $ git pull https://github.com/guozhangwang/kafka K5698-topology-description-sorting Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3618.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3618 commit bce7ea7d448fe15bebc45e4a12de5831c3f727db Author: Guozhang WangDate: 2017-08-04T00:43:30Z sort processor nodes based on its sub-tree size > Sort processor node based on name suffix in TopologyDescription.toString() > -- > > Key: KAFKA-5698 > URL: https://issues.apache.org/jira/browse/KAFKA-5698 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Guozhang Wang > Labels: newbie++ > > Today when we print the topology via the {{Topology#describe()#toString}}, > the processor nodes are not sorted inside the > {{TopologyDescription.toString()}} function. For example, for the word count > demo topology we get: > {code} > Sub-topologies: > Sub-topology: 0 > Processor: KSTREAM-FILTER-05(stores: []) --> > KSTREAM-SINK-04 <-- KSTREAM-MAP-02 > Source: KSTREAM-SOURCE-00(topics: streams-wordcount-input) --> > KSTREAM-FLATMAPVALUES-01 > Processor: KSTREAM-FLATMAPVALUES-01(stores: []) --> > KSTREAM-MAP-02 <-- KSTREAM-SOURCE-00 > Processor: KSTREAM-MAP-02(stores: []) --> > KSTREAM-FILTER-05 <-- KSTREAM-FLATMAPVALUES-01 > Sink: KSTREAM-SINK-04(topic: Counts-repartition) <-- > KSTREAM-FILTER-05 > Sub-topology: 1 > Source: KSTREAM-SOURCE-06(topics: Counts-repartition) --> > KSTREAM-AGGREGATE-03 > Sink: KSTREAM-SINK-08(topic: streams-wordcount-output) <-- > KTABLE-TOSTREAM-07 > Processor: KTABLE-TOSTREAM-07(stores: []) --> > KSTREAM-SINK-08 <-- KSTREAM-AGGREGATE-03 > Processor: KSTREAM-AGGREGATE-03(stores: [Counts]) --> > KTABLE-TOSTREAM-07 <-- KSTREAM-SOURCE-06 > {code} > While ideally we want: > {code} > Sub-topologies: > Sub-topology: 0 > Source: KSTREAM-SOURCE-00(topics: streams-wordcount-input) --> > KSTREAM-FLATMAPVALUES-01 > Processor: KSTREAM-FLATMAPVALUES-01(stores: []) --> > KSTREAM-MAP-02 <-- KSTREAM-SOURCE-00 > Processor: KSTREAM-MAP-02(stores: []) --> > KSTREAM-FILTER-05 <-- KSTREAM-FLATMAPVALUES-01 > Processor: KSTREAM-FILTER-05(stores: []) --> > KSTREAM-SINK-04 <-- KSTREAM-MAP-02 > Sink: KSTREAM-SINK-04(topic: Counts-repartition) <-- > KSTREAM-FILTER-05 > Sub-topology: 1 > Source: KSTREAM-SOURCE-06(topics: Counts-repartition) --> > KSTREAM-AGGREGATE-03 > Processor: KSTREAM-AGGREGATE-03(stores: [Counts]) --> > KTABLE-TOSTREAM-07 <-- KSTREAM-SOURCE-06 > Processor: KTABLE-TOSTREAM-07(stores: []) --> > KSTREAM-SINK-08 <-- KSTREAM-AGGREGATE-03 > Sink: KSTREAM-SINK-08(topic: streams-wordcount-output) <-- > KTABLE-TOSTREAM-07 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5467) setting offset retention minutes to a lower value is not reflecting
[ https://issues.apache.org/jira/browse/KAFKA-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113745#comment-16113745 ] Jason Gustafson commented on KAFKA-5467: [~divyapaulraj92] Can you clarify the steps that you took to produce this problem? How were you able to determine that the offsets were not being removed? > setting offset retention minutes to a lower value is not reflecting > --- > > Key: KAFKA-5467 > URL: https://issues.apache.org/jira/browse/KAFKA-5467 > Project: Kafka > Issue Type: Bug > Components: offset manager >Affects Versions: 0.10.1.1 >Reporter: Divya > > We have been observing offsets to be unknown and saw that our offset > retention time was lesser than the log retention period. Inorder to recreate > the same in test environment, we set the offset.retention.minutes to 1 minute > and the log retention time to 168 hours. There were no events written for > more than an hour but still the offsets were not turning to unknown. (The > offset clean interval was 10 minutes.) I would like to know the reason on why > the offset did not turn to unknown in an hour. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5407) Mirrormaker dont start after upgrade
[ https://issues.apache.org/jira/browse/KAFKA-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113733#comment-16113733 ] Jason Gustafson commented on KAFKA-5407: [~fvegaucr] Can you attach the broker logs from that time period? I am hoping to find an uncaught exception at that time on the broker. > Mirrormaker dont start after upgrade > > > Key: KAFKA-5407 > URL: https://issues.apache.org/jira/browse/KAFKA-5407 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 0.10.2.1 > Environment: Operating system > CentOS 6.8 > HW > Board Mfg : HP > Board Product : ProLiant DL380p Gen8 > CPU's x2 > Product Manufacturer : Intel > Product Name : Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz > Memory Type : DDR3 SDRAM > SDRAM Capacity: 2048 MB > Total Memory: : 64GB > Hardrives size and layout: > 9 drives using jbod > drive size 3.6TB each >Reporter: Fernando Vega >Priority: Critical > > Currently Im upgrading the cluster from 0.8.2-beta to 0.10.2.1 > So I followed the rolling procedure: > Here the config files: > Consumer > {noformat} > # > # Cluster: repl > # Topic list(goes into command line): > REPL-ams1-global,REPL-atl1-global,REPL-sjc2-global,REPL-ams1-global-PN_HXIDMAP_.*,REPL-atl1-global-PN_HXIDMAP_.*,REPL-sjc2-global-PN_HXIDMAP_.*,REPL-ams1-global-PN_HXCONTEXTUALV2_.*,REPL-atl1-global-PN_HXCONTEXTUALV2_.*,REPL-sjc2-global-PN_HXCONTEXTUALV2_.* > bootstrap.servers=app001:9092,app002:9092,app003:9092,app004:9092 > group.id=hkg1_cluster > auto.commit.interval.ms=6 > partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor > {noformat} > Producer > {noformat} > hkg1 > # # Producer > # # hkg1 > bootstrap.servers=app001:9092,app002:9092,app003:9092,app004:9092 > compression.type=gzip > acks=0 > {noformat} > Broker > {noformat} > auto.leader.rebalance.enable=true > delete.topic.enable=true > socket.receive.buffer.bytes=1048576 > socket.send.buffer.bytes=1048576 > default.replication.factor=2 > auto.create.topics.enable=true > num.partitions=1 > num.network.threads=8 > num.io.threads=40 > log.retention.hours=1 > log.roll.hours=1 > num.replica.fetchers=8 > zookeeper.connection.timeout.ms=3 > zookeeper.session.timeout.ms=3 > inter.broker.protocol.version=0.10.2 > log.message.format.version=0.8.2 > {noformat} > I tried also using stock configuraiton with no luck. > The error that I get is this: > {noformat} > 2017-06-07 12:24:45,476] INFO ConsumerConfig values: > auto.commit.interval.ms = 6 > auto.offset.reset = latest > bootstrap.servers = [app454.sjc2.mytest.com:9092, > app455.sjc2.mytest.com:9092, app456.sjc2.mytest.com:9092, > app457.sjc2.mytest.com:9092, app458.sjc2.mytest.com:9092, > app459.sjc2.mytest.com:9092] > check.crcs = true > client.id = MirrorMaker_hkg1-1 > connections.max.idle.ms = 54 > enable.auto.commit = false > exclude.internal.topics = true > fetch.max.bytes = 52428800 > fetch.max.wait.ms = 500 > fetch.min.bytes = 1 > group.id = MirrorMaker_hkg1 > heartbeat.interval.ms = 3000 > interceptor.classes = null > key.deserializer = class > org.apache.kafka.common.serialization.ByteArrayDeserializer > max.partition.fetch.bytes = 1048576 > max.poll.interval.ms = 30 > max.poll.records = 500 > metadata.max.age.ms = 30 > metric.reporters = [] > metrics.num.samples = 2 > metrics.recording.level = INFO > metrics.sample.window.ms = 3 > partition.assignment.strategy = > [org.apache.kafka.clients.consumer.RoundRobinAssignor] > receive.buffer.bytes = 65536 > reconnect.backoff.ms = 50 > request.timeout.ms = 305000 > retry.backoff.ms = 100 > sasl.jaas.config = null > sasl.kerberos.kinit.cmd = /usr/bin/kinit > sasl.kerberos.min.time.before.relogin = 6 > sasl.kerberos.service.name = null > sasl.kerberos.ticket.renew.jitter = 0.05 > sasl.kerberos.ticket.renew.window.factor = 0.8 > sasl.mechanism = GSSAPI > security.protocol = PLAINTEXT > send.buffer.bytes = 131072 > session.timeout.ms = 1 > ssl.cipher.suites = null > ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] > ssl.endpoint.identification.algorithm = null > ssl.key.password = null > ssl.keymanager.algorithm = SunX509 > ssl.keystore.location = null > ssl.keystore.password = null > ssl.keystore.type = JKS > ssl.protocol = TLS > ssl.provider = null > ssl.secure.random.implementation = null > ssl.trustmanager.algorithm = PKIX > ssl.truststore.location = null > ssl.truststore.password = null >
[jira] [Created] (KAFKA-5699) Validate and Create connector endpoint should take the same format message body
Robin Moffatt created KAFKA-5699: Summary: Validate and Create connector endpoint should take the same format message body Key: KAFKA-5699 URL: https://issues.apache.org/jira/browse/KAFKA-5699 Project: Kafka Issue Type: Improvement Components: KafkaConnect Reporter: Robin Moffatt Priority: Minor It's a fairly ugly UX to want to 'do the right thing' and validate a connector, but to have to do so with a different message body than that used for a POST to /connectors. Can the format be standardised across the calls (and for a PUT to //config too)? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-2507) Replace ControlledShutdown{Request,Response} with org.apache.kafka.common.requests equivalent
[ https://issues.apache.org/jira/browse/KAFKA-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113677#comment-16113677 ] Ismael Juma commented on KAFKA-2507: [~hachikuji] adapted the Java request classes so that they support the non-standard ControlledShutdown V0 request header. So, the Scala classes were removed and replaced by direct upgrades from 0.8.x are still supported. > Replace ControlledShutdown{Request,Response} with > org.apache.kafka.common.requests equivalent > - > > Key: KAFKA-2507 > URL: https://issues.apache.org/jira/browse/KAFKA-2507 > Project: Kafka > Issue Type: Sub-task >Reporter: Ismael Juma >Assignee: Jason Gustafson > Fix For: 1.0.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-2507) Replace ControlledShutdown{Request,Response} with org.apache.kafka.common.requests equivalent
[ https://issues.apache.org/jira/browse/KAFKA-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma resolved KAFKA-2507. Resolution: Fixed Assignee: Jason Gustafson (was: Grant Henke) Fix Version/s: (was: 2.0.0) 1.0.0 > Replace ControlledShutdown{Request,Response} with > org.apache.kafka.common.requests equivalent > - > > Key: KAFKA-2507 > URL: https://issues.apache.org/jira/browse/KAFKA-2507 > Project: Kafka > Issue Type: Sub-task >Reporter: Ismael Juma >Assignee: Jason Gustafson > Fix For: 1.0.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-2959) Remove temporary mapping to deserialize functions in RequestChannel
[ https://issues.apache.org/jira/browse/KAFKA-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma resolved KAFKA-2959. Resolution: Fixed Assignee: Jason Gustafson Fix Version/s: 1.0.0 > Remove temporary mapping to deserialize functions in RequestChannel > > > Key: KAFKA-2959 > URL: https://issues.apache.org/jira/browse/KAFKA-2959 > Project: Kafka > Issue Type: Sub-task >Reporter: Grant Henke >Assignee: Jason Gustafson > Fix For: 1.0.0 > > > Once the old Request & Response objects are no longer used we can delete the > legacy mapping maintained in RequestChannel.scala -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-1120) Controller could miss a broker state change
[ https://issues.apache.org/jira/browse/KAFKA-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113645#comment-16113645 ] Onur Karaman commented on KAFKA-1120: - Alright I might know what's happening. Here's the red flag: {code} > grep -r "Newly added brokers" . ./kafka_2.11-0.11.0.0/logs/controller.log:[2017-08-03 13:40:09,121] INFO [Controller 1]: Newly added brokers: 1, deleted brokers: , all live brokers: 1 (kafka.controller.KafkaController) ./kafka_2.11-0.11.0.0/logs/controller.log:[2017-08-03 13:40:27,172] INFO [Controller 1]: Newly added brokers: 2, deleted brokers: , all live brokers: 1,2 (kafka.controller.KafkaController) ./kafka_2.11-0.11.0.0/logs/controller.log:[2017-08-03 13:47:15,215] INFO [Controller 1]: Newly added brokers: , deleted brokers: , all live brokers: 1,2 (kafka.controller.KafkaController) ./kafka_2.11-0.11.0.0/logs/controller.log:[2017-08-03 13:47:17,927] INFO [Controller 1]: Newly added brokers: , deleted brokers: , all live brokers: 1,2 (kafka.controller.KafkaController) {code} Here's the relevant code in BrokerChange.process: {code} val curBrokers = zkUtils.getAllBrokersInCluster().toSet val curBrokerIds = curBrokers.map(_.id) val liveOrShuttingDownBrokerIds = controllerContext.liveOrShuttingDownBrokerIds val newBrokerIds = curBrokerIds -- liveOrShuttingDownBrokerIds val deadBrokerIds = liveOrShuttingDownBrokerIds -- curBrokerIds {code} Basically the ControlledShutdown event took so long to process that the BrokerChange corresponding to the killed broker (3rd BrokerChange in the above snippet) and BrokerChange corresponding to the restarted broker (4th BrokerChange in the above snippet) are queued up waiting for ControlledShutdown's completion. By the time these BrokerChange events get processed, the restarted broker is already registered in zookeeper, causing the broker to appear in both controllerContext.liveOrShuttingDownBrokerIds and the brokers listed in zookeeper. This means the controller will not execute the onBrokerFailure in the 3rd BrokerChange and will also not execute onBrokerJoin in the 4th BrokerChange. I'm not sure of the fix. Broker generations as defined in the redesign doc in KAFKA-5027 would work but I'm not sure if it's strictly required. > Controller could miss a broker state change > > > Key: KAFKA-1120 > URL: https://issues.apache.org/jira/browse/KAFKA-1120 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.8.1 >Reporter: Jun Rao > Labels: reliability > Fix For: 1.0.0 > > > When the controller is in the middle of processing a task (e.g., preferred > leader election, broker change), it holds a controller lock. During this > time, a broker could have de-registered and re-registered itself in ZK. After > the controller finishes processing the current task, it will start processing > the logic in the broker change listener. However, it will see no broker > change and therefore won't do anything to the restarted broker. This broker > will be in a weird state since the controller doesn't inform it to become the > leader of any partition. Yet, the cached metadata in other brokers could > still list that broker as the leader for some partitions. Client requests > routed to that broker will then get a TopicOrPartitionNotExistException. This > broker will continue to be in this bad state until it's restarted again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5698) Sort processor node based on name suffix in TopologyDescription.toString()
Guozhang Wang created KAFKA-5698: Summary: Sort processor node based on name suffix in TopologyDescription.toString() Key: KAFKA-5698 URL: https://issues.apache.org/jira/browse/KAFKA-5698 Project: Kafka Issue Type: Bug Components: streams Reporter: Guozhang Wang Today when we print the topology via the {{Topology#describe()#toString}}, the processor nodes are not sorted inside the {{TopologyDescription.toString()}} function. For example, for the word count demo topology we get: {code} Sub-topologies: Sub-topology: 0 Processor: KSTREAM-FILTER-05(stores: []) --> KSTREAM-SINK-04 <-- KSTREAM-MAP-02 Source: KSTREAM-SOURCE-00(topics: streams-wordcount-input) --> KSTREAM-FLATMAPVALUES-01 Processor: KSTREAM-FLATMAPVALUES-01(stores: []) --> KSTREAM-MAP-02 <-- KSTREAM-SOURCE-00 Processor: KSTREAM-MAP-02(stores: []) --> KSTREAM-FILTER-05 <-- KSTREAM-FLATMAPVALUES-01 Sink: KSTREAM-SINK-04(topic: Counts-repartition) <-- KSTREAM-FILTER-05 Sub-topology: 1 Source: KSTREAM-SOURCE-06(topics: Counts-repartition) --> KSTREAM-AGGREGATE-03 Sink: KSTREAM-SINK-08(topic: streams-wordcount-output) <-- KTABLE-TOSTREAM-07 Processor: KTABLE-TOSTREAM-07(stores: []) --> KSTREAM-SINK-08 <-- KSTREAM-AGGREGATE-03 Processor: KSTREAM-AGGREGATE-03(stores: [Counts]) --> KTABLE-TOSTREAM-07 <-- KSTREAM-SOURCE-06 {code} While ideally we want: {code} Sub-topologies: Sub-topology: 0 Source: KSTREAM-SOURCE-00(topics: streams-wordcount-input) --> KSTREAM-FLATMAPVALUES-01 Processor: KSTREAM-FLATMAPVALUES-01(stores: []) --> KSTREAM-MAP-02 <-- KSTREAM-SOURCE-00 Processor: KSTREAM-MAP-02(stores: []) --> KSTREAM-FILTER-05 <-- KSTREAM-FLATMAPVALUES-01 Processor: KSTREAM-FILTER-05(stores: []) --> KSTREAM-SINK-04 <-- KSTREAM-MAP-02 Sink: KSTREAM-SINK-04(topic: Counts-repartition) <-- KSTREAM-FILTER-05 Sub-topology: 1 Source: KSTREAM-SOURCE-06(topics: Counts-repartition) --> KSTREAM-AGGREGATE-03 Processor: KSTREAM-AGGREGATE-03(stores: [Counts]) --> KTABLE-TOSTREAM-07 <-- KSTREAM-SOURCE-06 Processor: KTABLE-TOSTREAM-07(stores: []) --> KSTREAM-SINK-08 <-- KSTREAM-AGGREGATE-03 Sink: KSTREAM-SINK-08(topic: streams-wordcount-output) <-- KTABLE-TOSTREAM-07 {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5697) StreamThread.close() need to interrupt the stream threads to break the loop
Guozhang Wang created KAFKA-5697: Summary: StreamThread.close() need to interrupt the stream threads to break the loop Key: KAFKA-5697 URL: https://issues.apache.org/jira/browse/KAFKA-5697 Project: Kafka Issue Type: Bug Components: streams Reporter: Guozhang Wang In {{StreamThread.close()}} we currently do nothing but set the state, hoping the stream thread may eventually check it and shutdown itself. However, under certain scenarios the thread may get blocked within a single loop and hence will never check on this state enum. For example, it's {{consumer.poll}} call trigger {{ensureCoordinatorReady()}} which will block until the coordinator can be found. If the coordinator broker is never up and running then the Stream instance will be blocked forever. A simple way to produce this issue is to start the work count demo without starting the ZK / Kafka broker, and then it will get stuck in a single loop and even `ctrl-C` will not stop it since its set state will never be read by the thread: {code} [2017-08-03 15:17:39,981] WARN Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) [2017-08-03 15:17:40,046] WARN Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) [2017-08-03 15:17:40,101] WARN Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) [2017-08-03 15:17:40,206] WARN Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) [2017-08-03 15:17:40,261] WARN Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) [2017-08-03 15:17:40,366] WARN Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) [2017-08-03 15:17:40,472] WARN Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) ^C[2017-08-03 15:17:40,580] WARN Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5621) The producer should retry expired batches when retries are enabled
[ https://issues.apache.org/jira/browse/KAFKA-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113590#comment-16113590 ] Sumant Tambe commented on KAFKA-5621: - Thread closed. Please share your thoughts on KIP-91 [DISCUSS] thread instead. > The producer should retry expired batches when retries are enabled > -- > > Key: KAFKA-5621 > URL: https://issues.apache.org/jira/browse/KAFKA-5621 > Project: Kafka > Issue Type: Bug >Reporter: Apurva Mehta >Assignee: Apurva Mehta > Fix For: 1.0.0 > > > Today, when a batch is expired in the accumulator, a {{TimeoutException}} is > raised to the user. > It might be better the producer to retry the expired batch rather up to the > configured number of retries. This is more intuitive from the user's point of > view. > Further the proposed behavior makes it easier for applications like mirror > maker to provide ordering guarantees even when batches expire. Today, they > would resend the expired batch and it would get added to the back of the > queue, causing the output ordering to be different from the input ordering. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-5684) KStreamPrintProcessor as customized KStreamPeekProcessor
[ https://issues.apache.org/jira/browse/KAFKA-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113499#comment-16113499 ] Guozhang Wang edited comment on KAFKA-5684 at 8/3/17 9:10 PM: -- I have a slightly different idea: 1. Remove the `defaultKeyValueMapper` from {{KStreamImpl}}, let the overload `print` and `writeAsText` functions without the mapper parameter to pass in the null values (note in general we do NOT prefer to pass nulls but replace it with default mapper as early as possible in the call trace, but this is a exception since in {{KStreamImpl}} we cannot access to {{context}} object for getting the default serdes). 2. Remove the `KeyValueMapper` from the constructor of {{PrintForeachAction}}, but add an API to set the {{KeyValueMapper}} after the object is constructed. 2. Still keep the {{KStreamPrint}} class, but let {{KStreamPrint}} extend {{KStreamPeek}} and let {{KStreamPrintProcessor}} extend {{KStreamPeekProcessor}}: let the {{KStreamPrint}} to pass in the user-specified serdes to the constructor of {{KStreamPrintProcessor}}. 4. Override {{KStreamPrintProcessor}}' `init` function to construct the default mapper (which potentially based on the default values of serdes from context) if it is not passed in, and then call `setMapper` in the action field. The advantage is that we can still abstract the internals from the public APIs completed with some tradeoff on code complexity. was (Author: guozhang): I have a slightly different idea: 1. Remove the `defaultKeyValueMapper` from {{KStreamImpl}}, let the overload `print` and `writeAsText` functions without the mapper parameter to pass in the null values (note in general we do NOT prefer to pass nulls but replace it with default mapper as early as possible in the call trace, but this is a exception since in {{KStreamImpl}} we cannot access to {{context}} object for getting the default serdes). 2. Remove the `KeyValueMapper` from the constructor of {{PrintForeachAction}}, but add an API to set the {{KeyValueMapper}} after the object is constructed. 2. Still keep the {{KStreamPrint}} class, but let {{KStreamPrint}} extend {{KStreamPeek}} and let {{KStreamPrintProcessor}} extend {{KStreamPeekProcessor}}: let the {{KStreamPrint}} to pass in the user-specified serdes to the constructor of {{KStreamPrintProcessor}}. 4. Override {{KStreamPrintProcessor}}' `init` function to construct the default mapper (which potentially based on the default values of serdes from context) if it is not passed in, and then call `setMapper` in the action field. > KStreamPrintProcessor as customized KStreamPeekProcessor > > > Key: KAFKA-5684 > URL: https://issues.apache.org/jira/browse/KAFKA-5684 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Paolo Patierno >Assignee: Paolo Patierno >Priority: Minor > > Hi, > the {{KStreamPrintProcessor}} is implemented from scratch (from the > {{AbstractProcessor}}) and the same for the related supplier. > It looks to me that it's just a special {{KStreamPeekProcessor}} with > forwardDownStream to false and that allows the possibility to specify Serdes > instances used if key/values are bytes. > At same time used by a {{print()}} method it provides a fast way to print > data flowing through the pipeline (while using just {{peek()}} you need to > write the code). > I think that it could be useful to refactoring the {{KStreamPrintProcessor}} > as derived from the {{KStreamPeekProcessor}} customizing its behavior. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5684) KStreamPrintProcessor as customized KStreamPeekProcessor
[ https://issues.apache.org/jira/browse/KAFKA-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113499#comment-16113499 ] Guozhang Wang commented on KAFKA-5684: -- I have a slightly different idea: 1. Remove the `defaultKeyValueMapper` from {{KStreamImpl}}, let the overload `print` and `writeAsText` functions without the mapper parameter to pass in the null values (note in general we do NOT prefer to pass nulls but replace it with default mapper as early as possible in the call trace, but this is a exception since in {{KStreamImpl}} we cannot access to {{context}} object for getting the default serdes). 2. Remove the `KeyValueMapper` from the constructor of {{PrintForeachAction}}, but add an API to set the {{KeyValueMapper}} after the object is constructed. 2. Still keep the {{KStreamPrint}} class, but let {{KStreamPrint}} extend {{KStreamPeek}} and let {{KStreamPrintProcessor}} extend {{KStreamPeekProcessor}}: let the {{KStreamPrint}} to pass in the user-specified serdes to the constructor of {{KStreamPrintProcessor}}. 4. Override {{KStreamPrintProcessor}}' `init` function to construct the default mapper (which potentially based on the default values of serdes from context) if it is not passed in, and then call `setMapper` in the action field. > KStreamPrintProcessor as customized KStreamPeekProcessor > > > Key: KAFKA-5684 > URL: https://issues.apache.org/jira/browse/KAFKA-5684 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Paolo Patierno >Assignee: Paolo Patierno >Priority: Minor > > Hi, > the {{KStreamPrintProcessor}} is implemented from scratch (from the > {{AbstractProcessor}}) and the same for the related supplier. > It looks to me that it's just a special {{KStreamPeekProcessor}} with > forwardDownStream to false and that allows the possibility to specify Serdes > instances used if key/values are bytes. > At same time used by a {{print()}} method it provides a fast way to print > data flowing through the pipeline (while using just {{peek()}} you need to > write the code). > I think that it could be useful to refactoring the {{KStreamPrintProcessor}} > as derived from the {{KStreamPeekProcessor}} customizing its behavior. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5621) The producer should retry expired batches when retries are enabled
[ https://issues.apache.org/jira/browse/KAFKA-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113406#comment-16113406 ] Ismael Juma commented on KAFKA-5621: Yes, let's start the KIP discussion in the mailing list so that more people can participate. > The producer should retry expired batches when retries are enabled > -- > > Key: KAFKA-5621 > URL: https://issues.apache.org/jira/browse/KAFKA-5621 > Project: Kafka > Issue Type: Bug >Reporter: Apurva Mehta >Assignee: Apurva Mehta > Fix For: 1.0.0 > > > Today, when a batch is expired in the accumulator, a {{TimeoutException}} is > raised to the user. > It might be better the producer to retry the expired batch rather up to the > configured number of retries. This is more intuitive from the user's point of > view. > Further the proposed behavior makes it easier for applications like mirror > maker to provide ordering guarantees even when batches expire. Today, they > would resend the expired batch and it would get added to the back of the > queue, causing the output ordering to be different from the input ordering. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5696) SourceConnector does not commit offset on reconfiguration
Oleg Kuznetsov created KAFKA-5696: - Summary: SourceConnector does not commit offset on reconfiguration Key: KAFKA-5696 URL: https://issues.apache.org/jira/browse/KAFKA-5696 Project: Kafka Issue Type: Bug Components: KafkaConnect Reporter: Oleg Kuznetsov Fix For: 0.10.0.2 I'm running SourceConnector, that reads files from storage and put data in kafka. I want, in case of reconfiguration, offsets to be flushed. Say, a file is completely processed, but source records are not yet committed and in case of reconfiguration their offsets might be missing in store. Is it possible to force committing offsets on reconfiguration? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-5638) Inconsistency in consumer group related ACLs
[ https://issues.apache.org/jira/browse/KAFKA-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1611#comment-1611 ] Vahid Hashemian edited comment on KAFKA-5638 at 8/3/17 7:50 PM: The current usage is probably not incorrect, because the implication you mentioned makes sense. However, it is inconsistent. I also don't know of any other inferred permission like this one. That's the reason I raised the issue. Unless there is a big push back, I would like to take the KIP approach and fix this inconsistency by dropping the {{Describe(Cluster)}} check from the API and introducing a {{Describe(Group)}} permission requirement. If there is push back, we can do the latter only and implement what you suggested above. If you are okay with this approach I'll start drafting the KIP. was (Author: vahid): The current usage is probably not incorrect, because the implication you mentioned makes sense. However, it is inconsistent. I also don't know of any other inferred permission like this one. That's the reason I raised the issue. Unless there is a big push back, I would like to take the KIP approach and fix this inconsistency by dropping the {{Describe(Cluster)}} check from the API and introducing a {{Describe(Group)}} group requirement. If there is push back, we can do the latter only and implement what you suggested above. If you are okay with this approach I'll start drafting the KIP. > Inconsistency in consumer group related ACLs > > > Key: KAFKA-5638 > URL: https://issues.apache.org/jira/browse/KAFKA-5638 > Project: Kafka > Issue Type: Bug > Components: security >Affects Versions: 0.11.0.0 >Reporter: Vahid Hashemian >Assignee: Vahid Hashemian >Priority: Minor > Labels: needs-kip > > Users can see all groups in the cluster (using consumer group’s {{--list}} > option) provided that they have {{Describe}} access to the cluster. It would > make more sense to modify that experience and limit what is listed in the > output to only those groups they have {{Describe}} access to. The reason is, > almost everything else is accessible by a user only if the access is > specifically granted (through ACL {{--add}}); and this scenario should not be > an exception. The potential change would be updating the minimum required > permission of {{ListGroup}} from {{Describe (Cluster)}} to {{Describe > (Group)}}. > We can also look at this issue from a different angle: A user with {{Read}} > access to a group can describe the group, but the same user would not see > anything when listing groups (assuming there is no {{Describe}} access to the > cluster). It makes more sense for this user to be able to list all groups > s/he can already describe. > It would be great to know if any user is relying on the existing behavior > (listing all consumer groups using a {{Describe (Cluster)}} ACL). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5638) Inconsistency in consumer group related ACLs
[ https://issues.apache.org/jira/browse/KAFKA-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1611#comment-1611 ] Vahid Hashemian commented on KAFKA-5638: The current usage is probably not incorrect, because the implication you mentioned makes sense. However, it is inconsistent. I also don't know of any other inferred permission like this one. That's the reason I raised the issue. Unless there is a big push back, I would like to take the KIP approach and fix this inconsistency by dropping the {{Describe(Cluster)}} check from the API and introducing a {{Describe(Group)}} group requirement. If there is push back, we can do the latter only and implement what you suggested above. If you are okay with this approach I'll start drafting the KIP. > Inconsistency in consumer group related ACLs > > > Key: KAFKA-5638 > URL: https://issues.apache.org/jira/browse/KAFKA-5638 > Project: Kafka > Issue Type: Bug > Components: security >Affects Versions: 0.11.0.0 >Reporter: Vahid Hashemian >Assignee: Vahid Hashemian >Priority: Minor > Labels: needs-kip > > Users can see all groups in the cluster (using consumer group’s {{--list}} > option) provided that they have {{Describe}} access to the cluster. It would > make more sense to modify that experience and limit what is listed in the > output to only those groups they have {{Describe}} access to. The reason is, > almost everything else is accessible by a user only if the access is > specifically granted (through ACL {{--add}}); and this scenario should not be > an exception. The potential change would be updating the minimum required > permission of {{ListGroup}} from {{Describe (Cluster)}} to {{Describe > (Group)}}. > We can also look at this issue from a different angle: A user with {{Read}} > access to a group can describe the group, but the same user would not see > anything when listing groups (assuming there is no {{Describe}} access to the > cluster). It makes more sense for this user to be able to list all groups > s/he can already describe. > It would be great to know if any user is relying on the existing behavior > (listing all consumer groups using a {{Describe (Cluster)}} ACL). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5678) When the broker graceful shutdown occurs, the producer side sends timeout.
[ https://issues.apache.org/jira/browse/KAFKA-5678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113064#comment-16113064 ] Jiangjie Qin commented on KAFKA-5678: - [~cuiyang] 1. Currently request timeout is used in two places: A. The actual request timeout on the wire. In this case the producer will retry. B. When a batch has been sitting in the accumulator for more than request timeout and the producer cannot make progress, the batch will be expired, this is not retriable. In the original design, in order to make progress, the producer needs to know the leader information of a partition and this information needs to be up to date. The current implementation of this is a little buggy. It checks whether there is an in-flight batch for a partition or not. But when max.in.flight.requests is set to 1 and metadata refresh happens, this check may fail and expire the batch by mistake. The expiration on the producer side is a little trickier than it looks like. KIP-91 is trying to address that. It looks that what you saw was the second case. Setting a higher request timeout is the way to go then. 2. The reason this problem happens during controlled shutdown is that during controlled shutdown the LeaderAndIsrRequests are not batched, but in other leader movement scenarios, the LeaderAndIsrRequests are actually batched. So this should not happen. > When the broker graceful shutdown occurs, the producer side sends timeout. > -- > > Key: KAFKA-5678 > URL: https://issues.apache.org/jira/browse/KAFKA-5678 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.9.0.0, 0.10.0.0, 0.11.0.0 >Reporter: tuyang > > Test environment as follows. > 1.Kafka version:0.9.0.1 > 2.Cluster with 3 broker which with broker id A,B,C > 3.Topic with 6 partitions with 2 replicas,with 2 leader partitions at each > broker. > We can reproduce the problem as follows. > 1.we send message as quickly as possible with ack -1. > 2.if partition p0's leader is on broker A and we graceful shutdown broker > A,but we send a message to p0 before the leader is reelect, so the message > can be appended to the leader replica successful, but if the follower replica > not catch it as quickly as possible, so the shutting down broker will create > a delayProduce for this request to wait complete until request.timeout.ms . > 3.because of the controllerShutdown request from broker A, then the p0 > partition leader will reelect > , then the replica on broker A will become follower before complete shut > down.then the delayProduce will not be trigger to complete until expire. > 4.if broker A shutdown cost too long, then the producer will get response > after request.timeout.ms, which results in increase the producer send latency > when we are restarting broker one by one. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5638) Inconsistency in consumer group related ACLs
[ https://issues.apache.org/jira/browse/KAFKA-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113013#comment-16113013 ] Jason Gustafson commented on KAFKA-5638: Yes, compatibility is what I had in mind. I think my thought at the time was that {{Describe(Cluster)}} ought to imply {{Describe(Group:*)}}, but this may be the only case where we've used permission on one resource to imply permission on another (not sure about that). I guess we see this as incorrect usage? It would be nice to have some clear semantic guidelines for ACL usage since there does seem to be a few inconsistencies. I think there's certainly an argument for treating the missing {{Describe(Group)}} check as a bug since listing the name of a group is less exposure than describing the group which is already possible with {{Describe(Group)}} permission. On the other hand, if we wanted to clean up the ACL model at the same time and drop the {{Describe(Cluster)}} permission, then a KIP would be necessary. Thoughts? > Inconsistency in consumer group related ACLs > > > Key: KAFKA-5638 > URL: https://issues.apache.org/jira/browse/KAFKA-5638 > Project: Kafka > Issue Type: Bug > Components: security >Affects Versions: 0.11.0.0 >Reporter: Vahid Hashemian >Assignee: Vahid Hashemian >Priority: Minor > Labels: needs-kip > > Users can see all groups in the cluster (using consumer group’s {{--list}} > option) provided that they have {{Describe}} access to the cluster. It would > make more sense to modify that experience and limit what is listed in the > output to only those groups they have {{Describe}} access to. The reason is, > almost everything else is accessible by a user only if the access is > specifically granted (through ACL {{--add}}); and this scenario should not be > an exception. The potential change would be updating the minimum required > permission of {{ListGroup}} from {{Describe (Cluster)}} to {{Describe > (Group)}}. > We can also look at this issue from a different angle: A user with {{Read}} > access to a group can describe the group, but the same user would not see > anything when listing groups (assuming there is no {{Describe}} access to the > cluster). It makes more sense for this user to be able to list all groups > s/he can already describe. > It would be great to know if any user is relying on the existing behavior > (listing all consumer groups using a {{Describe (Cluster)}} ACL). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KAFKA-4541) Add capability to create delegation token
[ https://issues.apache.org/jira/browse/KAFKA-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar reassigned KAFKA-4541: Assignee: Manikumar (was: Ashish Singh) > Add capability to create delegation token > - > > Key: KAFKA-4541 > URL: https://issues.apache.org/jira/browse/KAFKA-4541 > Project: Kafka > Issue Type: Sub-task > Components: security >Reporter: Ashish Singh >Assignee: Manikumar > > Add request/ response and server side handling to create delegation tokens. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4541) Add capability to create delegation token
[ https://issues.apache.org/jira/browse/KAFKA-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112931#comment-16112931 ] ASF GitHub Bot commented on KAFKA-4541: --- GitHub user omkreddy opened a pull request: https://github.com/apache/kafka/pull/3616 KAFKA-4541: Support for delegation token mechanism You can merge this pull request into a Git repository by running: $ git pull https://github.com/omkreddy/kafka KAFKA-4541 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3616.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3616 commit 6f163e48f2a0a42e81250a57470b925ac2dc2fbf Author: Manikumar ReddyDate: 2017-07-26T11:41:52Z KAFKA-4541: Add ability to create/renew/expire/describe delegation tokens KAFKA-4542: DelegationToken based Authentication using SCRAM > Add capability to create delegation token > - > > Key: KAFKA-4541 > URL: https://issues.apache.org/jira/browse/KAFKA-4541 > Project: Kafka > Issue Type: Sub-task > Components: security >Reporter: Ashish Singh >Assignee: Ashish Singh > > Add request/ response and server side handling to create delegation tokens. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5692) Refactor PreferredReplicaLeaderElectionCommand to use AdminClient
[ https://issues.apache.org/jira/browse/KAFKA-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Bentley updated KAFKA-5692: --- Issue Type: Improvement (was: Bug) > Refactor PreferredReplicaLeaderElectionCommand to use AdminClient > - > > Key: KAFKA-5692 > URL: https://issues.apache.org/jira/browse/KAFKA-5692 > Project: Kafka > Issue Type: Improvement >Reporter: Tom Bentley >Assignee: Tom Bentley >Priority: Minor > Labels: kip > > The PreferredReplicaLeaderElectionCommand currently uses a direct connection > to zookeeper. The zookeeper dependency should be deprecated and an > AdminClient API created to be used instead. > This change will require a KIP. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3268) Refactor existing CLI scripts to use KafkaAdminClient
[ https://issues.apache.org/jira/browse/KAFKA-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112603#comment-16112603 ] Tom Bentley commented on KAFKA-3268: [~viktorsomogyi] there is no tracking JIRA that I'm aware of. Rather than closing this one only to create a tracking one let's just use this JIRA. You're right, both those other commands will also need to be changed, so creating JIRAs and KIPs for those is fine with me. > Refactor existing CLI scripts to use KafkaAdminClient > - > > Key: KAFKA-3268 > URL: https://issues.apache.org/jira/browse/KAFKA-3268 > Project: Kafka > Issue Type: Sub-task >Reporter: Grant Henke >Assignee: Viktor Somogyi > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3268) Refactor existing CLI scripts to use KafkaAdminClient
[ https://issues.apache.org/jira/browse/KAFKA-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112591#comment-16112591 ] Viktor Somogyi commented on KAFKA-3268: --- [~tombentley], thanks for the heads up, I didn't know about these. I guess we can close this jira then? Do you know if there are any more jiras around refactoring the admin clients? Is there an umbrella jira which tracks the subtasks? I can see two tasks I'd be happy to do: refactoring the BrokerApiVersionsCommand command (this uses metadata request to collect the info, although through the deprecated AdminClient) and ConfigCommand (uses zookeeper directly, therefore probably needs kip). Shall I open jiras/kips accordingly? > Refactor existing CLI scripts to use KafkaAdminClient > - > > Key: KAFKA-3268 > URL: https://issues.apache.org/jira/browse/KAFKA-3268 > Project: Kafka > Issue Type: Sub-task >Reporter: Grant Henke >Assignee: Viktor Somogyi > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-3268) Refactor existing CLI scripts to use KafkaAdminClient
[ https://issues.apache.org/jira/browse/KAFKA-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi updated KAFKA-3268: -- Summary: Refactor existing CLI scripts to use KafkaAdminClient (was: Refactor existing CLI scripts to use new AdminClient) > Refactor existing CLI scripts to use KafkaAdminClient > - > > Key: KAFKA-3268 > URL: https://issues.apache.org/jira/browse/KAFKA-3268 > Project: Kafka > Issue Type: Sub-task >Reporter: Grant Henke >Assignee: Viktor Somogyi > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5561) Rewrite TopicCommand using the new Admin client
[ https://issues.apache.org/jira/browse/KAFKA-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112457#comment-16112457 ] Mickael Maison commented on KAFKA-5561: --- We have pretty restrictive create topic policies to ensure topics created by our users have specific settings (replication factor, partitions count, retention limit, etc). At the same time, we have a bunch of "internal" topics we use for monitoring, testing and those can have arbitrary settings that the policy wouldn't allow. > Rewrite TopicCommand using the new Admin client > --- > > Key: KAFKA-5561 > URL: https://issues.apache.org/jira/browse/KAFKA-5561 > Project: Kafka > Issue Type: Improvement > Components: tools >Reporter: Paolo Patierno >Assignee: Paolo Patierno > > Hi, > as suggested in the https://issues.apache.org/jira/browse/KAFKA-3331, it > could be great to have the TopicCommand using the new Admin client instead of > the way it works today. > As pushed by [~gwenshap] in the above JIRA, I'm going to work on it. > Thanks, > Paolo -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5674) max.connections.per.ip minimum value to be zero to allow IP address blocking
[ https://issues.apache.org/jira/browse/KAFKA-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112426#comment-16112426 ] Viktor Somogyi commented on KAFKA-5674: --- [~tmgstev] could you please review my PR once you have time? > max.connections.per.ip minimum value to be zero to allow IP address blocking > > > Key: KAFKA-5674 > URL: https://issues.apache.org/jira/browse/KAFKA-5674 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.11.0.0 >Reporter: Tristan Stevens >Assignee: Viktor Somogyi > > Currently the max.connections.per.ip (KAFKA-1512) config has a minimum value > of 1, however, as suggested in > https://issues.apache.org/jira/browse/KAFKA-1512?focusedCommentId=14051914, > having this with a minimum value of zero would allow IP-based filtering of > inbound connections (effectively prohibit those IP addresses from connecting > altogether). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5693) TopicCreationPolicy and AlterConfigsPolicy overlap
[ https://issues.apache.org/jira/browse/KAFKA-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112403#comment-16112403 ] Tom Bentley commented on KAFKA-5693: Would this need a KIP? It's not changing a public API, as such, but from a user's point of view topic creations or topic config modifications would be rejected which were permitted before. Note that some aspects of this (but not all) are already included in [KIP-179|https://cwiki.apache.org/confluence/display/KAFKA/KIP-179+-+Change+ReassignPartitionsCommand+to+use+AdminClient] > TopicCreationPolicy and AlterConfigsPolicy overlap > -- > > Key: KAFKA-5693 > URL: https://issues.apache.org/jira/browse/KAFKA-5693 > Project: Kafka > Issue Type: Bug >Reporter: Tom Bentley >Priority: Minor > > The administrator of a cluster can configure a {{CreateTopicPolicy}}, which > has access to the topic configs as well as other metadata to make its > decision about whether a topic creation is allowed. Thus in theory the > decision could be based on a combination of of the replication factor, and > the topic configs, for example. > Separately there is an AlterConfigPolicy, which only has access to the > configs (and can apply to configurable entities other than just topics). > There are potential issues with this. For example although the > CreateTopicPolicy is checked at creation time, it's not checked for any later > alterations to the topic config. So policies which depend on both the topic > configs and other topic metadata could be worked around by changing the > configs after creation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5692) Refactor PreferredReplicaLeaderElectionCommand to use AdminClient
[ https://issues.apache.org/jira/browse/KAFKA-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Bentley updated KAFKA-5692: --- Labels: kip (was: needs-kip) > Refactor PreferredReplicaLeaderElectionCommand to use AdminClient > - > > Key: KAFKA-5692 > URL: https://issues.apache.org/jira/browse/KAFKA-5692 > Project: Kafka > Issue Type: Bug >Reporter: Tom Bentley >Assignee: Tom Bentley >Priority: Minor > Labels: kip > > The PreferredReplicaLeaderElectionCommand currently uses a direct connection > to zookeeper. The zookeeper dependency should be deprecated and an > AdminClient API created to be used instead. > This change will require a KIP. -- This message was sent by Atlassian JIRA (v6.4.14#64029)