[jira] [Created] (KAFKA-5590) Delete Kafka Topic Complete Failed After Enable Ranger Kafka Plugin
Chaofeng Zhao created KAFKA-5590: Summary: Delete Kafka Topic Complete Failed After Enable Ranger Kafka Plugin Key: KAFKA-5590 URL: https://issues.apache.org/jira/browse/KAFKA-5590 Project: Kafka Issue Type: Bug Components: security Affects Versions: 0.10.0.0 Environment: kafka and ranger under ambari Reporter: Chaofeng Zhao Hi: Recently I develop some applications about kafka under ranger. But when I set enable ranger kafka plugin I can not delete kafka topic completely even though set 'delete.topic.enable=true'. And I find when enable ranger kafka plugin it must be authrized. How can I delete kafka topic completely under ranger. Thank you. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5587) Processor got uncaught exception: NullPointerException
[ https://issues.apache.org/jira/browse/KAFKA-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086391#comment-16086391 ] Rajini Sivaram commented on KAFKA-5587: --- When a cluster is under load, delays in processing (e.g. produce requests) can cause connections to be muted for a long time, resulting in channel expiry. If there are staged receives corresponding to an expired channel, a completed receive may be delivered after the channel is removed from the channel list. > Processor got uncaught exception: NullPointerException > -- > > Key: KAFKA-5587 > URL: https://issues.apache.org/jira/browse/KAFKA-5587 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.1.1 >Reporter: Dan >Assignee: Rajini Sivaram > > [2017-07-12 21:56:39,964] ERROR Processor got uncaught exception. > (kafka.network.Processor) > java.lang.NullPointerException > at > kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:490) > at > kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:487) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at > kafka.network.Processor.processCompletedReceives(SocketServer.scala:487) > at kafka.network.Processor.run(SocketServer.scala:417) > at java.lang.Thread.run(Thread.java:745) > Anyone knows the cause of this exception? What's the effect of it? > When this exception occurred, the log also showed that the broker was > frequently shrinking ISR to itself. Are these two things interrelated? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KAFKA-5587) Processor got uncaught exception: NullPointerException
[ https://issues.apache.org/jira/browse/KAFKA-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajini Sivaram reassigned KAFKA-5587: - Assignee: Rajini Sivaram > Processor got uncaught exception: NullPointerException > -- > > Key: KAFKA-5587 > URL: https://issues.apache.org/jira/browse/KAFKA-5587 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.1.1 >Reporter: Dan >Assignee: Rajini Sivaram > > [2017-07-12 21:56:39,964] ERROR Processor got uncaught exception. > (kafka.network.Processor) > java.lang.NullPointerException > at > kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:490) > at > kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:487) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at > kafka.network.Processor.processCompletedReceives(SocketServer.scala:487) > at kafka.network.Processor.run(SocketServer.scala:417) > at java.lang.Thread.run(Thread.java:745) > Anyone knows the cause of this exception? What's the effect of it? > When this exception occurred, the log also showed that the broker was > frequently shrinking ISR to itself. Are these two things interrelated? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5588) ConsumerConsole : uselss --new-consumer option
[ https://issues.apache.org/jira/browse/KAFKA-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086070#comment-16086070 ] Paolo Patierno commented on KAFKA-5588: --- [~vahid] thanks for the clarification. Where we could check to be sure about that ? Asking on the dev list ? > ConsumerConsole : uselss --new-consumer option > -- > > Key: KAFKA-5588 > URL: https://issues.apache.org/jira/browse/KAFKA-5588 > Project: Kafka > Issue Type: Bug >Reporter: Paolo Patierno >Assignee: Paolo Patierno >Priority: Minor > > Hi, > it seems to me that the --new-consumer option on the ConsoleConsumer is > useless. > The useOldConsumer var is related to specify --zookeeper on the command line > but then the bootstrap-server option (or the --new-consumer) can't be > used. > If you use --bootstrap-server option then the new consumer is used > automatically so no need for --new-consumer. > It turns out the using the old or new consumer is just related on using > --zookeeper or --bootstrap-server option (which can't be used together, so I > can't use new consumer connecting to zookeeper). > It's also clear when you use --zookeeper for the old consumer and the output > from help says : > "Consider using the new consumer by passing [bootstrap-server] instead of > [zookeeper]" > I'm going to remove the --new-consumer option from the tool. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5588) ConsumerConsole : uselss --new-consumer option
[ https://issues.apache.org/jira/browse/KAFKA-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086040#comment-16086040 ] Vahid Hashemian commented on KAFKA-5588: If I'm not mistaken this requires a KIP due to potential impact to existing users. > ConsumerConsole : uselss --new-consumer option > -- > > Key: KAFKA-5588 > URL: https://issues.apache.org/jira/browse/KAFKA-5588 > Project: Kafka > Issue Type: Bug >Reporter: Paolo Patierno >Assignee: Paolo Patierno >Priority: Minor > > Hi, > it seems to me that the --new-consumer option on the ConsoleConsumer is > useless. > The useOldConsumer var is related to specify --zookeeper on the command line > but then the bootstrap-server option (or the --new-consumer) can't be > used. > If you use --bootstrap-server option then the new consumer is used > automatically so no need for --new-consumer. > It turns out the using the old or new consumer is just related on using > --zookeeper or --bootstrap-server option (which can't be used together, so I > can't use new consumer connecting to zookeeper). > It's also clear when you use --zookeeper for the old consumer and the output > from help says : > "Consider using the new consumer by passing [bootstrap-server] instead of > [zookeeper]" > I'm going to remove the --new-consumer option from the tool. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-5589) Bump dependency of Kafka 0.10.x to the latest one
[ https://issues.apache.org/jira/browse/KAFKA-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski resolved KAFKA-5589. --- Resolution: Not A Problem Sorry, created an issue in wrong project. > Bump dependency of Kafka 0.10.x to the latest one > - > > Key: KAFKA-5589 > URL: https://issues.apache.org/jira/browse/KAFKA-5589 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: Piotr Nowojski > > We are using pretty old Kafka version for 0.10. Besides any bug fixes and > improvements that were made between 0.10.0.1 and 0.10.2.1, it 0.10.2.1 > version is more similar to 0.11.0. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5589) Bump dependency of Kafka 0.10.x to the latest one
Piotr Nowojski created KAFKA-5589: - Summary: Bump dependency of Kafka 0.10.x to the latest one Key: KAFKA-5589 URL: https://issues.apache.org/jira/browse/KAFKA-5589 Project: Kafka Issue Type: Improvement Components: KafkaConnect Reporter: Piotr Nowojski We are using pretty old Kafka version for 0.10. Besides any bug fixes and improvements that were made between 0.10.0.1 and 0.10.2.1, it 0.10.2.1 version is more similar to 0.11.0. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5584) Incorrect log size for topics larger than 2 GB
[ https://issues.apache.org/jira/browse/KAFKA-5584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085542#comment-16085542 ] Michal Borowiecki commented on KAFKA-5584: -- Was this a regression in 0.11.0.0 specifically? > Incorrect log size for topics larger than 2 GB > -- > > Key: KAFKA-5584 > URL: https://issues.apache.org/jira/browse/KAFKA-5584 > Project: Kafka > Issue Type: Bug > Components: log >Reporter: Gregor Uhlenheuer >Priority: Critical > Labels: reliability > Fix For: 0.11.0.1 > > Attachments: Screen Shot 2017-07-12 at 09.10.53.png > > > The {{size}} of a {{Log}} is calculated incorrectly due to an Integer > overflow. For large topics (larger than 2 GB) this value overflows. > This is easily observable in the reported metrics values of the path > {{log.Log.partition.*.topic..Size}} (see attached screenshot). > Moreover I think this breaks the size-based retention (via > {{log.retention.bytes}} and {{retention.bytes}}) of large topics as well. > I am not sure on the recommended workflow, should I open a pull request on > github with a fix? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3239) Timing issue in controller metrics on topic delete
[ https://issues.apache.org/jira/browse/KAFKA-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085523#comment-16085523 ] Mickael Maison commented on KAFKA-3239: --- We are still seeing this with 0.10.2.1: {quote} [2017-04-19 14:56:26,251] ERROR Error printing regular metrics: (com.airbnb.metrics.StatsDReporter) java.util.NoSuchElementException: key not found: [TOPIC_NAME_REDACTED,0] at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:59) at scala.collection.mutable.HashMap.apply(HashMap.scala:65) at kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:210) at kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:208) at scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:114) at scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:113) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) at scala.collection.TraversableOnce$class.count(TraversableOnce.scala:113) at scala.collection.AbstractTraversable.count(Traversable.scala:104) at kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply$mcI$sp(KafkaController.scala:208) at kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205) at kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213) at kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:204) at kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:202) at com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:163) at com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:37) at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) at com.airbnb.metrics.StatsDReporter.sendAMetric(StatsDReporter.java:131) at com.airbnb.metrics.StatsDReporter.sendAllKafkaMetrics(StatsDReporter.java:119) at com.airbnb.metrics.StatsDReporter.run(StatsDReporter.java:85) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:483) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:316) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:190) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1157) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627) at java.lang.Thread.run(Thread.java:809) {quote} The topic was stuck in "Marked for deletion" state. Such topics don't have leaders, so it's probably the issue is probably with: {quote} controllerContext.partitionLeadershipInfo(topicPartition) {quote} We're trying out a fix at the moment, if that works I'll send a PR. > Timing issue in controller metrics on topic delete > -- > > Key: KAFKA-3239 > URL: https://issues.apache.org/jira/browse/KAFKA-3239 > Project: Kafka > Issue Type: Bug > Components: controller >Affects Versions: 0.9.0.0 >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram > > Noticed this exception in our logs: > {quote} > java.util.NoSuchElementException: key not found: [sometopic,0] > at scala.collection.MapLike$class.default(MapLike.scala:228) > at scala.collection.AbstractMap.default(Map.scala:59) > at scala.collection.mutable.HashMap.apply(HashMap.scala:65) > at > kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:209) > at > kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:208) > at > scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:114) > at > scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:113) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) > at
[jira] [Commented] (KAFKA-5585) Failover in a replicated Cluster does not work
[ https://issues.apache.org/jira/browse/KAFKA-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085411#comment-16085411 ] Thomas Bayer commented on KAFKA-5585: - [~huxi_2b] I did that with a list of all brokers and with a simple broker that was working > Failover in a replicated Cluster does not work > -- > > Key: KAFKA-5585 > URL: https://issues.apache.org/jira/browse/KAFKA-5585 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.11.0.0 > Environment: Linux, Mac OSX >Reporter: Thomas Bayer > Attachments: broker_zookeeper_configs.zip, SimpleConsumer.java, > SimpleProducer.java, Stress Test Windows.xlsx, test_project_files.zip > > > Failover does not work in a cluster with 3 nodes and a replicated topic with > factor 3. > You can reproduce it als follows: Setup 3 Kafka Nodes and 1 Zookeeper. Than > create a topic with factor 3. Start a consumer. Stop a node. Write to the > topic. Now you get warnings that the client can not connect to a broker. The > consumer does not receive any messages. > The same setup works like a charm with 0.10.2.1. > Broker Config: > {{broker.id=1 > listeners=PLAINTEXT://:9091 > log.dirs=cluster/logs/node-1 > broker.id=2 > listeners=PLAINTEXT://:9092 > log.dirs=cluster/logs/node-2 > broker.id=3 > listeners=PLAINTEXT://:9093 > log.dirs=cluster/logs/node-3}} > Rest of the config is from the distribution. > Producer and consumer config: see attached files > *Log Consumer:* > 2017-07-12 16:15:26 WARN ConsumerCoordinator:649 - Auto-commit of offsets > {produktion-0=OffsetAndMetadata{offset=10, metadata=''}} failed for group a: > Offset commit failed with a retriable exception. You should retry committing > offsets. The underlying error was: The coordinator is not available. > 2017-07-12 16:15:26 WARN NetworkClient:588 - Connection to node 2147483645 > could not be established. Broker may not be available. > 2017-07-12 16:15:26 WARN NetworkClient:588 - Connection to node 2 could not > be established. Broker may not be available. > *Log Producer:* > {{2017-07-12 16:15:32 WARN NetworkClient:588 - Connection to node -1 could > not be established. Broker may not be available.}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5585) Failover in a replicated Cluster does not work
[ https://issues.apache.org/jira/browse/KAFKA-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085403#comment-16085403 ] huxihx commented on KAFKA-5585: --- [~tubayer] producer complains it failed to connect to Node 2 whose port should be 9092. In the producer properties, you specify `bootstrap.servers` as "localhost:9092" which means connect to Node 2 only. Could you specify it as "localhost:9091,localhost:9092,localhost:9093" and retry? > Failover in a replicated Cluster does not work > -- > > Key: KAFKA-5585 > URL: https://issues.apache.org/jira/browse/KAFKA-5585 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.11.0.0 > Environment: Linux, Mac OSX >Reporter: Thomas Bayer > Attachments: broker_zookeeper_configs.zip, SimpleConsumer.java, > SimpleProducer.java, Stress Test Windows.xlsx, test_project_files.zip > > > Failover does not work in a cluster with 3 nodes and a replicated topic with > factor 3. > You can reproduce it als follows: Setup 3 Kafka Nodes and 1 Zookeeper. Than > create a topic with factor 3. Start a consumer. Stop a node. Write to the > topic. Now you get warnings that the client can not connect to a broker. The > consumer does not receive any messages. > The same setup works like a charm with 0.10.2.1. > Broker Config: > {{broker.id=1 > listeners=PLAINTEXT://:9091 > log.dirs=cluster/logs/node-1 > broker.id=2 > listeners=PLAINTEXT://:9092 > log.dirs=cluster/logs/node-2 > broker.id=3 > listeners=PLAINTEXT://:9093 > log.dirs=cluster/logs/node-3}} > Rest of the config is from the distribution. > Producer and consumer config: see attached files > *Log Consumer:* > 2017-07-12 16:15:26 WARN ConsumerCoordinator:649 - Auto-commit of offsets > {produktion-0=OffsetAndMetadata{offset=10, metadata=''}} failed for group a: > Offset commit failed with a retriable exception. You should retry committing > offsets. The underlying error was: The coordinator is not available. > 2017-07-12 16:15:26 WARN NetworkClient:588 - Connection to node 2147483645 > could not be established. Broker may not be available. > 2017-07-12 16:15:26 WARN NetworkClient:588 - Connection to node 2 could not > be established. Broker may not be available. > *Log Producer:* > {{2017-07-12 16:15:32 WARN NetworkClient:588 - Connection to node -1 could > not be established. Broker may not be available.}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KAFKA-5431) LogCleaner stopped due to org.apache.kafka.common.errors.CorruptRecordException
[ https://issues.apache.org/jira/browse/KAFKA-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huxihx reassigned KAFKA-5431: - Assignee: huxihx > LogCleaner stopped due to > org.apache.kafka.common.errors.CorruptRecordException > --- > > Key: KAFKA-5431 > URL: https://issues.apache.org/jira/browse/KAFKA-5431 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.2.1 >Reporter: Carsten Rietz >Assignee: huxihx > Labels: reliability > Fix For: 0.11.0.1 > > > Hey all, > i have a strange problem with our uat cluster of 3 kafka brokers. > the __consumer_offsets topic was replicated to two instances and our disks > ran full due to a wrong configuration of the log cleaner. We fixed the > configuration and updated from 0.10.1.1 to 0.10.2.1 . > Today i increased the replication of the __consumer_offsets topic to 3 and > triggered replication to the third cluster via kafka-reassign-partitions.sh. > That went well but i get many errors like > {code} > [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for > partition [__consumer_offsets,18] offset 0 error Record size is less than the > minimum record overhead (14) (kafka.server.ReplicaFetcherThread) > [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for > partition [__consumer_offsets,24] offset 0 error Record size is less than the > minimum record overhead (14) (kafka.server.ReplicaFetcherThread) > {code} > Which i think are due to the full disk event. > The log cleaner threads died on these wrong messages: > {code} > [2017-06-12 09:59:50,722] ERROR [kafka-log-cleaner-thread-0], Error due to > (kafka.log.LogCleaner) > org.apache.kafka.common.errors.CorruptRecordException: Record size is less > than the minimum record overhead (14) > [2017-06-12 09:59:50,722] INFO [kafka-log-cleaner-thread-0], Stopped > (kafka.log.LogCleaner) > {code} > Looking at the file is see that some are truncated and some are jsut empty: > $ ls -lsh 00594653.log > 0 -rw-r--r-- 1 user user 100M Jun 12 11:00 00594653.log > Sadly i do not have the logs any more from the disk full event itsself. > I have three questions: > * What is the best way to clean this up? Deleting the old log files and > restarting the brokers? > * Why did kafka not handle the disk full event well? Is this only affecting > the cleanup or may we also loose data? > * Is this maybe caused by the combination of upgrade and disk full? > And last but not least: Keep up the good work. Kafka is really performing > well while being easy to administer and has good documentation! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-5431) LogCleaner stopped due to org.apache.kafka.common.errors.CorruptRecordException
[ https://issues.apache.org/jira/browse/KAFKA-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085357#comment-16085357 ] huxihx edited comment on KAFKA-5431 at 7/13/17 8:41 AM: Seems this only happens when preallocate is enabled and topic is configured with 'compact'. I think only one tiny code change could solve both this issue and [KAFKA-5582|https://issues.apache.org/jira/browse/KAFKA-5582]. was (Author: huxi_2b): Seems this only happens when preallocate is enabled and topic is configured with 'compact'. When I think only one tiny code change could solve both this issue and [KAFKA-5582|https://issues.apache.org/jira/browse/KAFKA-5582]. > LogCleaner stopped due to > org.apache.kafka.common.errors.CorruptRecordException > --- > > Key: KAFKA-5431 > URL: https://issues.apache.org/jira/browse/KAFKA-5431 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.2.1 >Reporter: Carsten Rietz > Labels: reliability > Fix For: 0.11.0.1 > > > Hey all, > i have a strange problem with our uat cluster of 3 kafka brokers. > the __consumer_offsets topic was replicated to two instances and our disks > ran full due to a wrong configuration of the log cleaner. We fixed the > configuration and updated from 0.10.1.1 to 0.10.2.1 . > Today i increased the replication of the __consumer_offsets topic to 3 and > triggered replication to the third cluster via kafka-reassign-partitions.sh. > That went well but i get many errors like > {code} > [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for > partition [__consumer_offsets,18] offset 0 error Record size is less than the > minimum record overhead (14) (kafka.server.ReplicaFetcherThread) > [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for > partition [__consumer_offsets,24] offset 0 error Record size is less than the > minimum record overhead (14) (kafka.server.ReplicaFetcherThread) > {code} > Which i think are due to the full disk event. > The log cleaner threads died on these wrong messages: > {code} > [2017-06-12 09:59:50,722] ERROR [kafka-log-cleaner-thread-0], Error due to > (kafka.log.LogCleaner) > org.apache.kafka.common.errors.CorruptRecordException: Record size is less > than the minimum record overhead (14) > [2017-06-12 09:59:50,722] INFO [kafka-log-cleaner-thread-0], Stopped > (kafka.log.LogCleaner) > {code} > Looking at the file is see that some are truncated and some are jsut empty: > $ ls -lsh 00594653.log > 0 -rw-r--r-- 1 user user 100M Jun 12 11:00 00594653.log > Sadly i do not have the logs any more from the disk full event itsself. > I have three questions: > * What is the best way to clean this up? Deleting the old log files and > restarting the brokers? > * Why did kafka not handle the disk full event well? Is this only affecting > the cleanup or may we also loose data? > * Is this maybe caused by the combination of upgrade and disk full? > And last but not least: Keep up the good work. Kafka is really performing > well while being easy to administer and has good documentation! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5582) Log compaction with preallocation enabled does not trim segments
[ https://issues.apache.org/jira/browse/KAFKA-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085382#comment-16085382 ] huxihx commented on KAFKA-5582: --- Seems in LogCleaner, `cleanSegments` should not set length for the cleanable log segment files. > Log compaction with preallocation enabled does not trim segments > > > Key: KAFKA-5582 > URL: https://issues.apache.org/jira/browse/KAFKA-5582 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.1.1 > Environment: Linux, Windows >Reporter: Jason Aliyetti > > Unexpected behavior occurs when a topic is configured to preallocate files > and has a retention policy of compact. > When log compaction runs, the cleaner attempts to gather groups of segments > to consolidate based on the max segment size. > When preallocation is enabled all segments are that size and thus each > individual segment is considered for compaction. > When compaction does occur, the resulting cleaned file is sized based on that > same configuration. This means that you can have very large files on disk > that contain little or no data which partly defeats the point of compacting. > The log cleaner should trim these segments to free up disk space. That way > they would free up disk space and be able to be further compacted on > subsequent runs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5431) LogCleaner stopped due to org.apache.kafka.common.errors.CorruptRecordException
[ https://issues.apache.org/jira/browse/KAFKA-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085380#comment-16085380 ] ASF GitHub Bot commented on KAFKA-5431: --- GitHub user huxihx opened a pull request: https://github.com/apache/kafka/pull/3525 KAFKA-5431: cleanSegments should not set length for cleanable segment files For a compacted topic with preallocate enabled, during log cleaning, LogCleaner.cleanSegments does not have to pre-allocate the underlying file size since we only want to store the cleaned data in the file. It's believed that this fix should also solve KAFKA-5582. You can merge this pull request into a Git repository by running: $ git pull https://github.com/huxihx/kafka log_compact_test Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3525.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3525 commit e14436a2abb25c5b324efba5e431e5e1afb6e05a Author: huxihxDate: 2017-07-13T08:28:50Z KAFKA-5431: LogCleaner stopped due to org.apache.kafka.common.errors.CorruptRecordException For a compacted topic with preallocate enabled, during log cleaning, LogCleaner.cleanSegments does not have to pre-allocate the underlying file size since we only want to store the cleaned data in the file. It's believed that this fix should also solve KAFKA-5582. > LogCleaner stopped due to > org.apache.kafka.common.errors.CorruptRecordException > --- > > Key: KAFKA-5431 > URL: https://issues.apache.org/jira/browse/KAFKA-5431 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.2.1 >Reporter: Carsten Rietz > Labels: reliability > Fix For: 0.11.0.1 > > > Hey all, > i have a strange problem with our uat cluster of 3 kafka brokers. > the __consumer_offsets topic was replicated to two instances and our disks > ran full due to a wrong configuration of the log cleaner. We fixed the > configuration and updated from 0.10.1.1 to 0.10.2.1 . > Today i increased the replication of the __consumer_offsets topic to 3 and > triggered replication to the third cluster via kafka-reassign-partitions.sh. > That went well but i get many errors like > {code} > [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for > partition [__consumer_offsets,18] offset 0 error Record size is less than the > minimum record overhead (14) (kafka.server.ReplicaFetcherThread) > [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for > partition [__consumer_offsets,24] offset 0 error Record size is less than the > minimum record overhead (14) (kafka.server.ReplicaFetcherThread) > {code} > Which i think are due to the full disk event. > The log cleaner threads died on these wrong messages: > {code} > [2017-06-12 09:59:50,722] ERROR [kafka-log-cleaner-thread-0], Error due to > (kafka.log.LogCleaner) > org.apache.kafka.common.errors.CorruptRecordException: Record size is less > than the minimum record overhead (14) > [2017-06-12 09:59:50,722] INFO [kafka-log-cleaner-thread-0], Stopped > (kafka.log.LogCleaner) > {code} > Looking at the file is see that some are truncated and some are jsut empty: > $ ls -lsh 00594653.log > 0 -rw-r--r-- 1 user user 100M Jun 12 11:00 00594653.log > Sadly i do not have the logs any more from the disk full event itsself. > I have three questions: > * What is the best way to clean this up? Deleting the old log files and > restarting the brokers? > * Why did kafka not handle the disk full event well? Is this only affecting > the cleanup or may we also loose data? > * Is this maybe caused by the combination of upgrade and disk full? > And last but not least: Keep up the good work. Kafka is really performing > well while being easy to administer and has good documentation! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5431) LogCleaner stopped due to org.apache.kafka.common.errors.CorruptRecordException
[ https://issues.apache.org/jira/browse/KAFKA-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085357#comment-16085357 ] huxihx commented on KAFKA-5431: --- Seems this only happens when preallocate is enabled and topic is configured with 'compact'. When I think only one tiny code change could solve both this issue and [KAFKA-5582|https://issues.apache.org/jira/browse/KAFKA-5582]. > LogCleaner stopped due to > org.apache.kafka.common.errors.CorruptRecordException > --- > > Key: KAFKA-5431 > URL: https://issues.apache.org/jira/browse/KAFKA-5431 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.2.1 >Reporter: Carsten Rietz > Labels: reliability > Fix For: 0.11.0.1 > > > Hey all, > i have a strange problem with our uat cluster of 3 kafka brokers. > the __consumer_offsets topic was replicated to two instances and our disks > ran full due to a wrong configuration of the log cleaner. We fixed the > configuration and updated from 0.10.1.1 to 0.10.2.1 . > Today i increased the replication of the __consumer_offsets topic to 3 and > triggered replication to the third cluster via kafka-reassign-partitions.sh. > That went well but i get many errors like > {code} > [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for > partition [__consumer_offsets,18] offset 0 error Record size is less than the > minimum record overhead (14) (kafka.server.ReplicaFetcherThread) > [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for > partition [__consumer_offsets,24] offset 0 error Record size is less than the > minimum record overhead (14) (kafka.server.ReplicaFetcherThread) > {code} > Which i think are due to the full disk event. > The log cleaner threads died on these wrong messages: > {code} > [2017-06-12 09:59:50,722] ERROR [kafka-log-cleaner-thread-0], Error due to > (kafka.log.LogCleaner) > org.apache.kafka.common.errors.CorruptRecordException: Record size is less > than the minimum record overhead (14) > [2017-06-12 09:59:50,722] INFO [kafka-log-cleaner-thread-0], Stopped > (kafka.log.LogCleaner) > {code} > Looking at the file is see that some are truncated and some are jsut empty: > $ ls -lsh 00594653.log > 0 -rw-r--r-- 1 user user 100M Jun 12 11:00 00594653.log > Sadly i do not have the logs any more from the disk full event itsself. > I have three questions: > * What is the best way to clean this up? Deleting the old log files and > restarting the brokers? > * Why did kafka not handle the disk full event well? Is this only affecting > the cleanup or may we also loose data? > * Is this maybe caused by the combination of upgrade and disk full? > And last but not least: Keep up the good work. Kafka is really performing > well while being easy to administer and has good documentation! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Issue Comment Deleted] (KAFKA-4628) Support KTable/GlobalKTable Joins
[ https://issues.apache.org/jira/browse/KAFKA-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] frank t updated KAFKA-4628: --- Comment: was deleted (was: would be possible to release only a patch for for this ? maybe for unstable release ? or a work-around ? i really face with a blocking problem that is not solvable without this:KTable join(final GlobalKTable globalTable, final KeyValueMapper keyMapper, final ValueJoiner joiner);) > Support KTable/GlobalKTable Joins > - > > Key: KAFKA-4628 > URL: https://issues.apache.org/jira/browse/KAFKA-4628 > Project: Kafka > Issue Type: Sub-task > Components: streams >Affects Versions: 0.10.2.0 >Reporter: Damian Guy > Fix For: 0.11.1.0 > > > In KIP-99 we have added support for GlobalKTables, however we don't currently > support KTable/GlobalKTable joins as they require materializing a state store > for the join. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4628) Support KTable/GlobalKTable Joins
[ https://issues.apache.org/jira/browse/KAFKA-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085353#comment-16085353 ] frank t commented on KAFKA-4628: [~guozhang] would be possible to release only a patch for for this ? or a work-around ? i am really blocked without this:.KTable join(final GlobalKTable globalTable, final KeyValueMapper keyMapper, final ValueJoiner joiner); > Support KTable/GlobalKTable Joins > - > > Key: KAFKA-4628 > URL: https://issues.apache.org/jira/browse/KAFKA-4628 > Project: Kafka > Issue Type: Sub-task > Components: streams >Affects Versions: 0.10.2.0 >Reporter: Damian Guy > Fix For: 0.11.1.0 > > > In KIP-99 we have added support for GlobalKTables, however we don't currently > support KTable/GlobalKTable joins as they require materializing a state store > for the join. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5585) Failover in a replicated Cluster does not work
[ https://issues.apache.org/jira/browse/KAFKA-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085330#comment-16085330 ] M. Manna commented on KAFKA-5585: - @huxih there is only one partition it seems. > Failover in a replicated Cluster does not work > -- > > Key: KAFKA-5585 > URL: https://issues.apache.org/jira/browse/KAFKA-5585 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.11.0.0 > Environment: Linux, Mac OSX >Reporter: Thomas Bayer > Attachments: broker_zookeeper_configs.zip, SimpleConsumer.java, > SimpleProducer.java, Stress Test Windows.xlsx, test_project_files.zip > > > Failover does not work in a cluster with 3 nodes and a replicated topic with > factor 3. > You can reproduce it als follows: Setup 3 Kafka Nodes and 1 Zookeeper. Than > create a topic with factor 3. Start a consumer. Stop a node. Write to the > topic. Now you get warnings that the client can not connect to a broker. The > consumer does not receive any messages. > The same setup works like a charm with 0.10.2.1. > Broker Config: > {{broker.id=1 > listeners=PLAINTEXT://:9091 > log.dirs=cluster/logs/node-1 > broker.id=2 > listeners=PLAINTEXT://:9092 > log.dirs=cluster/logs/node-2 > broker.id=3 > listeners=PLAINTEXT://:9093 > log.dirs=cluster/logs/node-3}} > Rest of the config is from the distribution. > Producer and consumer config: see attached files > *Log Consumer:* > 2017-07-12 16:15:26 WARN ConsumerCoordinator:649 - Auto-commit of offsets > {produktion-0=OffsetAndMetadata{offset=10, metadata=''}} failed for group a: > Offset commit failed with a retriable exception. You should retry committing > offsets. The underlying error was: The coordinator is not available. > 2017-07-12 16:15:26 WARN NetworkClient:588 - Connection to node 2147483645 > could not be established. Broker may not be available. > 2017-07-12 16:15:26 WARN NetworkClient:588 - Connection to node 2 could not > be established. Broker may not be available. > *Log Producer:* > {{2017-07-12 16:15:32 WARN NetworkClient:588 - Connection to node -1 could > not be established. Broker may not be available.}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5588) ConsumerConsole : uselss --new-consumer option
[ https://issues.apache.org/jira/browse/KAFKA-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085325#comment-16085325 ] ASF GitHub Bot commented on KAFKA-5588: --- GitHub user ppatierno opened a pull request: https://github.com/apache/kafka/pull/3524 KAFKA-5588: uselss --new-consumer option Get rid of the --new-consumer option for the ConsoleConsumer You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppatierno/kafka kafka-5588 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3524.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3524 commit f9309497551c7058696466c06755defddad6238c Author: ppatiernoDate: 2017-07-13T07:53:16Z Get rid of the --new-consumer option for the ConsoleConsumer > ConsumerConsole : uselss --new-consumer option > -- > > Key: KAFKA-5588 > URL: https://issues.apache.org/jira/browse/KAFKA-5588 > Project: Kafka > Issue Type: Bug >Reporter: Paolo Patierno >Assignee: Paolo Patierno >Priority: Minor > > Hi, > it seems to me that the --new-consumer option on the ConsoleConsumer is > useless. > The useOldConsumer var is related to specify --zookeeper on the command line > but then the bootstrap-server option (or the --new-consumer) can't be > used. > If you use --bootstrap-server option then the new consumer is used > automatically so no need for --new-consumer. > It turns out the using the old or new consumer is just related on using > --zookeeper or --bootstrap-server option (which can't be used together, so I > can't use new consumer connecting to zookeeper). > It's also clear when you use --zookeeper for the old consumer and the output > from help says : > "Consider using the new consumer by passing [bootstrap-server] instead of > [zookeeper]" > I'm going to remove the --new-consumer option from the tool. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5588) ConsumerConsole : uselss --new-consumer option
[ https://issues.apache.org/jira/browse/KAFKA-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paolo Patierno updated KAFKA-5588: -- Description: Hi, it seems to me that the --new-consumer option on the ConsoleConsumer is useless. The useOldConsumer var is related to specify --zookeeper on the command line but then the bootstrap-server option (or the --new-consumer) can't be used. If you use --bootstrap-server option then the new consumer is used automatically so no need for --new-consumer. It turns out the using the old or new consumer is just related on using --zookeeper or --bootstrap-server option (which can't be used together, so I can't use new consumer connecting to zookeeper). It's also clear when you use --zookeeper for the old consumer and the output from help says : "Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper]" I'm going to remove the --new-consumer option from the tool. Thanks, Paolo. was: Hi, it seems to me that the --new-consumer option on the ConsoleConsumer is useless. The useOldConsumer var is related to specify --zookeeper on the command line but then the bootstrap-server option (or the --new-consumer) can't be used. If you use --bootstrap-server option then the new consumer is used automatically so no need for --new-consumer. It turns out the using the old or new consumer is just related on using --zookeeper or --bootstrap-server option (which can't be used together, so I can't use new consumer connecting to zookeeper). I'm going to remove the --new-consumer option from the tool. Thanks, Paolo. > ConsumerConsole : uselss --new-consumer option > -- > > Key: KAFKA-5588 > URL: https://issues.apache.org/jira/browse/KAFKA-5588 > Project: Kafka > Issue Type: Bug >Reporter: Paolo Patierno >Assignee: Paolo Patierno >Priority: Minor > > Hi, > it seems to me that the --new-consumer option on the ConsoleConsumer is > useless. > The useOldConsumer var is related to specify --zookeeper on the command line > but then the bootstrap-server option (or the --new-consumer) can't be > used. > If you use --bootstrap-server option then the new consumer is used > automatically so no need for --new-consumer. > It turns out the using the old or new consumer is just related on using > --zookeeper or --bootstrap-server option (which can't be used together, so I > can't use new consumer connecting to zookeeper). > It's also clear when you use --zookeeper for the old consumer and the output > from help says : > "Consider using the new consumer by passing [bootstrap-server] instead of > [zookeeper]" > I'm going to remove the --new-consumer option from the tool. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5588) ConsumerConsole : uselss --new-consumer option
Paolo Patierno created KAFKA-5588: - Summary: ConsumerConsole : uselss --new-consumer option Key: KAFKA-5588 URL: https://issues.apache.org/jira/browse/KAFKA-5588 Project: Kafka Issue Type: Bug Reporter: Paolo Patierno Assignee: Paolo Patierno Priority: Minor Hi, it seems to me that the --new-consumer option on the ConsoleConsumer is useless. The useOldConsumer var is related to specify --zookeeper on the command line but then the bootstrap-server option (or the --new-consumer) can't be used. If you use --bootstrap-server option then the new consumer is used automatically so no need for --new-consumer. It turns out the using the old or new consumer is just related on using --zookeeper or --bootstrap-server option (which can't be used together, so I can't use new consumer connecting to zookeeper). I'm going to remove the --new-consumer option from the tool. Thanks, Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5585) Failover in a replicated Cluster does not work
[ https://issues.apache.org/jira/browse/KAFKA-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085285#comment-16085285 ] Thomas Bayer commented on KAFKA-5585: - [~huxi_2b] Leaders are available for all partitions. With the same configuration files, same producer and same consumer it works with 0.10.2.1. Then I stop the cluster, cean the logs and zookeeper folder, start it with the same config files with the 0.11.0.0. version and it did't work. OS: Ubuntu 16.4. LTS and OSX 10.12.5 > Failover in a replicated Cluster does not work > -- > > Key: KAFKA-5585 > URL: https://issues.apache.org/jira/browse/KAFKA-5585 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.11.0.0 > Environment: Linux, Mac OSX >Reporter: Thomas Bayer > Attachments: broker_zookeeper_configs.zip, SimpleConsumer.java, > SimpleProducer.java, Stress Test Windows.xlsx, test_project_files.zip > > > Failover does not work in a cluster with 3 nodes and a replicated topic with > factor 3. > You can reproduce it als follows: Setup 3 Kafka Nodes and 1 Zookeeper. Than > create a topic with factor 3. Start a consumer. Stop a node. Write to the > topic. Now you get warnings that the client can not connect to a broker. The > consumer does not receive any messages. > The same setup works like a charm with 0.10.2.1. > Broker Config: > {{broker.id=1 > listeners=PLAINTEXT://:9091 > log.dirs=cluster/logs/node-1 > broker.id=2 > listeners=PLAINTEXT://:9092 > log.dirs=cluster/logs/node-2 > broker.id=3 > listeners=PLAINTEXT://:9093 > log.dirs=cluster/logs/node-3}} > Rest of the config is from the distribution. > Producer and consumer config: see attached files > *Log Consumer:* > 2017-07-12 16:15:26 WARN ConsumerCoordinator:649 - Auto-commit of offsets > {produktion-0=OffsetAndMetadata{offset=10, metadata=''}} failed for group a: > Offset commit failed with a retriable exception. You should retry committing > offsets. The underlying error was: The coordinator is not available. > 2017-07-12 16:15:26 WARN NetworkClient:588 - Connection to node 2147483645 > could not be established. Broker may not be available. > 2017-07-12 16:15:26 WARN NetworkClient:588 - Connection to node 2 could not > be established. Broker may not be available. > *Log Producer:* > {{2017-07-12 16:15:32 WARN NetworkClient:588 - Connection to node -1 could > not be established. Broker may not be available.}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5587) Processor got uncaught exception: NullPointerException
Dan created KAFKA-5587: -- Summary: Processor got uncaught exception: NullPointerException Key: KAFKA-5587 URL: https://issues.apache.org/jira/browse/KAFKA-5587 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.10.1.1 Reporter: Dan [2017-07-12 21:56:39,964] ERROR Processor got uncaught exception. (kafka.network.Processor) java.lang.NullPointerException at kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:490) at kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:487) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at kafka.network.Processor.processCompletedReceives(SocketServer.scala:487) at kafka.network.Processor.run(SocketServer.scala:417) at java.lang.Thread.run(Thread.java:745) Anyone knows the cause of this exception? What's the effect of it? When this exception occurred, the log also showed that the broker was frequently shrinking ISR to itself. Are these two things interrelated? -- This message was sent by Atlassian JIRA (v6.4.14#64029)