[jira] [Created] (KAFKA-5590) Delete Kafka Topic Complete Failed After Enable Ranger Kafka Plugin

2017-07-13 Thread Chaofeng Zhao (JIRA)
Chaofeng Zhao created KAFKA-5590:


 Summary: Delete Kafka Topic Complete Failed After Enable Ranger 
Kafka Plugin
 Key: KAFKA-5590
 URL: https://issues.apache.org/jira/browse/KAFKA-5590
 Project: Kafka
  Issue Type: Bug
  Components: security
Affects Versions: 0.10.0.0
 Environment: kafka and ranger under ambari
Reporter: Chaofeng Zhao


Hi:
Recently I develop some applications about kafka under ranger. But when I 
set enable ranger kafka plugin I can not delete kafka topic completely even 
though set 'delete.topic.enable=true'. And I find when enable ranger kafka 
plugin it must be authrized. How can I delete kafka topic completely under 
ranger. Thank you.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5587) Processor got uncaught exception: NullPointerException

2017-07-13 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086391#comment-16086391
 ] 

Rajini Sivaram commented on KAFKA-5587:
---

When a cluster is under load, delays in processing (e.g. produce requests) can 
cause connections to be muted for a long time, resulting in channel expiry. If 
there are staged receives corresponding to an expired channel, a completed 
receive may be delivered after the channel is removed from the channel list.

> Processor got uncaught exception: NullPointerException
> --
>
> Key: KAFKA-5587
> URL: https://issues.apache.org/jira/browse/KAFKA-5587
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: Dan
>Assignee: Rajini Sivaram
>
> [2017-07-12 21:56:39,964] ERROR Processor got uncaught exception. 
> (kafka.network.Processor)
> java.lang.NullPointerException
> at 
> kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:490)
> at 
> kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:487)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at 
> kafka.network.Processor.processCompletedReceives(SocketServer.scala:487)
> at kafka.network.Processor.run(SocketServer.scala:417)
> at java.lang.Thread.run(Thread.java:745)
> Anyone knows the cause of this exception? What's the effect of it? 
> When this exception occurred, the log also showed that the broker was 
> frequently shrinking ISR to itself. Are these two things interrelated?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-5587) Processor got uncaught exception: NullPointerException

2017-07-13 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-5587:
-

Assignee: Rajini Sivaram

> Processor got uncaught exception: NullPointerException
> --
>
> Key: KAFKA-5587
> URL: https://issues.apache.org/jira/browse/KAFKA-5587
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: Dan
>Assignee: Rajini Sivaram
>
> [2017-07-12 21:56:39,964] ERROR Processor got uncaught exception. 
> (kafka.network.Processor)
> java.lang.NullPointerException
> at 
> kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:490)
> at 
> kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:487)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at 
> kafka.network.Processor.processCompletedReceives(SocketServer.scala:487)
> at kafka.network.Processor.run(SocketServer.scala:417)
> at java.lang.Thread.run(Thread.java:745)
> Anyone knows the cause of this exception? What's the effect of it? 
> When this exception occurred, the log also showed that the broker was 
> frequently shrinking ISR to itself. Are these two things interrelated?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5588) ConsumerConsole : uselss --new-consumer option

2017-07-13 Thread Paolo Patierno (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086070#comment-16086070
 ] 

Paolo Patierno commented on KAFKA-5588:
---

[~vahid] thanks for the clarification. Where we could check to be sure about 
that ? Asking on the dev list ?

> ConsumerConsole : uselss --new-consumer option
> --
>
> Key: KAFKA-5588
> URL: https://issues.apache.org/jira/browse/KAFKA-5588
> Project: Kafka
>  Issue Type: Bug
>Reporter: Paolo Patierno
>Assignee: Paolo Patierno
>Priority: Minor
>
> Hi,
> it seems to me that the --new-consumer option on the ConsoleConsumer is 
> useless.
> The useOldConsumer var is related to specify --zookeeper on the command line 
> but then the bootstrap-server option (or the --new-consumer) can't be 
> used.
> If you use --bootstrap-server option then the new consumer is used 
> automatically so no need for --new-consumer.
> It turns out the using the old or new consumer is just related on using 
> --zookeeper or --bootstrap-server option (which can't be used together, so I 
> can't use new consumer connecting to zookeeper).
> It's also clear when you use --zookeeper for the old consumer and the output 
> from help says :
> "Consider using the new consumer by passing [bootstrap-server] instead of 
> [zookeeper]"
> I'm going to remove the --new-consumer option from the tool.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5588) ConsumerConsole : uselss --new-consumer option

2017-07-13 Thread Vahid Hashemian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086040#comment-16086040
 ] 

Vahid Hashemian commented on KAFKA-5588:


If I'm not mistaken this requires a KIP due to potential impact to existing 
users.

> ConsumerConsole : uselss --new-consumer option
> --
>
> Key: KAFKA-5588
> URL: https://issues.apache.org/jira/browse/KAFKA-5588
> Project: Kafka
>  Issue Type: Bug
>Reporter: Paolo Patierno
>Assignee: Paolo Patierno
>Priority: Minor
>
> Hi,
> it seems to me that the --new-consumer option on the ConsoleConsumer is 
> useless.
> The useOldConsumer var is related to specify --zookeeper on the command line 
> but then the bootstrap-server option (or the --new-consumer) can't be 
> used.
> If you use --bootstrap-server option then the new consumer is used 
> automatically so no need for --new-consumer.
> It turns out the using the old or new consumer is just related on using 
> --zookeeper or --bootstrap-server option (which can't be used together, so I 
> can't use new consumer connecting to zookeeper).
> It's also clear when you use --zookeeper for the old consumer and the output 
> from help says :
> "Consider using the new consumer by passing [bootstrap-server] instead of 
> [zookeeper]"
> I'm going to remove the --new-consumer option from the tool.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5589) Bump dependency of Kafka 0.10.x to the latest one

2017-07-13 Thread Piotr Nowojski (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski resolved KAFKA-5589.
---
Resolution: Not A Problem

Sorry, created an issue in wrong project.

> Bump dependency of Kafka 0.10.x to the latest one
> -
>
> Key: KAFKA-5589
> URL: https://issues.apache.org/jira/browse/KAFKA-5589
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Piotr Nowojski
>
> We are using pretty old Kafka version for 0.10. Besides any bug fixes and 
> improvements that were made between 0.10.0.1 and 0.10.2.1, it 0.10.2.1 
> version is more similar to 0.11.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5589) Bump dependency of Kafka 0.10.x to the latest one

2017-07-13 Thread Piotr Nowojski (JIRA)
Piotr Nowojski created KAFKA-5589:
-

 Summary: Bump dependency of Kafka 0.10.x to the latest one
 Key: KAFKA-5589
 URL: https://issues.apache.org/jira/browse/KAFKA-5589
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Reporter: Piotr Nowojski


We are using pretty old Kafka version for 0.10. Besides any bug fixes and 
improvements that were made between 0.10.0.1 and 0.10.2.1, it 0.10.2.1 version 
is more similar to 0.11.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5584) Incorrect log size for topics larger than 2 GB

2017-07-13 Thread Michal Borowiecki (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085542#comment-16085542
 ] 

Michal Borowiecki commented on KAFKA-5584:
--

Was this a regression in 0.11.0.0 specifically?

> Incorrect log size for topics larger than 2 GB
> --
>
> Key: KAFKA-5584
> URL: https://issues.apache.org/jira/browse/KAFKA-5584
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Reporter: Gregor Uhlenheuer
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.1
>
> Attachments: Screen Shot 2017-07-12 at 09.10.53.png
>
>
> The {{size}} of a {{Log}} is calculated incorrectly due to an Integer 
> overflow. For large topics (larger than 2 GB) this value overflows.
> This is easily observable in the reported metrics values of the path 
> {{log.Log.partition.*.topic..Size}} (see attached screenshot).
> Moreover I think this breaks the size-based retention (via 
> {{log.retention.bytes}} and {{retention.bytes}}) of large topics as well.
> I am not sure on the recommended workflow, should I open a pull request on 
> github with a fix?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3239) Timing issue in controller metrics on topic delete

2017-07-13 Thread Mickael Maison (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085523#comment-16085523
 ] 

Mickael Maison commented on KAFKA-3239:
---

We are still seeing this with 0.10.2.1:
{quote}
[2017-04-19 14:56:26,251] ERROR Error printing regular metrics: 
(com.airbnb.metrics.StatsDReporter)
java.util.NoSuchElementException: key not found: [TOPIC_NAME_REDACTED,0]
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:59)
at scala.collection.mutable.HashMap.apply(HashMap.scala:65)
at 
kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:210)
at 
kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:208)
at 
scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:114)
at 
scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:113)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
at scala.collection.TraversableOnce$class.count(TraversableOnce.scala:113)
at scala.collection.AbstractTraversable.count(Traversable.scala:104)
at 
kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply$mcI$sp(KafkaController.scala:208)
at 
kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205)
at 
kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213)
at kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:204)
at kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:202)
at com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:163)
at com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:37)
at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28)
at com.airbnb.metrics.StatsDReporter.sendAMetric(StatsDReporter.java:131)
at 
com.airbnb.metrics.StatsDReporter.sendAllKafkaMetrics(StatsDReporter.java:119)
at com.airbnb.metrics.StatsDReporter.run(StatsDReporter.java:85)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:483)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:316)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:190)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1157)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627)
at java.lang.Thread.run(Thread.java:809)
{quote}

The topic was stuck in "Marked for deletion" state. Such topics don't have 
leaders, so it's probably the issue is probably with:
{quote}
controllerContext.partitionLeadershipInfo(topicPartition)
{quote} 

We're trying out a fix at the moment, if that works I'll send a PR.

> Timing issue in controller metrics on topic delete
> --
>
> Key: KAFKA-3239
> URL: https://issues.apache.org/jira/browse/KAFKA-3239
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>
> Noticed this exception in our logs:
> {quote}
> java.util.NoSuchElementException: key not found: [sometopic,0]
> at scala.collection.MapLike$class.default(MapLike.scala:228)
> at scala.collection.AbstractMap.default(Map.scala:59)
> at scala.collection.mutable.HashMap.apply(HashMap.scala:65)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:209)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:208)
> at 
> scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:114)
> at 
> scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:113)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at 

[jira] [Commented] (KAFKA-5585) Failover in a replicated Cluster does not work

2017-07-13 Thread Thomas Bayer (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085411#comment-16085411
 ] 

Thomas Bayer commented on KAFKA-5585:
-

[~huxi_2b] I did that with a list of all brokers and with a simple broker that 
was working

> Failover in a replicated Cluster does not work
> --
>
> Key: KAFKA-5585
> URL: https://issues.apache.org/jira/browse/KAFKA-5585
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.11.0.0
> Environment: Linux, Mac OSX
>Reporter: Thomas Bayer
> Attachments: broker_zookeeper_configs.zip, SimpleConsumer.java, 
> SimpleProducer.java, Stress Test Windows.xlsx, test_project_files.zip
>
>
> Failover does not work in a cluster with 3 nodes and a replicated topic with 
> factor 3.
> You can reproduce it als follows: Setup 3 Kafka Nodes and 1 Zookeeper. Than 
> create a topic with factor 3. Start a consumer. Stop a node. Write to the 
> topic. Now you get warnings that the client can not connect to a broker. The 
> consumer does not receive any messages.
> The same setup works like a charm with 0.10.2.1.
> Broker Config:
> {{broker.id=1
> listeners=PLAINTEXT://:9091
> log.dirs=cluster/logs/node-1
> broker.id=2
> listeners=PLAINTEXT://:9092
> log.dirs=cluster/logs/node-2
> broker.id=3
> listeners=PLAINTEXT://:9093
> log.dirs=cluster/logs/node-3}}
> Rest of the config is from the distribution.
> Producer and consumer config: see attached files
> *Log Consumer:*
> 2017-07-12 16:15:26 WARN  ConsumerCoordinator:649 - Auto-commit of offsets 
> {produktion-0=OffsetAndMetadata{offset=10, metadata=''}} failed for group a: 
> Offset commit failed with a retriable exception. You should retry committing 
> offsets. The underlying error was: The coordinator is not available.
> 2017-07-12 16:15:26 WARN  NetworkClient:588 - Connection to node 2147483645 
> could not be established. Broker may not be available.
> 2017-07-12 16:15:26 WARN  NetworkClient:588 - Connection to node 2 could not 
> be established. Broker may not be available.
> *Log Producer:*
> {{2017-07-12 16:15:32 WARN  NetworkClient:588 - Connection to node -1 could 
> not be established. Broker may not be available.}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5585) Failover in a replicated Cluster does not work

2017-07-13 Thread huxihx (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085403#comment-16085403
 ] 

huxihx commented on KAFKA-5585:
---

[~tubayer] producer complains it failed to connect to Node 2 whose port should 
be 9092. In the producer properties, you specify `bootstrap.servers` as 
"localhost:9092" which means connect to Node 2 only. Could you specify it as 
"localhost:9091,localhost:9092,localhost:9093" and retry?

> Failover in a replicated Cluster does not work
> --
>
> Key: KAFKA-5585
> URL: https://issues.apache.org/jira/browse/KAFKA-5585
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.11.0.0
> Environment: Linux, Mac OSX
>Reporter: Thomas Bayer
> Attachments: broker_zookeeper_configs.zip, SimpleConsumer.java, 
> SimpleProducer.java, Stress Test Windows.xlsx, test_project_files.zip
>
>
> Failover does not work in a cluster with 3 nodes and a replicated topic with 
> factor 3.
> You can reproduce it als follows: Setup 3 Kafka Nodes and 1 Zookeeper. Than 
> create a topic with factor 3. Start a consumer. Stop a node. Write to the 
> topic. Now you get warnings that the client can not connect to a broker. The 
> consumer does not receive any messages.
> The same setup works like a charm with 0.10.2.1.
> Broker Config:
> {{broker.id=1
> listeners=PLAINTEXT://:9091
> log.dirs=cluster/logs/node-1
> broker.id=2
> listeners=PLAINTEXT://:9092
> log.dirs=cluster/logs/node-2
> broker.id=3
> listeners=PLAINTEXT://:9093
> log.dirs=cluster/logs/node-3}}
> Rest of the config is from the distribution.
> Producer and consumer config: see attached files
> *Log Consumer:*
> 2017-07-12 16:15:26 WARN  ConsumerCoordinator:649 - Auto-commit of offsets 
> {produktion-0=OffsetAndMetadata{offset=10, metadata=''}} failed for group a: 
> Offset commit failed with a retriable exception. You should retry committing 
> offsets. The underlying error was: The coordinator is not available.
> 2017-07-12 16:15:26 WARN  NetworkClient:588 - Connection to node 2147483645 
> could not be established. Broker may not be available.
> 2017-07-12 16:15:26 WARN  NetworkClient:588 - Connection to node 2 could not 
> be established. Broker may not be available.
> *Log Producer:*
> {{2017-07-12 16:15:32 WARN  NetworkClient:588 - Connection to node -1 could 
> not be established. Broker may not be available.}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-5431) LogCleaner stopped due to org.apache.kafka.common.errors.CorruptRecordException

2017-07-13 Thread huxihx (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huxihx reassigned KAFKA-5431:
-

Assignee: huxihx

> LogCleaner stopped due to 
> org.apache.kafka.common.errors.CorruptRecordException
> ---
>
> Key: KAFKA-5431
> URL: https://issues.apache.org/jira/browse/KAFKA-5431
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1
>Reporter: Carsten Rietz
>Assignee: huxihx
>  Labels: reliability
> Fix For: 0.11.0.1
>
>
> Hey all,
> i have a strange problem with our uat cluster of 3 kafka brokers.
> the __consumer_offsets topic was replicated to two instances and our disks 
> ran full due to a wrong configuration of the log cleaner. We fixed the 
> configuration and updated from 0.10.1.1 to 0.10.2.1 .
> Today i increased the replication of the __consumer_offsets topic to 3 and 
> triggered replication to the third cluster via kafka-reassign-partitions.sh. 
> That went well but i get many errors like
> {code}
> [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for 
> partition [__consumer_offsets,18] offset 0 error Record size is less than the 
> minimum record overhead (14) (kafka.server.ReplicaFetcherThread)
> [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for 
> partition [__consumer_offsets,24] offset 0 error Record size is less than the 
> minimum record overhead (14) (kafka.server.ReplicaFetcherThread)
> {code}
> Which i think are due to the full disk event.
> The log cleaner threads died on these wrong messages:
> {code}
> [2017-06-12 09:59:50,722] ERROR [kafka-log-cleaner-thread-0], Error due to  
> (kafka.log.LogCleaner)
> org.apache.kafka.common.errors.CorruptRecordException: Record size is less 
> than the minimum record overhead (14)
> [2017-06-12 09:59:50,722] INFO [kafka-log-cleaner-thread-0], Stopped  
> (kafka.log.LogCleaner)
> {code}
> Looking at the file is see that some are truncated and some are jsut empty:
> $ ls -lsh 00594653.log
> 0 -rw-r--r-- 1 user user 100M Jun 12 11:00 00594653.log
> Sadly i do not have the logs any more from the disk full event itsself.
> I have three questions:
> * What is the best way to clean this up? Deleting the old log files and 
> restarting the brokers?
> * Why did kafka not handle the disk full event well? Is this only affecting 
> the cleanup or may we also loose data?
> * Is this maybe caused by the combination of upgrade and disk full?
> And last but not least: Keep up the good work. Kafka is really performing 
> well while being easy to administer and has good documentation!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5431) LogCleaner stopped due to org.apache.kafka.common.errors.CorruptRecordException

2017-07-13 Thread huxihx (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085357#comment-16085357
 ] 

huxihx edited comment on KAFKA-5431 at 7/13/17 8:41 AM:


Seems this only happens when preallocate is enabled and topic is configured 
with 'compact'. 

I think only one tiny code change could solve both this issue and 
[KAFKA-5582|https://issues.apache.org/jira/browse/KAFKA-5582].


was (Author: huxi_2b):
Seems this only happens when preallocate is enabled and topic is configured 
with 'compact'. When 

I think only one tiny code change could solve both this issue and 
[KAFKA-5582|https://issues.apache.org/jira/browse/KAFKA-5582].

> LogCleaner stopped due to 
> org.apache.kafka.common.errors.CorruptRecordException
> ---
>
> Key: KAFKA-5431
> URL: https://issues.apache.org/jira/browse/KAFKA-5431
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1
>Reporter: Carsten Rietz
>  Labels: reliability
> Fix For: 0.11.0.1
>
>
> Hey all,
> i have a strange problem with our uat cluster of 3 kafka brokers.
> the __consumer_offsets topic was replicated to two instances and our disks 
> ran full due to a wrong configuration of the log cleaner. We fixed the 
> configuration and updated from 0.10.1.1 to 0.10.2.1 .
> Today i increased the replication of the __consumer_offsets topic to 3 and 
> triggered replication to the third cluster via kafka-reassign-partitions.sh. 
> That went well but i get many errors like
> {code}
> [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for 
> partition [__consumer_offsets,18] offset 0 error Record size is less than the 
> minimum record overhead (14) (kafka.server.ReplicaFetcherThread)
> [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for 
> partition [__consumer_offsets,24] offset 0 error Record size is less than the 
> minimum record overhead (14) (kafka.server.ReplicaFetcherThread)
> {code}
> Which i think are due to the full disk event.
> The log cleaner threads died on these wrong messages:
> {code}
> [2017-06-12 09:59:50,722] ERROR [kafka-log-cleaner-thread-0], Error due to  
> (kafka.log.LogCleaner)
> org.apache.kafka.common.errors.CorruptRecordException: Record size is less 
> than the minimum record overhead (14)
> [2017-06-12 09:59:50,722] INFO [kafka-log-cleaner-thread-0], Stopped  
> (kafka.log.LogCleaner)
> {code}
> Looking at the file is see that some are truncated and some are jsut empty:
> $ ls -lsh 00594653.log
> 0 -rw-r--r-- 1 user user 100M Jun 12 11:00 00594653.log
> Sadly i do not have the logs any more from the disk full event itsself.
> I have three questions:
> * What is the best way to clean this up? Deleting the old log files and 
> restarting the brokers?
> * Why did kafka not handle the disk full event well? Is this only affecting 
> the cleanup or may we also loose data?
> * Is this maybe caused by the combination of upgrade and disk full?
> And last but not least: Keep up the good work. Kafka is really performing 
> well while being easy to administer and has good documentation!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5582) Log compaction with preallocation enabled does not trim segments

2017-07-13 Thread huxihx (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085382#comment-16085382
 ] 

huxihx commented on KAFKA-5582:
---

Seems in LogCleaner, `cleanSegments` should not set length for the cleanable 
log segment files. 

> Log compaction with preallocation enabled does not trim segments
> 
>
> Key: KAFKA-5582
> URL: https://issues.apache.org/jira/browse/KAFKA-5582
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.1
> Environment: Linux, Windows
>Reporter: Jason Aliyetti
>
> Unexpected behavior occurs when a topic is configured to preallocate files 
> and has a retention policy of compact.
> When log compaction runs, the cleaner attempts to gather groups of segments 
> to consolidate based on the max segment size.  
> When preallocation is enabled all segments are that size and thus each 
> individual segment is considered for compaction.
> When compaction does occur, the resulting cleaned file is sized based on that 
> same configuration.  This means that you can have very large files on disk 
> that contain little or no data which partly defeats the point of compacting. 
> The log cleaner should trim these segments to free up disk space.  That way 
> they would free up disk space and be able to be further compacted on 
> subsequent runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5431) LogCleaner stopped due to org.apache.kafka.common.errors.CorruptRecordException

2017-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085380#comment-16085380
 ] 

ASF GitHub Bot commented on KAFKA-5431:
---

GitHub user huxihx opened a pull request:

https://github.com/apache/kafka/pull/3525

KAFKA-5431: cleanSegments should not set length for cleanable segment files

For a compacted topic with preallocate enabled, during log cleaning, 
LogCleaner.cleanSegments does not have to pre-allocate the underlying file size 
since we only want to store the cleaned data in the file.

It's believed that this fix should also solve KAFKA-5582.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/huxihx/kafka log_compact_test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3525.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3525


commit e14436a2abb25c5b324efba5e431e5e1afb6e05a
Author: huxihx 
Date:   2017-07-13T08:28:50Z

KAFKA-5431: LogCleaner stopped due to 
org.apache.kafka.common.errors.CorruptRecordException

For a compacted topic with preallocate enabled, during log cleaning, 
LogCleaner.cleanSegments does not have to pre-allocate the underlying file size 
since we only want to store the cleaned data in the file.

It's believed that this fix should also solve KAFKA-5582.




> LogCleaner stopped due to 
> org.apache.kafka.common.errors.CorruptRecordException
> ---
>
> Key: KAFKA-5431
> URL: https://issues.apache.org/jira/browse/KAFKA-5431
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1
>Reporter: Carsten Rietz
>  Labels: reliability
> Fix For: 0.11.0.1
>
>
> Hey all,
> i have a strange problem with our uat cluster of 3 kafka brokers.
> the __consumer_offsets topic was replicated to two instances and our disks 
> ran full due to a wrong configuration of the log cleaner. We fixed the 
> configuration and updated from 0.10.1.1 to 0.10.2.1 .
> Today i increased the replication of the __consumer_offsets topic to 3 and 
> triggered replication to the third cluster via kafka-reassign-partitions.sh. 
> That went well but i get many errors like
> {code}
> [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for 
> partition [__consumer_offsets,18] offset 0 error Record size is less than the 
> minimum record overhead (14) (kafka.server.ReplicaFetcherThread)
> [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for 
> partition [__consumer_offsets,24] offset 0 error Record size is less than the 
> minimum record overhead (14) (kafka.server.ReplicaFetcherThread)
> {code}
> Which i think are due to the full disk event.
> The log cleaner threads died on these wrong messages:
> {code}
> [2017-06-12 09:59:50,722] ERROR [kafka-log-cleaner-thread-0], Error due to  
> (kafka.log.LogCleaner)
> org.apache.kafka.common.errors.CorruptRecordException: Record size is less 
> than the minimum record overhead (14)
> [2017-06-12 09:59:50,722] INFO [kafka-log-cleaner-thread-0], Stopped  
> (kafka.log.LogCleaner)
> {code}
> Looking at the file is see that some are truncated and some are jsut empty:
> $ ls -lsh 00594653.log
> 0 -rw-r--r-- 1 user user 100M Jun 12 11:00 00594653.log
> Sadly i do not have the logs any more from the disk full event itsself.
> I have three questions:
> * What is the best way to clean this up? Deleting the old log files and 
> restarting the brokers?
> * Why did kafka not handle the disk full event well? Is this only affecting 
> the cleanup or may we also loose data?
> * Is this maybe caused by the combination of upgrade and disk full?
> And last but not least: Keep up the good work. Kafka is really performing 
> well while being easy to administer and has good documentation!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5431) LogCleaner stopped due to org.apache.kafka.common.errors.CorruptRecordException

2017-07-13 Thread huxihx (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085357#comment-16085357
 ] 

huxihx commented on KAFKA-5431:
---

Seems this only happens when preallocate is enabled and topic is configured 
with 'compact'. When 

I think only one tiny code change could solve both this issue and 
[KAFKA-5582|https://issues.apache.org/jira/browse/KAFKA-5582].

> LogCleaner stopped due to 
> org.apache.kafka.common.errors.CorruptRecordException
> ---
>
> Key: KAFKA-5431
> URL: https://issues.apache.org/jira/browse/KAFKA-5431
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1
>Reporter: Carsten Rietz
>  Labels: reliability
> Fix For: 0.11.0.1
>
>
> Hey all,
> i have a strange problem with our uat cluster of 3 kafka brokers.
> the __consumer_offsets topic was replicated to two instances and our disks 
> ran full due to a wrong configuration of the log cleaner. We fixed the 
> configuration and updated from 0.10.1.1 to 0.10.2.1 .
> Today i increased the replication of the __consumer_offsets topic to 3 and 
> triggered replication to the third cluster via kafka-reassign-partitions.sh. 
> That went well but i get many errors like
> {code}
> [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for 
> partition [__consumer_offsets,18] offset 0 error Record size is less than the 
> minimum record overhead (14) (kafka.server.ReplicaFetcherThread)
> [2017-06-12 09:59:50,342] ERROR Found invalid messages during fetch for 
> partition [__consumer_offsets,24] offset 0 error Record size is less than the 
> minimum record overhead (14) (kafka.server.ReplicaFetcherThread)
> {code}
> Which i think are due to the full disk event.
> The log cleaner threads died on these wrong messages:
> {code}
> [2017-06-12 09:59:50,722] ERROR [kafka-log-cleaner-thread-0], Error due to  
> (kafka.log.LogCleaner)
> org.apache.kafka.common.errors.CorruptRecordException: Record size is less 
> than the minimum record overhead (14)
> [2017-06-12 09:59:50,722] INFO [kafka-log-cleaner-thread-0], Stopped  
> (kafka.log.LogCleaner)
> {code}
> Looking at the file is see that some are truncated and some are jsut empty:
> $ ls -lsh 00594653.log
> 0 -rw-r--r-- 1 user user 100M Jun 12 11:00 00594653.log
> Sadly i do not have the logs any more from the disk full event itsself.
> I have three questions:
> * What is the best way to clean this up? Deleting the old log files and 
> restarting the brokers?
> * Why did kafka not handle the disk full event well? Is this only affecting 
> the cleanup or may we also loose data?
> * Is this maybe caused by the combination of upgrade and disk full?
> And last but not least: Keep up the good work. Kafka is really performing 
> well while being easy to administer and has good documentation!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (KAFKA-4628) Support KTable/GlobalKTable Joins

2017-07-13 Thread frank t (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

frank t updated KAFKA-4628:
---
Comment: was deleted

(was: would be possible to release only a patch for for this ? maybe for 
unstable release ? 
or a work-around ? i really face with a blocking problem that is not solvable 
without this: 

 KTable join(final GlobalKTable globalTable,
final KeyValueMapper keyMapper,
final ValueJoiner joiner);)

> Support KTable/GlobalKTable Joins
> -
>
> Key: KAFKA-4628
> URL: https://issues.apache.org/jira/browse/KAFKA-4628
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
> Fix For: 0.11.1.0
>
>
> In KIP-99 we have added support for GlobalKTables, however we don't currently 
> support KTable/GlobalKTable joins as they require materializing a state store 
> for the join. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4628) Support KTable/GlobalKTable Joins

2017-07-13 Thread frank t (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085353#comment-16085353
 ] 

frank t commented on KAFKA-4628:


[~guozhang]  would be possible to release only a patch for for this ?  
or a work-around ? i am really blocked without this:.

 KTable join(final GlobalKTable globalTable,
final KeyValueMapper keyMapper,
final ValueJoiner joiner);

> Support KTable/GlobalKTable Joins
> -
>
> Key: KAFKA-4628
> URL: https://issues.apache.org/jira/browse/KAFKA-4628
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
> Fix For: 0.11.1.0
>
>
> In KIP-99 we have added support for GlobalKTables, however we don't currently 
> support KTable/GlobalKTable joins as they require materializing a state store 
> for the join. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5585) Failover in a replicated Cluster does not work

2017-07-13 Thread M. Manna (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085330#comment-16085330
 ] 

M. Manna commented on KAFKA-5585:
-

@huxih  there is only one partition it seems.

> Failover in a replicated Cluster does not work
> --
>
> Key: KAFKA-5585
> URL: https://issues.apache.org/jira/browse/KAFKA-5585
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.11.0.0
> Environment: Linux, Mac OSX
>Reporter: Thomas Bayer
> Attachments: broker_zookeeper_configs.zip, SimpleConsumer.java, 
> SimpleProducer.java, Stress Test Windows.xlsx, test_project_files.zip
>
>
> Failover does not work in a cluster with 3 nodes and a replicated topic with 
> factor 3.
> You can reproduce it als follows: Setup 3 Kafka Nodes and 1 Zookeeper. Than 
> create a topic with factor 3. Start a consumer. Stop a node. Write to the 
> topic. Now you get warnings that the client can not connect to a broker. The 
> consumer does not receive any messages.
> The same setup works like a charm with 0.10.2.1.
> Broker Config:
> {{broker.id=1
> listeners=PLAINTEXT://:9091
> log.dirs=cluster/logs/node-1
> broker.id=2
> listeners=PLAINTEXT://:9092
> log.dirs=cluster/logs/node-2
> broker.id=3
> listeners=PLAINTEXT://:9093
> log.dirs=cluster/logs/node-3}}
> Rest of the config is from the distribution.
> Producer and consumer config: see attached files
> *Log Consumer:*
> 2017-07-12 16:15:26 WARN  ConsumerCoordinator:649 - Auto-commit of offsets 
> {produktion-0=OffsetAndMetadata{offset=10, metadata=''}} failed for group a: 
> Offset commit failed with a retriable exception. You should retry committing 
> offsets. The underlying error was: The coordinator is not available.
> 2017-07-12 16:15:26 WARN  NetworkClient:588 - Connection to node 2147483645 
> could not be established. Broker may not be available.
> 2017-07-12 16:15:26 WARN  NetworkClient:588 - Connection to node 2 could not 
> be established. Broker may not be available.
> *Log Producer:*
> {{2017-07-12 16:15:32 WARN  NetworkClient:588 - Connection to node -1 could 
> not be established. Broker may not be available.}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5588) ConsumerConsole : uselss --new-consumer option

2017-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085325#comment-16085325
 ] 

ASF GitHub Bot commented on KAFKA-5588:
---

GitHub user ppatierno opened a pull request:

https://github.com/apache/kafka/pull/3524

KAFKA-5588: uselss --new-consumer option

Get rid of the --new-consumer option for the ConsoleConsumer

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppatierno/kafka kafka-5588

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3524.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3524


commit f9309497551c7058696466c06755defddad6238c
Author: ppatierno 
Date:   2017-07-13T07:53:16Z

Get rid of the --new-consumer option for the ConsoleConsumer




> ConsumerConsole : uselss --new-consumer option
> --
>
> Key: KAFKA-5588
> URL: https://issues.apache.org/jira/browse/KAFKA-5588
> Project: Kafka
>  Issue Type: Bug
>Reporter: Paolo Patierno
>Assignee: Paolo Patierno
>Priority: Minor
>
> Hi,
> it seems to me that the --new-consumer option on the ConsoleConsumer is 
> useless.
> The useOldConsumer var is related to specify --zookeeper on the command line 
> but then the bootstrap-server option (or the --new-consumer) can't be 
> used.
> If you use --bootstrap-server option then the new consumer is used 
> automatically so no need for --new-consumer.
> It turns out the using the old or new consumer is just related on using 
> --zookeeper or --bootstrap-server option (which can't be used together, so I 
> can't use new consumer connecting to zookeeper).
> It's also clear when you use --zookeeper for the old consumer and the output 
> from help says :
> "Consider using the new consumer by passing [bootstrap-server] instead of 
> [zookeeper]"
> I'm going to remove the --new-consumer option from the tool.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5588) ConsumerConsole : uselss --new-consumer option

2017-07-13 Thread Paolo Patierno (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paolo Patierno updated KAFKA-5588:
--
Description: 
Hi,
it seems to me that the --new-consumer option on the ConsoleConsumer is useless.
The useOldConsumer var is related to specify --zookeeper on the command line 
but then the bootstrap-server option (or the --new-consumer) can't be used.
If you use --bootstrap-server option then the new consumer is used 
automatically so no need for --new-consumer.
It turns out the using the old or new consumer is just related on using 
--zookeeper or --bootstrap-server option (which can't be used together, so I 
can't use new consumer connecting to zookeeper).

It's also clear when you use --zookeeper for the old consumer and the output 
from help says :

"Consider using the new consumer by passing [bootstrap-server] instead of 
[zookeeper]"

I'm going to remove the --new-consumer option from the tool.

Thanks,
Paolo.


  was:
Hi,
it seems to me that the --new-consumer option on the ConsoleConsumer is useless.
The useOldConsumer var is related to specify --zookeeper on the command line 
but then the bootstrap-server option (or the --new-consumer) can't be used.
If you use --bootstrap-server option then the new consumer is used 
automatically so no need for --new-consumer.
It turns out the using the old or new consumer is just related on using 
--zookeeper or --bootstrap-server option (which can't be used together, so I 
can't use new consumer connecting to zookeeper).
I'm going to remove the --new-consumer option from the tool.

Thanks,
Paolo.



> ConsumerConsole : uselss --new-consumer option
> --
>
> Key: KAFKA-5588
> URL: https://issues.apache.org/jira/browse/KAFKA-5588
> Project: Kafka
>  Issue Type: Bug
>Reporter: Paolo Patierno
>Assignee: Paolo Patierno
>Priority: Minor
>
> Hi,
> it seems to me that the --new-consumer option on the ConsoleConsumer is 
> useless.
> The useOldConsumer var is related to specify --zookeeper on the command line 
> but then the bootstrap-server option (or the --new-consumer) can't be 
> used.
> If you use --bootstrap-server option then the new consumer is used 
> automatically so no need for --new-consumer.
> It turns out the using the old or new consumer is just related on using 
> --zookeeper or --bootstrap-server option (which can't be used together, so I 
> can't use new consumer connecting to zookeeper).
> It's also clear when you use --zookeeper for the old consumer and the output 
> from help says :
> "Consider using the new consumer by passing [bootstrap-server] instead of 
> [zookeeper]"
> I'm going to remove the --new-consumer option from the tool.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5588) ConsumerConsole : uselss --new-consumer option

2017-07-13 Thread Paolo Patierno (JIRA)
Paolo Patierno created KAFKA-5588:
-

 Summary: ConsumerConsole : uselss --new-consumer option
 Key: KAFKA-5588
 URL: https://issues.apache.org/jira/browse/KAFKA-5588
 Project: Kafka
  Issue Type: Bug
Reporter: Paolo Patierno
Assignee: Paolo Patierno
Priority: Minor


Hi,
it seems to me that the --new-consumer option on the ConsoleConsumer is useless.
The useOldConsumer var is related to specify --zookeeper on the command line 
but then the bootstrap-server option (or the --new-consumer) can't be used.
If you use --bootstrap-server option then the new consumer is used 
automatically so no need for --new-consumer.
It turns out the using the old or new consumer is just related on using 
--zookeeper or --bootstrap-server option (which can't be used together, so I 
can't use new consumer connecting to zookeeper).
I'm going to remove the --new-consumer option from the tool.

Thanks,
Paolo.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5585) Failover in a replicated Cluster does not work

2017-07-13 Thread Thomas Bayer (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085285#comment-16085285
 ] 

Thomas Bayer commented on KAFKA-5585:
-

[~huxi_2b] Leaders are available for all partitions. 

With the same configuration files, same producer and same consumer it works 
with 0.10.2.1. Then I stop the cluster, cean the logs and zookeeper folder, 
start it with the same config files with the 0.11.0.0. version and it did't 
work.

OS: Ubuntu 16.4. LTS and OSX 10.12.5

> Failover in a replicated Cluster does not work
> --
>
> Key: KAFKA-5585
> URL: https://issues.apache.org/jira/browse/KAFKA-5585
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.11.0.0
> Environment: Linux, Mac OSX
>Reporter: Thomas Bayer
> Attachments: broker_zookeeper_configs.zip, SimpleConsumer.java, 
> SimpleProducer.java, Stress Test Windows.xlsx, test_project_files.zip
>
>
> Failover does not work in a cluster with 3 nodes and a replicated topic with 
> factor 3.
> You can reproduce it als follows: Setup 3 Kafka Nodes and 1 Zookeeper. Than 
> create a topic with factor 3. Start a consumer. Stop a node. Write to the 
> topic. Now you get warnings that the client can not connect to a broker. The 
> consumer does not receive any messages.
> The same setup works like a charm with 0.10.2.1.
> Broker Config:
> {{broker.id=1
> listeners=PLAINTEXT://:9091
> log.dirs=cluster/logs/node-1
> broker.id=2
> listeners=PLAINTEXT://:9092
> log.dirs=cluster/logs/node-2
> broker.id=3
> listeners=PLAINTEXT://:9093
> log.dirs=cluster/logs/node-3}}
> Rest of the config is from the distribution.
> Producer and consumer config: see attached files
> *Log Consumer:*
> 2017-07-12 16:15:26 WARN  ConsumerCoordinator:649 - Auto-commit of offsets 
> {produktion-0=OffsetAndMetadata{offset=10, metadata=''}} failed for group a: 
> Offset commit failed with a retriable exception. You should retry committing 
> offsets. The underlying error was: The coordinator is not available.
> 2017-07-12 16:15:26 WARN  NetworkClient:588 - Connection to node 2147483645 
> could not be established. Broker may not be available.
> 2017-07-12 16:15:26 WARN  NetworkClient:588 - Connection to node 2 could not 
> be established. Broker may not be available.
> *Log Producer:*
> {{2017-07-12 16:15:32 WARN  NetworkClient:588 - Connection to node -1 could 
> not be established. Broker may not be available.}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5587) Processor got uncaught exception: NullPointerException

2017-07-13 Thread Dan (JIRA)
Dan created KAFKA-5587:
--

 Summary: Processor got uncaught exception: NullPointerException
 Key: KAFKA-5587
 URL: https://issues.apache.org/jira/browse/KAFKA-5587
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.10.1.1
Reporter: Dan


[2017-07-12 21:56:39,964] ERROR Processor got uncaught exception. 
(kafka.network.Processor)
java.lang.NullPointerException
at 
kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:490)
at 
kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:487)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at 
kafka.network.Processor.processCompletedReceives(SocketServer.scala:487)
at kafka.network.Processor.run(SocketServer.scala:417)
at java.lang.Thread.run(Thread.java:745)

Anyone knows the cause of this exception? What's the effect of it? 
When this exception occurred, the log also showed that the broker was 
frequently shrinking ISR to itself. Are these two things interrelated?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)