[jira] [Commented] (KAFKA-6052) Windows: Consumers not polling when isolation.level=read_committed

2018-03-21 Thread banqjdon (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408971#comment-16408971
 ] 

banqjdon commented on KAFKA-6052:
-

 issue problem still exist in Kafka 1.0.1 on my Windows 10, I just tested it.

> Windows: Consumers not polling when isolation.level=read_committed 
> ---
>
> Key: KAFKA-6052
> URL: https://issues.apache.org/jira/browse/KAFKA-6052
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, producer 
>Affects Versions: 0.11.0.0, 1.0.1
> Environment: Windows 10. All processes running in embedded mode.
>Reporter: Ansel Zandegran
>Assignee: Vahid Hashemian
>Priority: Major
>  Labels: transactions, windows
> Attachments: Prducer_Consumer.log, Separate_Logs.zip, kafka-logs.zip, 
> logFile.log
>
>
> *The same code is running fine in Linux.* I am trying to send a transactional 
> record with exactly once schematics. These are my producer, consumer and 
> broker setups. 
> public void sendWithTTemp(String topic, EHEvent event) {
> Properties props = new Properties();
> props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
>   "localhost:9092,localhost:9093,localhost:9094");
> //props.put("bootstrap.servers", 
> "34.240.248.190:9092,52.50.95.30:9092,52.50.95.30:9092");
> props.put(ProducerConfig.ACKS_CONFIG, "all");
> props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
> props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, "1");
> props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);
> props.put(ProducerConfig.LINGER_MS_CONFIG, 1);
> props.put("transactional.id", "TID" + transactionId.incrementAndGet());
> props.put(ProducerConfig.TRANSACTION_TIMEOUT_CONFIG, "5000");
> Producer producer =
> new KafkaProducer<>(props,
> new StringSerializer(),
> new StringSerializer());
> Logger.log(this, "Initializing transaction...");
> producer.initTransactions();
> Logger.log(this, "Initializing done.");
> try {
>   Logger.log(this, "Begin transaction...");
>   producer.beginTransaction();
>   Logger.log(this, "Begin transaction done.");
>   Logger.log(this, "Sending events...");
>   producer.send(new ProducerRecord<>(topic,
>  event.getKey().toString(),
>  event.getValue().toString()));
>   Logger.log(this, "Sending events done.");
>   Logger.log(this, "Committing...");
>   producer.commitTransaction();
>   Logger.log(this, "Committing done.");
> } catch (ProducerFencedException | OutOfOrderSequenceException
> | AuthorizationException e) {
>   producer.close();
>   e.printStackTrace();
> } catch (KafkaException e) {
>   producer.abortTransaction();
>   e.printStackTrace();
> }
> producer.close();
>   }
> *In Consumer*
> I have set isolation.level=read_committed
> *In 3 Brokers*
> I'm running with the following properties
>   Properties props = new Properties();
>   props.setProperty("broker.id", "" + i);
>   props.setProperty("listeners", "PLAINTEXT://:909" + (2 + i));
>   props.setProperty("log.dirs", Configuration.KAFKA_DATA_PATH + "\\B" + 
> i);
>   props.setProperty("num.partitions", "1");
>   props.setProperty("zookeeper.connect", "localhost:2181");
>   props.setProperty("zookeeper.connection.timeout.ms", "6000");
>   props.setProperty("min.insync.replicas", "2");
>   props.setProperty("offsets.topic.replication.factor", "2");
>   props.setProperty("offsets.topic.num.partitions", "1");
>   props.setProperty("transaction.state.log.num.partitions", "2");
>   props.setProperty("transaction.state.log.replication.factor", "2");
>   props.setProperty("transaction.state.log.min.isr", "2");
> I am not getting any records in the consumer. When I set 
> isolation.level=read_uncommitted, I get the records. I assume that the 
> records are not getting commited. What could be the problem? log attached



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6702) Wrong className in LoggerFactory.getLogger method

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408956#comment-16408956
 ] 

ASF GitHub Bot commented on KAFKA-6702:
---

hejiefang opened a new pull request #4754: [KAFKA-6702]Wrong className in 
LoggerFactory.getLogger method
URL: https://github.com/apache/kafka/pull/4754
 
 
   
[https://issues.apache.org/jira/browse/KAFKA-6702](https://issues.apache.org/jira/browse/KAFKA-6702)
   Wrong className in LoggerFactory.getLogger method


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Wrong className in LoggerFactory.getLogger method
> -
>
> Key: KAFKA-6702
> URL: https://issues.apache.org/jira/browse/KAFKA-6702
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, streams
>Affects Versions: 1.2.0
>Reporter: JieFang.He
>Assignee: JieFang.He
>Priority: Major
>
> Wrong className in LoggerFactory.getLogger method



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6702) Wrong className in LoggerFactory.getLogger method

2018-03-21 Thread JieFang.He (JIRA)
JieFang.He created KAFKA-6702:
-

 Summary: Wrong className in LoggerFactory.getLogger method
 Key: KAFKA-6702
 URL: https://issues.apache.org/jira/browse/KAFKA-6702
 Project: Kafka
  Issue Type: Improvement
  Components: clients, streams
Affects Versions: 1.2.0
Reporter: JieFang.He
Assignee: JieFang.He


Wrong className in LoggerFactory.getLogger method



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6673) Segment and Stamped implement Comparable, but don't override equals.

2018-03-21 Thread Asutosh Pandya (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408935#comment-16408935
 ] 

Asutosh Pandya commented on KAFKA-6673:
---

PR: https://github.com/apache/kafka/pull/4745

> Segment and Stamped implement Comparable, but don't override equals.
> 
>
> Key: KAFKA-6673
> URL: https://issues.apache.org/jira/browse/KAFKA-6673
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Koen De Groote
>Priority: Minor
> Attachments: KAFKA_6673.patch
>
>
> The classes in question:
> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/internals/Segment.java
> and
> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/Stamped.java
> This came up while doing static analysis on the codebase on the trunk branch.
> As described by the analysis tool built into Intellij:
> {quote}
> Reports classes which implement java.lang.Comparable which do not override 
> equals(). If equals() is not overridden, the equals() implementation is not 
> consistent with the compareTo() implementation. If an object of such a class 
> is added to a collection such as java.util.SortedSet, this collection will 
> violate the contract of java.util.Set, which is defined in terms of equals().
> {quote}
>  
> Implementing an equals for an object is generally a best practice, especially 
> considering this caveat, where it's not the compareTo that will be used but 
> the equals method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-6457) Error: NOT_LEADER_FOR_PARTITION leads to NPE

2018-03-21 Thread Colin P. McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe updated KAFKA-6457:
---
Fix Version/s: 1.0.1

> Error: NOT_LEADER_FOR_PARTITION leads to NPE
> 
>
> Key: KAFKA-6457
> URL: https://issues.apache.org/jira/browse/KAFKA-6457
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Seweryn Habdank-Wojewodzki
>Priority: Major
> Fix For: 1.0.1
>
>
> One of our nodes was dead. Then the second one tooks all responsibility.
> But streamming aplication in the meanwhile crashed due to NPE caused by 
> {{Error: NOT_LEADER_FOR_PARTITION}}.
> The stack trace is below.
>  
> Is it something expected?
>  
> {code:java}
> 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer ...2018-01-17 
> 11:47:21 [my] [WARN ] Sender:251 - [Producer 
> clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer]
>  Got error produce response with correlation id 768962 on topic-partition 
> my_internal_topic-5, retrying (9 attempts left). Error: 
> NOT_LEADER_FOR_PARTITION
> 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer 
> clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer]
>  Got error produce response with correlation id 768962 on topic-partition 
> my_internal_topic-7, retrying (9 attempts left). Error: 
> NOT_LEADER_FOR_PARTITION
> 2018-01-17 11:47:21 [my] [ERROR] AbstractCoordinator:296 - [Consumer 
> clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-consumer,
>  groupId=restreamer-my] Heartbeat thread for group restreamer-my failed due 
> to unexpected error
> java.lang.NullPointerException: null
>     at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:436) 
> ~[my-restreamer.jar:?]
>     at org.apache.kafka.common.network.Selector.poll(Selector.java:395) 
> ~[my-restreamer.jar:?]
>     at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:460) 
> ~[my-restreamer.jar:?]
>     at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:238)
>  ~[my-restreamer.jar:?]
>     at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.pollNoWakeup(ConsumerNetworkClient.java:275)
>  ~[my-restreamer.jar:?]
>     at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatThread.run(AbstractCoordinator.java:934)
>  [my-restreamer.jar:?]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6457) Error: NOT_LEADER_FOR_PARTITION leads to NPE

2018-03-21 Thread Colin P. McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe resolved KAFKA-6457.

Resolution: Resolved

I believe this is a duplicate of KAFKA-6260

> Error: NOT_LEADER_FOR_PARTITION leads to NPE
> 
>
> Key: KAFKA-6457
> URL: https://issues.apache.org/jira/browse/KAFKA-6457
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Seweryn Habdank-Wojewodzki
>Priority: Major
>
> One of our nodes was dead. Then the second one tooks all responsibility.
> But streamming aplication in the meanwhile crashed due to NPE caused by 
> {{Error: NOT_LEADER_FOR_PARTITION}}.
> The stack trace is below.
>  
> Is it something expected?
>  
> {code:java}
> 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer ...2018-01-17 
> 11:47:21 [my] [WARN ] Sender:251 - [Producer 
> clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer]
>  Got error produce response with correlation id 768962 on topic-partition 
> my_internal_topic-5, retrying (9 attempts left). Error: 
> NOT_LEADER_FOR_PARTITION
> 2018-01-17 11:47:21 [my] [WARN ] Sender:251 - [Producer 
> clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-producer]
>  Got error produce response with correlation id 768962 on topic-partition 
> my_internal_topic-7, retrying (9 attempts left). Error: 
> NOT_LEADER_FOR_PARTITION
> 2018-01-17 11:47:21 [my] [ERROR] AbstractCoordinator:296 - [Consumer 
> clientId=restreamer-my-fef07ca9-b067-45c0-a5af-68b5a1730dac-StreamThread-1-consumer,
>  groupId=restreamer-my] Heartbeat thread for group restreamer-my failed due 
> to unexpected error
> java.lang.NullPointerException: null
>     at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:436) 
> ~[my-restreamer.jar:?]
>     at org.apache.kafka.common.network.Selector.poll(Selector.java:395) 
> ~[my-restreamer.jar:?]
>     at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:460) 
> ~[my-restreamer.jar:?]
>     at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:238)
>  ~[my-restreamer.jar:?]
>     at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.pollNoWakeup(ConsumerNetworkClient.java:275)
>  ~[my-restreamer.jar:?]
>     at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatThread.run(AbstractCoordinator.java:934)
>  [my-restreamer.jar:?]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-6538) Enhance ByteStore exceptions with more context information

2018-03-21 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-6538:
-
Labels: newbie  (was: )

> Enhance ByteStore exceptions with more context information
> --
>
> Key: KAFKA-6538
> URL: https://issues.apache.org/jira/browse/KAFKA-6538
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.1.0
>Reporter: Matthias J. Sax
>Priority: Minor
>  Labels: newbie
>
> In KIP-182 we refactored all stores to by plain {{Bytes/byte[]}} stores and 
> only have concrete key/value types on outer layers/wrappers of the stores.
> For this reason, the most inner {{RocksDBStore}} cannot provide useful error 
> messages anymore if a put/get/delete operation fails as it only handles plain 
> bytes.
> Therefore, we should enhance exceptions thrown from {{RocksDBStore}} with 
> corresponding information for which key/value the operation failed in the 
> wrapping stores (KeyValueStore, WindowedStored, and SessionStore).
> Cf https://github.com/apache/kafka/pull/4518 that cleans up {{RocksDBStore}} 
> exceptions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6692) Kafka Streams internal topics should be prefixed with an underscore

2018-03-21 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-6692.
--
Resolution: Not A Problem

> Kafka Streams internal topics should be prefixed with an underscore
> ---
>
> Key: KAFKA-6692
> URL: https://issues.apache.org/jira/browse/KAFKA-6692
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Yeva Byzek
>Priority: Major
>
> Issue: users cannot quickly/easily differentiate between their explicitly 
> created topics from internal topics that Kafka Streams uses, e.g. 
> {{*-changelog}} and {{*-repartition}}
>  
> Proposed solution: Kafka Streams internal topics should be prefixed with an 
> underscore
> Value add: ability to introspect kstreams and be able to *know* a set of 
> topics are associated with a topology.  This can help downstream tooling to 
> do cool visualizations and tracking



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6697) JBOD configured broker should not die if log directory is invalid

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408605#comment-16408605
 ] 

ASF GitHub Bot commented on KAFKA-6697:
---

lindong28 opened a new pull request #4752: KAFKA-6697; JBOD configured broker 
should not die if log directory is invalid
URL: https://github.com/apache/kafka/pull/4752
 
 
   Currently JBOD configured broker will still die on startup if 
dir.getCanonicalPath() throws IOException. We should mark such log directory as 
offline and broker should still run if there is good disk.
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JBOD configured broker should not die if log directory is invalid
> -
>
> Key: KAFKA-6697
> URL: https://issues.apache.org/jira/browse/KAFKA-6697
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
>Priority: Major
>
> Currently JBOD configured broker will still die on startup if 
> dir.getCanonicalPath() throws IOException. We should mark such log directory 
> as offline and broker should still run if there is good disk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-6683) ReplicaFetcher crashes with "Attempted to complete a transaction which was not started"

2018-03-21 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson reassigned KAFKA-6683:
--

Assignee: Jason Gustafson

> ReplicaFetcher crashes with "Attempted to complete a transaction which was 
> not started" 
> 
>
> Key: KAFKA-6683
> URL: https://issues.apache.org/jira/browse/KAFKA-6683
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 1.0.0
> Environment: os: GNU/Linux 
> arch: x86_64
> Kernel: 4.9.77
> jvm: OpenJDK 1.8.0
>Reporter: Chema Sanchez
>Assignee: Jason Gustafson
>Priority: Critical
> Attachments: server.properties
>
>
> We have been experiencing this issue lately when restarting or replacing 
> brokers of our Kafka clusters during maintenance operations.
> Having restarted or replaced a broker, after some minutes performing normally 
> it may suddenly throw the following exception and stop replicating some 
> partitions:
> {code:none}
> 2018-03-15 17:23:01,482] ERROR [ReplicaFetcher replicaId=12, leaderId=10, 
> fetcherId=0] Error due to (kafka.server.ReplicaFetcherThread)
> java.lang.IllegalArgumentException: Attempted to complete a transaction which 
> was not started
>     at 
> kafka.log.ProducerStateManager.completeTxn(ProducerStateManager.scala:720)
>     at kafka.log.Log.$anonfun$loadProducersFromLog$4(Log.scala:540)
>     at 
> kafka.log.Log.$anonfun$loadProducersFromLog$4$adapted(Log.scala:540)
>     at scala.collection.immutable.List.foreach(List.scala:389)
>     at 
> scala.collection.generic.TraversableForwarder.foreach(TraversableForwarder.scala:35)
>     at 
> scala.collection.generic.TraversableForwarder.foreach$(TraversableForwarder.scala:35)
>     at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:44)
>     at kafka.log.Log.loadProducersFromLog(Log.scala:540)
>     at kafka.log.Log.$anonfun$loadProducerState$5(Log.scala:521)
>     at kafka.log.Log.$anonfun$loadProducerState$5$adapted(Log.scala:514)
>     at scala.collection.Iterator.foreach(Iterator.scala:929)
>     at scala.collection.Iterator.foreach$(Iterator.scala:929)
>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
>     at scala.collection.IterableLike.foreach(IterableLike.scala:71)
>     at scala.collection.IterableLike.foreach$(IterableLike.scala:70)
>     at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>     at kafka.log.Log.loadProducerState(Log.scala:514)
>     at kafka.log.Log.$anonfun$truncateTo$2(Log.scala:1487)
>     at 
> scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:12)
>     at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
>     at kafka.log.Log.truncateTo(Log.scala:1467)
>     at kafka.log.LogManager.$anonfun$truncateTo$2(LogManager.scala:454)
>     at 
> kafka.log.LogManager.$anonfun$truncateTo$2$adapted(LogManager.scala:445)
>     at 
> scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:789)
>     at scala.collection.immutable.Map$Map1.foreach(Map.scala:120)
>     at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:788)
>     at kafka.log.LogManager.truncateTo(LogManager.scala:445)
>     at 
> kafka.server.ReplicaFetcherThread.$anonfun$maybeTruncate$1(ReplicaFetcherThread.scala:281)
>     at scala.collection.Iterator.foreach(Iterator.scala:929)
>     at scala.collection.Iterator.foreach$(Iterator.scala:929)
>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
>     at scala.collection.IterableLike.foreach(IterableLike.scala:71)
>     at scala.collection.IterableLike.foreach$(IterableLike.scala:70)
>     at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>     at 
> kafka.server.ReplicaFetcherThread.maybeTruncate(ReplicaFetcherThread.scala:265)
>     at 
> kafka.server.AbstractFetcherThread.$anonfun$maybeTruncate$2(AbstractFetcherThread.scala:135)
>     at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
>     at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:217)
>     at 
> kafka.server.AbstractFetcherThread.maybeTruncate(AbstractFetcherThread.scala:132)
>     at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:102)
>     at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64)
> [2018-03-15 17:23:01,497] INFO [ReplicaFetcher replicaId=12, leaderId=10, 
> fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread)
> {code}
> As during system updates all brokers in a cluster are restarted, it happened 
> some times the issue to manifest in different brokers holding 

[jira] [Created] (KAFKA-6701) synchronize Log modification between delete cleanup and async delete

2018-03-21 Thread Sumant Tambe (JIRA)
Sumant Tambe created KAFKA-6701:
---

 Summary: synchronize Log modification between delete cleanup and 
async delete
 Key: KAFKA-6701
 URL: https://issues.apache.org/jira/browse/KAFKA-6701
 Project: Kafka
  Issue Type: Bug
Reporter: Sumant Tambe
Assignee: Sumant Tambe


Kafka broker crashes without any evident disk failures 

>From [~becket_qin]: This looks a bug in kafka when topic deletion and log 
>retention cleanup happen at the same time, the log retention cleanup may see 
>ClosedChannelException after the log has been renamed for async deletion.

The root cause is that the topic deletion should have set the isClosed flag of 
the partition log to true and the retention should not bother to do the old log 
segments deletion when the log is closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6699) When one of two Kafka nodes are dead, streaming API cannot handle messaging

2018-03-21 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408334#comment-16408334
 ] 

Matthias J. Sax commented on KAFKA-6699:


Thanks for reporting this. Is version 1.0 or 1.1 (release shortly) also 
affected? Can you provide some DEBUG level log files? If I understand it 
correctly, Streams does not fail/crash, but does not process any data? 

> When one of two Kafka nodes are dead, streaming API cannot handle messaging
> ---
>
> Key: KAFKA-6699
> URL: https://issues.apache.org/jira/browse/KAFKA-6699
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.2
>Reporter: Seweryn Habdank-Wojewodzki
>Priority: Major
>
> Dears,
> I am observing quite often, when Kafka Broker is partly dead(*), then 
> application, which uses streaming API are doing nothing.
> (*) Partly dead in my case it means that one of two Kafka nodes are out of 
> order. 
> Especially when disk is full on one machine, then Broker is going in some 
> strange state, where streaming API goes vacations. It seems like regular 
> producer/consumer API has no problem in such a case.
> Can you have a look on that matter?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6052) Windows: Consumers not polling when isolation.level=read_committed

2018-03-21 Thread Pegerto Fernandez (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408105#comment-16408105
 ] 

Pegerto Fernandez commented on KAFKA-6052:
--

Hello

We did some test on 1.0.1 and it seems the issue is resolved but this ticket is 
open, can anybody else confirm if this is solved with 1.0.1?

Regards.

> Windows: Consumers not polling when isolation.level=read_committed 
> ---
>
> Key: KAFKA-6052
> URL: https://issues.apache.org/jira/browse/KAFKA-6052
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, producer 
>Affects Versions: 0.11.0.0
> Environment: Windows 10. All processes running in embedded mode.
>Reporter: Ansel Zandegran
>Assignee: Vahid Hashemian
>Priority: Major
>  Labels: windows
> Attachments: Prducer_Consumer.log, Separate_Logs.zip, kafka-logs.zip, 
> logFile.log
>
>
> *The same code is running fine in Linux.* I am trying to send a transactional 
> record with exactly once schematics. These are my producer, consumer and 
> broker setups. 
> public void sendWithTTemp(String topic, EHEvent event) {
> Properties props = new Properties();
> props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
>   "localhost:9092,localhost:9093,localhost:9094");
> //props.put("bootstrap.servers", 
> "34.240.248.190:9092,52.50.95.30:9092,52.50.95.30:9092");
> props.put(ProducerConfig.ACKS_CONFIG, "all");
> props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
> props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, "1");
> props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);
> props.put(ProducerConfig.LINGER_MS_CONFIG, 1);
> props.put("transactional.id", "TID" + transactionId.incrementAndGet());
> props.put(ProducerConfig.TRANSACTION_TIMEOUT_CONFIG, "5000");
> Producer producer =
> new KafkaProducer<>(props,
> new StringSerializer(),
> new StringSerializer());
> Logger.log(this, "Initializing transaction...");
> producer.initTransactions();
> Logger.log(this, "Initializing done.");
> try {
>   Logger.log(this, "Begin transaction...");
>   producer.beginTransaction();
>   Logger.log(this, "Begin transaction done.");
>   Logger.log(this, "Sending events...");
>   producer.send(new ProducerRecord<>(topic,
>  event.getKey().toString(),
>  event.getValue().toString()));
>   Logger.log(this, "Sending events done.");
>   Logger.log(this, "Committing...");
>   producer.commitTransaction();
>   Logger.log(this, "Committing done.");
> } catch (ProducerFencedException | OutOfOrderSequenceException
> | AuthorizationException e) {
>   producer.close();
>   e.printStackTrace();
> } catch (KafkaException e) {
>   producer.abortTransaction();
>   e.printStackTrace();
> }
> producer.close();
>   }
> *In Consumer*
> I have set isolation.level=read_committed
> *In 3 Brokers*
> I'm running with the following properties
>   Properties props = new Properties();
>   props.setProperty("broker.id", "" + i);
>   props.setProperty("listeners", "PLAINTEXT://:909" + (2 + i));
>   props.setProperty("log.dirs", Configuration.KAFKA_DATA_PATH + "\\B" + 
> i);
>   props.setProperty("num.partitions", "1");
>   props.setProperty("zookeeper.connect", "localhost:2181");
>   props.setProperty("zookeeper.connection.timeout.ms", "6000");
>   props.setProperty("min.insync.replicas", "2");
>   props.setProperty("offsets.topic.replication.factor", "2");
>   props.setProperty("offsets.topic.num.partitions", "1");
>   props.setProperty("transaction.state.log.num.partitions", "2");
>   props.setProperty("transaction.state.log.replication.factor", "2");
>   props.setProperty("transaction.state.log.min.isr", "2");
> I am not getting any records in the consumer. When I set 
> isolation.level=read_uncommitted, I get the records. I assume that the 
> records are not getting commited. What could be the problem? log attached



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6700) allow producer performance test to run for a fixed duration

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408086#comment-16408086
 ] 

ASF GitHub Bot commented on KAFKA-6700:
---

cburroughs opened a new pull request #4748: KAFKA-6700; allow producer 
performance test to run for a fixed duration
URL: https://github.com/apache/kafka/pull/4748
 
 
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> allow producer performance test to run for a fixed duration
> ---
>
> Key: KAFKA-6700
> URL: https://issues.apache.org/jira/browse/KAFKA-6700
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Chris Burroughs
>Assignee: Chris Burroughs
>Priority: Minor
>
> Currently {{ProducerPerformance}} has options to generate a fixed number of 
> records with variable duration (ie until enough records have been generated). 
>  For some performance scenarios (such as generating load from multiple 
> instances) it is more convenient to run for a fixed duration and generate as 
> many records as possible within that window.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6662) Consumer use offsetsForTimes() get offset return None.

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408054#comment-16408054
 ] 

ASF GitHub Bot commented on KAFKA-6662:
---

wangzzu closed pull request #4717: KAFKA-6662: Consumer use offsetsForTimes() 
get offset return None.
URL: https://github.com/apache/kafka/pull/4717
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/core/src/main/scala/kafka/log/Log.scala 
b/core/src/main/scala/kafka/log/Log.scala
index f0050f54aef..33035f9a9c9 100644
--- a/core/src/main/scala/kafka/log/Log.scala
+++ b/core/src/main/scala/kafka/log/Log.scala
@@ -1120,7 +1120,10 @@ class Log(@volatile var dir: File,
   None
   }
 
-  targetSeg.flatMap(_.findOffsetByTimestamp(targetTimestamp, 
logStartOffset))
+  targetSeg match {
+case None => Some(TimestampOffset(RecordBatch.NO_TIMESTAMP, 
this.logEndOffset))
+case _ => targetSeg.flatMap(_.findOffsetByTimestamp(targetTimestamp))
+  }
 }
   }
 
diff --git a/core/src/main/scala/kafka/log/LogSegment.scala 
b/core/src/main/scala/kafka/log/LogSegment.scala
index 5970f42f6d9..52740d49c79 100755
--- a/core/src/main/scala/kafka/log/LogSegment.scala
+++ b/core/src/main/scala/kafka/log/LogSegment.scala
@@ -479,8 +479,12 @@ class LogSegment private[log] (val log: FileRecords,
 val position = offsetIndex.lookup(math.max(timestampOffset.offset, 
startingOffset)).position
 
 // Search the timestamp
-Option(log.searchForTimestamp(timestamp, position, startingOffset)).map { 
timestampAndOffset =>
-  TimestampOffset(timestampAndOffset.timestamp, timestampAndOffset.offset)
+if (position == 0 && timestampOffset.timestamp == -1){
+  Option(timestampOffset)
+} else {
+  Option(log.searchForTimestamp(timestamp, position, startingOffset)).map 
{ timestampAndOffset =>
+TimestampOffset(timestampAndOffset.timestamp, 
timestampAndOffset.offset)
+  }
 }
   }
 
diff --git a/core/src/test/scala/unit/kafka/log/LogSegmentTest.scala 
b/core/src/test/scala/unit/kafka/log/LogSegmentTest.scala
index c45ed0d2986..946ef91c3de 100644
--- a/core/src/test/scala/unit/kafka/log/LogSegmentTest.scala
+++ b/core/src/test/scala/unit/kafka/log/LogSegmentTest.scala
@@ -263,7 +263,7 @@ class LogSegmentTest {
 assertEquals(43, seg.findOffsetByTimestamp(430).get.offset)
 assertEquals(44, seg.findOffsetByTimestamp(431).get.offset)
 // Search beyond the last timestamp
-assertEquals(None, seg.findOffsetByTimestamp(491))
+assertEquals(49, seg.findOffsetByTimestamp(491).get.offset)
 // Search before the first indexed timestamp
 assertEquals(41, seg.findOffsetByTimestamp(401).get.offset)
 // Search before the first timestamp


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Consumer use offsetsForTimes() get offset return None.
> --
>
> Key: KAFKA-6662
> URL: https://issues.apache.org/jira/browse/KAFKA-6662
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.0
>Reporter: Matt Wang
>Priority: Minor
>
> When we use Consumer's method  offsetsForTimes()  to get the topic-partition 
> offset, sometimes it will return null. Print the client log
> {code:java}
> // 2018-03-15 11:54:05,239] DEBUG Collector TraceCollector dispatcher loop 
> interval 256 upload 0 retry 0 fail 0 
> (com.meituan.mtrace.collector.sg.AbstractCollector)
> [2018-03-15 11:54:05,241] DEBUG Set SASL client state to INITIAL 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2018-03-15 11:54:05,241] DEBUG Set SASL client state to INTERMEDIATE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2018-03-15 11:54:05,247] DEBUG Set SASL client state to COMPLETE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2018-03-15 11:54:05,247] DEBUG Initiating API versions fetch from node 53. 
> (org.apache.kafka.clients.NetworkClient)
> [2018-03-15 11:54:05,253] DEBUG Recorded API versions for node 53: 
> (Produce(0): 0 to 2 [usable: 2], Fetch(1): 0 to 3 [usable: 3], Offsets(2): 0 
> to 1 [usable: 1], Metadata(3): 0 to 2 [usable: 2], LeaderAndIsr(4): 0 
> [usable: 0], StopReplica(5): 0 [usable: 0], UpdateMetadata(6): 0 to 3 
> [usable: 3], ControlledShutdown(7): 1 [usable: 1], OffsetCommit(8): 0 to 2 

[jira] [Created] (KAFKA-6700) allow producer performance test to run for a fixed duration

2018-03-21 Thread Chris Burroughs (JIRA)
Chris Burroughs created KAFKA-6700:
--

 Summary: allow producer performance test to run for a fixed 
duration
 Key: KAFKA-6700
 URL: https://issues.apache.org/jira/browse/KAFKA-6700
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Chris Burroughs
Assignee: Chris Burroughs


Currently {{ProducerPerformance}} has options to generate a fixed number of 
records with variable duration (ie until enough records have been generated).  
For some performance scenarios (such as generating load from multiple 
instances) it is more convenient to run for a fixed duration and generate as 
many records as possible within that window.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6689) Kafka not release .deleted file.

2018-03-21 Thread huxihx (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407938#comment-16407938
 ] 

huxihx commented on KAFKA-6689:
---

[~Ueng] I believe this issue actually results from the same root cause with 
[KAFKA-4616|https://issues.apache.org/jira/browse/KAFKA-4614], both of which 
failed to release the mapped buffers until they were finally garbage collected. 
Could you upgrade Kafka to 0.10.2.0 to see if it ever happens?

> Kafka not release .deleted file.
> 
>
> Key: KAFKA-6689
> URL: https://issues.apache.org/jira/browse/KAFKA-6689
> Project: Kafka
>  Issue Type: Bug
>  Components: config, controller, log
>Affects Versions: 0.10.1.1
> Environment: 3 Kafka Broker running on CentOS6.9(rungi3 VMs) 
>Reporter: A
>Priority: Critical
>  Labels: newbie
> Fix For: 0.10.1.1
>
>
>          After Kafka cleaned log  .timeindex / .index files based on topic 
> retention. I can
> still lsof a lot of .index.deleted and .timeindex.deleted files. 
>        We have 3 Brokers on 3 VMs ,  It's happen only 2 brokers.  
> [brokeer-01 ~]$ lsof -p 24324 | grep deleted | wc -l
> 28
> [broker-02 ~]$ lsof -p 12131 | grep deleted | wc -l
> 14599
> [broker-03 ~]$ lsof -p 3349 | grep deleted | wc -l
> 14523
>  
> Configuration on 3 broker is same.  (Rolllog every hour, Retention time 11 
> Hours)
>  * INFO KafkaConfig values:
>  advertised.host.name = null
>  advertised.listeners = PLAINTEXT://Broker-02:9092
>  advertised.port = null
>  authorizer.class.name =
>  auto.create.topics.enable = true
>  auto.leader.rebalance.enable = true
>  background.threads = 10
>  broker.id = 2
>  broker.id.generation.enable = true
>  broker.rack = null
>  compression.type = producer
>  connections.max.idle.ms = 60
>  controlled.shutdown.enable = true
>  controlled.shutdown.max.retries = 3
>  controlled.shutdown.retry.backoff.ms = 5000
>  controller.socket.timeout.ms = 3
>  default.replication.factor = 3
>  delete.topic.enable = true
>  fetch.purgatory.purge.interval.requests = 1000
>  group.max.session.timeout.ms = 30
>  group.min.session.timeout.ms = 6000
>  host.name =
>  inter.broker.protocol.version = 0.10.1-IV2
>  leader.imbalance.check.interval.seconds = 300
>  leader.imbalance.per.broker.percentage = 10
>  listeners = null
>  log.cleaner.backoff.ms = 15000
>  log.cleaner.dedupe.buffer.size = 134217728
>  log.cleaner.delete.retention.ms = 8640
>  log.cleaner.enable = true
>  log.cleaner.io.buffer.load.factor = 0.9
>  log.cleaner.io.buffer.size = 524288
>  log.cleaner.io.max.bytes.per.second = 1.7976931348623157E308
>  log.cleaner.min.cleanable.ratio = 0.5
>  log.cleaner.min.compaction.lag.ms = 0
>  log.cleaner.threads = 1
>  log.cleanup.policy = [delete]
>  log.dir = /tmp/kafka-logs
>  log.dirs = /data/appdata/kafka/data
>  log.flush.interval.messages = 9223372036854775807
>  log.flush.interval.ms = null
>  log.flush.offset.checkpoint.interval.ms = 6
>  log.flush.scheduler.interval.ms = 9223372036854775807
>  log.index.interval.bytes = 4096
>  log.index.size.max.bytes = 10485760
>  log.message.format.version = 0.10.1-IV2
>  log.message.timestamp.difference.max.ms = 9223372036854775807
>  log.message.timestamp.type = CreateTime
>  log.preallocate = false
>  log.retention.bytes = -1
>  log.retention.check.interval.ms = 30
>  log.retention.hours = 11
>  log.retention.minutes = 660
>  log.retention.ms = 3960
>  log.roll.hours = 1
>  log.roll.jitter.hours = 0
>  log.roll.jitter.ms = null
>  log.roll.ms = null
>  log.segment.bytes = 1073741824
>  log.segment.delete.delay.ms = 6
>  max.connections.per.ip = 2147483647
>  max.connections.per.ip.overrides =
>  message.max.bytes = 112
>  metric.reporters = []
>  metrics.num.samples = 2
>  metrics.sample.window.ms = 3
>  min.insync.replicas = 2
>  num.io.threads = 16
>  num.network.threads = 16
>  num.partitions = 10
>  num.recovery.threads.per.data.dir = 3
>  num.replica.fetchers = 1
>  offset.metadata.max.bytes = 4096
>  offsets.commit.required.acks = -1
>  offsets.commit.timeout.ms = 5000
>  offsets.load.buffer.size = 5242880
>  offsets.retention.check.interval.ms = 60
>  offsets.retention.minutes = 1440
>  offsets.topic.compression.codec = 0
>  offsets.topic.num.partitions = 50
>  offsets.topic.replication.factor = 3
>  offsets.topic.segment.bytes = 104857600
>  port = 9092
>  principal.builder.class = class 
> org.apache.kafka.common.security.auth.DefaultPrincipalBuilder
>  producer.purgatory.purge.interval.requests = 1000
>  queued.max.requests = 3
>  quota.consumer.default = 9223372036854775807
>  quota.producer.default = 9223372036854775807
>  quota.window.num = 11
>  quota.window.size.seconds = 1
>  replica.fetch.backoff.ms = 1000
>  replica.fetch.max.bytes = 1048576
>  replica.fetch.min.bytes = 1
>  

[jira] [Commented] (KAFKA-6680) Fix config initialization in DynamicBrokerConfig

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407769#comment-16407769
 ] 

ASF GitHub Bot commented on KAFKA-6680:
---

rajinisivaram closed pull request #4740: MINOR: Document workaround for 
KAFKA-6680 for 1.1
URL: https://github.com/apache/kafka/pull/4740
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/configuration.html b/docs/configuration.html
index df58ba76b77..7b975ba1d57 100644
--- a/docs/configuration.html
+++ b/docs/configuration.html
@@ -189,6 +189,12 @@ Adding and Removing Listeners
   Inter-broker listener must be configured using the static broker 
configuration inter.broker.listener.name
   or inter.broker.security.protocol.
 
+  Note: In Kafka version 1.1.0, thread config updates and listener 
updates are processed by a broker only if at least one other
+  broker config was configured dynamically prior to this update. To workaround 
this issue, a dummy property may be dynamically configured
+  prior to an update of thread/listener configs or the update may be retried 
after reverting the change. Dynamic update of listeners in
+  1.1.0 requires listener.security.protocol.map to be updated to 
a map containing security protocols for all the listeners,
+  even if listener name is a security protocol. These issues will be fixed in 
the next release.
+
   3.2 Topic-Level 
Configs
 
   Configurations pertinent to topics have both a server default as well an 
optional per-topic override. If no per-topic configuration is given the server 
default is used. The override can be set at topic creation time by giving one 
or more --config options. This example creates a topic named 
my-topic with a custom max message size and flush rate:


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix config initialization in DynamicBrokerConfig
> 
>
> Key: KAFKA-6680
> URL: https://issues.apache.org/jira/browse/KAFKA-6680
> Project: Kafka
>  Issue Type: Bug
>Reporter: Manikumar
>Assignee: Manikumar
>Priority: Major
> Fix For: 1.2.0
>
>
> Below issues observed while testing dynamic config update feature
> 1. {{kafkaConfig}} doesn't get updated during {{initialize}} if there are no 
> dynamic configs defined in ZK.
> 2.  update DynamicListenerConfig.validateReconfiguration() to check new 
> Listeners must be subset of listener map



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6699) When one of two Kafka nodes are dead, streaming API cannot handle messaging

2018-03-21 Thread Seweryn Habdank-Wojewodzki (JIRA)
Seweryn Habdank-Wojewodzki created KAFKA-6699:
-

 Summary: When one of two Kafka nodes are dead, streaming API 
cannot handle messaging
 Key: KAFKA-6699
 URL: https://issues.apache.org/jira/browse/KAFKA-6699
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.11.0.2
Reporter: Seweryn Habdank-Wojewodzki


Dears,

I am observing quite often, when Kafka Broker is partly dead(*), then 
application, which uses streaming API are doing nothing.

(*) Partly dead in my case it means that one of two Kafka nodes are out of 
order. 

Especially when disk is full on one machine, then Broker is going in some 
strange state, where streaming API goes vacations. It seems like regular 
producer/consumer API has no problem in such a case.

Can you have a look on that matter?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6698) ConsumerBounceTest#testClose sometimes fails

2018-03-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407577#comment-16407577
 ] 

Ted Yu commented on KAFKA-6698:
---

Test output consisted of repeated occurrence of:
{code}
[2018-03-21 07:02:01,683] ERROR ZKShutdownHandler is not registered, so 
ZooKeeper server won't take any action on ERROR or SHUTDOWN server state 
changes (org.apache.zookeeper.server.ZooKeeperServer:472)
[2018-03-21 07:02:02,693] WARN Session 0x0 for server null, unexpected error, 
closing socket connection and attempting reconnect 
(org.apache.zookeeper.ClientCnxn:1162)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
[2018-03-21 07:02:03,794] WARN Session 0x0 for server null, unexpected error, 
closing socket connection and attempting reconnect 
(org.apache.zookeeper.ClientCnxn:1162)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
{code}

> ConsumerBounceTest#testClose sometimes fails
> 
>
> Key: KAFKA-6698
> URL: https://issues.apache.org/jira/browse/KAFKA-6698
> Project: Kafka
>  Issue Type: Test
>Reporter: Ted Yu
>Priority: Minor
>
> Saw the following in 
> https://builds.apache.org/job/kafka-1.1-jdk7/94/testReport/junit/kafka.api/ConsumerBounceTest/testClose/
>  :
> {code}
> org.apache.kafka.common.errors.TimeoutException: The consumer group command 
> timed out while waiting for group to initialize: 
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6698) ConsumerBounceTest#testClose sometimes fails

2018-03-21 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-6698:
-

 Summary: ConsumerBounceTest#testClose sometimes fails
 Key: KAFKA-6698
 URL: https://issues.apache.org/jira/browse/KAFKA-6698
 Project: Kafka
  Issue Type: Test
Reporter: Ted Yu


Saw the following in 
https://builds.apache.org/job/kafka-1.1-jdk7/94/testReport/junit/kafka.api/ConsumerBounceTest/testClose/
 :
{code}
org.apache.kafka.common.errors.TimeoutException: The consumer group command 
timed out while waiting for group to initialize: 
Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6054) ERROR "SubscriptionInfo - unable to decode subscription data: version=2" when upgrading from 0.10.0.0 to 0.10.2.1

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407503#comment-16407503
 ] 

ASF GitHub Bot commented on KAFKA-6054:
---

mjsax opened a new pull request #4746: KAFKA-6054: Fix upgrade path from Kafka 
Streams v0.10.0
URL: https://github.com/apache/kafka/pull/4746
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> ERROR "SubscriptionInfo - unable to decode subscription data: version=2" when 
> upgrading from 0.10.0.0 to 0.10.2.1
> -
>
> Key: KAFKA-6054
> URL: https://issues.apache.org/jira/browse/KAFKA-6054
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: James Cheng
>Assignee: Matthias J. Sax
>Priority: Major
>  Labels: kip
> Fix For: 1.2.0
>
>
> KIP: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-268%3A+Simplify+Kafka+Streams+Rebalance+Metadata+Upgrade]
> We upgraded an app from kafka-streams 0.10.0.0 to 0.10.2.1. We did a rolling 
> upgrade of the app, so that one point, there were both 0.10.0.0-based 
> instances and 0.10.2.1-based instances running.
> We observed the following stack trace:
> {code:java}
> 2017-10-11 07:02:19.964 [StreamThread-3] ERROR o.a.k.s.p.i.a.SubscriptionInfo 
> -
> unable to decode subscription data: version=2
> org.apache.kafka.streams.errors.TaskAssignmentException: unable to decode
> subscription data: version=2
> at 
> org.apache.kafka.streams.processor.internals.assignment.SubscriptionInfo.decode(SubscriptionInfo.java:113)
> at 
> org.apache.kafka.streams.processor.internals.StreamPartitionAssignor.assign(StreamPartitionAssignor.java:235)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.performAssignment(ConsumerCoordinator.java:260)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onJoinLeader(AbstractCoordinator.java:404)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.access$900(AbstractCoordinator.java:81)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:358)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:340)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658)
> at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167)
> at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
> at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:243)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:345)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:977)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:937)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:295)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:218)
> 
> {code}
> I spoke with [~mjsax] and he said this is a known issue that happens when you 
> have both 0.10.0.0 instances and