[jira] [Commented] (KAFKA-4391) On Windows, Kafka server stops with uncaught exception after coming back from sleep

2016-12-01 Thread Yiquan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712034#comment-15712034
 ] 

Yiquan Zhou commented on KAFKA-4391:


[~huxi_2b] Can you think of any possible workaround to this issue? For example 
if we are using a single Kafka server, is there a way to let Kafka not use this 
replication-offset-checkpoint file? Thanks.

> On Windows, Kafka server stops with uncaught exception after coming back from 
> sleep
> ---
>
> Key: KAFKA-4391
> URL: https://issues.apache.org/jira/browse/KAFKA-4391
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
> Environment: Windows 10, jdk1.8.0_111
>Reporter: Yiquan Zhou
>
> Steps to reproduce:
> 1. start the zookeeper
> $ bin\windows\zookeeper-server-start.bat config/zookeeper.properties
> 2. start the Kafka server with the default properties
> $ bin\windows\kafka-server-start.bat config/server.properties
> 3. put Windows into sleep mode for 1-2 hours
> 4. activate Windows again, an exception occurs in Kafka server console and 
> the server is stopped:
> {code:title=kafka console log}
> [2016-11-08 21:45:35,185] INFO Client session timed out, have not heard from 
> server in 10081379ms for sessionid 0x1584514da47, closing socket 
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:40,698] INFO zookeeper state changed (Disconnected) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-11-08 21:45:43,029] INFO Opening socket connection to server 
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,044] INFO Socket connection established to 
> 127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,158] INFO Unable to reconnect to ZooKeeper service, 
> session 0x1584514da47 has expired, closing socket connection 
> (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,158] INFO zookeeper state changed (Expired) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-11-08 21:45:43,236] INFO Initiating client connection, 
> connectString=localhost:2181 sessionTimeout=6000 
> watcher=org.I0Itec.zkclient.ZkClient@11ca437b (org.apache.zookeeper.ZooKeeper)
> [2016-11-08 21:45:43,280] INFO EventThread shut down 
> (org.apache.zookeeper.ClientCnxn)
> log4j:ERROR Failed to rename [/controller.log] to 
> [/controller.log.2016-11-08-18].
> [2016-11-08 21:45:43,421] INFO Opening socket connection to server 
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,483] INFO Socket connection established to 
> 127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,811] INFO Session establishment complete on server 
> 127.0.0.1/127.0.0.1:2181, sessionid = 0x1584514da470001, negotiated timeout = 
> 6000 (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,827] INFO zookeeper state changed (SyncConnected) 
> (org.I0Itec.zkclient.ZkClient)
> log4j:ERROR Failed to rename [/server.log] to [/server.log.2016-11-08-18].
> [2016-11-08 21:45:43,827] INFO Creating /controller (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,014] INFO Result of znode creation is: OK 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,014] INFO 0 successfully elected as leader 
> (kafka.server.ZookeeperLeaderElector)
> log4j:ERROR Failed to rename [/state-change.log] to 
> [/state-change.log.2016-11-08-18].
> [2016-11-08 21:45:44,421] INFO re-registering broker info in ZK for broker 0 
> (kafka.server.KafkaHealthcheck)
> [2016-11-08 21:45:44,436] INFO Creating /brokers/ids/0 (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,686] INFO Result of znode creation is: OK 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,686] INFO Registered broker 0 at path /brokers/ids/0 
> with addresses: PLAINTEXT -> EndPoint(192.168.0.15,9092,PLAINTEXT) 
> (kafka.utils.ZkUtils)
> [2016-11-08 21:45:44,686] INFO done re-registering broker 
> (kafka.server.KafkaHealthcheck)
> [2016-11-08 21:45:44,686] INFO Subscribing to /brokers/topics path to watch 
> for new topics (kafka.server.KafkaHealthcheck)
> [2016-11-08 21:45:45,046] INFO [ReplicaFetcherManager on broker 0] Removed 
> fetcher for partitions [test,0] (kafka.server.ReplicaFetcherManager)
> [2016-11-08 21:45:45,061] INFO New leader is 0 
> (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
> [2016-11-08 21:45:47,325] ERROR Uncaught exception in scheduled task 
> 'kafka-recovery-point-checkpoint' (kafka.utils.KafkaScheduler)
> java.io.IOException: File rename from 
> D:\tmp\kafka-logs\recovery-point-off

[jira] [Commented] (KAFKA-4391) On Windows, Kafka server stops with uncaught exception after coming back from sleep

2016-11-18 Thread Yiquan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15676418#comment-15676418
 ] 

Yiquan Zhou commented on KAFKA-4391:


I did some tests with 0.10.1.0, I still got the same issue with a different 
exception, probably due to the use of Files.move instead of Files.renameTo:
{code}
[2016-11-18 11:24:31,357] FATAL [Replica Manager on Broker 0]: Error writing to 
highwatermark file:  (kafka.server.ReplicaManager)
java.nio.file.FileAlreadyExistsException: 
D:\tmp\kafka-logs\replication-offset-checkpoint.tmp -> 
D:\tmp\kafka-logs\replication-offset-checkpoint
at 
sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:81)
at 
sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:387)
at 
sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
at java.nio.file.Files.move(Files.java:1345)
at 
org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:670)
at kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:74)
at 
kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(ReplicaManager.scala:927)
at 
kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(ReplicaManager.scala:924)
at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at 
kafka.server.ReplicaManager.checkpointHighWatermarks(ReplicaManager.scala:924)
at 
kafka.server.ReplicaManager$$anonfun$1.apply$mcV$sp(ReplicaManager.scala:162)
at 
kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:58)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Suppressed: java.nio.file.AccessDeniedException: 
D:\tmp\kafka-logs\replication-offset-checkpoint.tmp -> 
D:\tmp\kafka-logs\replication-offset-checkpoint
at 
sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:83)
at 
sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:301)
at 
sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
at java.nio.file.Files.move(Files.java:1345)
at 
org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:667)
... 17 more
{code}

So maybe this is not the root cause of the issue.


> On Windows, Kafka server stops with uncaught exception after coming back from 
> sleep
> ---
>
> Key: KAFKA-4391
> URL: https://issues.apache.org/jira/browse/KAFKA-4391
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
> Environment: Windows 10, jdk1.8.0_111
>Reporter: Yiquan Zhou
>
> Steps to reproduce:
> 1. start the zookeeper
> $ bin\windows\zookeeper-server-start.bat config/zookeeper.properties
> 2. start the Kafka server with the default properties
> $ bin\windows\kafka-server-start.bat config/server.properties
> 3. put Windows into sleep mode for 1-2 hours
> 4. activate Windows again, an exception occurs in Kafka server console and 
> the server is stopped:
> {code:title=kafka console log}
> [2016-11-08 21:45:35,185] INFO Client session timed out, have not heard from 
> server in 10081379ms for sessionid 0x1584514da47, closing socket 
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:40,698] INFO zookeeper state changed (Disconnected) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-11-08 21:45:43,029] INFO Opening socket connection to server 
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,044] INFO Socket connection established to 
> 127.0.0.1/127.0.0.1:2181, initiating session (org.ap

[jira] [Comment Edited] (KAFKA-4391) On Windows, Kafka server stops with uncaught exception after coming back from sleep

2016-11-09 Thread Yiquan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15651828#comment-15651828
 ] 

Yiquan Zhou edited comment on KAFKA-4391 at 11/9/16 7:32 PM:
-

All the "Power options" settings are set to default values. In the real use 
case of my application, I know that this exception occurs on many other Windows 
machines.

Even if the disk is deactivated during the sleep, the exception is thrown after 
coming back from the sleep mode so I think that the disk and network are all 
running again. What could prevent the Kafka server from renaming a temporary 
file?


was (Author: yqzhou):
All the "Power options" settings are set to default values. In the real use 
case of my application, I know that this exception occurs on many other Windows 
machines.

Even if the disk is deactivated during the sleep, the exception is thrown after 
coming back from the sleep mode so I think that the disk and network are all 
running again. Why the Kafka server fails to rename a temporary file?

> On Windows, Kafka server stops with uncaught exception after coming back from 
> sleep
> ---
>
> Key: KAFKA-4391
> URL: https://issues.apache.org/jira/browse/KAFKA-4391
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
> Environment: Windows 10, jdk1.8.0_111
>Reporter: Yiquan Zhou
>
> Steps to reproduce:
> 1. start the zookeeper
> $ bin\windows\zookeeper-server-start.bat config/zookeeper.properties
> 2. start the Kafka server with the default properties
> $ bin\windows\kafka-server-start.bat config/server.properties
> 3. put Windows into sleep mode for 1-2 hours
> 4. activate Windows again, an exception occurs in Kafka server console and 
> the server is stopped:
> {code:title=kafka console log}
> [2016-11-08 21:45:35,185] INFO Client session timed out, have not heard from 
> server in 10081379ms for sessionid 0x1584514da47, closing socket 
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:40,698] INFO zookeeper state changed (Disconnected) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-11-08 21:45:43,029] INFO Opening socket connection to server 
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,044] INFO Socket connection established to 
> 127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,158] INFO Unable to reconnect to ZooKeeper service, 
> session 0x1584514da47 has expired, closing socket connection 
> (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,158] INFO zookeeper state changed (Expired) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-11-08 21:45:43,236] INFO Initiating client connection, 
> connectString=localhost:2181 sessionTimeout=6000 
> watcher=org.I0Itec.zkclient.ZkClient@11ca437b (org.apache.zookeeper.ZooKeeper)
> [2016-11-08 21:45:43,280] INFO EventThread shut down 
> (org.apache.zookeeper.ClientCnxn)
> log4j:ERROR Failed to rename [/controller.log] to 
> [/controller.log.2016-11-08-18].
> [2016-11-08 21:45:43,421] INFO Opening socket connection to server 
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,483] INFO Socket connection established to 
> 127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,811] INFO Session establishment complete on server 
> 127.0.0.1/127.0.0.1:2181, sessionid = 0x1584514da470001, negotiated timeout = 
> 6000 (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,827] INFO zookeeper state changed (SyncConnected) 
> (org.I0Itec.zkclient.ZkClient)
> log4j:ERROR Failed to rename [/server.log] to [/server.log.2016-11-08-18].
> [2016-11-08 21:45:43,827] INFO Creating /controller (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,014] INFO Result of znode creation is: OK 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,014] INFO 0 successfully elected as leader 
> (kafka.server.ZookeeperLeaderElector)
> log4j:ERROR Failed to rename [/state-change.log] to 
> [/state-change.log.2016-11-08-18].
> [2016-11-08 21:45:44,421] INFO re-registering broker info in ZK for broker 0 
> (kafka.server.KafkaHealthcheck)
> [2016-11-08 21:45:44,436] INFO Creating /brokers/ids/0 (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,686] INFO Result of znode creation is: OK 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,686] INFO Registered broker 0 at path /brokers/ids/0 
> with addresses: PLAINTEXT -> EndPoint(192.168.0.15,9092,PLAINTEXT) 
> (kafka.utils.ZkUtils)
> [2016-11-08 21:45:4

[jira] [Commented] (KAFKA-4391) On Windows, Kafka server stops with uncaught exception after coming back from sleep

2016-11-09 Thread Yiquan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15651828#comment-15651828
 ] 

Yiquan Zhou commented on KAFKA-4391:


All the "Power options" settings are set to default values. In the real use 
case of my application, I know that this exception occurs on many other Windows 
machines.

Even if the disk is deactivated during the sleep, the exception is thrown after 
coming back from the sleep mode so I think that the disk and network are all 
running again. Why the Kafka server fails to rename a temporary file?

> On Windows, Kafka server stops with uncaught exception after coming back from 
> sleep
> ---
>
> Key: KAFKA-4391
> URL: https://issues.apache.org/jira/browse/KAFKA-4391
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
> Environment: Windows 10, jdk1.8.0_111
>Reporter: Yiquan Zhou
>
> Steps to reproduce:
> 1. start the zookeeper
> $ bin\windows\zookeeper-server-start.bat config/zookeeper.properties
> 2. start the Kafka server with the default properties
> $ bin\windows\kafka-server-start.bat config/server.properties
> 3. put Windows into sleep mode for 1-2 hours
> 4. activate Windows again, an exception occurs in Kafka server console and 
> the server is stopped:
> {code:title=kafka console log}
> [2016-11-08 21:45:35,185] INFO Client session timed out, have not heard from 
> server in 10081379ms for sessionid 0x1584514da47, closing socket 
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:40,698] INFO zookeeper state changed (Disconnected) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-11-08 21:45:43,029] INFO Opening socket connection to server 
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,044] INFO Socket connection established to 
> 127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,158] INFO Unable to reconnect to ZooKeeper service, 
> session 0x1584514da47 has expired, closing socket connection 
> (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,158] INFO zookeeper state changed (Expired) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-11-08 21:45:43,236] INFO Initiating client connection, 
> connectString=localhost:2181 sessionTimeout=6000 
> watcher=org.I0Itec.zkclient.ZkClient@11ca437b (org.apache.zookeeper.ZooKeeper)
> [2016-11-08 21:45:43,280] INFO EventThread shut down 
> (org.apache.zookeeper.ClientCnxn)
> log4j:ERROR Failed to rename [/controller.log] to 
> [/controller.log.2016-11-08-18].
> [2016-11-08 21:45:43,421] INFO Opening socket connection to server 
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,483] INFO Socket connection established to 
> 127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,811] INFO Session establishment complete on server 
> 127.0.0.1/127.0.0.1:2181, sessionid = 0x1584514da470001, negotiated timeout = 
> 6000 (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,827] INFO zookeeper state changed (SyncConnected) 
> (org.I0Itec.zkclient.ZkClient)
> log4j:ERROR Failed to rename [/server.log] to [/server.log.2016-11-08-18].
> [2016-11-08 21:45:43,827] INFO Creating /controller (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,014] INFO Result of znode creation is: OK 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,014] INFO 0 successfully elected as leader 
> (kafka.server.ZookeeperLeaderElector)
> log4j:ERROR Failed to rename [/state-change.log] to 
> [/state-change.log.2016-11-08-18].
> [2016-11-08 21:45:44,421] INFO re-registering broker info in ZK for broker 0 
> (kafka.server.KafkaHealthcheck)
> [2016-11-08 21:45:44,436] INFO Creating /brokers/ids/0 (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,686] INFO Result of znode creation is: OK 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,686] INFO Registered broker 0 at path /brokers/ids/0 
> with addresses: PLAINTEXT -> EndPoint(192.168.0.15,9092,PLAINTEXT) 
> (kafka.utils.ZkUtils)
> [2016-11-08 21:45:44,686] INFO done re-registering broker 
> (kafka.server.KafkaHealthcheck)
> [2016-11-08 21:45:44,686] INFO Subscribing to /brokers/topics path to watch 
> for new topics (kafka.server.KafkaHealthcheck)
> [2016-11-08 21:45:45,046] INFO [ReplicaFetcherManager on broker 0] Removed 
> fetcher for partitions [test,0] (kafka.server.ReplicaFetcherManager)
> [2016-11-08 21:45:45,061] INFO New leader is 0 
> (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
> [2016-11-08 21:45:47,

[jira] [Updated] (KAFKA-4391) On Windows, Kafka server stops with uncaught exception after coming back from sleep

2016-11-08 Thread Yiquan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiquan Zhou updated KAFKA-4391:
---
Description: 
Steps to reproduce:
1. start the zookeeper
$ bin\windows\zookeeper-server-start.bat config/zookeeper.properties

2. start the Kafka server with the default properties
$ bin\windows\kafka-server-start.bat config/server.properties

3. put Windows into sleep mode for 1-2 hours

4. activate Windows again, an exception occurs in Kafka server console and the 
server is stopped:
{code:title=kafka console log}
[2016-11-08 21:45:35,185] INFO Client session timed out, have not heard from 
server in 10081379ms for sessionid 0x1584514da47, closing socket connection 
and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:40,698] INFO zookeeper state changed (Disconnected) 
(org.I0Itec.zkclient.ZkClient)
[2016-11-08 21:45:43,029] INFO Opening socket connection to server 
127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown 
error) (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,044] INFO Socket connection established to 
127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,158] INFO Unable to reconnect to ZooKeeper service, 
session 0x1584514da47 has expired, closing socket connection 
(org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,158] INFO zookeeper state changed (Expired) 
(org.I0Itec.zkclient.ZkClient)
[2016-11-08 21:45:43,236] INFO Initiating client connection, 
connectString=localhost:2181 sessionTimeout=6000 
watcher=org.I0Itec.zkclient.ZkClient@11ca437b (org.apache.zookeeper.ZooKeeper)
[2016-11-08 21:45:43,280] INFO EventThread shut down 
(org.apache.zookeeper.ClientCnxn)
log4j:ERROR Failed to rename [/controller.log] to 
[/controller.log.2016-11-08-18].
[2016-11-08 21:45:43,421] INFO Opening socket connection to server 
127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown 
error) (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,483] INFO Socket connection established to 
127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,811] INFO Session establishment complete on server 
127.0.0.1/127.0.0.1:2181, sessionid = 0x1584514da470001, negotiated timeout = 
6000 (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,827] INFO zookeeper state changed (SyncConnected) 
(org.I0Itec.zkclient.ZkClient)
log4j:ERROR Failed to rename [/server.log] to [/server.log.2016-11-08-18].
[2016-11-08 21:45:43,827] INFO Creating /controller (is it secure? false) 
(kafka.utils.ZKCheckedEphemeral)
[2016-11-08 21:45:44,014] INFO Result of znode creation is: OK 
(kafka.utils.ZKCheckedEphemeral)
[2016-11-08 21:45:44,014] INFO 0 successfully elected as leader 
(kafka.server.ZookeeperLeaderElector)
log4j:ERROR Failed to rename [/state-change.log] to 
[/state-change.log.2016-11-08-18].
[2016-11-08 21:45:44,421] INFO re-registering broker info in ZK for broker 0 
(kafka.server.KafkaHealthcheck)
[2016-11-08 21:45:44,436] INFO Creating /brokers/ids/0 (is it secure? false) 
(kafka.utils.ZKCheckedEphemeral)
[2016-11-08 21:45:44,686] INFO Result of znode creation is: OK 
(kafka.utils.ZKCheckedEphemeral)
[2016-11-08 21:45:44,686] INFO Registered broker 0 at path /brokers/ids/0 with 
addresses: PLAINTEXT -> EndPoint(192.168.0.15,9092,PLAINTEXT) 
(kafka.utils.ZkUtils)
[2016-11-08 21:45:44,686] INFO done re-registering broker 
(kafka.server.KafkaHealthcheck)
[2016-11-08 21:45:44,686] INFO Subscribing to /brokers/topics path to watch for 
new topics (kafka.server.KafkaHealthcheck)
[2016-11-08 21:45:45,046] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [test,0] (kafka.server.ReplicaFetcherManager)
[2016-11-08 21:45:45,061] INFO New leader is 0 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2016-11-08 21:45:47,325] ERROR Uncaught exception in scheduled task 
'kafka-recovery-point-checkpoint' (kafka.utils.KafkaScheduler)
java.io.IOException: File rename from 
D:\tmp\kafka-logs\recovery-point-offset-checkpoint.tmp to 
D:\tmp\kafka-logs\recovery-point-offset-checkpoint failed.
at kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:66)
at 
kafka.log.LogManager.kafka$log$LogManager$$checkpointLogsInDir(LogManager.scala:326)
at 
kafka.log.LogManager$$anonfun$checkpointRecoveryPointOffsets$1.apply(LogManager.scala:317)
at 
kafka.log.LogManager$$anonfun$checkpointRecoveryPointOffsets$1.apply(LogManager.scala:317)
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at 
kafka.log.LogManager.checkpointRecoveryPointOffsets(LogManager.scala:317)
at 
kafka.log.LogManager$$anonfun$startup$3.apply$mcV$sp(LogManager.scala:201)
at 
kafka.utils.KafkaSchedul

[jira] [Created] (KAFKA-4391) On Windows, Kafka server stops with uncaught exception after coming back from sleep

2016-11-08 Thread Yiquan Zhou (JIRA)
Yiquan Zhou created KAFKA-4391:
--

 Summary: On Windows, Kafka server stops with uncaught exception 
after coming back from sleep
 Key: KAFKA-4391
 URL: https://issues.apache.org/jira/browse/KAFKA-4391
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.9.0.1
 Environment: Windows 10, jdk1.8.0_111
Reporter: Yiquan Zhou


Steps to reproduce:
1. start the zookeeper
$ bin/zookeeper-server-start.sh config/zookeeper.properties

2. start the Kafka server with the default properties
$ bin/kafka-server-start.sh config/server.properties

3. put Windows into sleep mode for 1-2 hours

4. activate Windows again, an exception occurs in Kafka server console and the 
server is stopped:
{code:title=kafka console log}
[2016-11-08 21:45:35,185] INFO Client session timed out, have not heard from 
server in 10081379ms for sessionid 0x1584514da47, closing socket connection 
and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:40,698] INFO zookeeper state changed (Disconnected) 
(org.I0Itec.zkclient.ZkClient)
[2016-11-08 21:45:43,029] INFO Opening socket connection to server 
127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown 
error) (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,044] INFO Socket connection established to 
127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,158] INFO Unable to reconnect to ZooKeeper service, 
session 0x1584514da47 has expired, closing socket connection 
(org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,158] INFO zookeeper state changed (Expired) 
(org.I0Itec.zkclient.ZkClient)
[2016-11-08 21:45:43,236] INFO Initiating client connection, 
connectString=localhost:2181 sessionTimeout=6000 
watcher=org.I0Itec.zkclient.ZkClient@11ca437b (org.apache.zookeeper.ZooKeeper)
[2016-11-08 21:45:43,280] INFO EventThread shut down 
(org.apache.zookeeper.ClientCnxn)
log4j:ERROR Failed to rename [/controller.log] to 
[/controller.log.2016-11-08-18].
[2016-11-08 21:45:43,421] INFO Opening socket connection to server 
127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown 
error) (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,483] INFO Socket connection established to 
127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,811] INFO Session establishment complete on server 
127.0.0.1/127.0.0.1:2181, sessionid = 0x1584514da470001, negotiated timeout = 
6000 (org.apache.zookeeper.ClientCnxn)
[2016-11-08 21:45:43,827] INFO zookeeper state changed (SyncConnected) 
(org.I0Itec.zkclient.ZkClient)
log4j:ERROR Failed to rename [/server.log] to [/server.log.2016-11-08-18].
[2016-11-08 21:45:43,827] INFO Creating /controller (is it secure? false) 
(kafka.utils.ZKCheckedEphemeral)
[2016-11-08 21:45:44,014] INFO Result of znode creation is: OK 
(kafka.utils.ZKCheckedEphemeral)
[2016-11-08 21:45:44,014] INFO 0 successfully elected as leader 
(kafka.server.ZookeeperLeaderElector)
log4j:ERROR Failed to rename [/state-change.log] to 
[/state-change.log.2016-11-08-18].
[2016-11-08 21:45:44,421] INFO re-registering broker info in ZK for broker 0 
(kafka.server.KafkaHealthcheck)
[2016-11-08 21:45:44,436] INFO Creating /brokers/ids/0 (is it secure? false) 
(kafka.utils.ZKCheckedEphemeral)
[2016-11-08 21:45:44,686] INFO Result of znode creation is: OK 
(kafka.utils.ZKCheckedEphemeral)
[2016-11-08 21:45:44,686] INFO Registered broker 0 at path /brokers/ids/0 with 
addresses: PLAINTEXT -> EndPoint(192.168.0.15,9092,PLAINTEXT) 
(kafka.utils.ZkUtils)
[2016-11-08 21:45:44,686] INFO done re-registering broker 
(kafka.server.KafkaHealthcheck)
[2016-11-08 21:45:44,686] INFO Subscribing to /brokers/topics path to watch for 
new topics (kafka.server.KafkaHealthcheck)
[2016-11-08 21:45:45,046] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [test,0] (kafka.server.ReplicaFetcherManager)
[2016-11-08 21:45:45,061] INFO New leader is 0 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2016-11-08 21:45:47,325] ERROR Uncaught exception in scheduled task 
'kafka-recovery-point-checkpoint' (kafka.utils.KafkaScheduler)
java.io.IOException: File rename from 
D:\tmp\kafka-logs\recovery-point-offset-checkpoint.tmp to 
D:\tmp\kafka-logs\recovery-point-offset-checkpoint failed.
at kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:66)
at 
kafka.log.LogManager.kafka$log$LogManager$$checkpointLogsInDir(LogManager.scala:326)
at 
kafka.log.LogManager$$anonfun$checkpointRecoveryPointOffsets$1.apply(LogManager.scala:317)
at 
kafka.log.LogManager$$anonfun$checkpointRecoveryPointOffsets$1.apply(LogManager.scala:317)
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.sc

[jira] [Commented] (KAFKA-4348) On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on Kafka server

2016-11-02 Thread Yiquan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15628551#comment-15628551
 ] 

Yiquan Zhou commented on KAFKA-4348:


Yes it's very likely the same issue as KAFKA-3135. I can reproduce with the 
code from that issue as well. I've tested with a larger value of 
receive.buffer.bytes (~64K), it did workaround the issue. So we can close the 
issue as duplicated and I'll follow the other one. Thank you both for your help.


> On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on 
> Kafka server
> -
>
> Key: KAFKA-4348
> URL: https://issues.apache.org/jira/browse/KAFKA-4348
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.1
> Environment: Mac OS X EI Capitan, Java 1.8.0_111
>Reporter: Yiquan Zhou
>  Labels: consumer, mac, polling
>
> Steps to reproduce:
> 1. start the zookeeper and kafka server using the default properties from the 
> distribution: 
> $ bin/zookeeper-server-start.sh config/zookeeper.properties
> $ bin/kafka-server-start.sh config/server.properties 
> 2. create a Kafka consumer using the Java API KafkaConsumer.poll(long 
> timeout). It polls the records from the server every second (timeout set to 
> 1000) and prints the number of records polled. The code can be found here: 
> https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596
> 3. use bin/kafka-verifiable-producer.sh to generate some messages: 
> $ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
> --broker-list localhost:9092
> wait until all 200k messages are generated and sent to the server. 
> 4. Run the consumer Java code. In the output console of the consumer, we can 
> see that the consumer starts to poll some records, then it polls 0 records 
> for several seconds before polling some more. like this:
> polled 27160 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26886 records
> polled 26886 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26701 records
> polled 26214 records
> The bug slows down the consumption of messages a lot. And in our use case, 
> the consumer wrongly assumes that all messages are read from the topic.
> It is only reproducible on Mac OS X but neither on Linux nor Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4348) On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on Kafka server

2016-11-01 Thread Yiquan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15625312#comment-15625312
 ] 

Yiquan Zhou commented on KAFKA-4348:


I think I get the issue with the console consumer as well... Here is what I did 
with kafka_2.11-0.9.0.1 distribution:

I started the zookeeper and the kafka-server, run verifiable-producer to send 
200k messages. Then I started the console consumer with command:

$ bin/kafka-console-consumer.sh --zookeeper localhost:2181 --bootstrap-server 
localhost:9092 --topic connect-test --from-beginning --new-consumer 

I can see that messages got printed on the console but then it freezed for ~5s 
before printing some more, exactly the same behavior as calling 
KafkaConsumer.poll. 

But if I run the console consumer without option --new-consumer, it seems that 
the issue doesn't occur. Messages are printed out continuously.

I've run the tests on two Macbook pro and I both got the issue, although they 
have similar configurations... Is there any way that the network settings can 
have any impact on this issue?

> On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on 
> Kafka server
> -
>
> Key: KAFKA-4348
> URL: https://issues.apache.org/jira/browse/KAFKA-4348
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.1
> Environment: Mac OS X EI Capitan, Java 1.8.0_111
>Reporter: Yiquan Zhou
>  Labels: consumer, mac, polling
>
> Steps to reproduce:
> 1. start the zookeeper and kafka server using the default properties from the 
> distribution: 
> $ bin/zookeeper-server-start.sh config/zookeeper.properties
> $ bin/kafka-server-start.sh config/server.properties 
> 2. create a Kafka consumer using the Java API KafkaConsumer.poll(long 
> timeout). It polls the records from the server every second (timeout set to 
> 1000) and prints the number of records polled. The code can be found here: 
> https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596
> 3. use bin/kafka-verifiable-producer.sh to generate some messages: 
> $ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
> --broker-list localhost:9092
> wait until all 200k messages are generated and sent to the server. 
> 4. Run the consumer Java code. In the output console of the consumer, we can 
> see that the consumer starts to poll some records, then it polls 0 records 
> for several seconds before polling some more. like this:
> polled 27160 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26886 records
> polled 26886 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26701 records
> polled 26214 records
> The bug slows down the consumption of messages a lot. And in our use case, 
> the consumer wrongly assumes that all messages are read from the topic.
> It is only reproducible on Mac OS X but neither on Linux nor Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4348) On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on Kafka server

2016-10-31 Thread Yiquan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15622015#comment-15622015
 ] 

Yiquan Zhou commented on KAFKA-4348:


Did you get any "polled 0 records" before polling all the records? Because 
sometimes the consumer did poll all the records without interruption like what 
you got, but only after several seconds polling 0 record. This happens often 
right after the start of the server. But if you run the test for 4-5 more times 
you might get the same result as I did.

If I set "enable.auto.commit" to true, here is an output that I got:

Mon Oct 31 11:25:39 CET 2016 polled 0 records // first poll
Mon Oct 31 11:25:40 CET 2016 polled 0 records
Mon Oct 31 11:25:41 CET 2016 polled 0 records
Mon Oct 31 11:25:42 CET 2016 polled 0 records
Mon Oct 31 11:25:43 CET 2016 polled 0 records
Mon Oct 31 11:25:43 CET 2016 polled 27171 records
Mon Oct 31 11:25:43 CET 2016 polled 26886 records
Mon Oct 31 11:25:43 CET 2016 polled 26886 records
Mon Oct 31 11:25:44 CET 2016 polled 0 records
Mon Oct 31 11:25:45 CET 2016 polled 0 records
Mon Oct 31 11:25:46 CET 2016 polled 0 records
Mon Oct 31 11:25:47 CET 2016 polled 0 records
Mon Oct 31 11:25:48 CET 2016 polled 0 records
Mon Oct 31 11:25:48 CET 2016 polled 26690 records
Mon Oct 31 11:25:49 CET 2016 polled 26214 records
Mon Oct 31 11:25:49 CET 2016 polled 26214 records
Mon Oct 31 11:25:49 CET 2016 polled 26214 records
Mon Oct 31 11:25:49 CET 2016 polled 13725 records
Mon Oct 31 11:25:50 CET 2016 polled 0 records
Mon Oct 31 11:25:51 CET 2016 polled 0 records
Mon Oct 31 11:25:52 CET 2016 polled 0 records

> On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on 
> Kafka server
> -
>
> Key: KAFKA-4348
> URL: https://issues.apache.org/jira/browse/KAFKA-4348
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.1
> Environment: Mac OS X EI Capitan, Java 1.8.0_111
>Reporter: Yiquan Zhou
>  Labels: consumer, mac, polling
>
> Steps to reproduce:
> 1. start the zookeeper and kafka server using the default properties from the 
> distribution: 
> $ bin/zookeeper-server-start.sh config/zookeeper.properties
> $ bin/kafka-server-start.sh config/server.properties 
> 2. create a Kafka consumer using the Java API KafkaConsumer.poll(long 
> timeout). It polls the records from the server every second (timeout set to 
> 1000) and prints the number of records polled. The code can be found here: 
> https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596
> 3. use bin/kafka-verifiable-producer.sh to generate some messages: 
> $ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
> --broker-list localhost:9092
> wait until all 200k messages are generated and sent to the server. 
> 4. Run the consumer Java code. In the output console of the consumer, we can 
> see that the consumer starts to poll some records, then it polls 0 records 
> for several seconds before polling some more. like this:
> polled 27160 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26886 records
> polled 26886 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26701 records
> polled 26214 records
> The bug slows down the consumption of messages a lot. And in our use case, 
> the consumer wrongly assumes that all messages are read from the topic.
> It is only reproducible on Mac OS X but neither on Linux nor Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4348) On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on Kafka server

2016-10-31 Thread Yiquan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15621657#comment-15621657
 ] 

Yiquan Zhou commented on KAFKA-4348:


No, the server is started with the default properties from the distribution, as 
well as the producer. On the client side the code I used to reproduce is in the 
gist. Only the following properties are specified: "bootstrap.servers", 
"group.id", "key.deserializer", "value.deserializer" and eventually 
"auto.offset.reset". So I think it should not be too complicated to reproduce 
the issue.

> On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on 
> Kafka server
> -
>
> Key: KAFKA-4348
> URL: https://issues.apache.org/jira/browse/KAFKA-4348
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.1
> Environment: Mac OS X EI Capitan, Java 1.8.0_111
>Reporter: Yiquan Zhou
>  Labels: consumer, mac, polling
>
> Steps to reproduce:
> 1. start the zookeeper and kafka server using the default properties from the 
> distribution: 
> $ bin/zookeeper-server-start.sh config/zookeeper.properties
> $ bin/kafka-server-start.sh config/server.properties 
> 2. create a Kafka consumer using the Java API KafkaConsumer.poll(long 
> timeout). It polls the records from the server every second (timeout set to 
> 1000) and prints the number of records polled. The code can be found here: 
> https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596
> 3. use bin/kafka-verifiable-producer.sh to generate some messages: 
> $ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
> --broker-list localhost:9092
> wait until all 200k messages are generated and sent to the server. 
> 4. Run the consumer Java code. In the output console of the consumer, we can 
> see that the consumer starts to poll some records, then it polls 0 records 
> for several seconds before polling some more. like this:
> polled 27160 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26886 records
> polled 26886 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26701 records
> polled 26214 records
> The bug slows down the consumption of messages a lot. And in our use case, 
> the consumer wrongly assumes that all messages are read from the topic.
> It is only reproducible on Mac OS X but neither on Linux nor Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4348) On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on Kafka server

2016-10-28 Thread Yiquan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614582#comment-15614582
 ] 

Yiquan Zhou commented on KAFKA-4348:


The issue occurs when there are large numbers of records on the server to be 
read. What I did for reproducing the issue is to run the verifiable producer 
(acks=1 by default I think) and wait until all 200k records or more to be sent 
("sent":20,"acked":20) and THEN start the consumer. The consumer will 
read the max number of records corresponding to "max.partition.fetch.bytes" at 
each poll, 0 during several seconds and then some more records.

If I start the consumer at the same time or before running the producer, the 
consumer will poll fewer records each time but continuously (no "poll 0 
records"). So it works in this case.

> On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on 
> Kafka server
> -
>
> Key: KAFKA-4348
> URL: https://issues.apache.org/jira/browse/KAFKA-4348
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.1
> Environment: Mac OS X EI Capitan, Java 1.8.0_111
>Reporter: Yiquan Zhou
>  Labels: consumer, mac, polling
>
> Steps to reproduce:
> 1. start the zookeeper and kafka server using the default properties from the 
> distribution: 
> $ bin/zookeeper-server-start.sh config/zookeeper.properties
> $ bin/kafka-server-start.sh config/server.properties 
> 2. create a Kafka consumer using the Java API KafkaConsumer.poll(long 
> timeout). It polls the records from the server every second (timeout set to 
> 1000) and prints the number of records polled. The code can be found here: 
> https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596
> 3. use bin/kafka-verifiable-producer.sh to generate some messages: 
> $ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
> --broker-list localhost:9092
> wait until all 200k messages are generated and sent to the server. 
> 4. Run the consumer Java code. In the output console of the consumer, we can 
> see that the consumer starts to poll some records, then it polls 0 records 
> for several seconds before polling some more. like this:
> polled 27160 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26886 records
> polled 26886 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26701 records
> polled 26214 records
> The bug slows down the consumption of messages a lot. And in our use case, 
> the consumer wrongly assumes that all messages are read from the topic.
> It is only reproducible on Mac OS X but neither on Linux nor Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4348) On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on Kafka server

2016-10-27 Thread Yiquan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15610972#comment-15610972
 ] 

Yiquan Zhou commented on KAFKA-4348:


Thank you for looking into this issue. I just ran the test again by setting the 
consumer configuration "auto.offset.reset" to "earliest" as you suggested, but 
I can still reproduce the issue. And I verified that in the code of my real 
application where I got the issue, the value has been set to "earliest" as 
well. 

> On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on 
> Kafka server
> -
>
> Key: KAFKA-4348
> URL: https://issues.apache.org/jira/browse/KAFKA-4348
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.1
> Environment: Mac OS X EI Capitan, Java 1.8.0_111
>Reporter: Yiquan Zhou
>  Labels: consumer, mac, polling
>
> Steps to reproduce:
> 1. start the zookeeper and kafka server using the default properties from the 
> distribution: 
> $ bin/zookeeper-server-start.sh config/zookeeper.properties
> $ bin/kafka-server-start.sh config/server.properties 
> 2. create a Kafka consumer using the Java API KafkaConsumer.poll(long 
> timeout). It polls the records from the server every second (timeout set to 
> 1000) and prints the number of records polled. The code can be found here: 
> https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596
> 3. use bin/kafka-verifiable-producer.sh to generate some messages: 
> $ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
> --broker-list localhost:9092
> wait until all 200k messages are generated and sent to the server. 
> 4. Run the consumer Java code. In the output console of the consumer, we can 
> see that the consumer starts to poll some records, then it polls 0 records 
> for several seconds before polling some more. like this:
> polled 27160 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26886 records
> polled 26886 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26701 records
> polled 26214 records
> The bug slows down the consumption of messages a lot. And in our use case, 
> the consumer wrongly assumes that all messages are read from the topic.
> It is only reproducible on Mac OS X but neither on Linux nor Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4348) On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on Kafka server

2016-10-26 Thread Yiquan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiquan Zhou updated KAFKA-4348:
---
Description: 
Steps to reproduce:
1. start the zookeeper and kafka server using the default properties from the 
distribution: 
$ bin/zookeeper-server-start.sh config/zookeeper.properties
$ bin/kafka-server-start.sh config/server.properties 

2. create a Kafka consumer using the Java API KafkaConsumer.poll(long timeout). 
It polls the records from the server every second (timeout set to 1000) and 
prints the number of records polled. The code can be found here: 
https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596

3. use bin/kafka-verifiable-producer.sh to generate some messages: 
$ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
--broker-list localhost:9092
wait until all 200k messages are generated and sent to the server. 

4. Run the consumer Java code. In the output console of the consumer, we can 
see that the consumer starts to poll some records, then it polls 0 records for 
several seconds before polling some more. like this:

polled 27160 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 26886 records
polled 26886 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 26701 records
polled 26214 records

The bug slows down the consumption of messages a lot. And in our use case, the 
consumer wrongly assumes that all messages are read from the topic.

It is only reproducible on Mac OS X but neither on Linux nor Windows.

  was:

Steps to reproduce:
1. start the zookeeper and kafka server using the default properties from the 
distribution: 
$ bin/zookeeper-server-start.sh config/zookeeper.properties
$ bin/kafka-server-start.sh config/server.properties 

2. create a Kafka consumer using the Java API KafkaConsumer.poll(long timeout). 
It polls the records from the server every second (timeout set to 1000) and 
prints the number of records polled. The code can be found here: 
https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596

3. use bin/kafka-verifiable-producer.sh to generate some messages: 
$ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
--broker-list localhost:9092
wait until all 200k messages are generated and sent to the server. 

4. Run the consumer Java code. In the output console of the consumer, we can 
see that the consumer starts to poll some records, then it polls 0 records for 
several seconds before polling some more. like this:

polled 27160 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 26886 records
polled 26886 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 26701 records
polled 26214 records

The impact of this bug is that it slows down the consumption of messages a lot. 
And if an eventListener is attached to the idle event, it could be triggered 
even if there are still messages on the server.

It is only reproducible on Mac OS X but neither on Linux nor Windows.


> On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on 
> Kafka server
> -
>
> Key: KAFKA-4348
> URL: https://issues.apache.org/jira/browse/KAFKA-4348
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.1
> Environment: Mac OS X EI Capitan, Java 1.8.0_111
>Reporter: Yiquan Zhou
>  Labels: consumer, mac, polling
>
> Steps to reproduce:
> 1. start the zookeeper and kafka server using the default properties from the 
> distribution: 
> $ bin/zookeeper-server-start.sh config/zookeeper.properties
> $ bin/kafka-server-start.sh config/server.properties 
> 2. create a Kafka consumer using the Java API KafkaConsumer.poll(long 
> timeout). It polls the records from the server every second (timeout set to 
> 1000) and prints the number of records polled. The code can be found here: 
> https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596
> 3. use bin/kafka-verifiable-producer.sh to generate some messages: 
> $ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
> --broker-list localhost:9092
> wait until all 200k messages are generated and sent to the server. 
> 4. Run the consumer Java code. In the output console of the consumer, we can 
> see that the consumer starts to poll some records, then it polls 0 records 
> for several seconds before polling some more. like this:
> polled 27160 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26886 records
> polled 26886 records
> polled 0 records
> polled 0 rec

[jira] [Updated] (KAFKA-4348) On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on Kafka server

2016-10-26 Thread Yiquan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiquan Zhou updated KAFKA-4348:
---
Environment: Max OS X EI Capitan, Java 1.8.0_111  (was: Max OS X EI 
Capitan, Java 1.8.0_77)

> On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on 
> Kafka server
> -
>
> Key: KAFKA-4348
> URL: https://issues.apache.org/jira/browse/KAFKA-4348
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.1
> Environment: Max OS X EI Capitan, Java 1.8.0_111
>Reporter: Yiquan Zhou
>  Labels: consumer, mac, polling
>
> Steps to reproduce:
> 1. start the zookeeper and kafka server using the default properties from the 
> distribution: 
> $ bin/zookeeper-server-start.sh config/zookeeper.properties
> $ bin/kafka-server-start.sh config/server.properties 
> 2. create a Kafka consumer using the Java API KafkaConsumer.poll(long 
> timeout). It polls the records from the server every second (timeout set to 
> 1000) and prints the number of records polled. The code can be found here: 
> https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596
> 3. use bin/kafka-verifiable-producer.sh to generate some messages: 
> $ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
> --broker-list localhost:9092
> wait until all 200k messages are generated and sent to the server. 
> 4. Run the consumer Java code. In the output console of the consumer, we can 
> see that the consumer starts to poll some records, then it polls 0 records 
> for several seconds before polling some more. like this:
> polled 27160 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26886 records
> polled 26886 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26701 records
> polled 26214 records
> The impact of this bug is that it slows down the consumption of messages a 
> lot. And if an eventListener is attached to the idle event, it could be 
> triggered even if there are still messages on the server.
> It is only reproducible on Mac OS X but neither on Linux nor Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4348) On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on Kafka server

2016-10-26 Thread Yiquan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiquan Zhou updated KAFKA-4348:
---
Environment: Mac OS X EI Capitan, Java 1.8.0_111  (was: Max OS X EI 
Capitan, Java 1.8.0_111)

> On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on 
> Kafka server
> -
>
> Key: KAFKA-4348
> URL: https://issues.apache.org/jira/browse/KAFKA-4348
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.1
> Environment: Mac OS X EI Capitan, Java 1.8.0_111
>Reporter: Yiquan Zhou
>  Labels: consumer, mac, polling
>
> Steps to reproduce:
> 1. start the zookeeper and kafka server using the default properties from the 
> distribution: 
> $ bin/zookeeper-server-start.sh config/zookeeper.properties
> $ bin/kafka-server-start.sh config/server.properties 
> 2. create a Kafka consumer using the Java API KafkaConsumer.poll(long 
> timeout). It polls the records from the server every second (timeout set to 
> 1000) and prints the number of records polled. The code can be found here: 
> https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596
> 3. use bin/kafka-verifiable-producer.sh to generate some messages: 
> $ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
> --broker-list localhost:9092
> wait until all 200k messages are generated and sent to the server. 
> 4. Run the consumer Java code. In the output console of the consumer, we can 
> see that the consumer starts to poll some records, then it polls 0 records 
> for several seconds before polling some more. like this:
> polled 27160 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26886 records
> polled 26886 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26701 records
> polled 26214 records
> The impact of this bug is that it slows down the consumption of messages a 
> lot. And if an eventListener is attached to the idle event, it could be 
> triggered even if there are still messages on the server.
> It is only reproducible on Mac OS X but neither on Linux nor Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4348) On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on Kafka server

2016-10-26 Thread Yiquan Zhou (JIRA)
Yiquan Zhou created KAFKA-4348:
--

 Summary: On Mac OS, KafkaConsumer.poll returns 0 when there are 
still messages on Kafka server
 Key: KAFKA-4348
 URL: https://issues.apache.org/jira/browse/KAFKA-4348
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 0.10.0.1, 0.9.0.1, 0.9.0.0
 Environment: Max OS X EI Capitan, Java 1.8.0_77
Reporter: Yiquan Zhou



Steps to reproduce:
1. start the zookeeper and kafka server using the default properties from the 
distribution: 
$ bin/zookeeper-server-start.sh config/zookeeper.properties
$ bin/kafka-server-start.sh config/server.properties 

2. create a Kafka consumer using the Java API KafkaConsumer.poll(long timeout). 
It polls the records from the server every second (timeout set to 1000) and 
prints the number of records polled. The code can be found here: 
https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596

3. use bin/kafka-verifiable-producer.sh to generate some messages: 
$ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
--broker-list localhost:9092
wait until all 200k messages are generated and sent to the server. 

4. Run the consumer Java code. In the output console of the consumer, we can 
see that the consumer starts to poll some records, then it polls 0 records for 
several seconds before polling some more. like this:

polled 27160 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 26886 records
polled 26886 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 0 records
polled 26701 records
polled 26214 records

The impact of this bug is that it slows down the consumption of messages a lot. 
And if an eventListener is attached to the idle event, it could be triggered 
even if there are still messages on the server.

It is only reproducible on Mac OS X but neither on Linux nor Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)