Hey Everyone, I have been experiencing some problems and have not been able to find sufficient documentation online to help me figure out the root cuase. I have a 3 node kafka cluster set up on 3 different Windows 2012 servers. The cluster runs perfectly fine, ingesting data that is being sent via filebeat and winlogbeat (Elastic log collectors). I then have my logstash cluster configured to consume the data from Kafka. But pretty frequently, the data will stop coming in to my Logstash cluster. When I go to check on Kafak, it looks like something keeps preventing it from reaching the broker. When I run " .\kafka-console-consumer.bat --bootstrap-server localhost:9092 --topic winlogbeat" I get the following message.
[2018-06-29 13:48:13,921] WARN [Consumer clientId=consumer-1, groupId=console-consumer-34649] Connection to node -1 coul d not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) [2018-06-29 13:48:14,999] WARN [Consumer clientId=consumer-1, groupId=console-consumer-34649] Connection to node -1 coul d not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) The only way to fix it is to stop Kafka, remove all of the _consumer_offset files in C:\kafka\var\data and then starting Kafka again. I see these errors in the Zookeeper logs: [2018-06-25 04:54:12,305] WARN Interrupted while waiting for message on queue (org.apache.zookeeper.server.quorum.QuorumCnxManager) java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088) at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418) at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1097) at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:74) at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:932) [2018-06-25 04:54:12,305] WARN Send worker leaving thread (org.apache.zookeeper.server.quorum.QuorumCnxManager) [2018-06-25 04:55:19,336] INFO Received connection request /10.1.17.93:57654 (org.apache.zookeeper.server.quorum.QuorumCnxManager) [2018-06-25 04:55:23,711] WARN Exception reading or writing challenge: java.net.SocketException: Connection reset (org.apache.zookeeper.server.quorum.QuorumCnxManager) [2018-06-25 04:55:23,711] INFO Received connection request /10.1.17.93:57655 (org.apache.zookeeper.server.quorum.QuorumCnxManager) [2018-06-25 04:55:23,711] WARN Exception reading or writing challenge: java.net.SocketException: Connection reset (org.apache.zookeeper.server.quorum.QuorumCnxManager) [2018-06-25 05:03:02,134] INFO Received connection request /10.1.17.93:59024 (org.apache.zookeeper.server.quorum.QuorumCnxManager) And I see these errors on Kafka: log4j:ERROR Failed to rename [/logs/server.log] to [/logs/server.log.2018-06-25-13]. log4j:ERROR Failed to rename [/logs/log-cleaner.log] to [/logs/log-cleaner.log.2018-06-25-13]. log4j:ERROR Failed to rename [/logs/server.log] to [/logs/server.log.2018-06-25-14]. log4j:ERROR Failed to rename [/logs/log-cleaner.log] to [/logs/log-cleaner.log.2018-06-25-14]. log4j:ERROR Failed to rename [/logs/server.log] to [/logs/server.log.2018-06-25-15]. Can anyone help me out? -Steve ________________________________ The information contained in this email message is intended only for use of the individual or entity named above. If the reader of this message is not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify us by email, postmas...@weil.com, and destroy the original message. Thank you.