Re: Apache zookeeper going down every 168 hours
Thanks for sharing logs. Kafka has a mechanism to mark log dir as "failed" when IOException happens on I/O operations, and it will shut down when all log dirs have marked as failed. (Kafka allows to set multiple log dirs for JBOD) >From server.log, we can see that the Kafka broker shut down because of this. `[2024-04-28 11:50:46,466] ERROR Shutdown broker because all log dirs in C:\kafka\data\logs have failed (kafka.log.LogManager)` And, seems the IOException caused this situation shows below message: `The process cannot access the file because it is being used by another process.` I can find some similar issues on windows on Kafka JIRA (e.g. https://issues.apache.org/jira/browse/KAFKA-8172) I never run Kafka on windows so I'm not sure if it works though, you may try patch on this ticket. By the way, running Kafka on windows might be challenging (especially on Production environment) so I recommend you to try on Linux (or on WSL at least) Thanks, 2024年5月4日(土) 10:20 Yogeshkumar Annadurai : > Hello, > > We see timeout error in server.log > log files and properties files are attached for your reference > > regards > Yogeshkumar A > > On Sat, May 4, 2024 at 5:27 AM Haruki Okada wrote: > >> Hi. >> >> log.retention shouldn't be related to the phenomenon. >> Sounds like we should understand the situation more precisely to answer. >> >> > apache zookeeper connection is going down automatically >> >> How did you confirm this? On ZooKeeper log? >> >> Also, did you see any logs on Kafka side? (on stdout or server.log, ... >> etc) >> >> >> Thanks, >> >> 2024年5月4日(土) 6:48 Yogeshkumar Annadurai : >> >> > Hello, >> > >> > We are using Apache kakfa in a development environment, where apache >> > zookeeper connection is going down automatically every 168 hours. we >> > observed that, log.retention.hours is set as 168 hours (7 days). >> > >> > I would like to understand the configuration for this kind of scenario >> > (automatic kafka server down - It says, broker connection cannot be >> > established) >> > >> > >> > Regards >> > Yogeshkumar A >> > >> >> >> -- >> >> Okada Haruki >> ocadar...@gmail.com >> >> > -- Okada Haruki ocadar...@gmail.com
Re: Apache zookeeper going down every 168 hours
Hello, We see timeout error in server.log log files and properties files are attached for your reference regards Yogeshkumar A On Sat, May 4, 2024 at 5:27 AM Haruki Okada wrote: > Hi. > > log.retention shouldn't be related to the phenomenon. > Sounds like we should understand the situation more precisely to answer. > > > apache zookeeper connection is going down automatically > > How did you confirm this? On ZooKeeper log? > > Also, did you see any logs on Kafka side? (on stdout or server.log, ... > etc) > > > Thanks, > > 2024年5月4日(土) 6:48 Yogeshkumar Annadurai : > > > Hello, > > > > We are using Apache kakfa in a development environment, where apache > > zookeeper connection is going down automatically every 168 hours. we > > observed that, log.retention.hours is set as 168 hours (7 days). > > > > I would like to understand the configuration for this kind of scenario > > (automatic kafka server down - It says, broker connection cannot be > > established) > > > > > > Regards > > Yogeshkumar A > > > > > -- > > Okada Haruki > ocadar...@gmail.com > > <> <>
Re: Apache zookeeper going down every 168 hours
Hi. log.retention shouldn't be related to the phenomenon. Sounds like we should understand the situation more precisely to answer. > apache zookeeper connection is going down automatically How did you confirm this? On ZooKeeper log? Also, did you see any logs on Kafka side? (on stdout or server.log, ... etc) Thanks, 2024年5月4日(土) 6:48 Yogeshkumar Annadurai : > Hello, > > We are using Apache kakfa in a development environment, where apache > zookeeper connection is going down automatically every 168 hours. we > observed that, log.retention.hours is set as 168 hours (7 days). > > I would like to understand the configuration for this kind of scenario > (automatic kafka server down - It says, broker connection cannot be > established) > > > Regards > Yogeshkumar A > -- Okada Haruki ocadar...@gmail.com
Apache zookeeper going down every 168 hours
Hello, We are using Apache kakfa in a development environment, where apache zookeeper connection is going down automatically every 168 hours. we observed that, log.retention.hours is set as 168 hours (7 days). I would like to understand the configuration for this kind of scenario (automatic kafka server down - It says, broker connection cannot be established) Regards Yogeshkumar A
Re: Failed to initialize processor KSTREAM-AGGREGATE-0000000001
Can you file a ticket for it: https://issues.apache.org/jira/browse/KAFKA On 5/3/24 3:34 AM, Penumarthi Durga Prasad Chowdary wrote: Kafka versions 3.5.1 and 3.7.0, we're still encountering persistent issues. The Kafka Streams library is aligned with these Kafka versions. Upon analysis of the logs, it seems that the problem may occur when a Kafka node disconnects from Kafka Streams processes. This suspicion is supported by the abundance of network messages indicating disconnections, such as org.apache.kafka.clients.NetworkClient ThreadName: kafka-streams-exec-0-test-store-6d676cf0-3910-4c25-bfad-ea2b98953db3-StreamThread-9 Message: [Consumer clientId=kafka-streams-exec-0-test-store-6d676cf0-3910-4c25-bfad-ea2b98953db3-StreamThread-9-consumer, groupId=kafka-streams-exec-0-test-store ] Node 102 disconnected. On Mon, Apr 22, 2024 at 7:16 AM Matthias J. Sax wrote: Not sure either, but it sounds like a bug to me. Can you reproduce this reliably? What version are you using? It would be best if you could file a Jira ticket and we can take it from there. -Matthias On 4/21/24 5:38 PM, Penumarthi Durga Prasad Chowdary wrote: Hi , I have an issue in kafka-streams while constructing kafka-streams state store windows(TimeWindow and SessionWindow). While kafka-streams processing data sometimes intermittent kafka-streams process throwing below error ThreadName: kafka-streams-exec-0-test-store-6d676cf0-3910-4c25-bfad-ea2b98953db3-StreamThread-9 TraceID: unknown CorelationID: eff36722-1430-4ffb-bf2e-c6e6cf6ae164 Message: stream-client [ kafka-streams-exec-0-test-store -6d676cf0-3910-4c25-bfad-ea2b98953db3] Replacing thread in the streams uncaught exception handler org.apache.kafka.streams.errors.StreamsException: failed to initialize processor KSTREAM-AGGREGATE-01 at org.apache.kafka.streams.processor.internals.ProcessorNode.init(ProcessorNode.java:115) at org.apache.kafka.streams.processor.internals.StreamTask.initializeTopology(StreamTask.java:986) at org.apache.kafka.streams.processor.internals.StreamTask.completeRestoration(StreamTask.java:271) at org.apache.kafka.streams.processor.internals.TaskManager.tryToCompleteRestoration(TaskManager.java:716) at org.apache.kafka.streams.processor.internals.StreamThread.initializeAndRestorePhase(StreamThread.java:901) at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:778) at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:617) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:579) Caused by: java.lang.NullPointerException at org.apache.kafka.streams.kstream.internals.TimestampedTupleForwarder.(TimestampedTupleForwarder.java:46) at org.apache.kafka.streams.kstream.internals.KStreamSessionWindowAggregate$KStreamSessionWindowAggregateProcessor.init(KStreamSessionWindowAggregate.java:138) at org.apache.kafka.streams.processor.internals.ProcessorNode.init(ProcessorNode.java:107) ... 7 more Here my understanding is state-store is null and at that time stateStore.flush() gets invoked to send the data to stateStore, this leads to the above error. This error can be caught inside kafka-streams setUncaughtExceptionHandler. streams.setUncaughtExceptionHandler(throwable -> { LOGGER.error("Exception in streams", throwable); return StreamsUncaughtExceptionHandler.StreamThreadExceptionResponse.REPLACE_THREAD; }); I'm uncertain about the exact reason for this issue. Everything seems to be in order, including the Kafka cluster, and there are no errors in the Kafka Streams except for a few logs indicating node disconnections. Is there a better way to handle this error? When can this issue happen ? I would like to express my gratitude in advance for any assistance provided.
Re: Failed to initialize processor KSTREAM-AGGREGATE-0000000001
Kafka versions 3.5.1 and 3.7.0, we're still encountering persistent issues. The Kafka Streams library is aligned with these Kafka versions. Upon analysis of the logs, it seems that the problem may occur when a Kafka node disconnects from Kafka Streams processes. This suspicion is supported by the abundance of network messages indicating disconnections, such as > > org.apache.kafka.clients.NetworkClient > ThreadName: > kafka-streams-exec-0-test-store-6d676cf0-3910-4c25-bfad-ea2b98953db3-StreamThread-9 > Message: [Consumer > clientId=kafka-streams-exec-0-test-store-6d676cf0-3910-4c25-bfad-ea2b98953db3-StreamThread-9-consumer, > groupId=kafka-streams-exec-0-test-store ] Node 102 disconnected. On Mon, Apr 22, 2024 at 7:16 AM Matthias J. Sax wrote: > Not sure either, but it sounds like a bug to me. Can you reproduce this > reliably? What version are you using? > > It would be best if you could file a Jira ticket and we can take it from > there. > > > -Matthias > > On 4/21/24 5:38 PM, Penumarthi Durga Prasad Chowdary wrote: > > Hi , > > I have an issue in kafka-streams while constructing kafka-streams state > > store windows(TimeWindow and SessionWindow). While kafka-streams > > processing data sometimes intermittent kafka-streams process throwing > below > > error > > ThreadName: > > > kafka-streams-exec-0-test-store-6d676cf0-3910-4c25-bfad-ea2b98953db3-StreamThread-9 > > TraceID: unknown CorelationID: eff36722-1430-4ffb-bf2e-c6e6cf6ae164 > > Message: stream-client [ kafka-streams-exec-0-test-store > > -6d676cf0-3910-4c25-bfad-ea2b98953db3] Replacing thread in the streams > > uncaught exception handler > > org.apache.kafka.streams.errors.StreamsException: failed to initialize > > processor KSTREAM-AGGREGATE-01 > >at > > > org.apache.kafka.streams.processor.internals.ProcessorNode.init(ProcessorNode.java:115) > >at > > > org.apache.kafka.streams.processor.internals.StreamTask.initializeTopology(StreamTask.java:986) > >at > > > org.apache.kafka.streams.processor.internals.StreamTask.completeRestoration(StreamTask.java:271) > >at > > > org.apache.kafka.streams.processor.internals.TaskManager.tryToCompleteRestoration(TaskManager.java:716) > >at > > > org.apache.kafka.streams.processor.internals.StreamThread.initializeAndRestorePhase(StreamThread.java:901) > >at > > > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:778) > >at > > > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:617) > >at > > > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:579) > > Caused by: java.lang.NullPointerException > >at > > > org.apache.kafka.streams.kstream.internals.TimestampedTupleForwarder.(TimestampedTupleForwarder.java:46) > >at > > > org.apache.kafka.streams.kstream.internals.KStreamSessionWindowAggregate$KStreamSessionWindowAggregateProcessor.init(KStreamSessionWindowAggregate.java:138) > >at > > > org.apache.kafka.streams.processor.internals.ProcessorNode.init(ProcessorNode.java:107) > >... 7 more > > Here my understanding is state-store is null and at that time > > stateStore.flush() gets invoked to send the data to stateStore, this > leads > > to the above error. This error can be caught inside kafka-streams > > setUncaughtExceptionHandler. > >streams.setUncaughtExceptionHandler(throwable -> { > >LOGGER.error("Exception in streams", throwable); > >return > > > StreamsUncaughtExceptionHandler.StreamThreadExceptionResponse.REPLACE_THREAD; > >}); > > I'm uncertain about the exact reason for this issue. Everything seems to > be > > in order, including the Kafka cluster, and there are no errors in the > Kafka > > Streams except for a few logs indicating node disconnections. > > Is there a better way to handle this error? > > When can this issue happen ? > > I would like to express my gratitude in advance for any assistance > provided. > -- Thank's's, Prasad, 91-9030546248.
How do we usually handle Node disconnected issue for kafka producer
Hi, I am using a Kafka producer java client by vert.x framework. https://vertx.io/docs/apidocs/io/vertx/kafka/client/producer/KafkaProducer.html There is a producer setting in kafka: connections.max.idle.ms = 54 So if there are no records to produce then after 9 minutes I get this in my logs: [kafka-producer-network-thread | RecordProducer] [NetworkClient.java:977] - [Producer clientId=RecordProducer] Node -1 disconnected. What it looks like is the Kafka producer object I had created has lost its connection due to this setting. What are my options to ensure that Kafka producer client does not close idle connections or reconnects or keeps alive even when no records to produce arrive for a long time? Thanks Sachin