----- Forwarded Message ----- From: Elena Batranu <batranuel...@yahoo.com>To: d...@kafka.apache.org <d...@kafka.apache.org>Sent: Tuesday, May 16, 2023 at 09:54:02 AM GMT+3Subject: Migration from zookeeper to kraft not working Hello! I have a problem with my kafka configuration (kafka 3.4). I'm trying to migrate from zookeeper to kraft. I have 3 brokers, on one of them was also the zookeeper. I want to restart my brokers one by one, without having downtime. I started with putting the configuration with also kraft and zookeeper, to do the migration gradually. In this step my nodes are up, but i have the following error in the logs from kraft. [2023-05-16 06:35:19,485] DEBUG [BrokerToControllerChannelManager broker=0 name=quorum]: No controller provided, retrying after backoff (kafka.server.BrokerToControllerRequestThread)[2023-05-16 06:35:19,585] DEBUG [BrokerToControllerChannelManager broker=0 name=quorum]: Controller isn't cached, looking for local metadata changes (kafka.server.BrokerToControllerRequestThread)[2023-05-16 06:35:19,586] DEBUG [BrokerToControllerChannelManager broker=0 name=quorum]: No controller provided, retrying after backoff (kafka.server.BrokerToControllerRequestThread)[2023-05-16 06:35:19,624] INFO [RaftManager nodeId=0] Node 3002 disconnected. (org.apache.kafka.clients.NetworkClient)[2023-05-16 06:35:19,624] WARN [RaftManager nodeId=0] Connection to node 3002 (/192.168.25.172:9093) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)[2023-05-16 06:35:19,642] INFO [RaftManager nodeId=0] Node 3001 disconnected. (org.apache.kafka.clients.NetworkClient)[2023-05-16 06:35:19,642] WARN [RaftManager nodeId=0] Connection to node 3001 (/192.168.25.232:9093) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)[2023-05-16 06:35:19,643] INFO [RaftManager nodeId=0] Node 3000 disconnected. (org.apache.kafka.clients.NetworkClient)[2023-05-16 06:35:19,643] WARN [RaftManager nodeId=0] Connection to node 3000 (/192.168.25.146:9093) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) I configured the controller on each broker, the file looks like this: # Licensed to the Apache Software Foundation (ASF) under one or more# contributor license agreements. See the NOTICE file distributed with# this work for additional information regarding copyright ownership.# The ASF licenses this file to You under the Apache License, Version 2.0# (the "License"); you may not use this file except in compliance with# the License. You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License. ## This configuration file is intended for use in KRaft mode, where# Apache ZooKeeper is not present. See config/kraft/README.md for details.# ############################# Server Basics ############################# # The role of this server. Setting this puts us in KRaft modeprocess.roles=controller # The node id associated with this instance's rolesnode.id=3000# The connect string for the controller quorum#controller.quorum.voters=3000@localhost:9093controller.quorum.voters=3000@192.168.25.146:9093,3001@192.168.25.232:9093,3002@192.168.25.172:9093############################# Socket Server Settings ############################# # The address the socket server listens on.# Note that only the controller listeners are allowed here when `process.roles=controller`, and this listener should be consistent with `controller.quorum.voters` value.# FORMAT:# listeners = listener_name://host_name:port# EXAMPLE:# listeners = PLAINTEXT://your.host.name:9092listeners=CONTROLLER://:9093 # A comma-separated list of the names of the listeners used by the controller.# This is required if running in KRaft mode.controller.listener.names=CONTROLLER # Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL # The number of threads that the server uses for receiving requests from the network and sending responses to the networknum.network.threads=3 # The number of threads that the server uses for processing requests, which may include disk I/Onum.io.threads=8 # The send buffer (SO_SNDBUF) used by the socket serversocket.send.buffer.bytes=102400 # The receive buffer (SO_RCVBUF) used by the socket serversocket.receive.buffer.bytes=102400 # The maximum size of a request that the socket server will accept (protection against OOM)socket.request.max.bytes=104857600
############################# Log Basics ############################# # A comma separated list of directories under which to store log fileslog.dirs=/data/kraft-controller-logs # The default number of log partitions per topic. More partitions allow greater# parallelism for consumption, but this will also result in more files across# the brokers.num.partitions=1 # The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.# This value is recommended to be increased for installations with data dirs located in RAID array.num.recovery.threads.per.data.dir=1 ############################# Internal Topic Settings ############################## The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.offsets.topic.replication.factor=1transaction.state.log.replication.factor=1transaction.state.log.min.isr=1 ############################# Log Flush Policy ############################# # Messages are immediately written to the filesystem but by default we only fsync() to sync# the OS cache lazily. The following configurations control the flush of data to disk.# There are a few important trade-offs here:# 1. Durability: Unflushed data may be lost if you are not using replication.# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.# The settings below allow one to configure the flush policy to flush data after a period of time or# every N messages (or both). This can be done globally and overridden on a per-topic basis. # The number of messages to accept before forcing a flush of data to disk#log.flush.interval.messages=10000 # The maximum amount of time a message can sit in a log before we force a flush#log.flush.interval.ms=1000 ############################# Log Retention Policy ############################# # The following configurations control the disposal of log segments. The policy can# be set to delete segments after a period of time, or after a given size has accumulated.# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens# from the end of the log. # The minimum age of a log file to be eligible for deletion due to agelog.retention.hours=168 # A size-based retention policy for logs. Segments are pruned from the log unless the remaining# segments drop below log.retention.bytes. Functions independently of log.retention.hours.#log.retention.bytes=1073741824 # The maximum size of a log segment file. When this size is reached a new log segment will be created.log.segment.bytes=1073741824 # The interval at which log segments are checked to see if they can be deleted according# to the retention policieslog.retention.check.interval.ms=300000# Enable the migrationzookeeper.metadata.migration.enable=true # ZooKeeper client configurationzookeeper.connect=localhost:2181 ############################################################################################### Also this is my setup for the server.properties file: # Licensed to the Apache Software Foundation (ASF) under one or more# contributor license agreements. See the NOTICE file distributed with# this work for additional information regarding copyright ownership.# The ASF licenses this file to You under the Apache License, Version 2.0# (the "License"); you may not use this file except in compliance with# the License. You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License. ## This configuration file is intended for use in KRaft mode, where# Apache ZooKeeper is not present. See config/kraft/README.md for details.# ############################# Server Basics ############################# # The role of this server. Setting this puts us in KRaft mode#process.roles=broker,controller#process.roles=broker# The node id associated with this instance's roles#node.id=0broker.id=0 # The connect string for the controller quorum#controller.quorum.voters=3000@192.168.25.146:9093controller.quorum.voters=3000@192.168.25.146:9093,3001@192.168.25.232:9093,3002@192.168.25.172:9093############################# Socket Server Settings ############################# # The address the socket server listens on.# Combined nodes (i.e. those with `process.roles=broker,controller`) must list the controller listener here at a minimum.# If the broker listener is not defined, the default listener will use a host name that is equal to the value of java.net.InetAddress.getCanonicalHostName(),# with PLAINTEXT listener name, and port 9092.# FORMAT:# listeners = listener_name://host_name:port# EXAMPLE:# listeners = PLAINTEXT://your.host.name:9092listeners=PLAINTEXT://192.168.25.146:9092 # Name of listener used for communication between brokers.#inter.broker.listener.name=PLAINTEXT # Listener name, hostname and port the broker will advertise to clients.# If not set, it uses the value for "listeners".advertised.listeners=PLAINTEXT://192.168.25.146:9092# A comma-separated list of the names of the listeners used by the controller.# If no explicit mapping set in `listener.security.protocol.map`, default will be using PLAINTEXT protocol# This is required if running in KRaft mode.controller.listener.names=CONTROLLER # Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details#listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSLlistener.security.protocol.map=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT# The number of threads that the server uses for receiving requests from the network and sending responses to the networknum.network.threads=3 # The number of threads that the server uses for processing requests, which may include disk I/Onum.io.threads=8 # The send buffer (SO_SNDBUF) used by the socket serversocket.send.buffer.bytes=102400 # The receive buffer (SO_RCVBUF) used by the socket serversocket.receive.buffer.bytes=102400 # The maximum size of a request that the socket server will accept (protection against OOM)socket.request.max.bytes=104857600 ############################# Log Basics ############################# # A comma separated list of directories under which to store log fileslog.dirs=/data/kraft # The default number of log partitions per topic. More partitions allow greater# parallelism for consumption, but this will also result in more files across# the brokers.num.partitions=1 # The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.# This value is recommended to be increased for installations with data dirs located in RAID array.num.recovery.threads.per.data.dir=1 ############################# Internal Topic Settings ############################## The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.offsets.topic.replication.factor=1transaction.state.log.replication.factor=1transaction.state.log.min.isr=1 ############################# Log Flush Policy ############################# # Messages are immediately written to the filesystem but by default we only fsync() to sync# the OS cache lazily. The following configurations control the flush of data to disk.# There are a few important trade-offs here:# 1. Durability: Unflushed data may be lost if you are not using replication.# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.# The settings below allow one to configure the flush policy to flush data after a period of time or# every N messages (or both). This can be done globally and overridden on a per-topic basis. # The number of messages to accept before forcing a flush of data to disk#log.flush.interval.messages=10000 # The maximum amount of time a message can sit in a log before we force a flush#log.flush.interval.ms=1000 ############################# Log Retention Policy ############################# # The following configurations control the disposal of log segments. The policy can# be set to delete segments after a period of time, or after a given size has accumulated.# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens# from the end of the log. # The minimum age of a log file to be eligible for deletion due to agelog.retention.hours=168 # A size-based retention policy for logs. Segments are pruned from the log unless the remaining# segments drop below log.retention.bytes. Functions independently of log.retention.hours.#log.retention.bytes=1073741824 # The maximum size of a log segment file. When this size is reached a new log segment will be created.log.segment.bytes=1073741824 # The interval at which log segments are checked to see if they can be deleted according# to the retention policieslog.retention.check.interval.ms=300000zookeeper.connect=192.168.25.146:2181zookeeper.metadata.migration.enable=trueinter.broker.protocol.version=3.4 ########################################################################## i tried to do the next steps, to comment the lines related with zookeeper (on one of my broker)zookeeper.connect=192.168.25.146:2181zookeeper.metadata.migration.enable=trueinter.broker.protocol.version=3.4 and put this in place of broker.idprocess.roles=broker node.id=0 but after this kafka isn't working anymore. All my brokers are on the same cluster, so i don't think is a problem with the connection between them. I think i omitted something in the configurations files. I want to fully migrate to kraft. Please take a look and tell me if you have any suggestions.