Hello All, Please provide some inputs and help to resolve this issue. I have cross verified and it is not a DNS issue. Not sure why we are facing this issue. Should I report a bug for this ? Please let me know.
Regards, Praveen Kumar K S On Tue, Oct 27, 2020 at 8:41 PM Sabina Marx <sabina.m...@sneo.io> wrote: > Does anyone have any idea what we can do? > > All Zookeepers(3) and Kafkas are running. (5 nodes > meaning 5 physical hosts). Then I reboot one physical > host. I still have the redundancy. But when the physical host comes up and > zookeeper and then Kafka come up, I have Kafka timing out and not > connecting to the existing Kafka cluster. > > Log: > Started Apache Kafka. > INFO Registered kafka:type=kafka.Log4jController MBean > (kafka.utils.Log4jControllerRegistration$) > INFO Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable > client-initiated TLS renegotiation (org.apache.zookeeper.common.X509Util) > INFO Registered signal handlers for TERM, INT, HUP > (org.apache.kafka.common.utils.LoggingSignalHandler) > INFO starting (kafka.server.KafkaServer) > INFO Connecting to zookeeper on X.X.X.X:2181,X.X.X.X:2181,X.X.X.X:2181 > (kafka.server.KafkaServer) > INFO [ZooKeeperClient Kafka server] Initializing a new session to > X.X.X.X:2181,X.X.X.X:2181,X.X.X.X:2181. (kafka.zookeeper.ZooKeeperClient) > INFO Client > environment:zookeeper.version=3.5.8-f439ca583e70862c3068a1f2a7d4d068eec33315, > built on 05/04/2020 15:53 GMT (org.apache.zookeeper.ZooKeeper) > INFO Client environment:host.name=Kafka03.X.X > (org.apache.zookeeper.ZooKeeper) > INFO Client environment:java.version=11.0.8 > (org.apache.zookeeper.ZooKeeper) > INFO Client environment:java.vendor=Debian (org.apache.zookeeper.ZooKeeper) > INFO Client environment:java.home=/usr/lib/jvm/java-11-openjdk-amd64 > (org.apache.zookeeper.ZooKeeper) > INFO Client > environment:java.class.path=/opt/kafka/bin/../libs/activation-1.1.1.jar:/opt/kafka/bin/../libs/aopalliance-repackaged-2.5.0.jar:/opt/kafka/bin/../libs/argparse4j-0.7.0.jar:/opt/kafka/bin/../libs/audience-annotations-0.5.0.j > INFO Client > environment:java.library.path=/usr/java/packages/lib:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib > (org.apache.zookeeper.ZooKeeper) > INFO Client environment:java.io.tmpdir=/tmp > (org.apache.zookeeper.ZooKeeper) > INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper) > INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper) > INFO Client environment:os.arch=amd64 (org.apache.zookeeper.ZooKeeper) > INFO Client environment:os.version=4.19.0-10-amd64 > (org.apache.zookeeper.ZooKeeper) > INFO Client environment:user.name=it (org.apache.zookeeper.ZooKeeper) > INFO Client environment:user.home=/home/it (org.apache.zookeeper.ZooKeeper) > INFO Client environment:user.dir=/ (org.apache.zookeeper.ZooKeeper) > INFO Client environment:os.memory.free=980MB > (org.apache.zookeeper.ZooKeeper) > INFO Client environment:os.memory.max=1024MB > (org.apache.zookeeper.ZooKeeper) > INFO Client environment:os.memory.total=1024MB > (org.apache.zookeeper.ZooKeeper) > INFO Initiating client connection, > connectString=X.X.X.X:2181,X.X.X.X:2181,X.X.X.X:2181 sessionTimeout=18000 > watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$@48b67364 > (org.apache.zookeeper.ZooKeeper) > INFO jute.maxbuffer value is 4194304 Bytes > (org.apache.zookeeper.ClientCnxnSocket) > INFO zookeeper.request.timeout value is 0. feature enabled= > (org.apache.zookeeper.ClientCnxn) > INFO [ZooKeeperClient Kafka server] Waiting until connected. > (kafka.zookeeper.ZooKeeperClient) > INFO Opening socket connection to server kafka01.X.X/X.X.X.X:2181. Will > not attempt to authenticate using SASL (unknown error) > (org.apache.zookeeper.ClientCnxn) > INFO Socket connection established, initiating session, client: > /X.X.X.X:45952, server: kafka01.X.X/X.X.X.X:2181 > (org.apache.zookeeper.ClientCnxn) > WARN Client session timed out, have not heard from server in 6003ms for > sessionid 0x0 (org.apache.zookeeper.ClientCnxn) > INFO Client session timed out, have not heard from server in 6003ms for > sessionid 0x0, closing socket connection and attempting reconnect > (org.apache.zookeeper.ClientCnxn) > INFO Opening socket connection to server kafka05.X.X/X.X.X.X:2181. Will > not attempt to authenticate using SASL (unknown error) > (org.apache.zookeeper.ClientCnxn) > INFO Socket connection established, initiating session, client: > /X.X.X.X:51582, server: kafka05.X.X/X.X.X.X:2181 > (org.apache.zookeeper.ClientCnxn) > WARN Client session timed out, have not heard from server in 6003ms for > sessionid 0x0 (org.apache.zookeeper.ClientCnxn) > INFO Client session timed out, have not heard from server in 6003ms for > sessionid 0x0, closing socket connection and attempting reconnect > (org.apache.zookeeper.ClientCnxn) > INFO Opening socket connection to server X.X.X.X/X.X.X.X:2181. Will not > attempt to authenticate using SASL (unknown error) > (org.apache.zookeeper.ClientCnxn) > INFO Socket connection established, initiating session, client: > /X.X.X.X:44992, server: X.X.X.X/X.X.X.X:2181 > (org.apache.zookeeper.ClientCnxn) > INFO [ZooKeeperClient Kafka server] Closing. > (kafka.zookeeper.ZooKeeperClient) > WARN Client session timed out, have not heard from server in 6001ms for > sessionid 0x0 (org.apache.zookeeper.ClientCnxn) > INFO EventThread shut down for session: 0x0 > (org.apache.zookeeper.ClientCnxn) > INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper) > INFO [ZooKeeperClient Kafka server] Closed. > (kafka.zookeeper.ZooKeeperClient) > ERROR Fatal error during KafkaServer startup. Prepare to shutdown > (kafka.server.KafkaServer) > kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for > connection while in state: CONNECTING > at > kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:262) > at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:119) > at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1865) > at kafka.server.KafkaServer.createZkClient$1(KafkaServer.scala:419) > at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:444) > at kafka.server.KafkaServer.startup(KafkaServer.scala:222) > at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44) > at kafka.Kafka$.main(Kafka.scala:82) > at kafka.Kafka.main(Kafka.scala) > INFO shutting down (kafka.server.KafkaServer) > INFO shut down completed (kafka.server.KafkaServer) > ERROR Exiting Kafka. (kafka.server.KafkaServerStartable) > INFO shutting down (kafka.server.KafkaServer) > Main process exited, code=exited, status=1/FAILURE > Failed with result 'exit-code'. > > Am 23.10.20, 10:48 schrieb "Praveen Kumar K S" <prav...@securelyshare.com > >: > > Hello, > > Can someone please help me to understand what is the issue ? > > Regards, > Praveen Kumar K S > +91-9986855625 > > > On Thu, Oct 22, 2020 at 6:52 AM Praveen Kumar K S < > prav...@securelyshare.com> > wrote: > > > Hello Experts, > > > > Any help to debug and resolve this issue is highly appreciated. > > > > Regards, > > Praveen > > > > On Wed, 21 Oct, 2020, 11:26 Sabina Marx, <sabina.m...@sneo.io> > wrote: > > > >> Hi Praveen, > >> > >> it seems to be the same problem, your log looks quite similar to > mine. > >> But I have no solution until now. > >> > >> Regards > >> Sabina > >> > >> Von: Praveen Kumar K S <prav...@securelyshare.com> > >> Antworten an: "users@kafka.apache.org" <users@kafka.apache.org> > >> Datum: Dienstag, 20. Oktober 2020 um 20:07 > >> An: "users@kafka.apache.org" <users@kafka.apache.org> > >> Betreff: Re: Client session timed out > >> > >> Hello, > >> > >> I'm not sure if I can add my issue in this thread. But it seems > like I'm > >> facing the same problem. > >> > >> KAFKA_VERSION=2.5.1 > >> ZK_VERSION=3.5.8 > >> > >> I run 3 node zookeeper cluster and 3 node kafka cluster as docker > >> containers in docker swarm environment. When I install it for first > time, > >> everything goes well. Zookeeper and Kafka are able to form the > cluster. > >> Services are healthy. > >> > >> But when I issue docker update command, kafka is not coming up > though the > >> zookeeper cluster is healthy. Below is the sequence of steps. > >> > >> docker service update one_zookeeper --image > x.x.x/v1/zookeeper:latest > >> --force > >> docker service update one_zookeeper1 --image > x.x.x/v1/zookeeper:latest > >> --force > >> docker service update one_zookeeper2 --image > x.x.x/v1/zookeeper:latest > >> --force > >> > >> Zookeeper is healthy now. I'm able to query leader and follower. > >> > >> Now, I'm updating kafka and it doesn't work. > >> docker service update one_kafka --image x.x.com/v1/kafka:latest< > >> http://x.x.com/v1/kafka:latest> --force > >> > >> PFA Kafka log. > >> > >> While kafka update has failed, I see that kafka1 and kafka2 are > running > >> and healthy. > >> > >> docker service ls | grep kafka > >> one_kafka replicated 0/1 > >> one_kafka1 replicated 1/1 > >> one_kafka2 replicated 1/1 > >> > >> To cross verify, I have just brought down the services zookeeper and > >> kafka without data loss. I preserve > >> zookeeperdata,zookeeperlogs,zookeepertxns and kafkadata,kafkalogs. > >> > >> docker stack remove one > >> docker stack deploy -c cluster-zookeeper.yml one > >> docker stack deploy -c cluster-kafka.yml one > >> > >> Now, all the services are healthy. > >> > >> I'm not sure why kafka deployment is failing only during update. > There is > >> no change in the configuration in either zookeeper or kafka. > >> > >> Please help me resolve this issue and let me know if you need any > >> additional details. > >> > >> Regards, > >> Praveen Kumar K S > >> +91-9986855625 > >> > >> > >> On Tue, Oct 20, 2020 at 3:54 PM Sabina Marx <sabina.m...@sneo.io > <mailto: > >> sabina.m...@sneo.io>> wrote: > >> Yes, it's the same problem. > >> > >> Am 19.10.20, 19:50 schrieb "Mich Talebzadeh" < > mich.talebza...@gmail.com > >> <mailto:mich.talebza...@gmail.com>>: > >> > >> can you try to disable automatic start and on the node just > booted, > >> start > >> zookeeper first, check the log that it is connected and then > start > >> Kafka? > >> > >> I assume everything is set-up OK including in > >> $KAFKA_HOME/config/server<N>.properties values for broker.id< > >> http://broker.id>, hostname, > >> zookeeper.connect=<server1>:2181,,server2>:2181, > <serverN>:2181 and > >> also > >> zookeeper.connection.timeout.ms< > >> http://zookeeper.connection.timeout.ms>=6000 (default) > >> > >> HTH > >> > >> > >> > >> > >> > >> LinkedIn * > >> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> < > >> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> >* > >> > >> > >> > >> > >> > >> *Disclaimer:* Use it at your own risk. Any and all > responsibility for > >> any > >> loss, damage or destruction of data or any other property which > may > >> arise > >> from relying on this email's technical content is explicitly > >> disclaimed. > >> The author will in no case be liable for any monetary damages > arising > >> from > >> such loss, damage or destruction. > >> > >> > >> > >> > >> On Mon, 19 Oct 2020 at 18:17, Sabina Marx <sabina.m...@sneo.io > >> <mailto:sabina.m...@sneo.io>> wrote: > >> > >> > Yes, you have it > >> > > >> > Holen Sie sich Outlook für iOS<https://aka.ms/o0ukef> > >> > ________________________________ > >> > Von: Mich Talebzadeh <mich.talebza...@gmail.com<mailto: > >> mich.talebza...@gmail.com>> > >> > Gesendet: Monday, October 19, 2020 7:09:53 PM > >> > An: users@kafka.apache.org<mailto:users@kafka.apache.org> < > >> users@kafka.apache.org<mailto:users@kafka.apache.org>> > >> > Betreff: Re: Client session timed out > >> > > >> > Ok I think it is clearer now. > >> > > >> > As I understand all your Zookeepers and Kafkas are running. > (5 nodes > >> > meaning 5 physical hosts?). Then you have to reboot one > physical > >> host. You > >> > still have the redundancy. But when the physical host comes > up and > >> your > >> > zookeeper and then Kafka come up, you have Kafka timing out > and not > >> > connecting to the existing Kafka cluster? > >> > > >> > Does that make sense? > >> > > >> > > >> > > >> > > >> > > >> > > >> > LinkedIn * > >> > > >> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> > < > >> > > >> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> > >* > >> > > >> > > >> > > >> > > >> > > >> > *Disclaimer:* Use it at your own risk. Any and all > responsibility > >> for any > >> > loss, damage or destruction of data or any other property > which may > >> arise > >> > from relying on this email's technical content is explicitly > >> disclaimed. > >> > The author will in no case be liable for any monetary damages > >> arising from > >> > such loss, damage or destruction. > >> > > >> > > >> > > >> > > >> > On Mon, 19 Oct 2020 at 17:59, Sabina Marx < > sabina.m...@sneo.io > >> <mailto:sabina.m...@sneo.io>> wrote: > >> > > >> > > No, sorry I‘m not so good in explaining. > >> > > The scenario is: the complete Cluster is running, all > zookeepers > >> and all > >> > > kafkas. And then I restart one server, the others are still > >> running. > >> > > > >> > > Holen Sie sich Outlook für iOS<https://aka.ms/o0ukef> > >> > > ________________________________ > >> > > Von: Mich Talebzadeh <mich.talebza...@gmail.com<mailto: > >> mich.talebza...@gmail.com>> > >> > > Gesendet: Monday, October 19, 2020 6:46:49 PM > >> > > An: users@kafka.apache.org<mailto:users@kafka.apache.org> < > >> users@kafka.apache.org<mailto:users@kafka.apache.org>> > >> > > Betreff: Re: Client session timed out > >> > > > >> > > Can you pls clarify when you say you start one pair > (Zookeeper and > >> > Kafka), > >> > > what happens to the others. Do you keep them down? > >> > > > >> > > > >> > > > >> > > > >> > > LinkedIn * > >> > > > >> > > >> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> > > < > >> > > > >> > > >> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> > > >* > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > *Disclaimer:* Use it at your own risk. Any and all > responsibility > >> for any > >> > > loss, damage or destruction of data or any other property > which > >> may arise > >> > > from relying on this email's technical content is explicitly > >> disclaimed. > >> > > The author will in no case be liable for any monetary > damages > >> arising > >> > from > >> > > such loss, damage or destruction. > >> > > > >> > > > >> > > > >> > > > >> > > On Mon, 19 Oct 2020 at 17:33, Sabina Marx < > sabina.m...@sneo.io > >> <mailto:sabina.m...@sneo.io>> wrote: > >> > > > >> > > > Yes, it's a new setup. The kafka and zookeeper runs as > services > >> and are > >> > > > enabled, so they should start at system start. And kafka > starts > >> after > >> > > > zookeeper. > >> > > > And if I stop everything and start it, then it works. But > then, > >> if I > >> > > > restart one server with zookeeper and kafka the kafka > gets the > >> > timeouts. > >> > > > > >> > > > Am 19.10.20, 18:16 schrieb "Mich Talebzadeh" < > >> > mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com> > >> > > >: > >> > > > > >> > > > OK so the issue seems to be kafka cluster. Is this a > new > >> setup? > >> > > > > >> > > > HTH > >> > > > > >> > > > > >> > > > > >> > > > LinkedIn * > >> > > > > >> > > > >> > > >> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> > > > < > >> > > > > >> > > > >> > > >> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> > > > >* > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > *Disclaimer:* Use it at your own risk. Any and all > >> responsibility > >> > for > >> > > > any > >> > > > loss, damage or destruction of data or any other > property > >> which may > >> > > > arise > >> > > > from relying on this email's technical content is > explicitly > >> > > > disclaimed. > >> > > > The author will in no case be liable for any monetary > >> damages > >> > arising > >> > > > from > >> > > > such loss, damage or destruction. > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > On Mon, 19 Oct 2020 at 17:11, Sabina Marx < > >> sabina.m...@sneo.io<mailto:sabina.m...@sneo.io>> > >> > > wrote: > >> > > > > >> > > > > Thanks for your answer, but the zookeeper ensemble > is > >> running and > >> > > > ports > >> > > > > are ok. > >> > > > > > >> > > > > Am 19.10.20, 17:38 schrieb "Mich Talebzadeh" < > >> > > > mich.talebza...@gmail.com<mailto: > mich.talebza...@gmail.com>>: > >> > > > > > >> > > > > Start the zookeeper ensemble first before > starting > >> Kafka > >> > > > cluster. They > >> > > > > need > >> > > > > to select a leader and ensure that they all come > >> online OK. > >> > > > Check port > >> > > > > 2181, 2888, 3888 using > >> > > > > > >> > > > > netstat -plten|egrep '2181|2888|3888' > >> > > > > > >> > > > > tcp 0 0 :::2181 > :::* > >> > > > > LISTEN 1005 9934134 29170/java > >> > > > > tcp 0 0 ::ffff:50.140.197.217:2888< > >> http://50.140.197.217:2888> :::* > >> > > > > LISTEN 1005 9935496 29170/java > >> > > > > tcp 0 0 ::ffff:50.140.197.217:3888< > >> http://50.140.197.217:3888> :::* > >> > > > > LISTEN 1005 9935493 29170/java > >> > > > > > >> > > > > > >> > > > > > >> > > > > P.S. I assume you are talking about Apache > Kafks here. > >> > > > > > >> > > > > HTH > >> > > > > > >> > > > > > >> > > > > > >> > > > > LinkedIn * > >> > > > > > >> > > > > >> > > > >> > > >> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> > > > > < > >> > > > > > >> > > > > >> > > > >> > > >> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > >> > > > > >* > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > *Disclaimer:* Use it at your own risk. Any and > all > >> > > > responsibility for > >> > > > > any > >> > > > > loss, damage or destruction of data or any other > >> property > >> > which > >> > > > may > >> > > > > arise > >> > > > > from relying on this email's technical content > is > >> explicitly > >> > > > > disclaimed. > >> > > > > The author will in no case be liable for any > monetary > >> damages > >> > > > arising > >> > > > > from > >> > > > > such loss, damage or destruction. > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > On Mon, 19 Oct 2020 at 14:42, Sabina Marx < > >> > sabina.m...@sneo.io<mailto:sabina.m...@sneo.io> > >> > > > > >> > > > wrote: > >> > > > > > >> > > > > > Hi, > >> > > > > > > >> > > > > > I have a 5 nodes kafka cluster with 3 > zookeepers. > >> If I > >> > > restart > >> > > > 1 node > >> > > > > > (zookeeper and kafka) the kafka gets a Client > >> session timed > >> > > > out, > >> > > > > have not > >> > > > > > heard from server in 6007ms for sessionid 0x0 > >> > > > > > (org.apache.zookeeper.ClientCnxn) > >> > > > > > Client session timed out, have not heard from > >> server in > >> > > 6007ms > >> > > > for > >> > > > > > sessionid 0x0, closing socket connection and > >> attempting > >> > > > reconnect > >> > > > > > (org.apache.zookeeper.ClientCnxn) > >> > > > > > And my kafka service do not start. > >> > > > > > I have set the tickTime=6000 in the > >> zookeeper.properties > >> > but > >> > > > that > >> > > > > didn’t > >> > > > > > help. What can I do? > >> > > > > > > >> > > > > > Many thanks for your help. > >> > > > > > Sabina > >> > > > > > > >> > > > > > >> > > > > > >> > > > > >> > > > > >> > > > >> > > >> > > > >