[jira] [Commented] (ZOOKEEPER-3460) Zookeeper 3.4.13: keeps crashing after a repave in cloudnative environment.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882425#comment-16882425 ] Chandrasekhar commented on ZOOKEEPER-3460: -- Hi We are cleaning our workspaces and redeploying ...planned to repave and recreate. Can you provide some information on how to enable extra logging in zoo keeper? We have following in zoo.cfg ## The number of milliseconds of each ticktickTime=2000# The number of ticks that the initial# synchronization phase can takeinitLimit=10# The number of ticks that can pass between# sending a request and getting an acknowledgementsyncLimit=5# the directory where the snapshot is stored.dataDir=/var/zookeeper/data# the port at which the clients will connectclientPort=2181# the maximum number of client connections.# increase this if you need to handle more clientsmaxClientCnxns=60#minSessionTimeout=4000maxSessionTimeout=4## Be sure to read the maintenance section of the# administrator guide before turning on autopurge.## http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance## The number of snapshots to retain in dataDirautopurge.snapRetainCount=3# Purge task interval in hours# Set to "0" to disable auto purge featureautopurge.purgeInterval=1 > Zookeeper 3.4.13: keeps crashing after a repave in cloudnative environment. > --- > > Key: ZOOKEEPER-3460 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3460 > Project: ZooKeeper > Issue Type: Bug > Components: other >Affects Versions: 3.4.13 > Environment: Kubernetes Cloud Native Environment. >Reporter: Chandrasekhar >Priority: Major > Attachments: ZooKeeperDeploymentDescription.txt, ZookeeperCrashLog.txt > > > We have used the minimal binary installation for Zookeeper and every time > after repave the zookeeper keeps crashing with following logs... > I have attached the zookeeper crash logs and deployment information. Is this > related to one of the NULL Pointer Issues mentioned in > https://issues.apache.org/jira/browse/ZOOKEEPER-3009 ? > We are trying to find the exact issue here so our cloud native platform guys > can help us further. Kindly let us know how to turn on debugging further. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3460) Zookeeper 3.4.13: keeps crashing after a repave in cloudnative environment.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882269#comment-16882269 ] Chandrasekhar commented on ZOOKEEPER-3460: -- Hmm..Need help with RCA to identify if this is the issue fixed in later release. That way I can justify that 3.5.5 will fix this issue and we can get around this issue. > Zookeeper 3.4.13: keeps crashing after a repave in cloudnative environment. > --- > > Key: ZOOKEEPER-3460 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3460 > Project: ZooKeeper > Issue Type: Bug > Components: other >Affects Versions: 3.4.13 > Environment: Kubernetes Cloud Native Environment. >Reporter: Chandrasekhar >Priority: Major > Attachments: ZooKeeperDeploymentDescription.txt, ZookeeperCrashLog.txt > > > We have used the minimal binary installation for Zookeeper and every time > after repave the zookeeper keeps crashing with following logs... > I have attached the zookeeper crash logs and deployment information. Is this > related to one of the NULL Pointer Issues mentioned in > https://issues.apache.org/jira/browse/ZOOKEEPER-3009 ? > We are trying to find the exact issue here so our cloud native platform guys > can help us further. Kindly let us know how to turn on debugging further. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3460) Zookeeper 3.4.13: keeps crashing after a repave in cloudnative environment.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882267#comment-16882267 ] Enrico Olivelli commented on ZOOKEEPER-3460: Your error seems related to a corrupted file, it is not related to a NPE How many servers do you have in your cluster? If you have 3 server you can drop the corrupted one and then spawn a new server > Zookeeper 3.4.13: keeps crashing after a repave in cloudnative environment. > --- > > Key: ZOOKEEPER-3460 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3460 > Project: ZooKeeper > Issue Type: Bug > Components: other >Affects Versions: 3.4.13 > Environment: Kubernetes Cloud Native Environment. >Reporter: Chandrasekhar >Priority: Major > Attachments: ZooKeeperDeploymentDescription.txt, ZookeeperCrashLog.txt > > > We have used the minimal binary installation for Zookeeper and every time > after repave the zookeeper keeps crashing with following logs... > I have attached the zookeeper crash logs and deployment information. Is this > related to one of the NULL Pointer Issues mentioned in > https://issues.apache.org/jira/browse/ZOOKEEPER-3009 ? > We are trying to find the exact issue here so our cloud native platform guys > can help us further. Kindly let us know how to turn on debugging further. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3460) Zookeeper 3.4.13: keeps crashing after a repave in cloudnative environment.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882264#comment-16882264 ] Enrico Olivelli commented on ZOOKEEPER-3460: I guess that an upgrade to 3.5.5 (stable) is not an option for you. > Zookeeper 3.4.13: keeps crashing after a repave in cloudnative environment. > --- > > Key: ZOOKEEPER-3460 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3460 > Project: ZooKeeper > Issue Type: Bug > Components: other >Affects Versions: 3.4.13 > Environment: Kubernetes Cloud Native Environment. >Reporter: Chandrasekhar >Priority: Major > Attachments: ZooKeeperDeploymentDescription.txt, ZookeeperCrashLog.txt > > > We have used the minimal binary installation for Zookeeper and every time > after repave the zookeeper keeps crashing with following logs... > I have attached the zookeeper crash logs and deployment information. Is this > related to one of the NULL Pointer Issues mentioned in > https://issues.apache.org/jira/browse/ZOOKEEPER-3009 ? > We are trying to find the exact issue here so our cloud native platform guys > can help us further. Kindly let us know how to turn on debugging further. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3460) Zookeeper 3.4.13: keeps crashing after a repave in cloudnative environment.
Chandrasekhar created ZOOKEEPER-3460: Summary: Zookeeper 3.4.13: keeps crashing after a repave in cloudnative environment. Key: ZOOKEEPER-3460 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3460 Project: ZooKeeper Issue Type: Bug Components: other Affects Versions: 3.4.13 Environment: Kubernetes Cloud Native Environment. Reporter: Chandrasekhar Attachments: ZooKeeperDeploymentDescription.txt, ZookeeperCrashLog.txt We have used the minimal binary installation for Zookeeper and every time after repave the zookeeper keeps crashing with following logs... I have attached the zookeeper crash logs and deployment information. Is this related to one of the NULL Pointer Issues mentioned in https://issues.apache.org/jira/browse/ZOOKEEPER-3009 ? We are trying to find the exact issue here so our cloud native platform guys can help us further. Kindly let us know how to turn on debugging further. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3458) ZK 3.5.5 : Dynamic SecureClientPort and Server Specs
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andor Molnar resolved ZOOKEEPER-3458. - Resolution: Duplicate > ZK 3.5.5 : Dynamic SecureClientPort and Server Specs > > > Key: ZOOKEEPER-3458 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3458 > Project: ZooKeeper > Issue Type: Improvement > Components: java client >Affects Versions: 3.5.5 >Reporter: Fredrick Eisele >Priority: Major > > ZK 3.5.5 : Dynamic configuration of SecureClientPort and Server Specs > The server specification is ... > {{server. = ::[:role];[ address>:]}} > > The clientPort and clientPortAddress are accomodated but I do not see a > provision for secureClientPort. > > secureClientPort and secureClientPortAddress > were not made part of the dynamic configuration introduced in ZK 3.5.5 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (ZOOKEEPER-3458) ZK 3.5.5 : Dynamic SecureClientPort and Server Specs
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andor Molnar reopened ZOOKEEPER-3458: - > ZK 3.5.5 : Dynamic SecureClientPort and Server Specs > > > Key: ZOOKEEPER-3458 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3458 > Project: ZooKeeper > Issue Type: Improvement > Components: java client >Affects Versions: 3.5.5 >Reporter: Fredrick Eisele >Priority: Major > > ZK 3.5.5 : Dynamic configuration of SecureClientPort and Server Specs > The server specification is ... > {{server. = ::[:role];[ address>:]}} > > The clientPort and clientPortAddress are accomodated but I do not see a > provision for secureClientPort. > > secureClientPort and secureClientPortAddress > were not made part of the dynamic configuration introduced in ZK 3.5.5 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3456) Service temporarily unavailable due to an ongoing leader election. Please refresh
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881945#comment-16881945 ] maoling commented on ZOOKEEPER-3456: step1: *ping 10.32.0.3* step2: *telnet 10.32.0.3 5888* step3: check whether the port:5888 in the 10.32.0.3 has occupied by other process by *lsof -i:5888* > Service temporarily unavailable due to an ongoing leader election. Please > refresh > - > > Key: ZOOKEEPER-3456 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3456 > Project: ZooKeeper > Issue Type: Bug > Components: server > Environment: docker container with Ubuntu 16.04 >Reporter: Marzieh >Priority: Major > Fix For: 3.4.14 > > > Hi > I configured Zookeeper with four nodes for my Mesos cluster with Marathon. > When I ran Flink Json file on Marathon, it was run without problem. But, when > I entered IP of my two slaves, just one slave shew Flink UI and another slave > shew this error: > > Service temporarily unavailable due to an ongoing leader election. Please > refresh > I checked "zookeeper.out" file and it said that : > > 019-07-07 11:48:43,412 [myid:] - INFO [main:QuorumPeerConfig@136] - Reading > configuration from: /home/zookeeper-3.4.14/bin/../conf/zoo.cfg > 2019-07-07 11:48:43,421 [myid:] - INFO [main:QuorumPeer$QuorumServer@185] - > Resolved hostname: 0.0.0.0 to address: /0.0.0.0 > 2019-07-07 11:48:43,421 [myid:] - INFO [main:QuorumPeer$QuorumServer@185] - > Resolved hostname: 10.32.0.3 to address: /10.32.0.3 > 2019-07-07 11:48:43,422 [myid:] - INFO [main:QuorumPeer$QuorumServer@185] - > Resolved hostname: 10.32.0.2 to address: /10.32.0.2 > 2019-07-07 11:48:43,422 [myid:] - INFO [main:QuorumPeer$QuorumServer@185] - > Resolved hostname: 10.32.0.5 to address: /10.32.0.5 > 2019-07-07 11:48:43,422 [myid:] - WARN [main:QuorumPeerConfig@354] - > Non-optimial configuration, consider an odd number of servers. > 2019-07-07 11:48:43,422 [myid:] - INFO [main:QuorumPeerConfig@398] - > Defaulting to majority quorums > 2019-07-07 11:48:43,425 [myid:3] - INFO [main:DatadirCleanupManager@78] - > autopurge.snapRetainCount set to 3 > 2019-07-07 11:48:43,425 [myid:3] - INFO [main:DatadirCleanupManager@79] - > autopurge.purgeInterval set to 0 > 2019-07-07 11:48:43,425 [myid:3] - INFO [main:DatadirCleanupManager@101] - > Purge task is not scheduled. > 2019-07-07 11:48:43,432 [myid:3] - INFO [main:QuorumPeerMain@130] - Starting > quorum peer > 2019-07-07 11:48:43,437 [myid:3] - INFO [main:ServerCnxnFactory@117] - Using > org.apache.zookeeper.server.NIOServerCnxnFactory as server connect$ > 2019-07-07 11:48:43,439 [myid:3] - INFO [main:NIOServerCnxnFactory@89] - > binding to port 0.0.0.0/0.0.0.0:2181 > 2019-07-07 11:48:43,440 [myid:3] - ERROR [main:QuorumPeerMain@92] - > Unexpected exception, exiting abnormally > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:433) > at sun.nio.ch.Net.bind(Net.java:425) > at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:90) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:133) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81) > > I searched a lot and could not find the solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)