[ 
https://issues.apache.org/jira/browse/FLINK-17443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-17443:
-------------------------------------
    Component/s: Runtime / Coordination

> Flink's ZK in HA mode setup is unable to start up if any of the zk hosts are 
> unreachable
> ----------------------------------------------------------------------------------------
>
>                 Key: FLINK-17443
>                 URL: https://issues.apache.org/jira/browse/FLINK-17443
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>            Reporter: Piyush Narang
>            Priority: Major
>              Labels: pull-request-available
>
> We occasionally hit an issue where our Flink cluster will not startup if any 
> of the zookeeper hosts passed in the "high-availability.zookeeper.quorum" 
> config setting are unreachable. This seems to stem from us using an older 
> zookeeper dependency version (3.4.10). 
> Sample error we see is shown below.
> This error seems to stem from us being on an older zookeeper release 
> (3.4.10). This has been fixed as part of: 
> https://issues.apache.org/jira/browse/ZOOKEEPER-1576 in the 3.4.x branch 
> ([https://github.com/apache/zookeeper/commit/be1409cc9a14ac2e28693e0e02a0ba6d9713565e]).
>  
> {code:java}
> java.net.UnknownHostException: zk01-pa4.hpc.criteo.prod: Name or service not 
> knownjava.net.UnknownHostException: zk01-pa4.hpc.criteo.prod: Name or service 
> not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at 
> java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at 
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at 
> java.net.InetAddress.getAllByName0(InetAddress.java:1277) at 
> java.net.InetAddress.getAllByName(InetAddress.java:1193) at 
> java.net.InetAddress.getAllByName(InetAddress.java:1127) at 
> org.apache.flink.shaded.zookeeper.org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61)
>   at 
> org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
>  at 
> org.apache.flink.shaded.curator.org.apache.curator.utils.DefaultZookeeperFactory.newZooKeeper(DefaultZookeeperFactory.java:29)
>  at 
> org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl$2.newZooKeeper(CuratorFrameworkImpl.java:150)
>  at 
> org.apache.flink.shaded.curator.org.apache.curator.HandleHolder$1.getZooKeeper(HandleHolder.java:94)
>  at 
> org.apache.flink.shaded.curator.org.apache.curator.HandleHolder.getZooKeeper(HandleHolder.java:55)
>  at 
> org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.reset(ConnectionState.java:262)
>  at 
> org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.start(ConnectionState.java:109)
>  at 
> org.apache.flink.shaded.curator.org.apache.curator.CuratorZookeeperClient.start(CuratorZookeeperClient.java:191)
>  at 
> org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl.start(CuratorFrameworkImpl.java:259)
>  at 
> org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:131)
>  at 
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:123)
>  at 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:292)
>  at 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:257){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to