Thank you, Christopher. Can you comment on what version of Hadoop you are using? We've been pointed by some JDK experts at Microsoft to this ZK issue: https://issues.apache.org/jira/browse/ZOOKEEPER-3779. As far as I can see, Hadoop 3.2.1 (which is what I was using) references ZK 3.4.14, which would perhaps explain the issue. I'm going to repeat my tests with a snapshot build of Hadoop branch-3.3 and report back.
Stay safe everyone! Warm Regards, Arvind. -----Original Message----- From: Christopher <ctubb...@apache.org> Sent: Thursday, May 14, 2020 4:42 AM To: fluo-dev <dev@fluo.apache.org> Subject: Re: [EXTERNAL] Re: JDK14 and Fluo Just a follow up from this. As far as I can tell, the problem I was having was related to my system's hostname resolver, and that jdk14 works fine. On Wed, May 6, 2020, 18:10 Arvind Shyamsundar <arvin...@microsoft.com.invalid> wrote: > FWIW, sharing my notes - here's what I see when attempting to setup an > cluster (Accumulo-2.1.0-SNAPSHOT, Hadoop 3.2.1, ZooKeeper 3.5.7) with > OpenJDK 14. The step where the Hadoop ZKFC (hdfs zkfc -formatZK > -force) is configured fails with the below exception, which seemed > vaguely similar to Christopher's report of Accumulo being unable to connect > to ZK. > > 2020-05-06 22:00:47,518 INFO zookeeper.ClientCnxn: Opening socket > connection to server leader-3/<unresolved>:2181. Will not attempt to > authenticate using SASL (unknown error) > 2020-05-06 22:00:47,518 WARN zookeeper.ClientCnxn: Session 0x0 for > server leader-3/<unresolved>:2181, unexpected error, closing socket > connection and attempting reconnect > java.nio.channels.UnresolvedAddressException > at java.base/sun.nio.ch.Net.checkAddress(Net.java:139) > at java.base/sun.nio.ch > .SocketChannelImpl.checkRemote(SocketChannelImpl.java:727) > at java.base/sun.nio.ch > .SocketChannelImpl.connect(SocketChannelImpl.java:741) > at > org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277) > at > org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287) > at > org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1021) > at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1064) > > (the above is repeated for the other 2 ZK servers configured in this > cluster as well). The JDK item I found (again, this is all > circumstantial, and I have not spent any time to dig deep) was > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs > .openjdk.java.net%2Fbrowse%2FJDK-8225499&data=02%7C01%7Carvindsh%4 > 0microsoft.com%7Cf4c61036a9714c7045bb08d7f7fbe87b%7C72f988bf86f141af91 > ab2d7cd011db47%7C1%7C0%7C637250533625323572&sdata=TmSrpgDIlpxltWzH7GGduHqHARmonstwmorQ94yh0o0%3D&reserved=0. > The associated note in JDK 14 release notes > (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjdk.java.net%2F14%2Frelease-notes&data=02%7C01%7Carvindsh%40microsoft.com%7Cf4c61036a9714c7045bb08d7f7fbe87b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637250533625328563&sdata=RxZnt3FJ1Jn0fNzTmDM48o5JT%2FAYka8N3Aur6i1frPA%3D&reserved=0) > has the below text, which seemed interesting as it kind of matches up with > the symptoms above: > > Additionally, the string format for unresolved addresses has been changed. > The method now represents the literal IP address with the token > <unresolved>, for example: foo/<unresolved>:80 instead of foo:80. This > is based on InetAddress::toString, which returns a string of the form > "hostname / literal IP address". To retrieve a string representation > of the hostname, or the string form of the address if it doesn't have > a hostname, use InetSocketAddress::getHostString, rather than parsing > the string representation. > > - Arvind > > From: Arvind Shyamsundar <arvin...@microsoft.com> > Sent: Wednesday, May 6, 2020 1:09 AM > To: dev@fluo.apache.org > Subject: Re: [EXTERNAL] Re: JDK14 and Fluo > > Incidentally I did a brief test yesterday with Muchos and jdk 14. I > saw similar issues with zk client (within Hadoop zkfc) being unable to > connect to zk, complaining about an unresolved address. I did find a > vaguely related documented change in jdk14 for ipv6 addresses but gave > up after a while. I can send details, on Wednesday. > Sent from Outlook Mobile<https://aka.ms/blhgte> > > ________________________________ > From: Christopher <ctubb...@apache.org<mailto:ctubb...@apache.org>> > Sent: Wednesday, May 6, 2020 12:50:39 AM > To: fluo-dev <dev@fluo.apache.org<mailto:dev@fluo.apache.org>> > Subject: [EXTERNAL] Re: JDK14 and Fluo > > So, a quick follow-up to this: > > The CMS flags seem to be ignored, so that's not really a problem: > > OpenJDK 64-Bit Server VM warning: Ignoring option UseConcMarkSweepGC; > support was removed in 14.0 OpenJDK 64-Bit Server VM warning: Ignoring > option CMSInitiatingOccupancyFraction; support was removed in 14.0 > > These are just warnings about the flags being ignored. > > I may be having other problems on my machine that prevent the tests > from running. I'm not sure, but it seems that Accumulo's Initialize is > failing to connect to the ZooKeeper server within the 30s timeout > period. There does not appear to be any problems with ZooKeeper > itself, and I can use telnet to send `ruok`. My machine might just be > slow... I'll have to troubleshoot later. > > On Wed, May 6, 2020 at 2:52 AM Christopher <ctubb...@apache.org<mailto: > ctubb...@apache.org>> wrote: > > > > I have switched my development environment to primarily run with > > JDK14 and noticed some incompatibilities with Fluo. > > > > Some of these are now fixed in a PR: > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith > ub.com%2Fapache%2Ffluo%2Fpull%2F1093&data=02%7C01%7Carvindsh%40mic > rosoft.com%7Cf4c61036a9714c7045bb08d7f7fbe87b%7C72f988bf86f141af91ab2d > 7cd011db47%7C1%7C0%7C637250533625328563&sdata=TkVp97HUB3ugh%2FYqd1 > 8asT8qJIukZxCoT8g4p8AT40M%3D&reserved=0 > > > > However, this doesn't completely fix JDK14 builds. Specifically, > > JDK14 removed the Concurrent Mark Sweep garbage collector. This > > garbage collector is hard-coded in MiniAccumuloCluster 2.0.0 and earlier. > > Since accumulo2-maven-plugin uses MiniAccumuloCluster, the tests > > cannot run in Fluo on JDK14. > > > > This issue has already been addressed in upstream Accumulo ( > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith > ub.com%2Fapache%2Faccumulo%2Fpull%2F1427&data=02%7C01%7Carvindsh%4 > 0microsoft.com%7Cf4c61036a9714c7045bb08d7f7fbe87b%7C72f988bf86f141af91 > ab2d7cd011db47%7C1%7C0%7C637250533625328563&sdata=D8n09flrBYZVAwUV > fYcOYNIni3Qgk%2BRuJkTK8OnX6xQ%3D&reserved=0) > for Accumulo 2.1.0. > > > > My proposed resolution is that Fluo should use an updated version of > > accumulo2-maven-plugin, after Accumulo 2.1.0 is released with the > > MiniAccumuloCluster improvements. Of course, this requires Accumulo > > 2.1.0 to be released first. >