I am trying to setup a drill cluster running 3 vm nodes running CentOS 7. I have successfully installed a small zookeeper ensemble on the three nodes and then installed drill as per the documentation. I am having trouble connecting to the other drillbits when running drill-conf: However, this issue is intermittent. I can eventually connect if I spam drill-conf.
Error: Failure in connecting to Drill: org.apache.drill.exec.rpc.RpcException: CONNECTION : java.nio.channels.UnresolvedAddressException (state=,code=0) java.sql.SQLException: Failure in connecting to Drill: org.apache.drill.exec.rpc.RpcException: CONNECTION : java.nio.channels.UnresolvedAddressException at org.apache.drill.jdbc.impl.DrillConnectionImpl.<init>(DrillConnectionImpl.java:159) at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:64) at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:69) at net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:126) at org.apache.drill.jdbc.Driver.connect(Driver.java:72) at sqlline.DatabaseConnection.connect(DatabaseConnection.java:167) at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:213) at sqlline.Commands.connect(Commands.java:1083) at sqlline.Commands.connect(Commands.java:1015) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36) at sqlline.SqlLine.dispatch(SqlLine.java:742) at sqlline.SqlLine.initArgs(SqlLine.java:528) at sqlline.SqlLine.begin(SqlLine.java:596) at sqlline.SqlLine.start(SqlLine.java:375) at sqlline.SqlLine.main(SqlLine.java:268) Caused by: org.apache.drill.exec.rpc.RpcException: CONNECTION : java.nio.channels.UnresolvedAddressException at org.apache.drill.exec.client.DrillClient$FutureHandler.connectionFailed(DrillClient.java:448) at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:237) at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:200) at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567) at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424) at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.connect(AbstractEpollStreamChannel.java:482) at io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1089) at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:543) at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:528) at io.netty.channel.ChannelOutboundHandlerAdapter.connect(ChannelOutboundHandlerAdapter.java:47) at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:543) at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:528) at io.netty.channel.ChannelDuplexHandler.connect(ChannelDuplexHandler.java:50) at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:543) at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:528) at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:510) at io.netty.channel.DefaultChannelPipeline.connect(DefaultChannelPipeline.java:909) at io.netty.channel.AbstractChannel.connect(AbstractChannel.java:203) at io.netty.bootstrap.Bootstrap$2.run(Bootstrap.java:165) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.ExecutionException: java.nio.channels.UnresolvedAddressException at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:47) at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:213) ... 22 more Caused by: java.nio.channels.UnresolvedAddressException at io.netty.channel.epoll.AbstractEpollChannel.checkResolvable(AbstractEpollChannel.java:221) at io.netty.channel.epoll.EpollSocketChannel.doConnect(EpollSocketChannel.java:183) at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.connect(AbstractEpollStreamChannel.java:445) ... 17 more apache drill 1.4.0 "the only truly happy people are children, the creative minority and drill users" 0: jdbc:drill:> As I said, I will eventually connect only after spamming drill-conf a few times. In addition to this failure, I am unable to run broadcast joins when I point my drill cluster to an external HDH datasource. This failure happens 100% of the time: Error: CONNECTION ERROR: Error setting up remote intermediate fragment execution Nodes with failures dev-drill-node3, dev-drill-node2 [Error Id: 13a566d2-b4c0-4be9-8f04-da77c4dfef23 on dev-drill-node1:31010] (state=,code=0) Doing a grep on that error id I can see this error message: 2016-01-20 15:17:48,048 [BitServer-1] ERROR o.a.d.exec.work.foreman.QueryManager - Failure while attempting to CANCEL fragment query_id { part1: 2981405579761470863 part2: 8997799343876536749 } major_fragment_id: 1 minor_fragment_id: 1 on endpoint address: "dev-drill-node3" user_port: 31010 control_port: 31011 data_port: 31012 with org.apache.drill.exec.rpc.RpcException: Command failed while establishing connection. Failure type CONNECTION.. 2016-01-20 15:17:48,048 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.fragment.FragmentExecutor - 29601494-210c-cd8f-7cde-9ad4b400f9ad:0:0: State change requested AWAITING_ALLOCATION --> CANCELLATION_REQUESTED 2016-01-20 15:17:48,048 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.f.FragmentStatusReporter - 29601494-210c-cd8f-7cde-9ad4b400f9ad:0:0: State to report: CANCELLATION_REQUESTED 2016-01-20 15:17:48,048 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.fragment.FragmentExecutor - 29601494-210c-cd8f-7cde-9ad4b400f9ad:0:0: State change requested CANCELLATION_REQUESTED --> FINISHED 2016-01-20 15:17:48,048 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.f.FragmentStatusReporter - 29601494-210c-cd8f-7cde-9ad4b400f9ad:0:0: State to report: CANCELLED 2016-01-20 15:17:48,049 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.fragment.FragmentExecutor - 29601494-210c-cd8f-7cde-9ad4b400f9ad:2:0: State change requested AWAITING_ALLOCATION --> CANCELLATION_REQUESTED 2016-01-20 15:17:48,049 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.f.FragmentStatusReporter - 29601494-210c-cd8f-7cde-9ad4b400f9ad:2:0: State to report: CANCELLATION_REQUESTED 2016-01-20 15:17:48,049 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.fragment.FragmentExecutor - 29601494-210c-cd8f-7cde-9ad4b400f9ad:2:0: State change requested CANCELLATION_REQUESTED --> FINISHED 2016-01-20 15:17:48,049 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.f.FragmentStatusReporter - 29601494-210c-cd8f-7cde-9ad4b400f9ad:2:0: State to report: CANCELLED 2016-01-20 15:17:48,050 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.fragment.FragmentExecutor - 29601494-210c-cd8f-7cde-9ad4b400f9ad:1:0: State change requested AWAITING_ALLOCATION --> CANCELLATION_REQUESTED 2016-01-20 15:17:48,050 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.f.FragmentStatusReporter - 29601494-210c-cd8f-7cde-9ad4b400f9ad:1:0: State to report: CANCELLATION_REQUESTED 2016-01-20 15:17:48,050 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.fragment.FragmentExecutor - 29601494-210c-cd8f-7cde-9ad4b400f9ad:1:0: State change requested CANCELLATION_REQUESTED --> FINISHED 2016-01-20 15:17:48,050 [CONTROL-rpc-event-queue] INFO o.a.d.e.w.f.FragmentStatusReporter - 29601494-210c-cd8f-7cde-9ad4b400f9ad:1:0: State to report: CANCELLED 2016-01-20 15:17:48,054 [BitServer-1] ERROR o.apache.drill.exec.rpc.BasicClient - Failed to establish connection java.util.concurrent.ExecutionException: java.nio.channels.UnresolvedAddressException I have spent the last week and half pulling my hair out. I've combed through the documentations but cannot find any information about may causing this. I am at this point in desperate need for help on this problem. Any insight into what may be causing this would be met with extreme gratitude. If I haven't provided enough documentation please contact me as soon as possible. Thank You