[ https://issues.apache.org/jira/browse/DRILL-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368399#comment-16368399 ]
ASF GitHub Bot commented on DRILL-5902: --------------------------------------- Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1113#discussion_r168935308 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryStateProcessor.java --- @@ -125,20 +125,17 @@ public void cancel() { case PREPARING: case PLANNING: case ENQUEUED: - moveToState(QueryState.CANCELLATION_REQUESTED, null); - return; - case STARTING: case RUNNING: - addToEventQueue(QueryState.CANCELLATION_REQUESTED, null); - return; + moveToState(QueryState.CANCELLATION_REQUESTED, null); --- End diff -- @arina-ielchiieva Please update JIRA title. > Regression: Queries encounter random failure due to RPC connection timed out > ---------------------------------------------------------------------------- > > Key: DRILL-5902 > URL: https://issues.apache.org/jira/browse/DRILL-5902 > Project: Apache Drill > Issue Type: Bug > Components: Execution - RPC > Affects Versions: 1.11.0 > Reporter: Robert Hou > Assignee: Vlad Rozov > Priority: Critical > Attachments: 261230f7-e3b9-0cee-22d8-921cb56e3e12.sys.drill, > node196.drillbit.log > > > Multiple random failures (25) occurred with the latest > Functional-Baseline-88.193 run. Here is a sample query: > {noformat} > /root/drillAutomation/prasadns14/framework/resources/Functional/window_functions/multiple_partitions/q27.sql > -- Kitchen sink > -- Use all supported functions > select > rank() over W, > dense_rank() over W, > percent_rank() over W, > cume_dist() over W, > avg(c_integer + c_integer) over W, > sum(c_integer/100) over W, > count(*) over W, > min(c_integer) over W, > max(c_integer) over W, > row_number() over W > from > j7 > where > c_boolean is not null > window W as (partition by c_bigint, c_date, c_time, c_boolean order by > c_integer) > {noformat} > From the logs: > {noformat} > 2017-10-23 04:14:36,536 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:1 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:5 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:9 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:13 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:17 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,538 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:21 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > 2017-10-23 04:14:36,538 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request for early fragment termination for path > 261230e8-d03e-9ca9-91bf-c1039deecde2:1:25 -> > 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. > {noformat} > {noformat} > 2017-10-23 04:14:53,941 [UserServer-1] INFO > o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.88.196:31010 <--> > /10.10.88.193:38281 (user server) timed out. Timeout was set to 30 seconds. > Closing connection. > 2017-10-23 04:14:53,952 [UserServer-1] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested RUNNING --> > FAILED > 2017-10-23 04:14:53,952 [261230f8-2698-15b2-952f-d4ade8d6b180:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested FAILED --> > FINISHED > 2017-10-23 04:14:53,956 [UserServer-1] WARN > o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc > response. > java.lang.IllegalArgumentException: Self-suppression not permitted > at java.lang.Throwable.addSuppressed(Throwable.java:1043) > ~[na:1.7.0_45] > at > org.apache.drill.common.DeferredException.addException(DeferredException.java:88) > ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97) > ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:413) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.access$700(FragmentExecutor.java:55) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor$ExecutorStateImpl.fail(FragmentExecutor.java:427) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.FragmentContext.fail(FragmentContext.java:213) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.FragmentContext$1.accept(FragmentContext.java:95) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.FragmentContext$1.accept(FragmentContext.java:92) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.StatusHandler.failed(StatusHandler.java:42) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RequestIdMap$RpcListener.setException(RequestIdMap.java:139) > ~[drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:76) > [drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RequestIdMap$SetExceptionProcedure.apply(RequestIdMap.java:66) > [drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > com.carrotsearch.hppc.IntObjectHashMap.forEach(IntObjectHashMap.java:692) > [hppc-0.7.1.jar:na] > at > org.apache.drill.exec.rpc.RequestIdMap.channelClosed(RequestIdMap.java:62) > [drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.AbstractRemoteConnection.channelClosed(AbstractRemoteConnection.java:192) > [drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.AbstractServerConnection.channelClosed(AbstractServerConnection.java:165) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:167) > [drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:144) > [drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507) > [netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500) > [netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479) > [netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420) > [netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104) > [netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1003) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:632) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:610) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:553) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.DefaultChannelPipeline$HeadContext.close(DefaultChannelPipeline.java:1236) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:618) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:602) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.ChannelOutboundHandlerAdapter.close(ChannelOutboundHandlerAdapter.java:71) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:618) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:602) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:459) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.timeout.ReadTimeoutHandler.readTimedOut(ReadTimeoutHandler.java:99) > [netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > org.apache.drill.exec.rpc.BasicServer$LoggingReadTimeoutHandler.readTimedOut(BasicServer.java:147) > [drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > io.netty.handler.timeout.ReadTimeoutHandler.channelIdle(ReadTimeoutHandler.java:90) > [netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.timeout.IdleStateHandler$ReaderIdleTimeoutTask.run(IdleStateHandler.java:494) > [netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.timeout.IdleStateHandler$AbstractIdleTask.run(IdleStateHandler.java:466) > [netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38) > [netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120) > [netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399) > [netty-common-4.0.48.Final.jar:4.0.48.Final] > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) > [netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) > [netty-common-4.0.48.Final.jar:4.0.48.Final] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] > Caused by: org.apache.drill.exec.rpc.ChannelClosedException: Channel closed > /10.10.88.196:31010 <--> /10.10.88.193:38343. > at > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:164) > [drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > ... 29 common frames omitted > 2017-10-23 04:14:53,963 [UserServer-1] INFO > o.a.drill.exec.rpc.user.UserSession - Deleted session temporary location > [/tmp/bec73426-40b7-44b6-8819-7958634614e9] from file system [maprfs:///] > 2017-10-23 04:14:53,966 [261230f8-2698-15b2-952f-d4ade8d6b180:frag:0:0] ERROR > o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: ChannelClosedException: > Channel closed /10.10.88.196:31010 <--> /10.10.88.193:38343. > Fragment 0:0 > [Error Id: d3c59456-284f-4c35-988f-f052b7f6c4b0 on atsqa4-196.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > ChannelClosedException: Channel closed /10.10.88.196:31010 <--> > /10.10.88.193:38343. > Fragment 0:0 > [Error Id: d3c59456-284f-4c35-988f-f052b7f6c4b0 on atsqa4-196.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) > ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_45] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] > Caused by: org.apache.drill.exec.rpc.ChannelClosedException: Channel closed > /10.10.88.196:31010 <--> /10.10.88.193:38343. > at > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:164) > ~[drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:144) > ~[drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507) > ~[netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500) > ~[netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479) > ~[netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420) > ~[netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104) > ~[netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1003) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:632) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:610) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:553) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.DefaultChannelPipeline$HeadContext.close(DefaultChannelPipeline.java:1236) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:618) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:602) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.ChannelOutboundHandlerAdapter.close(ChannelOutboundHandlerAdapter.java:71) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:618) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:602) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:459) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.timeout.ReadTimeoutHandler.readTimedOut(ReadTimeoutHandler.java:99) > ~[netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > org.apache.drill.exec.rpc.BasicServer$LoggingReadTimeoutHandler.readTimedOut(BasicServer.java:147) > ~[drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > io.netty.handler.timeout.ReadTimeoutHandler.channelIdle(ReadTimeoutHandler.java:90) > ~[netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.timeout.IdleStateHandler$ReaderIdleTimeoutTask.run(IdleStateHandler.java:494) > ~[netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.handler.timeout.IdleStateHandler$AbstractIdleTask.run(IdleStateHandler.java:466) > ~[netty-handler-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38) > ~[netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120) > ~[netty-common-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399) > ~[netty-common-4.0.48.Final.jar:4.0.48.Final] > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) > ~[netty-transport-4.0.48.Final.jar:4.0.48.Final] > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) > ~[netty-common-4.0.48.Final.jar:4.0.48.Final] > ... 1 common frames omitted > 2017-10-23 04:14:54,089 [UserServer-1] INFO > o.a.drill.exec.work.foreman.Foreman - Failure while trying communicate query > result to initiating client. This would happen if a client is disconnected > before response notice can be sent. > org.apache.drill.exec.rpc.RpcException: Failure sending message. > {noformat} > sys.version is: > 1.12.0-SNAPSHOT f1d1945b3772bb782039fd6811e34a7de66441c8 > DRILL-5582: C++ Client: [Threat Modeling] Drillbit may be spoofed by an > attacker and this may lead to data being written to the attacker's target > instead of Drillbit 19.10.2017 @ 17:13:05 PDT Unknown 19.10.2017 @ > 18:37:13 PDT -- This message was sent by Atlassian JIRA (v7.6.3#76005)