[ https://issues.apache.org/jira/browse/DRILL-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15223037#comment-15223037 ]
Deneche A. Hakim commented on DRILL-3763: ----------------------------------------- A tentative fix was submitted along the fix for DRILL-3714 > Cancel (Ctrl-C) one of concurrent queries results in ChannelClosedException > --------------------------------------------------------------------------- > > Key: DRILL-3763 > URL: https://issues.apache.org/jira/browse/DRILL-3763 > Project: Apache Drill > Issue Type: Bug > Components: Execution - RPC > Affects Versions: 1.2.0 > Environment: 4 node cluster CentOS > Reporter: Khurram Faraaz > Assignee: Deneche A. Hakim > Fix For: 1.7.0 > > > Canceling a query from a set of concurrent queries executing on the same > table, results in ChannelClosedException when one of the queries in Canceled > using Ctrl-C from sqlline prompt. > Steps to reproduce the problem, > 1. Start, service map-warden start on 4 nodes. > 2. Start eight sqlline sessions from eight different terminals, ./sqlline -u > "jdbc:drill:schema=dfs.tmp" > 3. Run the below query from the eight sqlline sessions > select * from `twoKeyJsn.json`; > 4. While the queries are being executed, Cancel one of the queries on sqlline > prompt, issue Ctrl-C on sqlline. > 5. you will note that after a few seconds/minutes, on one of the other > sqlline sessions there is a ChannelClosedException that is reported and that > query is reported as FAILED on the query profile on the Web UI. > {code} > 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json`; > ... > | 1.31643767542E9 | h | > | 9.02780441562E8 | b | > | 6.46524413864E8 | l | > java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: > ChannelClosedException: Channel closed /10.10.100.201:31010 <--> > /10.10.100.202:58705. > Fragment 0:0 > [Error Id: bd765bd5-3921-4b58-89a7-cd87482f9088 on centos-01.qa.lab:31010] > at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73) > at > sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87) > at sqlline.TableOutputFormat.print(TableOutputFormat.java:118) > at sqlline.SqlLine.print(SqlLine.java:1583) > at sqlline.Commands.execute(Commands.java:852) > at sqlline.Commands.sql(Commands.java:751) > at sqlline.SqlLine.dispatch(SqlLine.java:738) > at sqlline.SqlLine.begin(SqlLine.java:612) > at sqlline.SqlLine.start(SqlLine.java:366) > at sqlline.SqlLine.main(SqlLine.java:259) > 0: jdbc:drill:schema=dfs.tmp> > {code} > Stack trace from drillbit.log > {code} > 2015-09-10 23:23:55,019 [UserServer-1] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2a0df049-697d-f47b-86b5-1e2697946237:0:0: State change requested RUNNING --> > FAILED > 2015-09-10 23:24:05,319 [2a0df049-697d-f47b-86b5-1e2697946237:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2a0df049-697d-f47b-86b5-1e2697946237:0:0: State change requested FAILED --> > FINISHED > 2015-09-10 23:24:05,331 [2a0df049-697d-f47b-86b5-1e2697946237:frag:0:0] ERROR > o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: ChannelClosedException: > Channel closed /10.10.100.201:31010 <--> /10.10.100.202:58705. > Fragment 0:0 > [Error Id: bd765bd5-3921-4b58-89a7-cd87482f9088 on centos-01.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > ChannelClosedException: Channel closed /10.10.100.201:31010 <--> > /10.10.100.202:58705. > Fragment 0:0 > [Error Id: bd765bd5-3921-4b58-89a7-cd87482f9088 on centos-01.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524) > ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_45] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] > Caused by: org.apache.drill.exec.rpc.ChannelClosedException: Channel closed > /10.10.100.201:31010 <--> /10.10.100.202:58705. > at > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:167) > ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$ChannelClosedHandler.operationComplete(RpcBus.java:143) > ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) > ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603) > ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563) > ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406) > ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82) > ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943) > ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592) > ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584) > ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.closeOnRead(AbstractEpollStreamChannel.java:409) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:647) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollRdHupReady(AbstractEpollStreamChannel.java:573) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:315) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > ... 1 common frames omitted > 2015-09-10 23:24:05,345 [BitServer-4] INFO > o.a.drill.exec.work.foreman.Foreman - State change requested. RUNNING --> > FAILED > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > ChannelClosedException: Channel closed /10.10.100.201:31010 <--> > /10.10.100.202:58705. > Fragment 0:0 > Error Id: bd765bd5-3921-4b58-89a7-cd87482f9088 on centos-01.qa.lab:31010] > at > org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:473) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.control.WorkEventBus.statusUpdate(WorkEventBus.java:71) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.work.batch.ControlMessageHandler.handle(ControlMessageHandler.java:79) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.control.ControlServer.handle(ControlServer.java:61) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.control.ControlServer.handle(ControlServer.java:38) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:205) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89) > [netty-codec-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.handler.timeout.ReadTimeoutHandler.channelRead(ReadTimeoutHandler.java:150) > [netty-handler-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) > [netty-codec-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242) > [netty-codec-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] > 2015-09-10 23:24:05,356 [BitServer-4] INFO > o.a.drill.exec.work.foreman.Foreman - foreman cleaning up. > 2015-09-10 23:24:05,364 [BitServer-4] INFO > o.a.drill.exec.work.foreman.Foreman - State change requested. FAILED --> > COMPLETED > 2015-09-10 23:24:05,364 [BitServer-4] WARN > o.a.drill.exec.work.foreman.Foreman - Dropping request to move to COMPLETED > state as query is already at FAILED state (which is terminal). > 2015-09-10 23:24:05,365 [BitServer-4] WARN o.a.d.e.w.b.ControlMessageHandler > - Dropping request to cancel fragment. > 2a0df049-697d-f47b-86b5-1e2697946237:0:0 does not exist. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)