Chun Chang created DRILL-4113: --------------------------------- Summary: memory leak reported while handling query or shutting down Key: DRILL-4113 URL: https://issues.apache.org/jira/browse/DRILL-4113 Project: Apache Drill Issue Type: Bug Affects Versions: 1.3.0, 1.4.0 Reporter: Chun Chang Priority: Critical
With impersonation enabled, I've seen two memory leaks. One reported at query time, one at shutdown. At query time: {noformat} 2015-11-17 19:11:03,595 [29b413b7-958e-c1f3-9d37-c34f96e7bf6a:foreman] INFO o.a.drill.exec.work.foreman.Foreman - Query text for query id 29b413b7-958e-c1f3-9d37-c34f96e7bf6a: use `dfs.window_functions` 2015-11-17 19:11:03,666 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - 29b413b7-edbc-9722-120d-66ab3611f250:0:0: State change requested AWAITING_ALLOCATION --> RUNNING 2015-11-17 19:11:03,666 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] INFO o.a.d.e.w.f.FragmentStatusReporter - 29b413b7-edbc-9722-120d-66ab3611f250:0:0: State to report: RUNNING 2015-11-17 19:11:03,669 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - 29b413b7-edbc-9722-120d-66ab3611f250:0:0: State change requested RUNNING --> FAILED 2015-11-17 19:11:03,669 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - 29b413b7-edbc-9722-120d-66ab3611f250:0:0: State change requested FAILED --> FAILED 2015-11-17 19:11:03,669 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - 29b413b7-edbc-9722-120d-66ab3611f250:0:0: State change requested FAILED --> FINISHED 2015-11-17 19:11:03,674 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: Failure while closing accountor. Expected private and shared pools to be set to initial values. However, one or more were not. Stats are zone init allocated delta private 1000000 738112 261888 shared 9999000000 9999261888 -261888. Fragment 0:0 [Error Id: 6df67be9-69d4-4a3b-9eae-43ab2404c6d3 on drillats1.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IllegalStateException: Failure while closing accountor. Expected private and shared pools to be set to initial values. However, one or more were not. Stats are zone init allocated delta private 1000000 738112 261888 shared 9999000000 9999261888 -261888. Fragment 0:0 [Error Id: 6df67be9-69d4-4a3b-9eae-43ab2404c6d3 on drillats1.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534) ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:321) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] Caused by: java.lang.IllegalStateException: Failure while closing accountor. Expected private and shared pools to be set to initial values. However, one or more were not. Stats are zone init allocated delta private 1000000 738112 261888 shared 9999000000 9999261888 -261888. at org.apache.drill.exec.memory.AtomicRemainder.close(AtomicRemainder.java:199) ~[drill-memory-impl-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.memory.AccountorImpl.close(AccountorImpl.java:365) ~[drill-memory-impl-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.close(TopLevelAllocator.java:326) ~[drill-memory-impl-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:123) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:437) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:426) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:346) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:179) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] ... 5 common frames omitted 2015-11-17 19:11:03,723 [CONTROL-rpc-event-queue] WARN o.a.drill.exec.work.foreman.Foreman - Dropping request to move to COMPLETED state as query is already at FAILED state (which is terminal). 2015-11-17 19:11:03,724 [CONTROL-rpc-event-queue] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel fragment. 29b413b7-edbc-9722-120d-66ab3611f250:0:0 does not exist. {noformat} At shutdown: {noformat} Tue Nov 17 22:01:14 PST 2015 Terminating drillbit pid 32653 2015-11-17 22:01:14,569 [Drillbit-ShutdownHook#0] INFO o.apache.drill.exec.server.Drillbit - Received shutdown request. 2015-11-17 22:01:15,559 [BitClient-2] INFO o.a.drill.exec.rpc.data.DataClient - Channel closed /10.10.100.171:56037 <--> /10.10.100.172:31012. 2015-11-17 22:01:15,559 [BitServer-6] INFO o.a.d.exec.rpc.control.ControlClient - Channel closed /10.10.100.171:47521 <--> /10.10.100.172:31011. 2015-11-17 22:01:15,574 [BitClient-4] INFO o.a.drill.exec.rpc.data.DataClient - Channel closed /10.10.100.171:59422 <--> /10.10.100.174:31012. 2015-11-17 22:01:15,575 [BitServer-2] INFO o.a.d.exec.rpc.control.ControlClient - Channel closed /10.10.100.171:55874 <--> /10.10.100.174:31011. 2015-11-17 22:01:15,582 [BitClient-3] INFO o.a.drill.exec.rpc.data.DataClient - Channel closed /10.10.100.171:49728 <--> /10.10.100.173:31012. 2015-11-17 22:01:15,582 [BitServer-5] INFO o.a.d.exec.rpc.control.ControlClient - Channel closed /10.10.100.171:43599 <--> /10.10.100.173:31011. 2015-11-17 22:01:15,603 [BitClient-1] INFO o.a.drill.exec.rpc.data.DataClient - Channel closed /10.10.100.171:56089 <--> /10.10.100.171:31012. 2015-11-17 22:01:15,603 [BitServer-3] INFO o.a.d.exec.rpc.control.ControlClient - Channel closed /10.10.100.171:52294 <--> /10.10.100.171:31011. 2015-11-17 22:01:16,610 [pool-1274-thread-1] INFO o.a.drill.exec.rpc.user.UserServer - closed eventLoopGroup io.netty.channel.epoll.EpollEventLoopGroup@44d08482 in 1005 ms 2015-11-17 22:01:16,610 [pool-1274-thread-2] INFO o.a.drill.exec.rpc.data.DataServer - closed eventLoopGroup io.netty.channel.epoll.EpollEventLoopGroup@1a10cc3c in 1007 ms 2015-11-17 22:01:16,610 [pool-1274-thread-1] INFO o.a.drill.exec.service.ServiceEngine - closed userServer in 1009 ms 2015-11-17 22:01:16,610 [pool-1274-thread-2] INFO o.a.drill.exec.service.ServiceEngine - closed dataPool in 1009 ms 2015-11-17 22:01:16,629 [Drillbit-ShutdownHook#0] WARN o.apache.drill.exec.server.Drillbit - Failure on close(): {} java.lang.RuntimeException: Exception while closing at org.apache.drill.common.DrillAutoCloseables.closeNoChecked(DrillAutoCloseables.java:46) ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:127) ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:44) ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:254) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.server.Drillbit$ShutdownThread.run(Drillbit.java:297) [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] Caused by: java.lang.IllegalStateException: Failure while closing accountor. Expected private and shared pools to be set to initial values. However, one or more were not. Stats are zone init allocated delta private 0 0 0 shared 34359738368 34358000256 1738112. at org.apache.drill.exec.memory.AtomicRemainder.close(AtomicRemainder.java:199) ~[drill-memory-impl-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.memory.AccountorImpl.close(AccountorImpl.java:365) ~[drill-memory-impl-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:179) ~[drill-memory-impl-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] at org.apache.drill.common.DrillAutoCloseables.closeNoChecked(DrillAutoCloseables.java:44) ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] ... 4 common frames omitted 2015-11-17 22:01:16,629 [Drillbit-ShutdownHook#0] INFO o.apache.drill.exec.server.Drillbit - Shutdown completed (2059 ms). {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)