[ 
https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987880#comment-15987880
 ] 

Xuefu Zhang commented on HIVE-15997:
------------------------------------

Hi [~ychena]/[~ctang.ma], Thanks for looking into this. I'm trying to 
understand the patch, as I'm reviewing HIVE-16552, which refers to this JIRA. 
Specifically, I don't quite understand the following code addtion:
{code}
       rj = jc.submitJob(job);
+
+      if (driverContext.isShutdown()) {
+        LOG.warn("Task was cancelled");
+        if (rj != null) {
+          rj.killJob();
+          rj = null;
+        }
+        return 5;
+      }
+
       this.jobID = rj.getJobID();
{code}
I understand we are checking if query is cancelled right after submission. 
However, my question is whether this check is necessary or complete as right 
after this check the query can become cancelled, which the check will not 
capture. If such cancellation is captured later in other code path, then the 
check here seems not necessary. Otherwise, this check will not capture all 
scenarios.

Did I miss anything? Thanks.

> Resource leaks when query is cancelled 
> ---------------------------------------
>
>                 Key: HIVE-15997
>                 URL: https://issues.apache.org/jira/browse/HIVE-15997
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Yongzhi Chen
>            Assignee: Yongzhi Chen
>             Fix For: 2.2.0
>
>         Attachments: HIVE-15997.1.patch
>
>
> There may some resource leaks when query is cancelled.
> We see following stacks in the log:
> Possible files and folder leak: 
> {noformat} 
> 2017-02-02 06:23:25,410 WARN hive.ql.Context: [HiveServer2-Background-Pool: 
> Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local 
> exception: java.nio.channels.ClosedByInterruptException; Host Details : local 
> host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: 
> "ychencdh511t-1.vpc.cloudera.com":8020; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1476) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1409) 
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>  
> at com.sun.proxy.$Proxy25.delete(Unknown Source) 
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:606) 
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>  
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>  
> at com.sun.proxy.$Proxy26.delete(Unknown Source) 
> at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
>  
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
>  
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
>  
> at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405) 
> at org.apache.hadoop.hive.ql.Context.clear(Context.java:541) 
> at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109) 
> at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) 
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
>  
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
>  
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  
> at java.lang.Thread.run(Thread.java:745) 
> Caused by: java.nio.channels.ClosedByInterruptException 
> at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>  
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681) 
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>  
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) 
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) 
> at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615) 
> at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:714) 
> at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376) 
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1448) 
> ... 35 more 
> 2017-02-02 12:26:52,706 INFO 
> org.apache.hive.service.cli.operation.OperationManager: 
> [HiveServer2-Background-Pool: Thread-23]: Operation is timed 
> out,operation=OperationHandle [opType=EXECUTE_STATEMENT, 
> getHandleIdentifier()=2af82100-94cf-4f26-abaa-c4b57c57b23c],state=CANCELED 
> {format} 
> Possible lock leak:
> Locks leak:
> {format}
> 2017-02-02 06:21:05,054 ERROR ZooKeeperHiveLockManager: 
> [HiveServer2-Background-Pool: Thread-61]: Failed to release ZooKeeper lock: 
> java.lang.InterruptedException
>       at java.lang.Object.wait(Native Method)
>       at java.lang.Object.wait(Object.java:503)
>       at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
>       at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:871)
>       at 
> org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:238)
>       at 
> org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:233)
>       at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
>       at 
> org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:230)
>       at 
> org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:214)
>       at 
> org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:41)
>       at 
> org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlockPrimitive(ZooKeeperHiveLockManager.java:488)
>       at 
> org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlockWithRetry(ZooKeeperHiveLockManager.java:466)
>       at 
> org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlock(ZooKeeperHiveLockManager.java:454)
>       at 
> org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.releaseLocks(ZooKeeperHiveLockManager.java:236)
>       at 
> org.apache.hadoop.hive.ql.Driver.releaseLocksAndCommitOrRollback(Driver.java:1175)
>       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1432)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to