[ https://issues.apache.org/jira/browse/HIVE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15860967#comment-15860967 ]
Xuefu Zhang commented on HIVE-15860: ------------------------------------ Hi [~lirui], thanks for working on this. Just to clarify, does the monitor loop forever in the case? It seems that it does even though the broken connection is already detected at RPC layer. As a result, the user session will hang forever w/o making any progress. > RemoteSparkJobMonitor may hang when RemoteDriver exits abnormally > ----------------------------------------------------------------- > > Key: HIVE-15860 > URL: https://issues.apache.org/jira/browse/HIVE-15860 > Project: Hive > Issue Type: Bug > Reporter: Rui Li > Assignee: Rui Li > Attachments: HIVE-15860.1.patch > > > It happens when RemoteDriver crashes between {{JobStarted}} and > {{JobSubmitted}}, e.g. killed by {{kill -9}}. RemoteSparkJobMonitor will > consider the job has started, however it can't get the job info because it > hasn't received the JobId. Then the monitor will loop forever. -- This message was sent by Atlassian JIRA (v6.3.15#6346)