[
https://issues.apache.org/jira/browse/HIVE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283310#comment-14283310
]
Chengxiang Li commented on HIVE-9395:
-------------------------------------
That's a good question, Hive submit spark job asynchronously, and monitor the
job status with SparkJobMonitor, all kinds of errors may happens before job get
executed on Spark cluster, so we need to add timeout in SparkJobMonitor which
would make sure it would not hang while could not get job state all the times,
this should be quite important for our unit test, as once SparkJobMonitor
hangs, it would blocks all the following tests.
When should we decide to timeout while we could not get state of job, after
30s, or 60s? should it configurable to user?
My opinion is that make it configurable to user, as user may know more about
the real cluster, which helps them to decide whether it's normal that
SparkJobMonitor could not get job state in certain time.
> Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor
> level.[Spark Branch]
> --------------------------------------------------------------------------------------------------
>
> Key: HIVE-9395
> URL: https://issues.apache.org/jira/browse/HIVE-9395
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Reporter: Chengxiang Li
> Assignee: Chengxiang Li
> Labels: Spark-M5
> Attachments: HIVE-9395.1-spark.patch
>
>
> SparkJobMonitor may hang if job state return null all the times, we should
> move the timeout check here to avoid it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)