[
https://issues.apache.org/jira/browse/HIVE-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-8972:
-------------------------
Attachment: HIVE-8972.4-spark.patch
The latest patch only consists of minor fix and clean up.
I talked about this with [~chengxiang li]. Here's our thought on this task:
Currently we set timeout for {{JobSubmitted}} event and assume the job is
always submitted via async API and will send back its spark job ID (i.e. by
calling monitorJob). If we add, say, {{JobStarted}} and set timeout for that,
we assume all failures after that can be properly captured and sent back to
client. So one way or another, we'll have to make assumptions. Since timeout
for {{JobSubmitted}} serves us well at the moment, maybe we should leave it as
is.
A possible improvement may be to differentiate the two kinds of jobs we have:
hive query job and other jobs (e.g. addFile, getJobInfo). The former should
guarantee to send back a spark job ID for monitoring and we can set timeout for
that, while the latter should finish within constant time so we can set timeout
when calling Future.get.
cc [~xuefuz] [~vanzin]
> Implement more fine-grained remote client-level events [Spark Branch]
> ---------------------------------------------------------------------
>
> Key: HIVE-8972
> URL: https://issues.apache.org/jira/browse/HIVE-8972
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Reporter: Rui Li
> Assignee: Rui Li
> Attachments: HIVE-8972.1-spark.patch, HIVE-8972.2-spark.patch,
> HIVE-8972.3-spark.patch, HIVE-8972.3-spark.patch, HIVE-8972.4-spark.patch
>
>
> Follow up task of HIVE-8956.
> Fine-grained events are useful for better job monitor and failure handling.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)