[ https://issues.apache.org/jira/browse/SPARK-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14087102#comment-14087102 ]
Chengxiang Li commented on SPARK-2636: -------------------------------------- {quote} There are two ways I think. One is for DAGScheduler.runJob to return an integer (or long) id for the job. An alternative, which I think is better and relates to SPARK-2321, is for runJob to return some Job object that has information about the id and can be queried about progress. {quote} DAGScheduler is Spark internal class, User can hardly use it directly. I like your second idea, return a Job info object while submit spark job in SparkContext(JavaSparkContext in this case) or RDD level. Actually AsyncRDDActions has done part of this work, I think it maybe a good place to fix this issue. > no where to get job identifier while submit spark job through spark API > ----------------------------------------------------------------------- > > Key: SPARK-2636 > URL: https://issues.apache.org/jira/browse/SPARK-2636 > Project: Spark > Issue Type: New Feature > Reporter: Chengxiang Li > > In Hive on Spark, we want to track spark job status through Spark API, the > basic idea is as following: > # create an hive-specified spark listener and register it to spark listener > bus. > # hive-specified spark listener generate job status by spark listener events. > # hive driver track job status through hive-specified spark listener. > the current problem is that hive driver need job identifier to track > specified job status through spark listener, but there is no spark API to get > job identifier(like job id) while submit spark job. > I think other project whoever try to track job status with spark API would > suffer from this as well. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org