[jira] [Commented] (SPARK-2636) no where to get job identifier while submit spark job through spark API

2014-08-28 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113400#comment-14113400
 ] 

Apache Spark commented on SPARK-2636:
-

User 'lirui-intel' has created a pull request for this issue:
https://github.com/apache/spark/pull/2176

 no where to get job identifier while submit spark job through spark API
 ---

 Key: SPARK-2636
 URL: https://issues.apache.org/jira/browse/SPARK-2636
 Project: Spark
  Issue Type: New Feature
  Components: Java API
Reporter: Chengxiang Li
  Labels: hive

 In Hive on Spark, we want to track spark job status through Spark API, the 
 basic idea is as following:
 # create an hive-specified spark listener and register it to spark listener 
 bus.
 # hive-specified spark listener generate job status by spark listener events.
 # hive driver track job status through hive-specified spark listener. 
 the current problem is that hive driver need job identifier to track 
 specified job status through spark listener, but there is no spark API to get 
 job identifier(like job id) while submit spark job.
 I think other project whoever try to track job status with spark API would 
 suffer from this as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2636) no where to get job identifier while submit spark job through spark API

2014-08-25 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110172#comment-14110172
 ] 

Rui Li commented on SPARK-2636:
---

Just want to make sure I understand everything correctly:

I think user submits a job via an RDD action, which in turn calls 
{{SparkContex.runJob - DAGScheduler.runJob - DAGScheduler.submitJob - 
DAGScheduler.handleJobSubmitted}}. The requirement is we should return some job 
ID to the user. So I think putting that in a DAGScheduler method doesn't help? 
BTW, {{DAGScheduler.submitJob}} returns a {{JobWaiter}} which contains the job 
ID.

Also, by job ID, do we mean {{org.apache.spark.streaming.scheduler.Job.id}} 
or {{org.apache.spark.scheduler.ActiveJob.jobId}}?

Please let me know if I misunderstand anything.

 no where to get job identifier while submit spark job through spark API
 ---

 Key: SPARK-2636
 URL: https://issues.apache.org/jira/browse/SPARK-2636
 Project: Spark
  Issue Type: New Feature
  Components: Java API
Reporter: Chengxiang Li
  Labels: hive

 In Hive on Spark, we want to track spark job status through Spark API, the 
 basic idea is as following:
 # create an hive-specified spark listener and register it to spark listener 
 bus.
 # hive-specified spark listener generate job status by spark listener events.
 # hive driver track job status through hive-specified spark listener. 
 the current problem is that hive driver need job identifier to track 
 specified job status through spark listener, but there is no spark API to get 
 job identifier(like job id) while submit spark job.
 I think other project whoever try to track job status with spark API would 
 suffer from this as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2636) no where to get job identifier while submit spark job through spark API

2014-08-05 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14086494#comment-14086494
 ] 

Marcelo Vanzin commented on SPARK-2636:
---

(BTW, just checked SPARK-2321, so if you really mean the {{Job}} id, ignore my 
comments, since yes, it's kind of a pain to know the ID of a job you're 
submitting to the context.)

 no where to get job identifier while submit spark job through spark API
 ---

 Key: SPARK-2636
 URL: https://issues.apache.org/jira/browse/SPARK-2636
 Project: Spark
  Issue Type: New Feature
Reporter: Chengxiang Li

 In Hive on Spark, we want to track spark job status through Spark API, the 
 basic idea is as following:
 # create an hive-specified spark listener and register it to spark listener 
 bus.
 # hive-specified spark listener generate job status by spark listener events.
 # hive driver track job status through hive-specified spark listener. 
 the current problem is that hive driver need job identifier to track 
 specified job status through spark listener, but there is no spark API to get 
 job identifier(like job id) while submit spark job.
 I think other project whoever try to track job status with spark API would 
 suffer from this as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2636) no where to get job identifier while submit spark job through spark API

2014-08-05 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14087102#comment-14087102
 ] 

Chengxiang Li commented on SPARK-2636:
--

{quote}
There are two ways I think. One is for DAGScheduler.runJob to return an integer 
(or long) id for the job. An alternative, which I think is better and relates 
to SPARK-2321, is for runJob to return some Job object that has information 
about the id and can be queried about progress.
{quote}
DAGScheduler is Spark internal class, User can hardly use it directly. I like 
your second idea,  return a Job info object while submit spark job in 
SparkContext(JavaSparkContext in this case) or RDD level. Actually 
AsyncRDDActions has done part of this work, I think it maybe a good place to 
fix this issue.

 no where to get job identifier while submit spark job through spark API
 ---

 Key: SPARK-2636
 URL: https://issues.apache.org/jira/browse/SPARK-2636
 Project: Spark
  Issue Type: New Feature
Reporter: Chengxiang Li

 In Hive on Spark, we want to track spark job status through Spark API, the 
 basic idea is as following:
 # create an hive-specified spark listener and register it to spark listener 
 bus.
 # hive-specified spark listener generate job status by spark listener events.
 # hive driver track job status through hive-specified spark listener. 
 the current problem is that hive driver need job identifier to track 
 specified job status through spark listener, but there is no spark API to get 
 job identifier(like job id) while submit spark job.
 I think other project whoever try to track job status with spark API would 
 suffer from this as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2636) no where to get job identifier while submit spark job through spark API

2014-07-22 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071311#comment-14071311
 ] 

Chengxiang Li commented on SPARK-2636:
--

cc [~rxin] [~xuefuz]

 no where to get job identifier while submit spark job through spark API
 ---

 Key: SPARK-2636
 URL: https://issues.apache.org/jira/browse/SPARK-2636
 Project: Spark
  Issue Type: New Feature
Reporter: Chengxiang Li

 In Hive on Spark, we want to track spark job status through Spark API, the 
 basic idea is as following:
 # create an hive-specified spark listener and register it to spark listener 
 bus.
 # hive-specified spark listener generate job status by spark listener events.
 # hive driver track job status through hive-specified spark listener. 
 the current problem is that hive driver need job identifier to track 
 specified job status through spark listener, but there is no spark API to get 
 job identifier(like job id) while submit spark job.
 I think other project whoever try to track job status with spark API would 
 suffer from this as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)