I’ve been looking at how this is implemented in the UI:
https://github.com/apache/spark/blob/branch-0.9/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala

1/ it’s easy to get the RDD name at the Stage events level
2/ the tricky part is that at the task level, we cannot link the tasks back to 
their corresponding stage like it’s done because TaskInfo is private (in fact 
private[spark]) : 

val stageIdToTaskInfos =
    HashMap[Int, HashSet[(TaskInfo, Option[TaskMetrics], 
Option[ExceptionFailure])]]()

Tell me if I’m wrong, but i guess that’s the end of the story. There’s no way 
to do that without doing a custom build of spark…

HTH




Pierre Borckmans
Software team

RealImpact Analytics | Brussels Office
www.realimpactanalytics.com | pierre.borckm...@realimpactanalytics.com

FR +32 485 91 87 31 | Skype pierre.borckmans






On 23 May 2014, at 16:40, Otávio Carvalho [via Apache Spark User List] 
<ml-node+s1001560n6321...@n3.nabble.com> wrote:

> Mayur,
> 
> I'm interested on it as well. Can you send me?
> 
> Cheers,
> 
> 
> Otávio Carvalho.
> Undergrad. Student at Federal University of Rio Grande do Sul
> Porto Alegre, Brazil.
> 
> 
> 2014-05-23 11:00 GMT-03:00 Pierre Borckmans <[hidden email]>:
> That would be great, Mayur, thanks!
> 
> Anyhow, to be more specific, my question really was the following:
> 
> Is there any way to link events in the SparkListener to an action triggered 
> in your code?
> 
> Cheers
> 
> 
> 
> 
> Pierre Borckmans
> Software team
> 
> RealImpact Analytics | Brussels Office
> www.realimpactanalytics.com | [hidden email]
> 
> FR +32 485 91 87 31 | Skype pierre.borckmans
> 
> 
> 
> 
> 
> 
> On 23 May 2014, at 10:17, Mayur Rustagi <[hidden email]> wrote:
> 
>> We have an internal patched version of Spark webUI which exports application 
>> related data as Json. We use monitoring systems as well as alternate UI for 
>> that json data for our specific application. Found it much cleaner. Can 
>> provide 0.9.1 version.
>> Would submit as a pull request soon. 
>> 
>> 
>> Mayur Rustagi
>> Ph: <a href="tel:%2B1%20%28760%29%20203%203257" value="+17602033257" 
>> target="_blank">+1 (760) 203 3257
>> http://www.sigmoidanalytics.com
>> @mayur_rustagi
>> 
>> 
>> 
>> On Fri, May 23, 2014 at 10:57 AM, Chester <[hidden email]> wrote:
>> This is something we are interested as well. We are planning to investigate 
>> more on this. If someone has suggestions, we would love to hear.
>> 
>> Chester
>> 
>> Sent from my iPad
>> 
>> On May 22, 2014, at 8:02 AM, Pierre B <[hidden email]> wrote:
>> 
>>> Hi Andy!
>>> 
>>> Yes Spark UI provides a lot of interesting informations for debugging 
>>> purposes.
>>> 
>>> Here I’m trying to integrate a simple progress monitoring in my app ui.
>>> 
>>> I’m typically running a few “jobs” (or rather actions), and I’d like to be 
>>> able to display the progress of each of those in my ui.
>>> 
>>> I don’t really see how i could do that using SparkListener for the moment …
>>> 
>>> Thanks for your help!
>>> 
>>> Cheers!
>>> 
>>> 
>>> 
>>> 
>>> Pierre Borckmans
>>> Software team
>>> 
>>> RealImpact Analytics | Brussels Office
>>> www.realimpactanalytics.com | [hidden email]
>>> 
>>> FR +32 485 91 87 31 | Skype pierre.borckmans
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] 
>>> <[hidden email]> wrote:
>>> 
>>>> SparkListener offers good stuffs.
>>>> But I also completed it with another metrics stuffs on my own that use 
>>>> Akka to aggregate metrics from anywhere I'd like to collect them (without 
>>>> any deps on ganglia yet on Codahale).
>>>> However, this was useful to gather some custom metrics (from within the 
>>>> tasks then) not really to collect overall monitoring information about the 
>>>> spark thingies themselves.
>>>> For that Spark UI offers already a pretty good insight no?
>>>> 
>>>> Cheers,
>>>> 
>>>> aℕdy ℙetrella
>>>> about.me/noootsab
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Thu, May 22, 2014 at 4:51 PM, Pierre B <<a 
>>>> href="x-msg://7/user/SendEmail.jtp?type=node&amp;node=6258&amp;i=0" 
>>>> target="_top" rel="nofollow" link="external">[hidden email]> wrote:
>>>> Is there a simple way to monitor the overall progress of an action using
>>>> SparkListener or anything else?
>>>> 
>>>> I see that one can name an RDD... Could that be used to determine which
>>>> action triggered a stage, ... ?
>>>> 
>>>> 
>>>> Thanks
>>>> 
>>>> Pierre
>>>> 
>>>> 
>>>> 
>>>> --
>>>> View this message in context: 
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>> 
>>>> 
>>>> 
>>>> If you reply to this email, your message will be added to the discussion 
>>>> below:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html
>>>> To unsubscribe from Use SparkListener to get overall progress of an 
>>>> action, click here.
>>>> NAML
>>> 
>>> 
>>> View this message in context: Re: Use SparkListener to get overall progress 
>>> of an action
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>> 
> 
> 
> 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6321.html
> To unsubscribe from Use SparkListener to get overall progress of an action, 
> click here.
> NAML





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6324.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to