Re: Use SparkListener to get overall progress of an action

2014-05-23 Thread Mayur Rustagi
We have an internal patched version of Spark webUI which exports
application related data as Json. We use monitoring systems as well as
alternate UI for that json data for our specific application. Found it much
cleaner. Can provide 0.9.1 version.
Would submit as a pull request soon.


Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi https://twitter.com/mayur_rustagi



On Fri, May 23, 2014 at 10:57 AM, Chester chesterxgc...@yahoo.com wrote:

 This is something we are interested as well. We are planning to
 investigate more on this. If someone has suggestions, we would love to hear.

 Chester

 Sent from my iPad

 On May 22, 2014, at 8:02 AM, Pierre B 
 pierre.borckm...@realimpactanalytics.com wrote:

 Hi Andy!

 Yes Spark UI provides a lot of interesting informations for debugging
 purposes.

 Here I’m trying to integrate a simple progress monitoring in my app ui.

 I’m typically running a few “jobs” (or rather actions), and I’d like to be
 able to display the progress of each of those in my ui.

 I don’t really see how i could do that using SparkListener for the moment …

 Thanks for your help!

 Cheers!




   *Pierre Borckmans*
 Software team

 *Real**Impact* Analytics *| *Brussels Office
  http://www.realimpactanalytics.com/www.realimpactanalytics.com *| *[hidden
 email] http://user/SendEmail.jtp?type=nodenode=6259i=0

 *FR *+32 485 91 87 31 *| **Skype* pierre.borckmans






 On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] [hidden
 email] http://user/SendEmail.jtp?type=nodenode=6259i=1 wrote:

 SparkListener offers good stuffs.
 But I also completed it with another metrics stuffs on my own that use
 Akka to aggregate metrics from anywhere I'd like to collect them (without
 any deps on ganglia yet on Codahale).
 However, this was useful to gather some custom metrics (from within the
 tasks then) not really to collect overall monitoring information about the
 spark thingies themselves.
 For that Spark UI offers already a pretty good insight no?

 Cheers,

  aℕdy ℙetrella
 about.me/noootsab
 [image: aℕdy ℙetrella on about.me]

 http://about.me/noootsab


 On Thu, May 22, 2014 at 4:51 PM, Pierre B a
 href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0
 target=_top rel=nofollow link=external[hidden email] wrote:

 Is there a simple way to monitor the overall progress of an action using
 SparkListener or anything else?

 I see that one can name an RDD... Could that be used to determine which
 action triggered a stage, ... ?


 Thanks

 Pierre



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 Sent from the Apache Spark User List mailing list archive at
 http://Nabble.comNabble.com.




 --
  If you reply to this email, your message will be added to the discussion
 below:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html
  To unsubscribe from Use SparkListener to get overall progress of an
 action, click here.
 NAMLhttp://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml



 --
 View this message in context: Re: Use SparkListener to get overall
 progress of an 
 actionhttp://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6259.html
 Sent from the Apache Spark User List mailing list 
 archivehttp://apache-spark-user-list.1001560.n3.nabble.com/at
 Nabble.com.




Re: Use SparkListener to get overall progress of an action

2014-05-23 Thread Pierre Borckmans
That would be great, Mayur, thanks!

Anyhow, to be more specific, my question really was the following:

Is there any way to link events in the SparkListener to an action triggered in 
your code?

Cheers




Pierre Borckmans
Software team

RealImpact Analytics | Brussels Office
www.realimpactanalytics.com | pierre.borckm...@realimpactanalytics.com

FR +32 485 91 87 31 | Skype pierre.borckmans






On 23 May 2014, at 10:17, Mayur Rustagi mayur.rust...@gmail.com wrote:

 We have an internal patched version of Spark webUI which exports application 
 related data as Json. We use monitoring systems as well as alternate UI for 
 that json data for our specific application. Found it much cleaner. Can 
 provide 0.9.1 version.
 Would submit as a pull request soon. 
 
 
 Mayur Rustagi
 Ph: +1 (760) 203 3257
 http://www.sigmoidanalytics.com
 @mayur_rustagi
 
 
 
 On Fri, May 23, 2014 at 10:57 AM, Chester chesterxgc...@yahoo.com wrote:
 This is something we are interested as well. We are planning to investigate 
 more on this. If someone has suggestions, we would love to hear.
 
 Chester
 
 Sent from my iPad
 
 On May 22, 2014, at 8:02 AM, Pierre B 
 pierre.borckm...@realimpactanalytics.com wrote:
 
 Hi Andy!
 
 Yes Spark UI provides a lot of interesting informations for debugging 
 purposes.
 
 Here I’m trying to integrate a simple progress monitoring in my app ui.
 
 I’m typically running a few “jobs” (or rather actions), and I’d like to be 
 able to display the progress of each of those in my ui.
 
 I don’t really see how i could do that using SparkListener for the moment …
 
 Thanks for your help!
 
 Cheers!
 
 
 
 
 Pierre Borckmans
 Software team
 
 RealImpact Analytics | Brussels Office
 www.realimpactanalytics.com | [hidden email]
 
 FR +32 485 91 87 31 | Skype pierre.borckmans
 
 
 
 
 
 
 On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] 
 [hidden email] wrote:
 
 SparkListener offers good stuffs.
 But I also completed it with another metrics stuffs on my own that use Akka 
 to aggregate metrics from anywhere I'd like to collect them (without any 
 deps on ganglia yet on Codahale).
 However, this was useful to gather some custom metrics (from within the 
 tasks then) not really to collect overall monitoring information about the 
 spark thingies themselves.
 For that Spark UI offers already a pretty good insight no?
 
 Cheers,
 
 aℕdy ℙetrella
 about.me/noootsab
 
 
 
 
 On Thu, May 22, 2014 at 4:51 PM, Pierre B a 
 href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0 
 target=_top rel=nofollow link=external[hidden email] wrote:
 Is there a simple way to monitor the overall progress of an action using
 SparkListener or anything else?
 
 I see that one can name an RDD... Could that be used to determine which
 action triggered a stage, ... ?
 
 
 Thanks
 
 Pierre
 
 
 
 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 
 
 If you reply to this email, your message will be added to the discussion 
 below:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html
 To unsubscribe from Use SparkListener to get overall progress of an action, 
 click here.
 NAML
 
 
 View this message in context: Re: Use SparkListener to get overall progress 
 of an action
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 



Re: Use SparkListener to get overall progress of an action

2014-05-23 Thread Chester At Yahoo
Sounds like just what we need. For Hadoop we have progress bar to show the 
current status of the job . We like to do the same for spark. The yarn client 
only shows the percentage progress does show any text info. 

Does your PR works for yarn mode ?


Chester

Sent from my iPhone

 On May 23, 2014, at 1:17 AM, Mayur Rustagi mayur.rust...@gmail.com wrote:
 
 We have an internal patched version of Spark webUI which exports application 
 related data as Json. We use monitoring systems as well as alternate UI for 
 that json data for our specific application. Found it much cleaner. Can 
 provide 0.9.1 version.
 Would submit as a pull request soon. 
 
 
 Mayur Rustagi
 Ph: +1 (760) 203 3257
 http://www.sigmoidanalytics.com
 @mayur_rustagi
 
 
 
 On Fri, May 23, 2014 at 10:57 AM, Chester chesterxgc...@yahoo.com wrote:
 This is something we are interested as well. We are planning to investigate 
 more on this. If someone has suggestions, we would love to hear.
 
 Chester
 
 Sent from my iPad
 
 On May 22, 2014, at 8:02 AM, Pierre B 
 pierre.borckm...@realimpactanalytics.com wrote:
 
 Hi Andy!
 
 Yes Spark UI provides a lot of interesting informations for debugging 
 purposes.
 
 Here I’m trying to integrate a simple progress monitoring in my app ui.
 
 I’m typically running a few “jobs” (or rather actions), and I’d like to be 
 able to display the progress of each of those in my ui.
 
 I don’t really see how i could do that using SparkListener for the moment …
 
 Thanks for your help!
 
 Cheers!
 
 
 
 
 Pierre Borckmans
 Software team
 
 RealImpact Analytics | Brussels Office
 www.realimpactanalytics.com | [hidden email]
 
 FR +32 485 91 87 31 | Skype pierre.borckmans
 
 
 
 
 
 
 On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] 
 [hidden email] wrote:
 
 SparkListener offers good stuffs.
 But I also completed it with another metrics stuffs on my own that use 
 Akka to aggregate metrics from anywhere I'd like to collect them (without 
 any deps on ganglia yet on Codahale).
 However, this was useful to gather some custom metrics (from within the 
 tasks then) not really to collect overall monitoring information about the 
 spark thingies themselves.
 For that Spark UI offers already a pretty good insight no?
 
 Cheers,
 
 aℕdy ℙetrella
 about.me/noootsab
 
 
 
 
 On Thu, May 22, 2014 at 4:51 PM, Pierre B a 
 href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0 
 target=_top rel=nofollow link=external[hidden email] wrote:
 Is there a simple way to monitor the overall progress of an action using
 SparkListener or anything else?
 
 I see that one can name an RDD... Could that be used to determine which
 action triggered a stage, ... ?
 
 
 Thanks
 
 Pierre
 
 
 
 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 
 
 If you reply to this email, your message will be added to the discussion 
 below:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html
 To unsubscribe from Use SparkListener to get overall progress of an 
 action, click here.
 NAML
 
 
 View this message in context: Re: Use SparkListener to get overall progress 
 of an action
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 


Re: Use SparkListener to get overall progress of an action

2014-05-23 Thread Otávio Carvalho
Mayur,

I'm interested on it as well. Can you send me?

Cheers,


Otávio Carvalho.
Undergrad. Student at Federal University of Rio Grande do Sul
Porto Alegre, Brazil.


2014-05-23 11:00 GMT-03:00 Pierre Borckmans 
pierre.borckm...@realimpactanalytics.com:

 That would be great, Mayur, thanks!

 Anyhow, to be more specific, my question really was the following:

 Is there any way to link events in the SparkListener to an action
 triggered in your code?

 Cheers




   *Pierre Borckmans*
 Software team

 *Real**Impact* Analytics *| *Brussels Office
  www.realimpactanalytics.com *| 
 *pierre.borckm...@realimpactanalytics.comthierry.lib...@realimpactanalytics.com

 *FR *+32 485 91 87 31 *| **Skype* pierre.borckmans






 On 23 May 2014, at 10:17, Mayur Rustagi mayur.rust...@gmail.com wrote:

 We have an internal patched version of Spark webUI which exports
 application related data as Json. We use monitoring systems as well as
 alternate UI for that json data for our specific application. Found it much
 cleaner. Can provide 0.9.1 version.
 Would submit as a pull request soon.


 Mayur Rustagi
 Ph: +1 (760) 203 3257
 http://www.sigmoidanalytics.com
 @mayur_rustagi https://twitter.com/mayur_rustagi



 On Fri, May 23, 2014 at 10:57 AM, Chester chesterxgc...@yahoo.com wrote:

 This is something we are interested as well. We are planning to
 investigate more on this. If someone has suggestions, we would love to hear.

 Chester

 Sent from my iPad

 On May 22, 2014, at 8:02 AM, Pierre B 
 pierre.borckm...@realimpactanalytics.com wrote:

  Hi Andy!

 Yes Spark UI provides a lot of interesting informations for debugging
 purposes.

 Here I’m trying to integrate a simple progress monitoring in my app ui.

 I’m typically running a few “jobs” (or rather actions), and I’d like to
 be able to display the progress of each of those in my ui.

 I don’t really see how i could do that using SparkListener for the moment
 …

 Thanks for your help!

 Cheers!




   *Pierre Borckmans*
 Software team

 *Real**Impact* Analytics *| *Brussels Office
  http://www.realimpactanalytics.com/www.realimpactanalytics.com *| *[hidden
 email] http://user/SendEmail.jtp?type=nodenode=6259i=0

 *FR *+32 485 91 87 31 *| **Skype* pierre.borckmans






 On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] [hidden
 email] http://user/SendEmail.jtp?type=nodenode=6259i=1 wrote:

 SparkListener offers good stuffs.
 But I also completed it with another metrics stuffs on my own that use
 Akka to aggregate metrics from anywhere I'd like to collect them (without
 any deps on ganglia yet on Codahale).
 However, this was useful to gather some custom metrics (from within the
 tasks then) not really to collect overall monitoring information about the
 spark thingies themselves.
 For that Spark UI offers already a pretty good insight no?

 Cheers,

  aℕdy ℙetrella
 about.me/noootsab
 [image: aℕdy ℙetrella on about.me]

 http://about.me/noootsab


 On Thu, May 22, 2014 at 4:51 PM, Pierre B a href=
 x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0
 target=_top rel=nofollow link=external[hidden email] wrote:

 Is there a simple way to monitor the overall progress of an action using
 SparkListener or anything else?

 I see that one can name an RDD... Could that be used to determine which
 action triggered a stage, ... ?


 Thanks

 Pierre



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 Sent from the Apache Spark User List mailing list archive at
 http://nabble.com/Nabble.com http://nabble.com/.




 --
  If you reply to this email, your message will be added to the
 discussion below:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html
  To unsubscribe from Use SparkListener to get overall progress of an
 action, click here.
 NAMLhttp://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml



 --
 View this message in context: Re: Use SparkListener to get overall
 progress of an 
 actionhttp://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6259.html
 Sent from the Apache Spark User List mailing list 
 archivehttp://apache-spark-user-list.1001560.n3.nabble.com/at
 Nabble.com 

Re: Use SparkListener to get overall progress of an action

2014-05-23 Thread Pierre B
I’ve been looking at how this is implemented in the UI:
https://github.com/apache/spark/blob/branch-0.9/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala

1/ it’s easy to get the RDD name at the Stage events level
2/ the tricky part is that at the task level, we cannot link the tasks back to 
their corresponding stage like it’s done because TaskInfo is private (in fact 
private[spark]) : 

val stageIdToTaskInfos =
HashMap[Int, HashSet[(TaskInfo, Option[TaskMetrics], 
Option[ExceptionFailure])]]()

Tell me if I’m wrong, but i guess that’s the end of the story. There’s no way 
to do that without doing a custom build of spark…

HTH




Pierre Borckmans
Software team

RealImpact Analytics | Brussels Office
www.realimpactanalytics.com | pierre.borckm...@realimpactanalytics.com

FR +32 485 91 87 31 | Skype pierre.borckmans






On 23 May 2014, at 16:40, Otávio Carvalho [via Apache Spark User List] 
ml-node+s1001560n6321...@n3.nabble.com wrote:

 Mayur,
 
 I'm interested on it as well. Can you send me?
 
 Cheers,
 
 
 Otávio Carvalho.
 Undergrad. Student at Federal University of Rio Grande do Sul
 Porto Alegre, Brazil.
 
 
 2014-05-23 11:00 GMT-03:00 Pierre Borckmans [hidden email]:
 That would be great, Mayur, thanks!
 
 Anyhow, to be more specific, my question really was the following:
 
 Is there any way to link events in the SparkListener to an action triggered 
 in your code?
 
 Cheers
 
 
 
 
 Pierre Borckmans
 Software team
 
 RealImpact Analytics | Brussels Office
 www.realimpactanalytics.com | [hidden email]
 
 FR +32 485 91 87 31 | Skype pierre.borckmans
 
 
 
 
 
 
 On 23 May 2014, at 10:17, Mayur Rustagi [hidden email] wrote:
 
 We have an internal patched version of Spark webUI which exports application 
 related data as Json. We use monitoring systems as well as alternate UI for 
 that json data for our specific application. Found it much cleaner. Can 
 provide 0.9.1 version.
 Would submit as a pull request soon. 
 
 
 Mayur Rustagi
 Ph: a href=tel:%2B1%20%28760%29%20203%203257 value=+17602033257 
 target=_blank+1 (760) 203 3257
 http://www.sigmoidanalytics.com
 @mayur_rustagi
 
 
 
 On Fri, May 23, 2014 at 10:57 AM, Chester [hidden email] wrote:
 This is something we are interested as well. We are planning to investigate 
 more on this. If someone has suggestions, we would love to hear.
 
 Chester
 
 Sent from my iPad
 
 On May 22, 2014, at 8:02 AM, Pierre B [hidden email] wrote:
 
 Hi Andy!
 
 Yes Spark UI provides a lot of interesting informations for debugging 
 purposes.
 
 Here I’m trying to integrate a simple progress monitoring in my app ui.
 
 I’m typically running a few “jobs” (or rather actions), and I’d like to be 
 able to display the progress of each of those in my ui.
 
 I don’t really see how i could do that using SparkListener for the moment …
 
 Thanks for your help!
 
 Cheers!
 
 
 
 
 Pierre Borckmans
 Software team
 
 RealImpact Analytics | Brussels Office
 www.realimpactanalytics.com | [hidden email]
 
 FR +32 485 91 87 31 | Skype pierre.borckmans
 
 
 
 
 
 
 On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] 
 [hidden email] wrote:
 
 SparkListener offers good stuffs.
 But I also completed it with another metrics stuffs on my own that use 
 Akka to aggregate metrics from anywhere I'd like to collect them (without 
 any deps on ganglia yet on Codahale).
 However, this was useful to gather some custom metrics (from within the 
 tasks then) not really to collect overall monitoring information about the 
 spark thingies themselves.
 For that Spark UI offers already a pretty good insight no?
 
 Cheers,
 
 aℕdy ℙetrella
 about.me/noootsab
 
 
 
 
 On Thu, May 22, 2014 at 4:51 PM, Pierre B a 
 href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0 
 target=_top rel=nofollow link=external[hidden email] wrote:
 Is there a simple way to monitor the overall progress of an action using
 SparkListener or anything else?
 
 I see that one can name an RDD... Could that be used to determine which
 action triggered a stage, ... ?
 
 
 Thanks
 
 Pierre
 
 
 
 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 
 
 If you reply to this email, your message will be added to the discussion 
 below:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html
 To unsubscribe from Use SparkListener to get overall progress of an 
 action, click here.
 NAML
 
 
 View this message in context: Re: Use SparkListener to get overall progress 
 of an action
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 
 
 
 
 If you reply to this email, your message will be added to the discussion 
 below:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress

Re: Use SparkListener to get overall progress of an action

2014-05-23 Thread Philip Ogren

Hi Pierre,

I asked a similar question on this list about 6 weeks ago.  Here is one 
answer 
http://mail-archives.apache.org/mod_mbox/spark-user/201404.mbox/%3ccamjob8n3foaxd-dc5j57-n1oocwxefcg5chljwnut7qnreq...@mail.gmail.com%3E 
I got that is of particular note:


In the upcoming release of Spark 1.0 there will be a feature that 
provides for exactly what you describe: capturing the information 
displayed on the UI in JSON. More details will be provided in the 
documentation, but for now, anything before 0.9.1 can only go through 
JobLogger.scala, which outputs information in a somewhat arbitrary 
format and will be deprecated soon. If you find this feature useful, you 
can test it out by building the master branch of Spark yourself, 
following the instructions in https://github.com/apache/spark/pull/42.




On 05/22/2014 08:51 AM, Pierre B wrote:

Is there a simple way to monitor the overall progress of an action using
SparkListener or anything else?

I see that one can name an RDD... Could that be used to determine which
action triggered a stage, ... ?


Thanks

Pierre



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.




Re: Use SparkListener to get overall progress of an action

2014-05-23 Thread Pierre B
Thanks Philip,

I don’t want to go the JobLogger way (too hacky ;) )

In version 1.0, if i’m not mistaken, you can even do what I’m asking for, since 
they removed the “private” for TaskInfo and such and replaced it with the 
“@DeveloperApi” annotation.

I was looking for a simple way to do this in 0.9.1, but thanks anyway!

Pierre


On 23 May 2014, at 17:41, Philip Ogren [via Apache Spark User List] 
ml-node+s1001560n6326...@n3.nabble.com wrote:

 Hi Pierre,
 
 I asked a similar question on this list about 6 weeks ago.  Here is one 
 answer I got that is of particular note:
 
 In the upcoming release of Spark 1.0 there will be a feature that provides 
 for exactly what you describe: capturing the information displayed on the UI 
 in JSON. More details will be provided in the documentation, but for now, 
 anything before 0.9.1 can only go through JobLogger.scala, which outputs 
 information in a somewhat arbitrary format and will be deprecated soon. If 
 you find this feature useful, you can test it out by building the master 
 branch of Spark yourself, following the instructions in 
 https://github.com/apache/spark/pull/42.
 
 
 
 On 05/22/2014 08:51 AM, Pierre B wrote:
 Is there a simple way to monitor the overall progress of an action using
 SparkListener or anything else?
 
 I see that one can name an RDD... Could that be used to determine which
 action triggered a stage, ... ?
 
 
 Thanks
 
 Pierre
 
 
 
 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 
 
 If you reply to this email, your message will be added to the discussion 
 below:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6326.html
 To unsubscribe from Use SparkListener to get overall progress of an action, 
 click here.
 NAML





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6327.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Use SparkListener to get overall progress of an action

2014-05-22 Thread andy petrella
SparkListener offers good stuffs.
But I also completed it with another metrics stuffs on my own that use Akka
to aggregate metrics from anywhere I'd like to collect them (without any
deps on ganglia yet on Codahale).
However, this was useful to gather some custom metrics (from within the
tasks then) not really to collect overall monitoring information about the
spark thingies themselves.
For that Spark UI offers already a pretty good insight no?

Cheers,

 aℕdy ℙetrella
about.me/noootsab
[image: aℕdy ℙetrella on about.me]

http://about.me/noootsab


On Thu, May 22, 2014 at 4:51 PM, Pierre B 
pierre.borckm...@realimpactanalytics.com wrote:

 Is there a simple way to monitor the overall progress of an action using
 SparkListener or anything else?

 I see that one can name an RDD... Could that be used to determine which
 action triggered a stage, ... ?


 Thanks

 Pierre



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.



Re: Use SparkListener to get overall progress of an action

2014-05-22 Thread Pierre B
Hi Andy!

Yes Spark UI provides a lot of interesting informations for debugging purposes.

Here I’m trying to integrate a simple progress monitoring in my app ui.

I’m typically running a few “jobs” (or rather actions), and I’d like to be able 
to display the progress of each of those in my ui.

I don’t really see how i could do that using SparkListener for the moment …

Thanks for your help!

Cheers!




Pierre Borckmans
Software team

RealImpact Analytics | Brussels Office
www.realimpactanalytics.com | pierre.borckm...@realimpactanalytics.com

FR +32 485 91 87 31 | Skype pierre.borckmans






On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] 
ml-node+s1001560n6258...@n3.nabble.com wrote:

 SparkListener offers good stuffs.
 But I also completed it with another metrics stuffs on my own that use Akka 
 to aggregate metrics from anywhere I'd like to collect them (without any deps 
 on ganglia yet on Codahale).
 However, this was useful to gather some custom metrics (from within the tasks 
 then) not really to collect overall monitoring information about the spark 
 thingies themselves.
 For that Spark UI offers already a pretty good insight no?
 
 Cheers,
 
 aℕdy ℙetrella
 about.me/noootsab
 
 
 
 
 On Thu, May 22, 2014 at 4:51 PM, Pierre B [hidden email] wrote:
 Is there a simple way to monitor the overall progress of an action using
 SparkListener or anything else?
 
 I see that one can name an RDD... Could that be used to determine which
 action triggered a stage, ... ?
 
 
 Thanks
 
 Pierre
 
 
 
 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 
 
 If you reply to this email, your message will be added to the discussion 
 below:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html
 To unsubscribe from Use SparkListener to get overall progress of an action, 
 click here.
 NAML





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6259.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Use SparkListener to get overall progress of an action

2014-05-22 Thread andy petrella
Yeah, actually for that I used directly codahale with my own stuffs using
the Akka system from within Spark itself.

So the workers send messages back to a bunch of actors on the driver which
are using codahale metrics.
This way I can collect what/how an executor do/did, but also I can
aggregate all executors metrics at once (via dedicated aggregation purposed
codahale metrics).

However, I didn't had time to dig enough in Spark to see with I could reuse
the SparkListener system itself -- which is kind-of doing the same thing,
but w/o akka AFAICT = where I can see that TaskMetrics are collected by
task within the context/granularity of a Stage. Than aggregation looks like
being done in a built-in (Queued) Bus. So I'll let someone else report how
this could be extended, but my gut feeling that it won't be straightforward.

hth (with respect of my limited knowledge of these internals ^^)

cheers,


  aℕdy ℙetrella
about.me/noootsab
[image: aℕdy ℙetrella on about.me]

http://about.me/noootsab


On Thu, May 22, 2014 at 5:02 PM, Pierre B 
pierre.borckm...@realimpactanalytics.com wrote:

 Hi Andy!

 Yes Spark UI provides a lot of interesting informations for debugging
 purposes.

 Here I’m trying to integrate a simple progress monitoring in my app ui.

 I’m typically running a few “jobs” (or rather actions), and I’d like to be
 able to display the progress of each of those in my ui.

 I don’t really see how i could do that using SparkListener for the moment …

 Thanks for your help!

 Cheers!




   *Pierre Borckmans*
 Software team

 *Real**Impact* Analytics *| *Brussels Office
  www.realimpactanalytics.com *| *[hidden 
 email]http://user/SendEmail.jtp?type=nodenode=6259i=0

 *FR *+32 485 91 87 31 *| **Skype* pierre.borckmans






 On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] [hidden
 email] http://user/SendEmail.jtp?type=nodenode=6259i=1 wrote:

 SparkListener offers good stuffs.
 But I also completed it with another metrics stuffs on my own that use
 Akka to aggregate metrics from anywhere I'd like to collect them (without
 any deps on ganglia yet on Codahale).
 However, this was useful to gather some custom metrics (from within the
 tasks then) not really to collect overall monitoring information about the
 spark thingies themselves.
 For that Spark UI offers already a pretty good insight no?

 Cheers,

  aℕdy ℙetrella
 about.me/noootsab
 [image: aℕdy ℙetrella on about.me]

 http://about.me/noootsab


 On Thu, May 22, 2014 at 4:51 PM, Pierre B a
 href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0
 target=_top rel=nofollow link=external[hidden email] wrote:

 Is there a simple way to monitor the overall progress of an action using
 SparkListener or anything else?

 I see that one can name an RDD... Could that be used to determine which
 action triggered a stage, ... ?


 Thanks

 Pierre



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.




 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html
  To unsubscribe from Use SparkListener to get overall progress of an
 action, click here.
 NAMLhttp://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml



 --
 View this message in context: Re: Use SparkListener to get overall
 progress of an 
 actionhttp://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6259.html

 Sent from the Apache Spark User List mailing list 
 archivehttp://apache-spark-user-list.1001560.n3.nabble.com/at Nabble.com.



Re: Use SparkListener to get overall progress of an action

2014-05-22 Thread Chester
This is something we are interested as well. We are planning to investigate 
more on this. If someone has suggestions, we would love to hear.

Chester

Sent from my iPad

On May 22, 2014, at 8:02 AM, Pierre B 
pierre.borckm...@realimpactanalytics.com wrote:

 Hi Andy!
 
 Yes Spark UI provides a lot of interesting informations for debugging 
 purposes.
 
 Here I’m trying to integrate a simple progress monitoring in my app ui.
 
 I’m typically running a few “jobs” (or rather actions), and I’d like to be 
 able to display the progress of each of those in my ui.
 
 I don’t really see how i could do that using SparkListener for the moment …
 
 Thanks for your help!
 
 Cheers!
 
 
 
 
 Pierre Borckmans
 Software team
 
 RealImpact Analytics | Brussels Office
 www.realimpactanalytics.com | [hidden email]
 
 FR +32 485 91 87 31 | Skype pierre.borckmans
 
 
 
 
 
 
 On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] [hidden 
 email] wrote:
 
 SparkListener offers good stuffs.
 But I also completed it with another metrics stuffs on my own that use Akka 
 to aggregate metrics from anywhere I'd like to collect them (without any 
 deps on ganglia yet on Codahale).
 However, this was useful to gather some custom metrics (from within the 
 tasks then) not really to collect overall monitoring information about the 
 spark thingies themselves.
 For that Spark UI offers already a pretty good insight no?
 
 Cheers,
 
 aℕdy ℙetrella
 about.me/noootsab
 
 
 
 
 On Thu, May 22, 2014 at 4:51 PM, Pierre B a 
 href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0 
 target=_top rel=nofollow link=external[hidden email] wrote:
 Is there a simple way to monitor the overall progress of an action using
 SparkListener or anything else?
 
 I see that one can name an RDD... Could that be used to determine which
 action triggered a stage, ... ?
 
 
 Thanks
 
 Pierre
 
 
 
 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 
 
 If you reply to this email, your message will be added to the discussion 
 below:
 http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html
 To unsubscribe from Use SparkListener to get overall progress of an action, 
 click here.
 NAML
 
 
 View this message in context: Re: Use SparkListener to get overall progress 
 of an action
 Sent from the Apache Spark User List mailing list archive at Nabble.com.