Re: Use SparkListener to get overall progress of an action
We have an internal patched version of Spark webUI which exports application related data as Json. We use monitoring systems as well as alternate UI for that json data for our specific application. Found it much cleaner. Can provide 0.9.1 version. Would submit as a pull request soon. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi https://twitter.com/mayur_rustagi On Fri, May 23, 2014 at 10:57 AM, Chester chesterxgc...@yahoo.com wrote: This is something we are interested as well. We are planning to investigate more on this. If someone has suggestions, we would love to hear. Chester Sent from my iPad On May 22, 2014, at 8:02 AM, Pierre B pierre.borckm...@realimpactanalytics.com wrote: Hi Andy! Yes Spark UI provides a lot of interesting informations for debugging purposes. Here I’m trying to integrate a simple progress monitoring in my app ui. I’m typically running a few “jobs” (or rather actions), and I’d like to be able to display the progress of each of those in my ui. I don’t really see how i could do that using SparkListener for the moment … Thanks for your help! Cheers! *Pierre Borckmans* Software team *Real**Impact* Analytics *| *Brussels Office http://www.realimpactanalytics.com/www.realimpactanalytics.com *| *[hidden email] http://user/SendEmail.jtp?type=nodenode=6259i=0 *FR *+32 485 91 87 31 *| **Skype* pierre.borckmans On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] [hidden email] http://user/SendEmail.jtp?type=nodenode=6259i=1 wrote: SparkListener offers good stuffs. But I also completed it with another metrics stuffs on my own that use Akka to aggregate metrics from anywhere I'd like to collect them (without any deps on ganglia yet on Codahale). However, this was useful to gather some custom metrics (from within the tasks then) not really to collect overall monitoring information about the spark thingies themselves. For that Spark UI offers already a pretty good insight no? Cheers, aℕdy ℙetrella about.me/noootsab [image: aℕdy ℙetrella on about.me] http://about.me/noootsab On Thu, May 22, 2014 at 4:51 PM, Pierre B a href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0 target=_top rel=nofollow link=external[hidden email] wrote: Is there a simple way to monitor the overall progress of an action using SparkListener or anything else? I see that one can name an RDD... Could that be used to determine which action triggered a stage, ... ? Thanks Pierre -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html Sent from the Apache Spark User List mailing list archive at http://Nabble.comNabble.com. -- If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html To unsubscribe from Use SparkListener to get overall progress of an action, click here. NAMLhttp://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: Re: Use SparkListener to get overall progress of an actionhttp://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6259.html Sent from the Apache Spark User List mailing list archivehttp://apache-spark-user-list.1001560.n3.nabble.com/at Nabble.com.
Re: Use SparkListener to get overall progress of an action
That would be great, Mayur, thanks! Anyhow, to be more specific, my question really was the following: Is there any way to link events in the SparkListener to an action triggered in your code? Cheers Pierre Borckmans Software team RealImpact Analytics | Brussels Office www.realimpactanalytics.com | pierre.borckm...@realimpactanalytics.com FR +32 485 91 87 31 | Skype pierre.borckmans On 23 May 2014, at 10:17, Mayur Rustagi mayur.rust...@gmail.com wrote: We have an internal patched version of Spark webUI which exports application related data as Json. We use monitoring systems as well as alternate UI for that json data for our specific application. Found it much cleaner. Can provide 0.9.1 version. Would submit as a pull request soon. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi On Fri, May 23, 2014 at 10:57 AM, Chester chesterxgc...@yahoo.com wrote: This is something we are interested as well. We are planning to investigate more on this. If someone has suggestions, we would love to hear. Chester Sent from my iPad On May 22, 2014, at 8:02 AM, Pierre B pierre.borckm...@realimpactanalytics.com wrote: Hi Andy! Yes Spark UI provides a lot of interesting informations for debugging purposes. Here I’m trying to integrate a simple progress monitoring in my app ui. I’m typically running a few “jobs” (or rather actions), and I’d like to be able to display the progress of each of those in my ui. I don’t really see how i could do that using SparkListener for the moment … Thanks for your help! Cheers! Pierre Borckmans Software team RealImpact Analytics | Brussels Office www.realimpactanalytics.com | [hidden email] FR +32 485 91 87 31 | Skype pierre.borckmans On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] [hidden email] wrote: SparkListener offers good stuffs. But I also completed it with another metrics stuffs on my own that use Akka to aggregate metrics from anywhere I'd like to collect them (without any deps on ganglia yet on Codahale). However, this was useful to gather some custom metrics (from within the tasks then) not really to collect overall monitoring information about the spark thingies themselves. For that Spark UI offers already a pretty good insight no? Cheers, aℕdy ℙetrella about.me/noootsab On Thu, May 22, 2014 at 4:51 PM, Pierre B a href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0 target=_top rel=nofollow link=external[hidden email] wrote: Is there a simple way to monitor the overall progress of an action using SparkListener or anything else? I see that one can name an RDD... Could that be used to determine which action triggered a stage, ... ? Thanks Pierre -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html Sent from the Apache Spark User List mailing list archive at Nabble.com. If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html To unsubscribe from Use SparkListener to get overall progress of an action, click here. NAML View this message in context: Re: Use SparkListener to get overall progress of an action Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Use SparkListener to get overall progress of an action
Sounds like just what we need. For Hadoop we have progress bar to show the current status of the job . We like to do the same for spark. The yarn client only shows the percentage progress does show any text info. Does your PR works for yarn mode ? Chester Sent from my iPhone On May 23, 2014, at 1:17 AM, Mayur Rustagi mayur.rust...@gmail.com wrote: We have an internal patched version of Spark webUI which exports application related data as Json. We use monitoring systems as well as alternate UI for that json data for our specific application. Found it much cleaner. Can provide 0.9.1 version. Would submit as a pull request soon. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi On Fri, May 23, 2014 at 10:57 AM, Chester chesterxgc...@yahoo.com wrote: This is something we are interested as well. We are planning to investigate more on this. If someone has suggestions, we would love to hear. Chester Sent from my iPad On May 22, 2014, at 8:02 AM, Pierre B pierre.borckm...@realimpactanalytics.com wrote: Hi Andy! Yes Spark UI provides a lot of interesting informations for debugging purposes. Here I’m trying to integrate a simple progress monitoring in my app ui. I’m typically running a few “jobs” (or rather actions), and I’d like to be able to display the progress of each of those in my ui. I don’t really see how i could do that using SparkListener for the moment … Thanks for your help! Cheers! Pierre Borckmans Software team RealImpact Analytics | Brussels Office www.realimpactanalytics.com | [hidden email] FR +32 485 91 87 31 | Skype pierre.borckmans On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] [hidden email] wrote: SparkListener offers good stuffs. But I also completed it with another metrics stuffs on my own that use Akka to aggregate metrics from anywhere I'd like to collect them (without any deps on ganglia yet on Codahale). However, this was useful to gather some custom metrics (from within the tasks then) not really to collect overall monitoring information about the spark thingies themselves. For that Spark UI offers already a pretty good insight no? Cheers, aℕdy ℙetrella about.me/noootsab On Thu, May 22, 2014 at 4:51 PM, Pierre B a href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0 target=_top rel=nofollow link=external[hidden email] wrote: Is there a simple way to monitor the overall progress of an action using SparkListener or anything else? I see that one can name an RDD... Could that be used to determine which action triggered a stage, ... ? Thanks Pierre -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html Sent from the Apache Spark User List mailing list archive at Nabble.com. If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html To unsubscribe from Use SparkListener to get overall progress of an action, click here. NAML View this message in context: Re: Use SparkListener to get overall progress of an action Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Use SparkListener to get overall progress of an action
Mayur, I'm interested on it as well. Can you send me? Cheers, Otávio Carvalho. Undergrad. Student at Federal University of Rio Grande do Sul Porto Alegre, Brazil. 2014-05-23 11:00 GMT-03:00 Pierre Borckmans pierre.borckm...@realimpactanalytics.com: That would be great, Mayur, thanks! Anyhow, to be more specific, my question really was the following: Is there any way to link events in the SparkListener to an action triggered in your code? Cheers *Pierre Borckmans* Software team *Real**Impact* Analytics *| *Brussels Office www.realimpactanalytics.com *| *pierre.borckm...@realimpactanalytics.comthierry.lib...@realimpactanalytics.com *FR *+32 485 91 87 31 *| **Skype* pierre.borckmans On 23 May 2014, at 10:17, Mayur Rustagi mayur.rust...@gmail.com wrote: We have an internal patched version of Spark webUI which exports application related data as Json. We use monitoring systems as well as alternate UI for that json data for our specific application. Found it much cleaner. Can provide 0.9.1 version. Would submit as a pull request soon. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi https://twitter.com/mayur_rustagi On Fri, May 23, 2014 at 10:57 AM, Chester chesterxgc...@yahoo.com wrote: This is something we are interested as well. We are planning to investigate more on this. If someone has suggestions, we would love to hear. Chester Sent from my iPad On May 22, 2014, at 8:02 AM, Pierre B pierre.borckm...@realimpactanalytics.com wrote: Hi Andy! Yes Spark UI provides a lot of interesting informations for debugging purposes. Here I’m trying to integrate a simple progress monitoring in my app ui. I’m typically running a few “jobs” (or rather actions), and I’d like to be able to display the progress of each of those in my ui. I don’t really see how i could do that using SparkListener for the moment … Thanks for your help! Cheers! *Pierre Borckmans* Software team *Real**Impact* Analytics *| *Brussels Office http://www.realimpactanalytics.com/www.realimpactanalytics.com *| *[hidden email] http://user/SendEmail.jtp?type=nodenode=6259i=0 *FR *+32 485 91 87 31 *| **Skype* pierre.borckmans On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] [hidden email] http://user/SendEmail.jtp?type=nodenode=6259i=1 wrote: SparkListener offers good stuffs. But I also completed it with another metrics stuffs on my own that use Akka to aggregate metrics from anywhere I'd like to collect them (without any deps on ganglia yet on Codahale). However, this was useful to gather some custom metrics (from within the tasks then) not really to collect overall monitoring information about the spark thingies themselves. For that Spark UI offers already a pretty good insight no? Cheers, aℕdy ℙetrella about.me/noootsab [image: aℕdy ℙetrella on about.me] http://about.me/noootsab On Thu, May 22, 2014 at 4:51 PM, Pierre B a href= x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0 target=_top rel=nofollow link=external[hidden email] wrote: Is there a simple way to monitor the overall progress of an action using SparkListener or anything else? I see that one can name an RDD... Could that be used to determine which action triggered a stage, ... ? Thanks Pierre -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html Sent from the Apache Spark User List mailing list archive at http://nabble.com/Nabble.com http://nabble.com/. -- If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html To unsubscribe from Use SparkListener to get overall progress of an action, click here. NAMLhttp://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: Re: Use SparkListener to get overall progress of an actionhttp://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6259.html Sent from the Apache Spark User List mailing list archivehttp://apache-spark-user-list.1001560.n3.nabble.com/at Nabble.com
Re: Use SparkListener to get overall progress of an action
I’ve been looking at how this is implemented in the UI: https://github.com/apache/spark/blob/branch-0.9/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala 1/ it’s easy to get the RDD name at the Stage events level 2/ the tricky part is that at the task level, we cannot link the tasks back to their corresponding stage like it’s done because TaskInfo is private (in fact private[spark]) : val stageIdToTaskInfos = HashMap[Int, HashSet[(TaskInfo, Option[TaskMetrics], Option[ExceptionFailure])]]() Tell me if I’m wrong, but i guess that’s the end of the story. There’s no way to do that without doing a custom build of spark… HTH Pierre Borckmans Software team RealImpact Analytics | Brussels Office www.realimpactanalytics.com | pierre.borckm...@realimpactanalytics.com FR +32 485 91 87 31 | Skype pierre.borckmans On 23 May 2014, at 16:40, Otávio Carvalho [via Apache Spark User List] ml-node+s1001560n6321...@n3.nabble.com wrote: Mayur, I'm interested on it as well. Can you send me? Cheers, Otávio Carvalho. Undergrad. Student at Federal University of Rio Grande do Sul Porto Alegre, Brazil. 2014-05-23 11:00 GMT-03:00 Pierre Borckmans [hidden email]: That would be great, Mayur, thanks! Anyhow, to be more specific, my question really was the following: Is there any way to link events in the SparkListener to an action triggered in your code? Cheers Pierre Borckmans Software team RealImpact Analytics | Brussels Office www.realimpactanalytics.com | [hidden email] FR +32 485 91 87 31 | Skype pierre.borckmans On 23 May 2014, at 10:17, Mayur Rustagi [hidden email] wrote: We have an internal patched version of Spark webUI which exports application related data as Json. We use monitoring systems as well as alternate UI for that json data for our specific application. Found it much cleaner. Can provide 0.9.1 version. Would submit as a pull request soon. Mayur Rustagi Ph: a href=tel:%2B1%20%28760%29%20203%203257 value=+17602033257 target=_blank+1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi On Fri, May 23, 2014 at 10:57 AM, Chester [hidden email] wrote: This is something we are interested as well. We are planning to investigate more on this. If someone has suggestions, we would love to hear. Chester Sent from my iPad On May 22, 2014, at 8:02 AM, Pierre B [hidden email] wrote: Hi Andy! Yes Spark UI provides a lot of interesting informations for debugging purposes. Here I’m trying to integrate a simple progress monitoring in my app ui. I’m typically running a few “jobs” (or rather actions), and I’d like to be able to display the progress of each of those in my ui. I don’t really see how i could do that using SparkListener for the moment … Thanks for your help! Cheers! Pierre Borckmans Software team RealImpact Analytics | Brussels Office www.realimpactanalytics.com | [hidden email] FR +32 485 91 87 31 | Skype pierre.borckmans On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] [hidden email] wrote: SparkListener offers good stuffs. But I also completed it with another metrics stuffs on my own that use Akka to aggregate metrics from anywhere I'd like to collect them (without any deps on ganglia yet on Codahale). However, this was useful to gather some custom metrics (from within the tasks then) not really to collect overall monitoring information about the spark thingies themselves. For that Spark UI offers already a pretty good insight no? Cheers, aℕdy ℙetrella about.me/noootsab On Thu, May 22, 2014 at 4:51 PM, Pierre B a href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0 target=_top rel=nofollow link=external[hidden email] wrote: Is there a simple way to monitor the overall progress of an action using SparkListener or anything else? I see that one can name an RDD... Could that be used to determine which action triggered a stage, ... ? Thanks Pierre -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html Sent from the Apache Spark User List mailing list archive at Nabble.com. If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html To unsubscribe from Use SparkListener to get overall progress of an action, click here. NAML View this message in context: Re: Use SparkListener to get overall progress of an action Sent from the Apache Spark User List mailing list archive at Nabble.com. If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress
Re: Use SparkListener to get overall progress of an action
Hi Pierre, I asked a similar question on this list about 6 weeks ago. Here is one answer http://mail-archives.apache.org/mod_mbox/spark-user/201404.mbox/%3ccamjob8n3foaxd-dc5j57-n1oocwxefcg5chljwnut7qnreq...@mail.gmail.com%3E I got that is of particular note: In the upcoming release of Spark 1.0 there will be a feature that provides for exactly what you describe: capturing the information displayed on the UI in JSON. More details will be provided in the documentation, but for now, anything before 0.9.1 can only go through JobLogger.scala, which outputs information in a somewhat arbitrary format and will be deprecated soon. If you find this feature useful, you can test it out by building the master branch of Spark yourself, following the instructions in https://github.com/apache/spark/pull/42. On 05/22/2014 08:51 AM, Pierre B wrote: Is there a simple way to monitor the overall progress of an action using SparkListener or anything else? I see that one can name an RDD... Could that be used to determine which action triggered a stage, ... ? Thanks Pierre -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Use SparkListener to get overall progress of an action
Thanks Philip, I don’t want to go the JobLogger way (too hacky ;) ) In version 1.0, if i’m not mistaken, you can even do what I’m asking for, since they removed the “private” for TaskInfo and such and replaced it with the “@DeveloperApi” annotation. I was looking for a simple way to do this in 0.9.1, but thanks anyway! Pierre On 23 May 2014, at 17:41, Philip Ogren [via Apache Spark User List] ml-node+s1001560n6326...@n3.nabble.com wrote: Hi Pierre, I asked a similar question on this list about 6 weeks ago. Here is one answer I got that is of particular note: In the upcoming release of Spark 1.0 there will be a feature that provides for exactly what you describe: capturing the information displayed on the UI in JSON. More details will be provided in the documentation, but for now, anything before 0.9.1 can only go through JobLogger.scala, which outputs information in a somewhat arbitrary format and will be deprecated soon. If you find this feature useful, you can test it out by building the master branch of Spark yourself, following the instructions in https://github.com/apache/spark/pull/42. On 05/22/2014 08:51 AM, Pierre B wrote: Is there a simple way to monitor the overall progress of an action using SparkListener or anything else? I see that one can name an RDD... Could that be used to determine which action triggered a stage, ... ? Thanks Pierre -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html Sent from the Apache Spark User List mailing list archive at Nabble.com. If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6326.html To unsubscribe from Use SparkListener to get overall progress of an action, click here. NAML -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6327.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Use SparkListener to get overall progress of an action
SparkListener offers good stuffs. But I also completed it with another metrics stuffs on my own that use Akka to aggregate metrics from anywhere I'd like to collect them (without any deps on ganglia yet on Codahale). However, this was useful to gather some custom metrics (from within the tasks then) not really to collect overall monitoring information about the spark thingies themselves. For that Spark UI offers already a pretty good insight no? Cheers, aℕdy ℙetrella about.me/noootsab [image: aℕdy ℙetrella on about.me] http://about.me/noootsab On Thu, May 22, 2014 at 4:51 PM, Pierre B pierre.borckm...@realimpactanalytics.com wrote: Is there a simple way to monitor the overall progress of an action using SparkListener or anything else? I see that one can name an RDD... Could that be used to determine which action triggered a stage, ... ? Thanks Pierre -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Use SparkListener to get overall progress of an action
Hi Andy! Yes Spark UI provides a lot of interesting informations for debugging purposes. Here I’m trying to integrate a simple progress monitoring in my app ui. I’m typically running a few “jobs” (or rather actions), and I’d like to be able to display the progress of each of those in my ui. I don’t really see how i could do that using SparkListener for the moment … Thanks for your help! Cheers! Pierre Borckmans Software team RealImpact Analytics | Brussels Office www.realimpactanalytics.com | pierre.borckm...@realimpactanalytics.com FR +32 485 91 87 31 | Skype pierre.borckmans On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] ml-node+s1001560n6258...@n3.nabble.com wrote: SparkListener offers good stuffs. But I also completed it with another metrics stuffs on my own that use Akka to aggregate metrics from anywhere I'd like to collect them (without any deps on ganglia yet on Codahale). However, this was useful to gather some custom metrics (from within the tasks then) not really to collect overall monitoring information about the spark thingies themselves. For that Spark UI offers already a pretty good insight no? Cheers, aℕdy ℙetrella about.me/noootsab On Thu, May 22, 2014 at 4:51 PM, Pierre B [hidden email] wrote: Is there a simple way to monitor the overall progress of an action using SparkListener or anything else? I see that one can name an RDD... Could that be used to determine which action triggered a stage, ... ? Thanks Pierre -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html Sent from the Apache Spark User List mailing list archive at Nabble.com. If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html To unsubscribe from Use SparkListener to get overall progress of an action, click here. NAML -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6259.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Use SparkListener to get overall progress of an action
Yeah, actually for that I used directly codahale with my own stuffs using the Akka system from within Spark itself. So the workers send messages back to a bunch of actors on the driver which are using codahale metrics. This way I can collect what/how an executor do/did, but also I can aggregate all executors metrics at once (via dedicated aggregation purposed codahale metrics). However, I didn't had time to dig enough in Spark to see with I could reuse the SparkListener system itself -- which is kind-of doing the same thing, but w/o akka AFAICT = where I can see that TaskMetrics are collected by task within the context/granularity of a Stage. Than aggregation looks like being done in a built-in (Queued) Bus. So I'll let someone else report how this could be extended, but my gut feeling that it won't be straightforward. hth (with respect of my limited knowledge of these internals ^^) cheers, aℕdy ℙetrella about.me/noootsab [image: aℕdy ℙetrella on about.me] http://about.me/noootsab On Thu, May 22, 2014 at 5:02 PM, Pierre B pierre.borckm...@realimpactanalytics.com wrote: Hi Andy! Yes Spark UI provides a lot of interesting informations for debugging purposes. Here I’m trying to integrate a simple progress monitoring in my app ui. I’m typically running a few “jobs” (or rather actions), and I’d like to be able to display the progress of each of those in my ui. I don’t really see how i could do that using SparkListener for the moment … Thanks for your help! Cheers! *Pierre Borckmans* Software team *Real**Impact* Analytics *| *Brussels Office www.realimpactanalytics.com *| *[hidden email]http://user/SendEmail.jtp?type=nodenode=6259i=0 *FR *+32 485 91 87 31 *| **Skype* pierre.borckmans On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] [hidden email] http://user/SendEmail.jtp?type=nodenode=6259i=1 wrote: SparkListener offers good stuffs. But I also completed it with another metrics stuffs on my own that use Akka to aggregate metrics from anywhere I'd like to collect them (without any deps on ganglia yet on Codahale). However, this was useful to gather some custom metrics (from within the tasks then) not really to collect overall monitoring information about the spark thingies themselves. For that Spark UI offers already a pretty good insight no? Cheers, aℕdy ℙetrella about.me/noootsab [image: aℕdy ℙetrella on about.me] http://about.me/noootsab On Thu, May 22, 2014 at 4:51 PM, Pierre B a href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0 target=_top rel=nofollow link=external[hidden email] wrote: Is there a simple way to monitor the overall progress of an action using SparkListener or anything else? I see that one can name an RDD... Could that be used to determine which action triggered a stage, ... ? Thanks Pierre -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html Sent from the Apache Spark User List mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html To unsubscribe from Use SparkListener to get overall progress of an action, click here. NAMLhttp://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: Re: Use SparkListener to get overall progress of an actionhttp://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6259.html Sent from the Apache Spark User List mailing list archivehttp://apache-spark-user-list.1001560.n3.nabble.com/at Nabble.com.
Re: Use SparkListener to get overall progress of an action
This is something we are interested as well. We are planning to investigate more on this. If someone has suggestions, we would love to hear. Chester Sent from my iPad On May 22, 2014, at 8:02 AM, Pierre B pierre.borckm...@realimpactanalytics.com wrote: Hi Andy! Yes Spark UI provides a lot of interesting informations for debugging purposes. Here I’m trying to integrate a simple progress monitoring in my app ui. I’m typically running a few “jobs” (or rather actions), and I’d like to be able to display the progress of each of those in my ui. I don’t really see how i could do that using SparkListener for the moment … Thanks for your help! Cheers! Pierre Borckmans Software team RealImpact Analytics | Brussels Office www.realimpactanalytics.com | [hidden email] FR +32 485 91 87 31 | Skype pierre.borckmans On 22 May 2014, at 16:58, andy petrella [via Apache Spark User List] [hidden email] wrote: SparkListener offers good stuffs. But I also completed it with another metrics stuffs on my own that use Akka to aggregate metrics from anywhere I'd like to collect them (without any deps on ganglia yet on Codahale). However, this was useful to gather some custom metrics (from within the tasks then) not really to collect overall monitoring information about the spark thingies themselves. For that Spark UI offers already a pretty good insight no? Cheers, aℕdy ℙetrella about.me/noootsab On Thu, May 22, 2014 at 4:51 PM, Pierre B a href=x-msg://7/user/SendEmail.jtp?type=nodeamp;node=6258amp;i=0 target=_top rel=nofollow link=external[hidden email] wrote: Is there a simple way to monitor the overall progress of an action using SparkListener or anything else? I see that one can name an RDD... Could that be used to determine which action triggered a stage, ... ? Thanks Pierre -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256.html Sent from the Apache Spark User List mailing list archive at Nabble.com. If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/Use-SparkListener-to-get-overall-progress-of-an-action-tp6256p6258.html To unsubscribe from Use SparkListener to get overall progress of an action, click here. NAML View this message in context: Re: Use SparkListener to get overall progress of an action Sent from the Apache Spark User List mailing list archive at Nabble.com.