Any pointers on this? Regards Sumit Chawla
On Mon, Dec 5, 2016 at 8:30 PM, Chawla,Sumit <sumitkcha...@gmail.com> wrote: > An example implementation i found is : https://github.com/groupon/ > spark-metrics > > Anyone has any experience using this? I am more interested in something > for Pyspark specifically. > > The above link pointed to - https://github.com/apache/ > spark/blob/master/conf/metrics.properties.template. I need to spend some > time reading it, but any quick pointers will be appreciated. > > > > Regards > Sumit Chawla > > > On Mon, Dec 5, 2016 at 8:17 PM, Chawla,Sumit <sumitkcha...@gmail.com> > wrote: > >> Hi Manish >> >> I am specifically looking for something similar to following: >> >> https://ci.apache.org/projects/flink/flink-docs-release-1. >> 1/apis/common/index.html#accumulators--counters. >> >> Flink has this concept of Accumulators, where user can keep its custom >> counters etc. While the application is executing these counters are >> queryable through REST API provided by Flink Monitoring Backend. This way >> you don't have to wait for the program to complete. >> >> >> >> Regards >> Sumit Chawla >> >> >> On Mon, Dec 5, 2016 at 5:53 PM, manish ranjan <cse1.man...@gmail.com> >> wrote: >> >>> http://spark.apache.org/docs/latest/monitoring.html >>> >>> You can even install tools like dstat >>> <http://dag.wieers.com/home-made/dstat/>, iostat >>> <http://linux.die.net/man/1/iostat>, and iotop >>> <http://linux.die.net/man/1/iotop>, *collectd* can provide >>> fine-grained profiling on individual nodes. >>> >>> If you are using Mesos as Resource Manager , mesos exposes metrics as >>> well for the running job. >>> >>> Manish >>> >>> ~Manish >>> >>> >>> >>> On Mon, Dec 5, 2016 at 4:17 PM, Chawla,Sumit <sumitkcha...@gmail.com> >>> wrote: >>> >>>> Hi All >>>> >>>> I have a long running job which takes hours and hours to process data. >>>> How can i monitor the operational efficency of this job? I am interested >>>> in something like Storm\Flink style User metrics/aggregators, which i can >>>> monitor while my job is running. Using these metrics i want to monitor, >>>> per partition performance in processing items. As of now, only way for me >>>> to get these metrics is when the job finishes. >>>> >>>> One possibility is that spark can flush the metrics to external system >>>> every few seconds, and thus use an external system to monitor these >>>> metrics. However, i wanted to see if the spark supports any such use case >>>> OOB. >>>> >>>> >>>> Regards >>>> Sumit Chawla >>>> >>>> >>> >> >