Hi Manish I am specifically looking for something similar to following:
https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/common/index.html#accumulators--counters. Flink has this concept of Accumulators, where user can keep its custom counters etc. While the application is executing these counters are queryable through REST API provided by Flink Monitoring Backend. This way you don't have to wait for the program to complete. Regards Sumit Chawla On Mon, Dec 5, 2016 at 5:53 PM, manish ranjan <cse1.man...@gmail.com> wrote: > http://spark.apache.org/docs/latest/monitoring.html > > You can even install tools like dstat > <http://dag.wieers.com/home-made/dstat/>, iostat > <http://linux.die.net/man/1/iostat>, and iotop > <http://linux.die.net/man/1/iotop>, *collectd* can provide fine-grained > profiling on individual nodes. > > If you are using Mesos as Resource Manager , mesos exposes metrics as well > for the running job. > > Manish > > ~Manish > > > > On Mon, Dec 5, 2016 at 4:17 PM, Chawla,Sumit <sumitkcha...@gmail.com> > wrote: > >> Hi All >> >> I have a long running job which takes hours and hours to process data. >> How can i monitor the operational efficency of this job? I am interested >> in something like Storm\Flink style User metrics/aggregators, which i can >> monitor while my job is running. Using these metrics i want to monitor, >> per partition performance in processing items. As of now, only way for me >> to get these metrics is when the job finishes. >> >> One possibility is that spark can flush the metrics to external system >> every few seconds, and thus use an external system to monitor these >> metrics. However, i wanted to see if the spark supports any such use case >> OOB. >> >> >> Regards >> Sumit Chawla >> >> >