Re: Monitoring the User Metrics for a long running Spark Job

Chawla,Sumit Mon, 05 Dec 2016 20:18:03 -0800

Hi Manish

I am specifically looking for something similar to following:



https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/common/index.html#accumulators--counters.


Flink has this concept of Accumulators, where user can keep its custom
counters etc.  While the application is executing these counters are
queryable through REST API provided by Flink Monitoring Backend.  This way
you don't have to wait for the program to complete.



Regards
Sumit Chawla


On Mon, Dec 5, 2016 at 5:53 PM, manish ranjan <cse1.man...@gmail.com> wrote:

> http://spark.apache.org/docs/latest/monitoring.html
>
> You can even install tools like  dstat
> <http://dag.wieers.com/home-made/dstat/>, iostat
> <http://linux.die.net/man/1/iostat>, and iotop
> <http://linux.die.net/man/1/iotop>, *collectd*  can provide fine-grained
> profiling on individual nodes.
>
> If you are using Mesos as Resource Manager , mesos exposes metrics as well
> for the running job.
>
> Manish
>
> ~Manish
>
>
>
> On Mon, Dec 5, 2016 at 4:17 PM, Chawla,Sumit <sumitkcha...@gmail.com>
> wrote:
>
>> Hi All
>>
>> I have a long running job which takes hours and hours to process data.
>> How can i monitor the operational efficency of this job?  I am interested
>> in something like Storm\Flink style User metrics/aggregators, which i can
>> monitor while my job is running.  Using these metrics i want to monitor,
>> per partition performance in processing items.  As of now, only way for me
>> to get these metrics is when the job finishes.
>>
>> One possibility is that spark can flush the metrics to external system
>> every few seconds, and thus use  an external system to monitor these
>> metrics.  However, i wanted to see if the spark supports any such use case
>> OOB.
>>
>>
>> Regards
>> Sumit Chawla
>>
>>
>

Re: Monitoring the User Metrics for a long running Spark Job

Reply via email to