Re: Monitoring the User Metrics for a long running Spark Job

Sonal Goyal Wed, 07 Dec 2016 22:46:42 -0800

You can try updating metrics.properties for the sink of your choice. In our
case, we add the following for getting application metrics in JSON format
using http


*.sink.reifier.class= org.apache.spark.metrics.sink.MetricsServlet

Here, we have defined the sink with name reifier and its class is the
MetricsServlet class. Then you can poll <master
ui>/metrics/applications/json

Take a look at https://github.com/hammerlab/spark-json-relay if it serves
your need.

Thanks,
Sonal
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>



On Wed, Dec 7, 2016 at 1:10 AM, Chawla,Sumit <sumitkcha...@gmail.com> wrote:

> Any pointers on this?
>
> Regards
> Sumit Chawla
>
>
> On Mon, Dec 5, 2016 at 8:30 PM, Chawla,Sumit <sumitkcha...@gmail.com>
> wrote:
>
>> An example implementation i found is : https://github.com/groupon/s
>> park-metrics
>>
>> Anyone has any experience using this?  I am more interested in something
>> for Pyspark specifically.
>>
>> The above link pointed to - https://github.com/apache/sp
>> ark/blob/master/conf/metrics.properties.template.  I need to spend some
>> time reading it, but any quick pointers will be appreciated.
>>
>>
>>
>> Regards
>> Sumit Chawla
>>
>>
>> On Mon, Dec 5, 2016 at 8:17 PM, Chawla,Sumit <sumitkcha...@gmail.com>
>> wrote:
>>
>>> Hi Manish
>>>
>>> I am specifically looking for something similar to following:
>>>
>>>  https://ci.apache.org/projects/flink/flink-docs-release-1.1
>>> /apis/common/index.html#accumulators--counters.
>>>
>>> Flink has this concept of Accumulators, where user can keep its custom
>>> counters etc.  While the application is executing these counters are
>>> queryable through REST API provided by Flink Monitoring Backend.  This way
>>> you don't have to wait for the program to complete.
>>>
>>>
>>>
>>> Regards
>>> Sumit Chawla
>>>
>>>
>>> On Mon, Dec 5, 2016 at 5:53 PM, manish ranjan <cse1.man...@gmail.com>
>>> wrote:
>>>
>>>> http://spark.apache.org/docs/latest/monitoring.html
>>>>
>>>> You can even install tools like  dstat
>>>> <http://dag.wieers.com/home-made/dstat/>, iostat
>>>> <http://linux.die.net/man/1/iostat>, and iotop
>>>> <http://linux.die.net/man/1/iotop>, *collectd*  can provide
>>>> fine-grained profiling on individual nodes.
>>>>
>>>> If you are using Mesos as Resource Manager , mesos exposes metrics as
>>>> well for the running job.
>>>>
>>>> Manish
>>>>
>>>> ~Manish
>>>>
>>>>
>>>>
>>>> On Mon, Dec 5, 2016 at 4:17 PM, Chawla,Sumit <sumitkcha...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi All
>>>>>
>>>>> I have a long running job which takes hours and hours to process
>>>>> data.  How can i monitor the operational efficency of this job?  I am
>>>>> interested in something like Storm\Flink style User metrics/aggregators,
>>>>> which i can monitor while my job is running.  Using these metrics i want 
>>>>> to
>>>>> monitor, per partition performance in processing items.  As of now, only
>>>>> way for me to get these metrics is when the job finishes.
>>>>>
>>>>> One possibility is that spark can flush the metrics to external system
>>>>> every few seconds, and thus use  an external system to monitor these
>>>>> metrics.  However, i wanted to see if the spark supports any such use case
>>>>> OOB.
>>>>>
>>>>>
>>>>> Regards
>>>>> Sumit Chawla
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Monitoring the User Metrics for a long running Spark Job

Reply via email to