Spark REST API YARN client mode is not full?

2016-10-06 Thread Vladimir Tretyakov
Hi,

When I start Spark  v1.6 (cdh5.8.0) in Yarn client mode I see that 4040
port is avaiable, but UI shows nothing and API returns not full information.

I started Spark application like this:

spark-submit --master yarn-client --class
org.apache.spark.examples.SparkPi
/usr/lib/spark/examples/lib/spark-examples-1.6.0-cdh5.8.0-hadoop2.6.0-cdh5.8.0.jar
1

API returns me:

http://localhost:4040/api/v1/applications

[ {
  "name" : "Spark Pi",
  "attempts" : [ {
"startTime" : "2016-10-05T11:27:54.558GMT",
"endTime" : "1969-12-31T23:59:59.999GMT",
"sparkUser" : "",
"completed" : false
  } ]
} ]

Where is application id? How I can get more detailed information about
application without this id (I am talking about /applications/[app-id]/jobs,
/applications/[app-id]/stages etc urls from
http://spark.apache.org/docs/1.6.0/monitoring.html)?

UI also shows me empty pages.

Without appId we cannot use other REST API calls. Is there any other way to
get RUNNING application ids?

Please help me understand what's going on.


Re: Spark metrics when running with YARN?

2016-09-18 Thread Vladimir Tretyakov
Hello Saisai Shao.

Thx for reminder, I know which component in which mode Spark has.

But Mich Talebzadeh has written above that URL 4040 will work regardless
mode user use, that's why I hoped it will be also true for metrics URL
(since they are on the same port).

I think you are right, better start Spark application in all possible modes
and check what is available and wher. This is time consuming, but I will
100% sure, thx guys.

Best regards, Vladimir.

On Sun, Sep 18, 2016 at 4:35 AM, Saisai Shao <sai.sai.s...@gmail.com> wrote:

> H Vladimir,
>
> I think you mixed cluster manager and Spark application running on it, the
> master and workers are two components for Standalone cluster manager, the
> yarn counterparts are RM and NM. the URL you listed above is only worked
> for standalone master and workers.
>
> It would be more clear if you could try running simple Spark applications
> on Standalone and Yarn.
>
> On Fri, Sep 16, 2016 at 10:32 PM, Vladimir Tretyakov <
> vladimir.tretya...@sematext.com> wrote:
>
>> Hello.
>>
>> Found that there is also Spark metric Sink like MetricsServlet.
>> which is enabled by default:
>>
>> https://apache.googlesource.com/spark/+/refs/heads/master/co
>> re/src/main/scala/org/apache/spark/metrics/MetricsConfig.scala#40
>>
>> Tried urls:
>>
>> On master:
>> http://localhost:8080/metrics/master/json/
>> http://localhost:8080/metrics/applications/json
>>
>> On slaves (with workers):
>> http://localhost:4040/metrics/json/
>>
>> got information I need.
>>
>> Questions:
>> 1. Will URLs for masted work in YARN (client/server mode) and Mesos
>> modes? Or this is only for Standalone mode?
>> 2. Will URL for slave also work for modes other than Standalone?
>>
>> Why are there 2 ways to get information, REST API and this Sink?
>>
>>
>> Best regards, Vladimir.
>>
>>
>>
>>
>>
>>
>> On Mon, Sep 12, 2016 at 3:53 PM, Vladimir Tretyakov <
>> vladimir.tretya...@sematext.com> wrote:
>>
>>> Hello Saisai Shao, Jacek Laskowski , thx for information.
>>>
>>> We are working on Spark monitoring tool and our users have different
>>> setup modes (Standalone, Mesos, YARN).
>>>
>>> Looked at code, found:
>>>
>>> /**
>>>  * Attempt to start a Jetty server bound to the supplied hostName:port 
>>> using the given
>>>  * context handlers.
>>>  *
>>>  * If the desired port number is contended, continues
>>> *incrementing ports until a free port is** * found*. Return the jetty 
>>> Server object, the chosen port, and a mutable collection of handlers.
>>>  */
>>>
>>> It seems most generic way (which will work for most users) will be start
>>> looking at ports:
>>>
>>> spark.ui.port (4040 by default)
>>> spark.ui.port + 1
>>> spark.ui.port + 2
>>> spark.ui.port + 3
>>> ...
>>>
>>> Until we will get responses from Spark.
>>>
>>> PS: yeah they may be some intersections with some other applications for
>>> some setups, in this case we may ask users about these exceptions and do
>>> our housework around them.
>>>
>>> Best regards, Vladimir.
>>>
>>> On Mon, Sep 12, 2016 at 12:07 PM, Saisai Shao <sai.sai.s...@gmail.com>
>>> wrote:
>>>
>>>> Here is the yarn RM REST API for you to refer (
>>>> http://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yar
>>>> n-site/ResourceManagerRest.html). You can use these APIs to query
>>>> applications running on yarn.
>>>>
>>>> On Sun, Sep 11, 2016 at 11:25 PM, Jacek Laskowski <ja...@japila.pl>
>>>> wrote:
>>>>
>>>>> Hi Vladimir,
>>>>>
>>>>> You'd have to talk to your cluster manager to query for all the
>>>>> running Spark applications. I'm pretty sure YARN and Mesos can do that
>>>>> but unsure about Spark Standalone. This is certainly not something a
>>>>> Spark application's web UI could do for you since it is designed to
>>>>> handle the single Spark application.
>>>>>
>>>>> Pozdrawiam,
>>>>> Jacek Laskowski
>>>>> ----
>>>>> https://medium.com/@jaceklaskowski/
>>>>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>>>>> Follow me at https://twitter.com/jaceklaskowski
>>>>>
>>>>>
>>>>> On Sun, S

Re: Spark metrics when running with YARN?

2016-09-16 Thread Vladimir Tretyakov
Hello.

Found that there is also Spark metric Sink like MetricsServlet.
which is enabled by default:

https://apache.googlesource.com/spark/+/refs/heads/master/core/src/main/scala/org/apache/spark/metrics/MetricsConfig.scala#40

Tried urls:

On master:
http://localhost:8080/metrics/master/json/
http://localhost:8080/metrics/applications/json

On slaves (with workers):
http://localhost:4040/metrics/json/

got information I need.

Questions:
1. Will URLs for masted work in YARN (client/server mode) and Mesos modes?
Or this is only for Standalone mode?
2. Will URL for slave also work for modes other than Standalone?

Why are there 2 ways to get information, REST API and this Sink?


Best regards, Vladimir.






On Mon, Sep 12, 2016 at 3:53 PM, Vladimir Tretyakov <
vladimir.tretya...@sematext.com> wrote:

> Hello Saisai Shao, Jacek Laskowski , thx for information.
>
> We are working on Spark monitoring tool and our users have different setup
> modes (Standalone, Mesos, YARN).
>
> Looked at code, found:
>
> /**
>  * Attempt to start a Jetty server bound to the supplied hostName:port using 
> the given
>  * context handlers.
>  *
>  * If the desired port number is contended, continues
> *incrementing ports until a free port is** * found*. Return the jetty Server 
> object, the chosen port, and a mutable collection of handlers.
>  */
>
> It seems most generic way (which will work for most users) will be start
> looking at ports:
>
> spark.ui.port (4040 by default)
> spark.ui.port + 1
> spark.ui.port + 2
> spark.ui.port + 3
> ...
>
> Until we will get responses from Spark.
>
> PS: yeah they may be some intersections with some other applications for
> some setups, in this case we may ask users about these exceptions and do
> our housework around them.
>
> Best regards, Vladimir.
>
> On Mon, Sep 12, 2016 at 12:07 PM, Saisai Shao <sai.sai.s...@gmail.com>
> wrote:
>
>> Here is the yarn RM REST API for you to refer (
>> http://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-
>> yarn-site/ResourceManagerRest.html). You can use these APIs to query
>> applications running on yarn.
>>
>> On Sun, Sep 11, 2016 at 11:25 PM, Jacek Laskowski <ja...@japila.pl>
>> wrote:
>>
>>> Hi Vladimir,
>>>
>>> You'd have to talk to your cluster manager to query for all the
>>> running Spark applications. I'm pretty sure YARN and Mesos can do that
>>> but unsure about Spark Standalone. This is certainly not something a
>>> Spark application's web UI could do for you since it is designed to
>>> handle the single Spark application.
>>>
>>> Pozdrawiam,
>>> Jacek Laskowski
>>> 
>>> https://medium.com/@jaceklaskowski/
>>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>>> Follow me at https://twitter.com/jaceklaskowski
>>>
>>>
>>> On Sun, Sep 11, 2016 at 11:18 AM, Vladimir Tretyakov
>>> <vladimir.tretya...@sematext.com> wrote:
>>> > Hello Jacek, thx a lot, it works.
>>> >
>>> > Is there a way how to get list of running applications from REST API?
>>> Or I
>>> > have to try connect 4040 4041... 40xx ports and check if ports answer
>>> > something?
>>> >
>>> > Best regards, Vladimir.
>>> >
>>> > On Sat, Sep 10, 2016 at 6:00 AM, Jacek Laskowski <ja...@japila.pl>
>>> wrote:
>>> >>
>>> >> Hi,
>>> >>
>>> >> That's correct. One app one web UI. Open 4041 and you'll see the other
>>> >> app.
>>> >>
>>> >> Jacek
>>> >>
>>> >>
>>> >> On 9 Sep 2016 11:53 a.m., "Vladimir Tretyakov"
>>> >> <vladimir.tretya...@sematext.com> wrote:
>>> >>>
>>> >>> Hello again.
>>> >>>
>>> >>> I am trying to play with Spark version "2.11-2.0.0".
>>> >>>
>>> >>> Problem that REST API and UI shows me different things.
>>> >>>
>>> >>> I've stared 2 applications from "examples set": opened 2 consoles
>>> and run
>>> >>> following command in each:
>>> >>>
>>> >>> ./bin/spark-submit   --class org.apache.spark.examples.SparkPi
>>>  --master
>>> >>> spark://wawanawna:7077  --executor-memory 2G  --total-executor-cores
>>> 30
>>> >>> examples/jars/spark-examples_2.11-2.0.0.jar  1
>>> >>>
>>> >>> Request to

Re: Spark metrics when running with YARN?

2016-09-12 Thread Vladimir Tretyakov
Hello Saisai Shao, Jacek Laskowski , thx for information.

We are working on Spark monitoring tool and our users have different setup
modes (Standalone, Mesos, YARN).

Looked at code, found:

/**
 * Attempt to start a Jetty server bound to the supplied hostName:port
using the given
 * context handlers.
 *
 * If the desired port number is contended, continues
*incrementing ports until a free port is** * found*. Return the jetty
Server object, the chosen port, and a mutable collection of handlers.
 */

It seems most generic way (which will work for most users) will be start
looking at ports:

spark.ui.port (4040 by default)
spark.ui.port + 1
spark.ui.port + 2
spark.ui.port + 3
...

Until we will get responses from Spark.

PS: yeah they may be some intersections with some other applications for
some setups, in this case we may ask users about these exceptions and do
our housework around them.

Best regards, Vladimir.

On Mon, Sep 12, 2016 at 12:07 PM, Saisai Shao <sai.sai.s...@gmail.com>
wrote:

> Here is the yarn RM REST API for you to refer (http://hadoop.apache.org/
> docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html). You
> can use these APIs to query applications running on yarn.
>
> On Sun, Sep 11, 2016 at 11:25 PM, Jacek Laskowski <ja...@japila.pl> wrote:
>
>> Hi Vladimir,
>>
>> You'd have to talk to your cluster manager to query for all the
>> running Spark applications. I'm pretty sure YARN and Mesos can do that
>> but unsure about Spark Standalone. This is certainly not something a
>> Spark application's web UI could do for you since it is designed to
>> handle the single Spark application.
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> 
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>
>> On Sun, Sep 11, 2016 at 11:18 AM, Vladimir Tretyakov
>> <vladimir.tretya...@sematext.com> wrote:
>> > Hello Jacek, thx a lot, it works.
>> >
>> > Is there a way how to get list of running applications from REST API?
>> Or I
>> > have to try connect 4040 4041... 40xx ports and check if ports answer
>> > something?
>> >
>> > Best regards, Vladimir.
>> >
>> > On Sat, Sep 10, 2016 at 6:00 AM, Jacek Laskowski <ja...@japila.pl>
>> wrote:
>> >>
>> >> Hi,
>> >>
>> >> That's correct. One app one web UI. Open 4041 and you'll see the other
>> >> app.
>> >>
>> >> Jacek
>> >>
>> >>
>> >> On 9 Sep 2016 11:53 a.m., "Vladimir Tretyakov"
>> >> <vladimir.tretya...@sematext.com> wrote:
>> >>>
>> >>> Hello again.
>> >>>
>> >>> I am trying to play with Spark version "2.11-2.0.0".
>> >>>
>> >>> Problem that REST API and UI shows me different things.
>> >>>
>> >>> I've stared 2 applications from "examples set": opened 2 consoles and
>> run
>> >>> following command in each:
>> >>>
>> >>> ./bin/spark-submit   --class org.apache.spark.examples.SparkPi
>>  --master
>> >>> spark://wawanawna:7077  --executor-memory 2G  --total-executor-cores
>> 30
>> >>> examples/jars/spark-examples_2.11-2.0.0.jar  1
>> >>>
>> >>> Request to API endpoint:
>> >>>
>> >>> http://localhost:4040/api/v1/applications
>> >>>
>> >>> returned me following JSON:
>> >>>
>> >>> [ {
>> >>>   "id" : "app-20160909184529-0016",
>> >>>   "name" : "Spark Pi",
>> >>>   "attempts" : [ {
>> >>> "startTime" : "2016-09-09T15:45:25.047GMT",
>> >>> "endTime" : "1969-12-31T23:59:59.999GMT",
>> >>> "lastUpdated" : "2016-09-09T15:45:25.047GMT",
>> >>> "duration" : 0,
>> >>> "sparkUser" : "",
>> >>> "completed" : false,
>> >>> "startTimeEpoch" : 1473435925047,
>> >>> "endTimeEpoch" : -1,
>> >>> "lastUpdatedEpoch" : 1473435925047
>> >>>   } ]
>> >>> } ]
>> >>>
>> >>> so response contains information only about 1 application.
>> >>>
>> >>> But in r

Re: Spark metrics when running with YARN?

2016-09-11 Thread Vladimir Tretyakov
Hello Jacek, thx a lot, it works.

Is there a way how to get list of running applications from REST API? Or I
have to try connect 4040 4041... 40xx ports and check if ports answer
something?

Best regards, Vladimir.

On Sat, Sep 10, 2016 at 6:00 AM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> That's correct. One app one web UI. Open 4041 and you'll see the other
> app.
>
> Jacek
>
> On 9 Sep 2016 11:53 a.m., "Vladimir Tretyakov" <
> vladimir.tretya...@sematext.com> wrote:
>
>> Hello again.
>>
>> I am trying to play with Spark version "2.11-2.0.0".
>>
>> Problem that REST API and UI shows me different things.
>>
>> I've stared 2 applications from "examples set": opened 2 consoles and run
>> following command in each:
>>
>> *./bin/spark-submit   --class org.apache.spark.examples.SparkPi
>> --master spark://wawanawna:7077  --executor-memory 2G
>>  --total-executor-cores 30  examples/jars/spark-examples_2.11-2.0.0.jar
>>  1*
>>
>> Request to API endpoint:
>>
>> http://localhost:4040/api/v1/applications
>>
>> returned me following JSON:
>>
>> [ {
>>   "id" : "app-20160909184529-0016",
>>   "name" : "Spark Pi",
>>   "attempts" : [ {
>> "startTime" : "2016-09-09T15:45:25.047GMT",
>> "endTime" : "1969-12-31T23:59:59.999GMT",
>> "lastUpdated" : "2016-09-09T15:45:25.047GMT",
>> "duration" : 0,
>> "sparkUser" : "",
>> "completed" : false,
>> "startTimeEpoch" : 1473435925047,
>> "endTimeEpoch" : -1,
>> "lastUpdatedEpoch" : 1473435925047
>>   } ]
>> } ]
>>
>> so response contains information only about 1 application. But in reality
>> I've started 2 applications and Spark UI shows me 2 RUNNING application
>> (please see screenshot). Does anybody maybe know answer why API and UI
>> shows different things?
>>
>>
>> Best regards, Vladimir.
>>
>>
>> On Tue, Aug 30, 2016 at 3:52 PM, Vijay Kiran <m...@vijaykiran.com> wrote:
>>
>>> Hi Otis,
>>>
>>> Did you check the REST API as documented in
>>> http://spark.apache.org/docs/latest/monitoring.html
>>>
>>> Regards,
>>> Vijay
>>>
>>> > On 30 Aug 2016, at 14:43, Otis Gospodnetić <otis.gospodne...@gmail.com>
>>> wrote:
>>> >
>>> > Hi Mich and Vijay,
>>> >
>>> > Thanks!  I forgot to include an important bit - I'm looking for a
>>> programmatic way to get Spark metrics when running Spark under YARN - so
>>> JMX or API of some kind.
>>> >
>>> > Thanks,
>>> > Otis
>>> > --
>>> > Monitoring - Log Management - Alerting - Anomaly Detection
>>> > Solr & Elasticsearch Consulting Support Training -
>>> http://sematext.com/
>>> >
>>> >
>>> > On Tue, Aug 30, 2016 at 6:59 AM, Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>> > Spark UI regardless of deployment mode Standalone, yarn etc runs on
>>> port 4040 by default that can be accessed directly
>>> >
>>> > Otherwise one can specify a specific port with --conf
>>> "spark.ui.port=5" for example 5
>>> >
>>> > HTH
>>> >
>>> > Dr Mich Talebzadeh
>>> >
>>> > LinkedIn  https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJ
>>> d6zP6AcPCCdOABUrV8Pw
>>> >
>>> > http://talebzadehmich.wordpress.com
>>> >
>>> > Disclaimer: Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>> >
>>> >
>>> > On 30 August 2016 at 11:48, Vijay Kiran <m...@vijaykiran.com> wrote:
>>> >
>>> > From Yarm RM UI, find the spark application Id, and in the application
>>> details, you can click on the “Tracking URL” which should give you the
>>> Spark UI.
>>> >
>>> > ./Vijay
>>> >
>>> > > On 30 Aug 2016, at 07:53, Otis Gospodnetić <
>>> otis.gospodne...@gmail.com> wrote:
>>> > >
>>> > > Hi,
>>> > >
>>> > > When Spark is run on top of YARN, where/how can one get Spark
>>> metrics?
>>> > >
>>> > > Thanks,
>>> > > Otis
>>> > > --
>>> > > Monitoring - Log Management - Alerting - Anomaly Detection
>>> > > Solr & Elasticsearch Consulting Support Training -
>>> http://sematext.com/
>>> > >
>>> >
>>> >
>>> > -
>>> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>> >
>>> >
>>> >
>>>
>>>
>>> -
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>
>>>
>>
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>


Is there a way to provide individual property to each Spark executor?

2014-10-02 Thread Vladimir Tretyakov
Hi, here in Sematext we almost done with Spark monitoring
http://www.sematext.com/spm/index.html

But we need 1 thing from Spark, something like
https://groups.google.com/forum/#!topic/storm-user/2fNCF341yqU in Storm.

Something like 'placeholder' in java opts which Spark will fills for
executor, with executorId (0,1,2,3...).

For example I will write in spark-defaults.conf:

spark.executor.extraJavaOptions -Dcom.sun.management.jmxremote
-javaagent:/opt/spm/spm-monitor/lib/spm-monitor-spark.jar=myValue-
*%executoId*:spark-executor:default

and will get in executor processes:
-Dcom.sun.management.jmxremote
-javaagent:/opt/spm/spm-monitor/lib/spm-monitor-spark.jar=myValue-*0*
:spark-executor:default
-Dcom.sun.management.jmxremote
-javaagent:/opt/spm/spm-monitor/lib/spm-monitor-spark.jar=myValue-*1*
:spark-executor:default
-Dcom.sun.management.jmxremote
-javaagent:/opt/spm/spm-monitor/lib/spm-monitor-spark.jar=myValue-*2*
:spark-executor:default
...
...
...



Can I do something like that in Spark for executor? If not maybe it can be
done in the future? Will be useful.

Thx, best redgards, Vladimir Tretyakov.


JMXSink for YARN deployment

2014-09-11 Thread Vladimir Tretyakov
Hello, we are in Sematext (https://apps.sematext.com/) are writing
Monitoring tool for Spark and we came across one question:

How to enable JMX metrics for YARN deployment?

We put *.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
to file $SPARK_HOME/conf/metrics.properties but it doesn't work.

Everything works in Standalone mode, but not in YARN mode.

Can somebody help?

Thx!

PS: I've found also
https://stackoverflow.com/questions/23529404/spark-on-yarn-how-to-send-metrics-to-graphite-sink/25786112
without answer.


Re: JMXSink for YARN deployment

2014-09-11 Thread Vladimir Tretyakov
Hi Shao, thx for explanation, any ideas how to fix it? Where should I put
metrics.properties file?

On Thu, Sep 11, 2014 at 4:18 PM, Shao, Saisai saisai.s...@intel.com wrote:

  Hi,



 I’m guessing the problem is that driver or executor cannot get the
 metrics.properties configuration file in the yarn container, so metrics
 system cannot load the right sinks.



 Thanks

 Jerry



 *From:* Vladimir Tretyakov [mailto:vladimir.tretya...@sematext.com]
 *Sent:* Thursday, September 11, 2014 7:30 PM
 *To:* user@spark.apache.org
 *Subject:* JMXSink for YARN deployment



 Hello, we are in Sematext (https://apps.sematext.com/) are writing
 Monitoring tool for Spark and we came across one question:



 How to enable JMX metrics for YARN deployment?



 We put *.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink

 to file $SPARK_HOME/conf/metrics.properties but it doesn't work.



 Everything works in Standalone mode, but not in YARN mode.



 Can somebody help?



 Thx!



 PS: I've found also
 https://stackoverflow.com/questions/23529404/spark-on-yarn-how-to-send-metrics-to-graphite-sink/25786112
 without answer.



Re: JMXSink for YARN deployment

2014-09-11 Thread Vladimir Tretyakov
Hi again, yeah , I've tried to use ” spark.metrics.conf” before my question
in ML, had no  luck:(
Any other ideas from somebody?
Seems nobody use metrics in YARN deployment mode.
How about Mesos? I didn't try but maybe Spark has the same difficulties on
Mesos?

PS: Spark is great thing in general, will be nice to see metrics in
YARN/Mesos mode, not only in Standalone:)


On Thu, Sep 11, 2014 at 5:25 PM, Shao, Saisai saisai.s...@intel.com wrote:

  I think you can try to use ” spark.metrics.conf” to manually specify the
 path of metrics.properties, but the prerequisite is that each container
 should find this file in their local FS because this file is loaded locally.



 Besides I think this might be a kind of workaround, a better solution is
 to fix this by some other solutions.



 Thanks

 Jerry



 *From:* Vladimir Tretyakov [mailto:vladimir.tretya...@sematext.com]
 *Sent:* Thursday, September 11, 2014 10:08 PM
 *Cc:* user@spark.apache.org
 *Subject:* Re: JMXSink for YARN deployment



 Hi Shao, thx for explanation, any ideas how to fix it? Where should I put
 metrics.properties file?



 On Thu, Sep 11, 2014 at 4:18 PM, Shao, Saisai saisai.s...@intel.com
 wrote:

 Hi,



 I’m guessing the problem is that driver or executor cannot get the
 metrics.properties configuration file in the yarn container, so metrics
 system cannot load the right sinks.



 Thanks

 Jerry



 *From:* Vladimir Tretyakov [mailto:vladimir.tretya...@sematext.com]
 *Sent:* Thursday, September 11, 2014 7:30 PM
 *To:* user@spark.apache.org
 *Subject:* JMXSink for YARN deployment



 Hello, we are in Sematext (https://apps.sematext.com/) are writing
 Monitoring tool for Spark and we came across one question:



 How to enable JMX metrics for YARN deployment?



 We put *.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink

 to file $SPARK_HOME/conf/metrics.properties but it doesn't work.



 Everything works in Standalone mode, but not in YARN mode.



 Can somebody help?



 Thx!



 PS: I've found also
 https://stackoverflow.com/questions/23529404/spark-on-yarn-how-to-send-metrics-to-graphite-sink/25786112
 without answer.





Re: JMXSink for YARN deployment

2014-09-11 Thread Vladimir Tretyakov
Hi, Kousuke,

Can you please explain a bit detailed what do you mean, I am new in Spark,
looked at https://spark.apache.org/docs/latest/submitting-applications.html
seems there is no '--files' option.

I just have to add '--files /path-to-metrics.properties' ? Undocumented
ability?

Thx for answer.



On Thu, Sep 11, 2014 at 5:55 PM, Kousuke Saruta saru...@oss.nttdata.co.jp
wrote:

  Hi Vladimir

 How about use --files option with spark-submit?

 - Kousuke


 (2014/09/11 23:43), Vladimir Tretyakov wrote:

  Hi again, yeah , I've tried to use ” spark.metrics.conf” before my
 question in ML, had no  luck:(
 Any other ideas from somebody?
 Seems nobody use metrics in YARN deployment mode.
 How about Mesos? I didn't try but maybe Spark has the same difficulties on
 Mesos?

  PS: Spark is great thing in general, will be nice to see metrics in
 YARN/Mesos mode, not only in Standalone:)


 On Thu, Sep 11, 2014 at 5:25 PM, Shao, Saisai saisai.s...@intel.com
 wrote:

  I think you can try to use ” spark.metrics.conf” to manually specify
 the path of metrics.properties, but the prerequisite is that each container
 should find this file in their local FS because this file is loaded locally.



 Besides I think this might be a kind of workaround, a better solution is
 to fix this by some other solutions.



 Thanks

 Jerry



 *From:* Vladimir Tretyakov [mailto:vladimir.tretya...@sematext.com]
 *Sent:* Thursday, September 11, 2014 10:08 PM
 *Cc:* user@spark.apache.org
 *Subject:* Re: JMXSink for YARN deployment



 Hi Shao, thx for explanation, any ideas how to fix it? Where should I put
 metrics.properties file?



 On Thu, Sep 11, 2014 at 4:18 PM, Shao, Saisai saisai.s...@intel.com
 wrote:

 Hi,



 I’m guessing the problem is that driver or executor cannot get the
 metrics.properties configuration file in the yarn container, so metrics
 system cannot load the right sinks.



 Thanks

 Jerry



 *From:* Vladimir Tretyakov [mailto:vladimir.tretya...@sematext.com]
 *Sent:* Thursday, September 11, 2014 7:30 PM
 *To:* user@spark.apache.org
 *Subject:* JMXSink for YARN deployment



 Hello, we are in Sematext (https://apps.sematext.com/) are writing
 Monitoring tool for Spark and we came across one question:



 How to enable JMX metrics for YARN deployment?



 We put *.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink

 to file $SPARK_HOME/conf/metrics.properties but it doesn't work.



 Everything works in Standalone mode, but not in YARN mode.



 Can somebody help?



 Thx!



 PS: I've found also
 https://stackoverflow.com/questions/23529404/spark-on-yarn-how-to-send-metrics-to-graphite-sink/25786112
 without answer.