Re: Spark metrics when running with YARN?

Vladimir Tretyakov Sat, 17 Sep 2016 23:41:23 -0700

Hello Saisai Shao.

Thx for reminder, I know which component in which mode Spark has.


But Mich Talebzadeh has written above that URL 4040 will work regardless
mode user use, that's why I hoped it will be also true for metrics URL
(since they are on the same port).

I think you are right, better start Spark application in all possible modes
and check what is available and wher. This is time consuming, but I will
100% sure, thx guys.

Best regards, Vladimir.

On Sun, Sep 18, 2016 at 4:35 AM, Saisai Shao <sai.sai.s...@gmail.com> wrote:

> H Vladimir,
>
> I think you mixed cluster manager and Spark application running on it, the
> master and workers are two components for Standalone cluster manager, the
> yarn counterparts are RM and NM. the URL you listed above is only worked
> for standalone master and workers.
>
> It would be more clear if you could try running simple Spark applications
> on Standalone and Yarn.
>
> On Fri, Sep 16, 2016 at 10:32 PM, Vladimir Tretyakov <
> vladimir.tretya...@sematext.com> wrote:
>
>> Hello.
>>
>> Found that there is also Spark metric Sink like MetricsServlet.
>> which is enabled by default:
>>
>> https://apache.googlesource.com/spark/+/refs/heads/master/co
>> re/src/main/scala/org/apache/spark/metrics/MetricsConfig.scala#40
>>
>> Tried urls:
>>
>> On master:
>> http://localhost:8080/metrics/master/json/
>> http://localhost:8080/metrics/applications/json
>>
>> On slaves (with workers):
>> http://localhost:4040/metrics/json/
>>
>> got information I need.
>>
>> Questions:
>> 1. Will URLs for masted work in YARN (client/server mode) and Mesos
>> modes? Or this is only for Standalone mode?
>> 2. Will URL for slave also work for modes other than Standalone?
>>
>> Why are there 2 ways to get information, REST API and this Sink?
>>
>>
>> Best regards, Vladimir.
>>
>>
>>
>>
>>
>>
>> On Mon, Sep 12, 2016 at 3:53 PM, Vladimir Tretyakov <
>> vladimir.tretya...@sematext.com> wrote:
>>
>>> Hello Saisai Shao, Jacek Laskowski , thx for information.
>>>
>>> We are working on Spark monitoring tool and our users have different
>>> setup modes (Standalone, Mesos, YARN).
>>>
>>> Looked at code, found:
>>>
>>> /**
>>>  * Attempt to start a Jetty server bound to the supplied hostName:port 
>>> using the given
>>>  * context handlers.
>>>  *
>>>  * If the desired port number is contended, continues
>>> *incrementing ports until a free port is** * found*. Return the jetty 
>>> Server object, the chosen port, and a mutable collection of handlers.
>>>  */
>>>
>>> It seems most generic way (which will work for most users) will be start
>>> looking at ports:
>>>
>>> spark.ui.port (4040 by default)
>>> spark.ui.port + 1
>>> spark.ui.port + 2
>>> spark.ui.port + 3
>>> ...
>>>
>>> Until we will get responses from Spark.
>>>
>>> PS: yeah they may be some intersections with some other applications for
>>> some setups, in this case we may ask users about these exceptions and do
>>> our housework around them.
>>>
>>> Best regards, Vladimir.
>>>
>>> On Mon, Sep 12, 2016 at 12:07 PM, Saisai Shao <sai.sai.s...@gmail.com>
>>> wrote:
>>>
>>>> Here is the yarn RM REST API for you to refer (
>>>> http://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yar
>>>> n-site/ResourceManagerRest.html). You can use these APIs to query
>>>> applications running on yarn.
>>>>
>>>> On Sun, Sep 11, 2016 at 11:25 PM, Jacek Laskowski <ja...@japila.pl>
>>>> wrote:
>>>>
>>>>> Hi Vladimir,
>>>>>
>>>>> You'd have to talk to your cluster manager to query for all the
>>>>> running Spark applications. I'm pretty sure YARN and Mesos can do that
>>>>> but unsure about Spark Standalone. This is certainly not something a
>>>>> Spark application's web UI could do for you since it is designed to
>>>>> handle the single Spark application.
>>>>>
>>>>> Pozdrawiam,
>>>>> Jacek Laskowski
>>>>> ----
>>>>> https://medium.com/@jaceklaskowski/
>>>>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>>>>> Follow me at https://twitter.com/jaceklaskowski
>>>>>
>>>>>
>>>>> On Sun, Sep 11, 2016 at 11:18 AM, Vladimir Tretyakov
>>>>> <vladimir.tretya...@sematext.com> wrote:
>>>>> > Hello Jacek, thx a lot, it works.
>>>>> >
>>>>> > Is there a way how to get list of running applications from REST
>>>>> API? Or I
>>>>> > have to try connect 4040 4041... 40xx ports and check if ports answer
>>>>> > something?
>>>>> >
>>>>> > Best regards, Vladimir.
>>>>> >
>>>>> > On Sat, Sep 10, 2016 at 6:00 AM, Jacek Laskowski <ja...@japila.pl>
>>>>> wrote:
>>>>> >>
>>>>> >> Hi,
>>>>> >>
>>>>> >> That's correct. One app one web UI. Open 4041 and you'll see the
>>>>> other
>>>>> >> app.
>>>>> >>
>>>>> >> Jacek
>>>>> >>
>>>>> >>
>>>>> >> On 9 Sep 2016 11:53 a.m., "Vladimir Tretyakov"
>>>>> >> <vladimir.tretya...@sematext.com> wrote:
>>>>> >>>
>>>>> >>> Hello again.
>>>>> >>>
>>>>> >>> I am trying to play with Spark version "2.11-2.0.0".
>>>>> >>>
>>>>> >>> Problem that REST API and UI shows me different things.
>>>>> >>>
>>>>> >>> I've stared 2 applications from "examples set": opened 2 consoles
>>>>> and run
>>>>> >>> following command in each:
>>>>> >>>
>>>>> >>> ./bin/spark-submit   --class org.apache.spark.examples.SparkPi
>>>>>  --master
>>>>> >>> spark://wawanawna:7077  --executor-memory 2G
>>>>> --total-executor-cores 30
>>>>> >>> examples/jars/spark-examples_2.11-2.0.0.jar  10000
>>>>> >>>
>>>>> >>> Request to API endpoint:
>>>>> >>>
>>>>> >>> http://localhost:4040/api/v1/applications
>>>>> >>>
>>>>> >>> returned me following JSON:
>>>>> >>>
>>>>> >>> [ {
>>>>> >>>   "id" : "app-20160909184529-0016",
>>>>> >>>   "name" : "Spark Pi",
>>>>> >>>   "attempts" : [ {
>>>>> >>>     "startTime" : "2016-09-09T15:45:25.047GMT",
>>>>> >>>     "endTime" : "1969-12-31T23:59:59.999GMT",
>>>>> >>>     "lastUpdated" : "2016-09-09T15:45:25.047GMT",
>>>>> >>>     "duration" : 0,
>>>>> >>>     "sparkUser" : "",
>>>>> >>>     "completed" : false,
>>>>> >>>     "startTimeEpoch" : 1473435925047,
>>>>> >>>     "endTimeEpoch" : -1,
>>>>> >>>     "lastUpdatedEpoch" : 1473435925047
>>>>> >>>   } ]
>>>>> >>> } ]
>>>>> >>>
>>>>> >>> so response contains information only about 1 application.
>>>>> >>>
>>>>> >>> But in reality I've started 2 applications and Spark UI shows me 2
>>>>> >>> RUNNING application (please see screenshot).
>>>>> >>>
>>>>> >>> Does anybody maybe know answer why API and UI shows different
>>>>> things?
>>>>> >>>
>>>>> >>>
>>>>> >>> Best regards, Vladimir.
>>>>> >>>
>>>>> >>>
>>>>> >>> On Tue, Aug 30, 2016 at 3:52 PM, Vijay Kiran <m...@vijaykiran.com>
>>>>> wrote:
>>>>> >>>>
>>>>> >>>> Hi Otis,
>>>>> >>>>
>>>>> >>>> Did you check the REST API as documented in
>>>>> >>>> http://spark.apache.org/docs/latest/monitoring.html
>>>>> >>>>
>>>>> >>>> Regards,
>>>>> >>>> Vijay
>>>>> >>>>
>>>>> >>>> > On 30 Aug 2016, at 14:43, Otis Gospodnetić
>>>>> >>>> > <otis.gospodne...@gmail.com> wrote:
>>>>> >>>> >
>>>>> >>>> > Hi Mich and Vijay,
>>>>> >>>> >
>>>>> >>>> > Thanks!  I forgot to include an important bit - I'm looking for
>>>>> a
>>>>> >>>> > programmatic way to get Spark metrics when running Spark under
>>>>> YARN - so JMX
>>>>> >>>> > or API of some kind.
>>>>> >>>> >
>>>>> >>>> > Thanks,
>>>>> >>>> > Otis
>>>>> >>>> > --
>>>>> >>>> > Monitoring - Log Management - Alerting - Anomaly Detection
>>>>> >>>> > Solr & Elasticsearch Consulting Support Training -
>>>>> >>>> > http://sematext.com/
>>>>> >>>> >
>>>>> >>>> >
>>>>> >>>> > On Tue, Aug 30, 2016 at 6:59 AM, Mich Talebzadeh
>>>>> >>>> > <mich.talebza...@gmail.com> wrote:
>>>>> >>>> > Spark UI regardless of deployment mode Standalone, yarn etc
>>>>> runs on
>>>>> >>>> > port 4040 by default that can be accessed directly
>>>>> >>>> >
>>>>> >>>> > Otherwise one can specify a specific port with --conf
>>>>> >>>> > "spark.ui.port=55555" for example 55555
>>>>> >>>> >
>>>>> >>>> > HTH
>>>>> >>>> >
>>>>> >>>> > Dr Mich Talebzadeh
>>>>> >>>> >
>>>>> >>>> > LinkedIn
>>>>> >>>> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJ
>>>>> d6zP6AcPCCdOABUrV8Pw
>>>>> >>>> >
>>>>> >>>> > http://talebzadehmich.wordpress.com
>>>>> >>>> >
>>>>> >>>> > Disclaimer: Use it at your own risk. Any and all responsibility
>>>>> for
>>>>> >>>> > any loss, damage or destruction of data or any other property
>>>>> which may
>>>>> >>>> > arise from relying on this email's technical content is
>>>>> explicitly
>>>>> >>>> > disclaimed. The author will in no case be liable for any
>>>>> monetary damages
>>>>> >>>> > arising from such loss, damage or destruction.
>>>>> >>>> >
>>>>> >>>> >
>>>>> >>>> > On 30 August 2016 at 11:48, Vijay Kiran <m...@vijaykiran.com>
>>>>> wrote:
>>>>> >>>> >
>>>>> >>>> > From Yarm RM UI, find the spark application Id, and in the
>>>>> application
>>>>> >>>> > details, you can click on the “Tracking URL” which should give
>>>>> you the Spark
>>>>> >>>> > UI.
>>>>> >>>> >
>>>>> >>>> > ./Vijay
>>>>> >>>> >
>>>>> >>>> > > On 30 Aug 2016, at 07:53, Otis Gospodnetić
>>>>> >>>> > > <otis.gospodne...@gmail.com> wrote:
>>>>> >>>> > >
>>>>> >>>> > > Hi,
>>>>> >>>> > >
>>>>> >>>> > > When Spark is run on top of YARN, where/how can one get Spark
>>>>> >>>> > > metrics?
>>>>> >>>> > >
>>>>> >>>> > > Thanks,
>>>>> >>>> > > Otis
>>>>> >>>> > > --
>>>>> >>>> > > Monitoring - Log Management - Alerting - Anomaly Detection
>>>>> >>>> > > Solr & Elasticsearch Consulting Support Training -
>>>>> >>>> > > http://sematext.com/
>>>>> >>>> > >
>>>>> >>>> >
>>>>> >>>> >
>>>>> >>>> > ------------------------------------------------------------
>>>>> ---------
>>>>> >>>> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>>> >>>> >
>>>>> >>>> >
>>>>> >>>> >
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> ------------------------------------------------------------
>>>>> ---------
>>>>> >>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>>> >>>>
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>> ------------------------------------------------------------
>>>>> ---------
>>>>> >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>>> >
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Spark metrics when running with YARN?

Reply via email to