[ 
https://issues.apache.org/jira/browse/SPARK-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15336685#comment-15336685
 ] 

Faisal commented on SPARK-15999:
--------------------------------

- Does the spark rest service exposed on *spark.yarn.am.port*? 

- Under spark configuration(CDH-5.5.2), i don't see this property set to 
anything. 
Can we specify this property while submitting job in using spark-submit?

I have to integrate internal monitoring application with Spark REST service 
such that we can store and measure the performance of Spark streaming like 
Batch Time, Input Size, Scheduling Delay, Processing Time, Total Delay and some 
other metrices. 
But unfortunately this all is not well documented.



> Wrong/Missing information for Spark UI/REST port
> ------------------------------------------------
>
>                 Key: SPARK-15999
>                 URL: https://issues.apache.org/jira/browse/SPARK-15999
>             Project: Spark
>          Issue Type: Bug
>          Components: Documentation, Streaming
>    Affects Versions: 1.5.0
>         Environment: CDH5.5.2, Spark 1.5.0
>            Reporter: Faisal
>            Priority: Minor
>
> *Spark Monitoring documentation*
> https://spark.apache.org/docs/1.5.0/monitoring.html
> {quote}
> You can access this interface by simply opening http://<driver-node>:4040 in 
> a web browser. If multiple SparkContexts are running on the same host, they 
> will bind to successive ports beginning with 4040 (4041, 4042, etc).
> {quote}
> This statement is very confusing and doesn't apply at all in spark streaming 
> jobs(unless i am missing something)
> Same is the case with REST API calls.
> {quote}
> REST API
> In addition to viewing the metrics in the UI, they are also available as 
> JSON. This gives developers an easy way to create new visualizations and 
> monitoring tools for Spark. The JSON is available for both running 
> applications, and in the history server. The endpoints are mounted at 
> /api/v1. Eg., for the history server, they would typically be accessible at 
> http://<server-url>:18080/api/v1, and for a running application, at 
> http://localhost:4040/api/v1.
> {quote}
> I am running spark streaming job in CDH-5.5.2 Spark version 1.5.0
> and nowhere on driver node, executor node for running/live application i am 
> able to call rest service.
> My spark streaming jobs running in yarn cluster mode
> --master yarn-cluster
> However for historyServer
> i am able to call REST service and can pull up json messages
> using the URL
> http://historyServer:18088/api/v1/applications
> {code}
> [ {
>   "id" : "application_1463099418950_11465",
>   "name" : "PySparkShell",
>   "attempts" : [ {
>     "startTime" : "2016-06-15T15:28:32.460GMT",
>     "endTime" : "2016-06-15T19:01:39.100GMT",
>     "sparkUser" : "abc",
>     "completed" : true
>   } ]
> }, {
>   "id" : "application_1463099418950_11635",
>   "name" : "DataProcessor-ETL.ETIME",
>   "attempts" : [ {
>     "attemptId" : "1",
>     "startTime" : "2016-06-15T18:56:04.413GMT",
>     "endTime" : "2016-06-15T18:58:00.022GMT",
>     "sparkUser" : "abc",
>     "completed" : true
>   } ]
> }, 
> {code}
> Besides following description pointing to a broken link to 
> http://metrics.codahale.com/
> {quote}Spark has a configurable metrics system based on the Coda Hale Metrics 
> Library. {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to