Submitting spark application using Yarn Rest API
Hi All, I am trying to submit the spark application using yarn rest API. I am able to submit the application but final status shows as 'UNDEFINED.'. Couple of other observations: User shows as Dr.who Application type is empty though I specify it as Spark Is any one had this problem before? I am creating the app id using: http://{cluster host}/ws/v1/cluster/apps/new-application Post Request Url: http://{cluster host}/ws/v1/cluster/apps Here is the request body: { application-id:application_1426273041023_0055, application-name:test, am-container-spec: { credentials: { secrets: { entry: [ { key:user.name, value:x } ] } }, commands: { command:%SPARK_HOME%/bin/spark-submit.cmd --class org.apache.spark.examples.SparkPi --conf spark.yarn.jar=hdfs://xxx/apps/spark/spark-1.2.1-hadoop2.6/spark-assembly-1.2.1-hadoop2.6.0.jar --master yarn-cluster hdfs://xxx/apps/spark/spark-1.2.1-hadoop2.6/spark-examples-1.2.1-hadoop2.6.0.jar }, application-type:SPARK }
Re: Spark History server default conf values
Thank you Charles and Meethu. On Tue, Mar 10, 2015 at 12:47 AM, Charles Feduke charles.fed...@gmail.com wrote: What I found from a quick search of the Spark source code (from my local snapshot on January 25, 2015): // Interval between each check for event log updates private val UPDATE_INTERVAL_MS = conf.getInt(spark.history.fs.updateInterval, conf.getInt(spark.history.updateInterval, 10)) * 1000 private val retainedApplications = conf.getInt(spark.history.retainedApplications, 50) On Tue, Mar 10, 2015 at 12:37 AM Srini Karri skarri@gmail.com wrote: Hi All, What are the default values for the following conf properities if we don't set in the conf file? # spark.history.fs.updateInterval 10 # spark.history.retainedApplications 500 Regards, Srini.
Spark History server default conf values
Hi All, What are the default values for the following conf properities if we don't set in the conf file? # spark.history.fs.updateInterval 10 # spark.history.retainedApplications 500 Regards, Srini.
Re: Spark Monitoring UI for Hadoop Yarn Cluster
Hi Todd and Marcelo, Thanks for helping me. I was to able to lunch the history server on windows with out any issues. One problem I am running into right now. I always get the message no completed applications found in history server UI. But I was able to browse through these applications from Spark Master. Do you have any thoughts what could be problem? Following are my settings in spark conf file: spark.executor.extraClassPath D:\\Apache\\spark-1.2.1-bin-hadoop2\\spark-1.2.1-bin-hadoop2.4\\bin\\classes spark.eventLog.dir D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events spark.history.fs.logDirectory D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events Also I have attached Spark Master and Spark History server UI screen shots for convenience. And all the logs are available and I granted directory permissions to Everyone with full control. Following is the console output from History server: D:\Apache\spark-1.2.1-bin-hadoop2\spark-1.2.1-bin-hadoop2.4\binspark-class.cmd org.apache.spark.deploy.history.HistoryServer Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/03/04 08:59:42 INFO SecurityManager: Changing view acls to: skarri 15/03/04 08:59:42 INFO SecurityManager: Changing modify acls to: skarri 15/03/04 08:59:42 INFO SecurityManager: SecurityManager: authentication disabled ; ui acls disabled; users with view permissions: Set(skarri); users with modify permissions: Set(skarri) 15/03/04 08:59:49 WARN NativeCodeLoader: Unable to load native-hadoop library fo r your platform... using builtin-java classes where applicable 15/03/04 08:59:56 INFO Utils: Successfully started service on port 18080. 15/03/04 08:59:56 INFO HistoryServer: Started HistoryServer at http://skarri-lt0 5.redmond.corp.microsoft.com:18080 Regards, Srini. On Tue, Mar 3, 2015 at 11:41 AM, Marcelo Vanzin van...@cloudera.com wrote: Spark applications shown in the RM's UI should have an Application Master link when they're running. That takes you to the Spark UI for that application where you can see all the information you're looking for. If you're running a history server and add spark.yarn.historyServer.address to your config, that link will become a History link after the application is finished, and will take you to the history server to view the app's UI. On Tue, Mar 3, 2015 at 9:47 AM, Srini Karri skarri@gmail.com wrote: Hi All, I am having trouble finding data related to my requirement. Here is the context, I have tried Standalone Spark Installation on Windows, I am able to submit the logs, able to see the history of events. My question is, is it possible to achieve the same monitoring UI experience with Yarn Cluster like Viewing workers, running/completed job stages in the Web UI. Currently, if we go to our Yarn Resource manager UI, we are able to see the Spark Jobs and it's logs. But it is not as rich as Spark Standalone master UI. Is this limitation for hadoop yarn cluster or is there any way we can hook this Spark Standalone master to Yarn Cluster? Any help is highly appreciated. Regards, Srini. -- Marcelo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark Monitoring UI for Hadoop Yarn Cluster
Yes. I do see files, actually I missed copying the other settings: spark.master spark:// skarri-lt05.redmond.corp.microsoft.com:7077 spark.eventLog.enabled true spark.rdd.compress true spark.storage.memoryFraction 1 spark.core.connection.ack.wait.timeout 6000 spark.akka.frameSize 50 spark.executor.extraClassPath D:\\Apache\\spark-1.2.1-bin-hadoop2\\spark-1.2.1-bin-hadoop2.4\\bin\\classes spark.eventLog.dir D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events spark.history.fs.logDirectory D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events On Wed, Mar 4, 2015 at 10:15 AM, Marcelo Vanzin van...@cloudera.com wrote: On Wed, Mar 4, 2015 at 10:08 AM, Srini Karri skarri@gmail.com wrote: spark.executor.extraClassPath D:\\Apache\\spark-1.2.1-bin-hadoop2\\spark-1.2.1-bin-hadoop2.4\\bin\\classes spark.eventLog.dir D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events spark.history.fs.logDirectory D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events Do you see any files in that directory? spark.eventLog.dir won't do anything unless you also have spark.eventLog.enabled=true somewhere. And these are application configs, so make sure they're set when running your application (not when starting the history server). -- Marcelo
Re: Spark Monitoring UI for Hadoop Yarn Cluster
Hi Marcelo, I found the problem from http://mail-archives.apache.org/mod_mbox/spark-user/201409.mbox/%3cCAL+LEBfzzjugOoB2iFFdz_=9TQsH=DaiKY=cvydfydg3ac5...@mail.gmail.com%3e this link. The problem is the application I am running, is not generating APPLICATION_COMPLETE file. If I add this file manually it is showing application in the UI. So the problem is with application which is not calling Stop method on the spark context. Thank you and Todd for helping. Hopefully I will be able to apply these on the actual cluster. Regards, Srini. On Wed, Mar 4, 2015 at 10:20 AM, Srini Karri skarri@gmail.com wrote: Yes. I do see files, actually I missed copying the other settings: spark.master spark:// skarri-lt05.redmond.corp.microsoft.com:7077 spark.eventLog.enabled true spark.rdd.compress true spark.storage.memoryFraction 1 spark.core.connection.ack.wait.timeout 6000 spark.akka.frameSize 50 spark.executor.extraClassPath D:\\Apache\\spark-1.2.1-bin-hadoop2\\spark-1.2.1-bin-hadoop2.4\\bin\\classes spark.eventLog.dir D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events spark.history.fs.logDirectory D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events On Wed, Mar 4, 2015 at 10:15 AM, Marcelo Vanzin van...@cloudera.com wrote: On Wed, Mar 4, 2015 at 10:08 AM, Srini Karri skarri@gmail.com wrote: spark.executor.extraClassPath D:\\Apache\\spark-1.2.1-bin-hadoop2\\spark-1.2.1-bin-hadoop2.4\\bin\\classes spark.eventLog.dir D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events spark.history.fs.logDirectory D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events Do you see any files in that directory? spark.eventLog.dir won't do anything unless you also have spark.eventLog.enabled=true somewhere. And these are application configs, so make sure they're set when running your application (not when starting the history server). -- Marcelo
Spark Monitoring UI for Hadoop Yarn Cluster
Hi All, I am having trouble finding data related to my requirement. Here is the context, I have tried Standalone Spark Installation on Windows, I am able to submit the logs, able to see the history of events. My question is, is it possible to achieve the same monitoring UI experience with Yarn Cluster like Viewing workers, running/completed job stages in the Web UI. Currently, if we go to our Yarn Resource manager UI, we are able to see the Spark Jobs and it's logs. But it is not as rich as Spark Standalone master UI. Is this limitation for hadoop yarn cluster or is there any way we can hook this Spark Standalone master to Yarn Cluster? Any help is highly appreciated. Regards, Srini.