Robert Kanter created OOZIE-2170:
------------------------------------
Summary: Oozie should automatically sets configs to make Spark
jobs show up in the Spark History Server
Key: OOZIE-2170
URL: https://issues.apache.org/jira/browse/OOZIE-2170
Project: Oozie
Issue Type: Improvement
Components: action
Affects Versions: trunk
Reporter: Robert Kanter
Assignee: Robert Kanter
If you use "yarn-cluster" for the Spark action's master, the Spark jobs don't
show up in the Spark History Server or properly link to it from the Spark AM.
The user needs to set this in their Spark action in the workflow.xml:
{code:xml}
<spark-opts>--conf spark.yarn.historyServer.address=http://SPH18088 --conf
spark.eventLog.dir=hdfs://NN:8020/user/spark/applicationHistory --conf
spark.eventLog.enabled=true</spark-opts>
{code}
It would be nice if Oozie did this automatically via some oozie-site.xml
config(s). We could do something similar how the hadoop configs are setup
where it will load a Spark .conf file from a directory based on the RM
specified in the <job-tracker>.
While we're at it, it might be good to document how to use Spark on YARN:
# Include the spark-assembly jar with your workflow (this is unfortunately not
published in maven)
# Specify "yarn-cluster" as the master
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)