[jira] [Updated] (OOZIE-2170) Oozie should automatically set configs to make Spark jobs show up in the Spark History Server

2015-05-18 Thread Shwetha G S (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shwetha G S updated OOZIE-2170:
---
Fix Version/s: (was: trunk)
   4.2

 Oozie should automatically set configs to make Spark jobs show up in the 
 Spark History Server
 -

 Key: OOZIE-2170
 URL: https://issues.apache.org/jira/browse/OOZIE-2170
 Project: Oozie
  Issue Type: Improvement
  Components: action
Affects Versions: trunk
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 4.2

 Attachments: OOZIE-2170.patch, OOZIE-2170.patch, OOZIE-2170.patch


 If you use yarn-cluster for the Spark action's master, the Spark jobs don't 
 show up in the Spark History Server or properly link to it from the Spark AM.
 The user needs to set this in their Spark action in the workflow.xml:
 {code:xml}
 spark-opts--conf spark.yarn.historyServer.address=http://SPH18088 --conf 
 spark.eventLog.dir=hdfs://NN:8020/user/spark/applicationHistory --conf 
 spark.eventLog.enabled=true/spark-opts
 {code}
 It would be nice if Oozie did this automatically via some oozie-site.xml 
 config(s).  We could do something similar how the hadoop configs are setup 
 where it will load a Spark .conf file from a directory based on the RM 
 specified in the job-tracker.
 While we're at it, it might be good to document how to use Spark on YARN:
 # Include the spark-assembly jar with your workflow (this is unfortunately 
 not published in maven)
 # Specify yarn-cluster as the master
 Also, the Spark example should delete the output dir in {{prepare}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2170) Oozie should automatically set configs to make Spark jobs show up in the Spark History Server

2015-04-10 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated OOZIE-2170:
-
Attachment: OOZIE-2170.patch

Attaching the final version of the patch for reference.

 Oozie should automatically set configs to make Spark jobs show up in the 
 Spark History Server
 -

 Key: OOZIE-2170
 URL: https://issues.apache.org/jira/browse/OOZIE-2170
 Project: Oozie
  Issue Type: Improvement
  Components: action
Affects Versions: trunk
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: trunk

 Attachments: OOZIE-2170.patch, OOZIE-2170.patch, OOZIE-2170.patch


 If you use yarn-cluster for the Spark action's master, the Spark jobs don't 
 show up in the Spark History Server or properly link to it from the Spark AM.
 The user needs to set this in their Spark action in the workflow.xml:
 {code:xml}
 spark-opts--conf spark.yarn.historyServer.address=http://SPH18088 --conf 
 spark.eventLog.dir=hdfs://NN:8020/user/spark/applicationHistory --conf 
 spark.eventLog.enabled=true/spark-opts
 {code}
 It would be nice if Oozie did this automatically via some oozie-site.xml 
 config(s).  We could do something similar how the hadoop configs are setup 
 where it will load a Spark .conf file from a directory based on the RM 
 specified in the job-tracker.
 While we're at it, it might be good to document how to use Spark on YARN:
 # Include the spark-assembly jar with your workflow (this is unfortunately 
 not published in maven)
 # Specify yarn-cluster as the master
 Also, the Spark example should delete the output dir in {{prepare}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2170) Oozie should automatically set configs to make Spark jobs show up in the Spark History Server

2015-03-18 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated OOZIE-2170:
-
Attachment: OOZIE-2170.patch

Uploading a rebased patch in case that's the problem.  Though it seems odd that 
HEAD didn't compile...

 Oozie should automatically set configs to make Spark jobs show up in the 
 Spark History Server
 -

 Key: OOZIE-2170
 URL: https://issues.apache.org/jira/browse/OOZIE-2170
 Project: Oozie
  Issue Type: Improvement
  Components: action
Affects Versions: trunk
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: OOZIE-2170.patch, OOZIE-2170.patch


 If you use yarn-cluster for the Spark action's master, the Spark jobs don't 
 show up in the Spark History Server or properly link to it from the Spark AM.
 The user needs to set this in their Spark action in the workflow.xml:
 {code:xml}
 spark-opts--conf spark.yarn.historyServer.address=http://SPH18088 --conf 
 spark.eventLog.dir=hdfs://NN:8020/user/spark/applicationHistory --conf 
 spark.eventLog.enabled=true/spark-opts
 {code}
 It would be nice if Oozie did this automatically via some oozie-site.xml 
 config(s).  We could do something similar how the hadoop configs are setup 
 where it will load a Spark .conf file from a directory based on the RM 
 specified in the job-tracker.
 While we're at it, it might be good to document how to use Spark on YARN:
 # Include the spark-assembly jar with your workflow (this is unfortunately 
 not published in maven)
 # Specify yarn-cluster as the master
 Also, the Spark example should delete the output dir in {{prepare}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2170) Oozie should automatically set configs to make Spark jobs show up in the Spark History Server

2015-03-17 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated OOZIE-2170:
-
Attachment: OOZIE-2170.patch

The patch adds a new {{SparkConfigurationService}} which will load the 
spark-defaults.conf files defined by the new 
{{oozie.service.SparkConfigurationService.spark.configurations}} oozie-site 
property.  It operates similarly to how we load Hadoop conf and default action 
confs

When the master starts with yarn and the {{job-tracker}} matches something 
in {{oozie.service.SparkConfigurationService.spark.configurations}}, Oozie will 
inject the properties from spark-defaults.conf as {{--conf}} parameters in the 
{{spark-opts}} field.  

In addition to the unit tests, I also tested out a variety of scenarios 
manually in a cluster (configs defined, invalid configs, {{*}}, duplicate 
properties, etc).


 Oozie should automatically set configs to make Spark jobs show up in the 
 Spark History Server
 -

 Key: OOZIE-2170
 URL: https://issues.apache.org/jira/browse/OOZIE-2170
 Project: Oozie
  Issue Type: Improvement
  Components: action
Affects Versions: trunk
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: OOZIE-2170.patch


 If you use yarn-cluster for the Spark action's master, the Spark jobs don't 
 show up in the Spark History Server or properly link to it from the Spark AM.
 The user needs to set this in their Spark action in the workflow.xml:
 {code:xml}
 spark-opts--conf spark.yarn.historyServer.address=http://SPH18088 --conf 
 spark.eventLog.dir=hdfs://NN:8020/user/spark/applicationHistory --conf 
 spark.eventLog.enabled=true/spark-opts
 {code}
 It would be nice if Oozie did this automatically via some oozie-site.xml 
 config(s).  We could do something similar how the hadoop configs are setup 
 where it will load a Spark .conf file from a directory based on the RM 
 specified in the job-tracker.
 While we're at it, it might be good to document how to use Spark on YARN:
 # Include the spark-assembly jar with your workflow (this is unfortunately 
 not published in maven)
 # Specify yarn-cluster as the master
 Also, the Spark example should delete the output dir in {{prepare}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-2170) Oozie should automatically set configs to make Spark jobs show up in the Spark History Server

2015-03-13 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated OOZIE-2170:
-
Summary: Oozie should automatically set configs to make Spark jobs show up 
in the Spark History Server  (was: Oozie should automatically sets configs to 
make Spark jobs show up in the Spark History Server)

 Oozie should automatically set configs to make Spark jobs show up in the 
 Spark History Server
 -

 Key: OOZIE-2170
 URL: https://issues.apache.org/jira/browse/OOZIE-2170
 Project: Oozie
  Issue Type: Improvement
  Components: action
Affects Versions: trunk
Reporter: Robert Kanter
Assignee: Robert Kanter

 If you use yarn-cluster for the Spark action's master, the Spark jobs don't 
 show up in the Spark History Server or properly link to it from the Spark AM.
 The user needs to set this in their Spark action in the workflow.xml:
 {code:xml}
 spark-opts--conf spark.yarn.historyServer.address=http://SPH18088 --conf 
 spark.eventLog.dir=hdfs://NN:8020/user/spark/applicationHistory --conf 
 spark.eventLog.enabled=true/spark-opts
 {code}
 It would be nice if Oozie did this automatically via some oozie-site.xml 
 config(s).  We could do something similar how the hadoop configs are setup 
 where it will load a Spark .conf file from a directory based on the RM 
 specified in the job-tracker.
 While we're at it, it might be good to document how to use Spark on YARN:
 # Include the spark-assembly jar with your workflow (this is unfortunately 
 not published in maven)
 # Specify yarn-cluster as the master
 Also, the Spark example should delete the output dir in {{prepare}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)