Simon Whitelaw created OOZIE-3122:
-------------------------------------

             Summary: Oozie spark action unable to handle some options in 
spark-defaults.conf
                 Key: OOZIE-3122
                 URL: https://issues.apache.org/jira/browse/OOZIE-3122
             Project: Oozie
          Issue Type: Bug
          Components: action
    Affects Versions: 4.3.0
         Environment: Operating on EMR cluster (5.4.0) which includes Oozie 
4.3.0
            Reporter: Simon Whitelaw


When using the setting 
oozie.service.SparkConfigurationService.spark.configurations *=etc/spark/conf 
to specify a spark-defaults.conf file for spark to use, a few options are not 
handled properly by the oozie spark action and cause the job to fail. These 
include the following:
* spark.driver.extraClassPath - causes a blank --conf tag to be sent to the 
spark submit and the spark submit will fail
* spark.executor.extraClassPath - causes a blank --conf tag to be sent to the 
spark submit and the spark submit will fail
* spark.executor.extraJavaOptions with multiple options set (e.g. 
spark.executor.extraJavaOptions  -verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC 
-XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 
-XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p') - causes all 
options except for the first to be passed as singletons rather than 
extraJavaOptions
* spark.driver.extraJavaOptions with multiple option set - causes all options 
except for the first to be passed as singletons rather than extraJavaOptions
* Within the spark.driver.extraJavaOptions and spark.executor.extraJavaOptions, 
the option -XX:OnOutOfMemoryError='kill -9 %p' is not parsed correctly even 
from the spark-opts tag and requires additional double quotations within the 
single quote marks: '"kill -9 %p"'

These issues were noticed running the oozie spark action on an amazon EMR 
cluster. Workaround involves sending the options as part of the spark-opts tag, 
but is not ideal because in EMR, the spark-defaults.conf file is autogenerated 
for the specifications of the cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to