[ https://issues.apache.org/jira/browse/OOZIE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16326347#comment-16326347 ]
Attila Sasvari commented on OOZIE-3159: --------------------------------------- [~satishsaley], [~andras.piros], [~gezapeti] can you take a look at the attached patch? Testing done: - {{mvn clean install assembly:single -DskipTests -Denforcer.skip=true -Dcheckstyle.skip=true -Dfindbugs.skip=true -DtargetJavaVersion=1.8 -DjavaVersion=1.8 -Dhadoop.version-2.6.0 -Puber}} - configured Oozie so that is talks with pseudo hadoop 2.6. - modified {{examples/apps/spark/workflow.xml}} so that it also includes {{<mode>}} {code:xml} <workflow-app xmlns='uri:oozie:workflow:0.5' name='SparkFileCopy'> <start to='spark-node' /> <action name='spark-node'> <spark xmlns="uri:oozie:spark-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark"/> </prepare> <master>${master}</master> <mode>${mode}</mode> <name>Spark-FileCopy</name> <class>org.apache.oozie.example.SparkFileCopy</class> <jar>${nameNode}/user/${wf:user()}/${examplesRoot}/apps/spark/lib/oozie-examples.jar</jar> <arg>${nameNode}/user/${wf:user()}/${examplesRoot}/input-data/text/data.txt</arg> <arg>${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark</arg> </spark> <ok to="end" /> <error to="fail" /> </action> <kill name="fail"> <message>Workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}] </message> </kill> <end name='end' /> </workflow-app> {code} - CLUSTER mode with YARN: {{bin/oozie job -oozie http://localhost:11000/oozie -config examples/apps/pyspark/job.properties -run -DnameNode=hdfs://localhost:9000 -DjobTracker=localhost:8032 -Dmaster=yarn -Dmode=cluster}}, workflow succeeded - CLIENT mode with YARN: {{bin/oozie job -oozie http://localhost:11000/oozie -config examples/apps/pyspark/job.properties -run -DnameNode=hdfs://localhost:9000 -DjobTracker=localhost:8032 -Dmaster=yarn -Dmode=client}}, workflow succeeded - executed example workflows all except hive related ones, pyspark failed first because I have not uploaded required dependencies to Spark sharelib. After I set it up correctly, workflow succeeded. Note: If you do not specify mode in the Spark workflow when {{master}} set to {{yarn}}, executor will fail. > Spark Action fails because of absence of hadoop mapreduce jar(s) > ---------------------------------------------------------------- > > Key: OOZIE-3159 > URL: https://issues.apache.org/jira/browse/OOZIE-3159 > Project: Oozie > Issue Type: Bug > Reporter: Satish Subhashrao Saley > Assignee: Attila Sasvari > Priority: Blocker > Fix For: 5.0.0b1 > > Attachments: OOZIE-3159-001.patch > > > OOZIE-2869 removed map reduce dependencies from getting added to Spark > action. Spark action uses > org.apache.hadoop.filecache.DistributedCache. It is not available anymore in > Spack action's classpath, causing it to fail. > {code} > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412) > at > org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:56) > at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219) > at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:142) > Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/filecache/DistributedCache > at > org.apache.oozie.action.hadoop.SparkArgsExtractor.extract(SparkArgsExtractor.java:309) > at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:74) > at > org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:101) > at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60) > ... 16 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.filecache.DistributedCache > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 20 more > Failing Oozie Launcher, org/apache/hadoop/filecache/DistributedCache > java.lang.NoClassDefFoundError: org/apache/hadoop/filecache/DistributedCache > at > org.apache.oozie.action.hadoop.SparkArgsExtractor.extract(SparkArgsExtractor.java:309) > at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:74) > at > org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:101) > at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412) > at > org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:56) > at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219) > at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:142) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.filecache.DistributedCache > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 20 more > Oozie Launcher, uploading action data to HDFS sequence file: > hdfs://localhost:8020/user/saley/oozie-sale/0000009-180112124633268-oozie-sale-W/spark-node--spark/action-data.seq > {code} > I enable adding map reduce jars by setting > {{oozie.launcher.oozie.action.mapreduce.needed.for}} to {{true}}. The > launcher job was able to kick the child job. But child job failed with > {code} > 2018-01-12 15:00:13,301 [Driver] ERROR > org.apache.spark.deploy.yarn.ApplicationMaster - User class threw exception: > java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s > signer information does not match signer information of other classes in the > same package > java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s > signer information does not match signer information of other classes in the > same package > at java.lang.ClassLoader.checkCerts(ClassLoader.java:898) > at java.lang.ClassLoader.preDefineClass(ClassLoader.java:668) > at java.lang.ClassLoader.defineClass(ClassLoader.java:761) > at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) > at java.net.URLClassLoader.access$100(URLClassLoader.java:73) > at java.net.URLClassLoader$1.run(URLClassLoader.java:368) > at java.net.URLClassLoader$1.run(URLClassLoader.java:362) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:361) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at > org.spark-project.jetty.servlet.ServletContextHandler.<init>(ServletContextHandler.java:136) > at > org.spark-project.jetty.servlet.ServletContextHandler.<init>(ServletContextHandler.java:129) > at > org.spark-project.jetty.servlet.ServletContextHandler.<init>(ServletContextHandler.java:98) > at > org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:126) > at > org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:113) > at org.apache.spark.ui.WebUI.attachPage(WebUI.scala:78) > at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:62) > at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:62) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at org.apache.spark.ui.WebUI.attachTab(WebUI.scala:62) > at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:63) > at org.apache.spark.ui.SparkUI.<init>(SparkUI.scala:76) > at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:195) > at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:146) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:473) > at > org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59) > at org.apache.oozie.example.SparkFileCopy.main(SparkFileCopy.java:35) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > 2018-01-12 15:00:13,303 [Driver] INFO > org.apache.spark.deploy.yarn.ApplicationMaster - Final app status: FAILED, > exitCode: 15, (reason: User class threw exception: > java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s > signer information does not match signer information of other classes in the > same package) > {code} > I looked around this exception, it is due to servlet-api-2.5.jar which got > pulled in by hadoop-common in the spark sharelib. We need to revisit the > reason for adding hadoop-common as dependency. -- This message was sent by Atlassian JIRA (v7.6.3#76005)