Daniel Becker created OOZIE-3321:
------------------------------------

             Summary: PySpark example fails
                 Key: OOZIE-3321
                 URL: https://issues.apache.org/jira/browse/OOZIE-3321
             Project: Oozie
          Issue Type: Bug
          Components: examples
    Affects Versions: 5.0.0, 5.1.0
            Reporter: Daniel Becker
            Assignee: Daniel Becker


The PySpark example fails with the following exception:

{noformat}
ACTION[0000001-180806145017687-oozie-oozi-W@spark-node] Launcher exception: 
Missing py4j and/or pyspark zip files. Please add them to the lib folder or to 
the Spark sharelib.
org.apache.oozie.action.hadoop.OozieActionConfiguratorException: Missing py4j 
and/or pyspark zip files. Please add them to the lib folder or to the Spark 
sharelib.
        at 
org.apache.oozie.action.hadoop.SparkMain.getMatchingPyFile(SparkMain.java:151)
        at 
org.apache.oozie.action.hadoop.SparkMain.createPySparkLibFolder(SparkMain.java:132)
        at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:77)
        at 
org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:101)
        at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:410)
        at 
org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:55)
        at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:223)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
        at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:217)
        at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
        at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141)
{noformat}

This is because py4j-0.9-src.zip and pyspark.zip are not available. They are 
needed for the example to run, so we should either add them to the sharelib or 
if they are not important enough to be in the sharelib, we should add them to 
the example's lib directory. Currently these two files can be found in the 
codebase under sharelib/spark/src/test/resources. We should also decide what to 
do with these files there, to avoid duplication of resource files.

If I add the files manually to either the sharelib or the lib directory of the 
example, the example runs successfilly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to