[ https://issues.apache.org/jira/browse/SPARK-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337840#comment-14337840 ]
Tathagata Das commented on SPARK-5185: -------------------------------------- I also encountered this for KafkaUtils in Python. I am doing the said workaround. But we should fix this for the general case. > pyspark --jars does not add classes to driver class path > -------------------------------------------------------- > > Key: SPARK-5185 > URL: https://issues.apache.org/jira/browse/SPARK-5185 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 1.2.0 > Reporter: Uri Laserson > Assignee: Andrew Or > > I have some random class I want access to from an Spark shell, say > {{com.cloudera.science.throwaway.ThrowAway}}. You can find the specific > example I used here: > https://gist.github.com/laserson/e9e3bd265e1c7a896652 > I packaged it as {{throwaway.jar}}. > If I then run {{bin/spark-shell}} like so: > {code} > bin/spark-shell --master local[1] --jars throwaway.jar > {code} > I can execute > {code} > val a = new com.cloudera.science.throwaway.ThrowAway() > {code} > Successfully. > I now run PySpark like so: > {code} > PYSPARK_DRIVER_PYTHON=ipython bin/pyspark --master local[1] --jars > throwaway.jar > {code} > which gives me an error when I try to instantiate the class through Py4J: > {code} > In [1]: sc._jvm.com.cloudera.science.throwaway.ThrowAway() > --------------------------------------------------------------------------- > Py4JError Traceback (most recent call last) > <ipython-input-1-4eedbe023c29> in <module>() > ----> 1 sc._jvm.com.cloudera.science.throwaway.ThrowAway() > /Users/laserson/repos/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py > in __getattr__(self, name) > 724 def __getattr__(self, name): > 725 if name == '__call__': > --> 726 raise Py4JError('Trying to call a package.') > 727 new_fqn = self._fqn + '.' + name > 728 command = REFLECTION_COMMAND_NAME +\ > Py4JError: Trying to call a package. > {code} > However, if I explicitly add the {{--driver-class-path}} to add the same jar > {code} > PYSPARK_DRIVER_PYTHON=ipython bin/pyspark --master local[1] --jars > throwaway.jar --driver-class-path throwaway.jar > {code} > it works > {code} > In [1]: sc._jvm.com.cloudera.science.throwaway.ThrowAway() > Out[1]: JavaObject id=o18 > {code} > However, the docs state that {{--jars}} should also set the driver class path. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org