[ https://issues.apache.org/jira/browse/SPARK-19627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-19627. ------------------------------- Resolution: Invalid Fix Version/s: (was: 1.6.1) Target Version/s: (was: 1.6.1) Please read http://spark.apache.org/contributing.html first > pyspark call jvm function defined by ourselves > ---------------------------------------------- > > Key: SPARK-19627 > URL: https://issues.apache.org/jira/browse/SPARK-19627 > Project: Spark > Issue Type: Bug > Components: Deploy > Affects Versions: 1.6.1 > Reporter: kehao > > hi, I have a question that pyspark couldn't execute suceess by call jvm's > function defined by myself, please view the code below: > from pyspark import SparkConf,SparkContext > from py4j.java_gateway import java_import > if __name__ == "__main__": > # conf = SparkConf().setAppName("testing") > # sc = SparkContext(conf=conf) > sc = SparkContext(appName="Py4jTesting") > def foo(x): > java_import(sc._jvm, "Calculate") > func = sc._jvm.Calculate() > func.sqAdd(x) > rdd = sc.parallelize([1, 2, 3]) > result = rdd.map(foo).collect() > print("$$$$$$$$$$$$$$$$$$$$$$") > print(result) > the result shows as below ,who can help me? > Traceback (most recent call last): > File "/home/manager/data/software/mytest/kehao/driver.py", line 19, in > <module> > result = rdd.map(foo).collect() > File > "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", > line 771, in collect > File > "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", > line 2379, in _jrdd > File > "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", > line 2299, in _prepare_for_python_RDD > File > "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/serializers.py", > line 428, in dumps > File > "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", > line 646, in dumps > File > "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", > line 107, in dump > File "/usr/lib/python3.4/pickle.py", line 412, in dump > self.save(obj) > File "/usr/lib/python3.4/pickle.py", line 479, in save > f(self, obj) # Call unbound method with explicit self > File "/usr/lib/python3.4/pickle.py", line 744, in save_tuple > save(element) > File "/usr/lib/python3.4/pickle.py", line 479, in save > f(self, obj) # Call unbound method with explicit self > File > "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", > line 199, in save_function > File > "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", > line 236, in save_function_tuple > File "/usr/lib/python3.4/pickle.py", line 479, in save > f(self, obj) # Call unbound method with explicit self > File "/usr/lib/python3.4/pickle.py", line 729, in save_tuple > save(element) > File "/usr/lib/python3.4/pickle.py", line 479, in save > f(self, obj) # Call unbound method with explicit self > File "/usr/lib/python3.4/pickle.py", line 774, in save_list > self._batch_appends(obj) > File "/usr/lib/python3.4/pickle.py", line 801, in _batch_appends > save(tmp[0]) > File "/usr/lib/python3.4/pickle.py", line 479, in save > f(self, obj) # Call unbound method with explicit self > File > "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", > line 193, in save_function > File > "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", > line 241, in save_function_tuple > File "/usr/lib/python3.4/pickle.py", line 479, in save > f(self, obj) # Call unbound method with explicit self > File "/usr/lib/python3.4/pickle.py", line 814, in save_dict > self._batch_setitems(obj.items()) > File "/usr/lib/python3.4/pickle.py", line 840, in _batch_setitems > save(v) > File "/usr/lib/python3.4/pickle.py", line 499, in save > rv = reduce(self.proto) > File > "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/context.py", > line 268, in __getnewargs__ > Exception: It appears that you are attempting to reference SparkContext from > a broadcast variable, action, or transformation. SparkContext can only be > used on the driver, not in code that it run on workers. For more information, > see SPARK-5063 -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org