I am having a heck of a time setting up my development environment. I used
pip to install pyspark. I also downloaded spark from apache.

My eclipse pyDev intereperter is configured as a python3 virtualenv

I have a simple unit test that loads a small dataframe. Df.show() generates
the following error


2018-04-04 17:13:56 ERROR Executor:91 - Exception in task 0.0 in stage 0.0
(TID 0)

org.apache.spark.SparkException:

Error from python worker:

  Traceback (most recent call last):

    File "/Users/a/workSpace/pythonEnv/spark-2.3.0/lib/python3.6/site.py",
line 67, in <module>

      import os

    File "/Users/a/workSpace/pythonEnv/spark-2.3.0/lib/python3.6/os.py",
line 409

      yield from walk(new_path, topdown, onerror, followlinks)

               ^

  SyntaxError: invalid syntax





My unittest classs is dervied from.



class PySparkTestCase(unittest.TestCase):



    @classmethod

    def setUpClass(cls):

        conf = SparkConf().setMaster("local[2]") \

            .setAppName(cls.__name__) #\

#             .set("spark.authenticate.secret", "111111")

        cls.sparkContext = SparkContext(conf=conf)

        sc_values[cls.__name__] = cls.sparkContext

        cls.sqlContext = SQLContext(cls.sparkContext)

        print("aedwip:", SparkContext)



    @classmethod

    def tearDownClass(cls):

        print("....calling stop tearDownClas, the content of sc_values=",
sc_values)

        sc_values.clear()

        cls.sparkContext.stop()



This looks similar to Class  PySparkTestCase in
https://github.com/apache/spark/blob/master/python/pyspark/tests.py



Any suggestions would be greatly appreciated.



Andy



My downloaed version is spark-2.3.0-bin-hadoop2.7



My virtual env version is

(spark-2.3.0) $ pip show pySpark

Name: pyspark

Version: 2.3.0

Summary: Apache Spark Python API

Home-page: https://github.com/apache/spark/tree/master/python

Author: Spark Developers

Author-email: d...@spark.apache.org

License: http://www.apache.org/licenses/LICENSE-2.0

Location: 
/Users/a/workSpace/pythonEnv/spark-2.3.0/lib/python3.6/site-packages

Requires: py4j

(spark-2.3.0) $ 



(spark-2.3.0) $ python --version

Python 3.6.1

(spark-2.3.0) $ 




Reply via email to