On 9/2/16 3:47 AM, Felix Cheung wrote:
There is an Anaconda parcel one could readily install on CDH
https://docs.continuum.io/anaconda/cloudera
As Sean says it is Python 2.7.x.
Spark should work for both 2.7 and 3.5.
Yes, I'm actually an engineer at Continuum, so I know the Anaconda
parcel
Sent: Friday, September 2, 2016 12:41 AM
Subject: Re: PySpark: preference for Python 2.7 or Python 3.5?
To: Ian Stokes Rees <ijsto...@continuum.io<mailto:ijsto...@continuum.io>>
Cc: user @spark <user@spark.apache.org<mailto:user@spark.apache.org>>
Spark should work fine with Python
Spark should work fine with Python 3. I'm not a Python person, but all else
equal I'd use 3.5 too. I assume the issue could be libraries you want that
don't support Python 3. I don't think that changes with CDH. It includes a
version of Anaconda from Continuum, but that lays down Python 2.7.11. I
I have the option of running PySpark with Python 2.7 or Python 3.5. I am
fairly expert with Python and know the Python-side history of the
differences. All else being the same, I have a preference for Python
3.5. I'm using CDH 5.8 and I'm wondering if that biases whether I
should proceed