If users are able to install Spark 2.0 on their RHEL clusters, then I
imagine that they're also capable of installing a standalone Python
alongside that Spark version (without changing Python systemwide). For
instance, Anaconda/Miniconda make it really easy to install Python
2.7.x/3.x without impacting / changing the system Python and doesn't
require any special permissions to install (you don't need root / sudo
access). Does this address the Python versioning concerns for RHEL users?

On Tue, Jan 5, 2016 at 2:33 PM, Koert Kuipers <ko...@tresata.com> wrote:

> yeah, the practical concern is that we have no control over java or python
> version on large company clusters. our current reality for the vast
> majority of them is java 7 and python 2.6, no matter how outdated that is.
>
> i dont like it either, but i cannot change it.
>
> we currently don't use pyspark so i have no stake in this, but if we did i
> can assure you we would not upgrade to spark 2.x if python 2.6 was dropped.
> no point in developing something that doesnt run for majority of customers.
>
> On Tue, Jan 5, 2016 at 5:19 PM, Nicholas Chammas <
> nicholas.cham...@gmail.com> wrote:
>
>> As I pointed out in my earlier email, RHEL will support Python 2.6 until
>> 2020. So I'm assuming these large companies will have the option of riding
>> out Python 2.6 until then.
>>
>> Are we seriously saying that Spark should likewise support Python 2.6 for
>> the next several years? Even though the core Python devs stopped supporting
>> it in 2013?
>>
>> If that's not what we're suggesting, then when, roughly, can we drop
>> support? What are the criteria?
>>
>> I understand the practical concern here. If companies are stuck using
>> 2.6, it doesn't matter to them that it is deprecated. But balancing that
>> concern against the maintenance burden on this project, I would say that
>> "upgrade to Python 2.7 or stay on Spark 1.6.x" is a reasonable position to
>> take. There are many tiny annoyances one has to put up with to support 2.6.
>>
>> I suppose if our main PySpark contributors are fine putting up with those
>> annoyances, then maybe we don't need to drop support just yet...
>>
>> Nick
>> 2016년 1월 5일 (화) 오후 2:27, Julio Antonio Soto de Vicente <ju...@esbet.es>님이
>> 작성:
>>
>>> Unfortunately, Koert is right.
>>>
>>> I've been in a couple of projects using Spark (banking industry) where
>>> CentOS + Python 2.6 is the toolbox available.
>>>
>>> That said, I believe it should not be a concern for Spark. Python 2.6 is
>>> old and busted, which is totally opposite to the Spark philosophy IMO.
>>>
>>>
>>> El 5 ene 2016, a las 20:07, Koert Kuipers <ko...@tresata.com> escribió:
>>>
>>> rhel/centos 6 ships with python 2.6, doesnt it?
>>>
>>> if so, i still know plenty of large companies where python 2.6 is the
>>> only option. asking them for python 2.7 is not going to work
>>>
>>> so i think its a bad idea
>>>
>>> On Tue, Jan 5, 2016 at 1:52 PM, Juliet Hougland <
>>> juliet.hougl...@gmail.com> wrote:
>>>
>>>> I don't see a reason Spark 2.0 would need to support Python 2.6. At
>>>> this point, Python 3 should be the default that is encouraged.
>>>> Most organizations acknowledge the 2.7 is common, but lagging behind
>>>> the version they should theoretically use. Dropping python 2.6
>>>> support sounds very reasonable to me.
>>>>
>>>> On Tue, Jan 5, 2016 at 5:45 AM, Nicholas Chammas <
>>>> nicholas.cham...@gmail.com> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> Red Hat supports Python 2.6 on REHL 5 until 2020
>>>>> <https://alexgaynor.net/2015/mar/30/red-hat-open-source-community/>,
>>>>> but otherwise yes, Python 2.6 is ancient history and the core Python
>>>>> developers stopped supporting it in 2013. REHL 5 is not a good enough
>>>>> reason to continue support for Python 2.6 IMO.
>>>>>
>>>>> We should aim to support Python 2.7 and Python 3.3+ (which I believe
>>>>> we currently do).
>>>>>
>>>>> Nick
>>>>>
>>>>> On Tue, Jan 5, 2016 at 8:01 AM Allen Zhang <allenzhang...@126.com>
>>>>> wrote:
>>>>>
>>>>>> plus 1,
>>>>>>
>>>>>> we are currently using python 2.7.2 in production environment.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 在 2016-01-05 18:11:45,"Meethu Mathew" <meethu.mat...@flytxt.com> 写道:
>>>>>>
>>>>>> +1
>>>>>> We use Python 2.7
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Meethu Mathew
>>>>>>
>>>>>> On Tue, Jan 5, 2016 at 12:47 PM, Reynold Xin <r...@databricks.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Does anybody here care about us dropping support for Python 2.6 in
>>>>>>> Spark 2.0?
>>>>>>>
>>>>>>> Python 2.6 is ancient, and is pretty slow in many aspects (e.g. json
>>>>>>> parsing) when compared with Python 2.7. Some libraries that Spark 
>>>>>>> depend on
>>>>>>> stopped supporting 2.6. We can still convince the library maintainers to
>>>>>>> support 2.6, but it will be extra work. I'm curious if anybody still 
>>>>>>> uses
>>>>>>> Python 2.6 to run Spark.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>

Reply via email to