As Reynold pointed out, we don't have to drop Python 2 support right off
the bat. We can just deprecate it with Spark 3.0, which would allow us to
actually drop it at a later 3.x release.

On Sat, Sep 15, 2018 at 2:09 PM Erik Erlandson <eerla...@redhat.com> wrote:

> On a separate dev@spark thread, I raised a question of whether or not to
> support python 2 in Apache Spark, going forward into Spark 3.0.
>
> Python-2 is going EOL <https://github.com/python/devguide/pull/344> at
> the end of 2019. The upcoming release of Spark 3.0 is an opportunity to
> make breaking changes to Spark's APIs, and so it is a good time to consider
> support for Python-2 on PySpark.
>
> Key advantages to dropping Python 2 are:
>
>    - Support for PySpark becomes significantly easier.
>    - Avoid having to support Python 2 until Spark 4.0, which is likely to
>    imply supporting Python 2 for some time after it goes EOL.
>
> (Note that supporting python 2 after EOL means, among other things, that
> PySpark would be supporting a version of python that was no longer
> receiving security patches)
>
> The main disadvantage is that PySpark users who have legacy python-2 code
> would have to migrate their code to python 3 to take advantage of Spark 3.0
>
> This decision obviously has large implications for the Apache Spark
> community and we want to solicit community feedback.
>
>

Reply via email to