Deprecated -- certainly and sooner than later.
I don't have a good sense of the overhead of continuing to support
Python 2; is it large enough to consider dropping it in Spark 3.0?

On Wed, May 29, 2019 at 11:47 PM Xiangrui Meng <men...@gmail.com> wrote:
>
> Hi all,
>
> I want to revive this old thread since no action was taken so far. If we plan 
> to mark Python 2 as deprecated in Spark 3.0, we should do it as early as 
> possible and let users know ahead. PySpark depends on Python, numpy, pandas, 
> and pyarrow, all of which are sunsetting Python 2 support by 2020/01/01 per 
> https://python3statement.org/. At that time we cannot really support Python 2 
> because the dependent libraries do not plan to make new releases, even for 
> security reasons. So I suggest the following:
>
> 1. Update Spark website and state that Python 2 is deprecated in Spark 3.0 
> and its support will be removed in a release after 2020/01/01.
> 2. Make a formal announcement to dev@ and users@.
> 3. Add Apache Spark project to https://python3statement.org/ timeline.
> 4. Update PySpark, check python version and print a deprecation warning if 
> version < 3.
>
> Any thoughts and suggestions?
>
> Best,
> Xiangrui

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to