Hi,
I am aware there was a previous discussion about dropping support for
different platforms
(http://apache-spark-developers-list.1001551.n3.nabble.com/Straw-poll-dropping-support-for-things-like-Scala-2-10-td19553.html)
but somehow it has been dominated by Scala and JVM and never touched the
subject of Python 2.
Some facts:
* Python 2 End Of Life is scheduled for 2020
(http://legacy.python.org/dev/peps/pep-0373/) without with "no
guarantee that bugfix releases will be made on a regular basis"
until then.
* Almost all commonly used libraries already support Python 3
(https://python3wos.appspot.com/). A single exception that can be
important for Spark is thrift (Python 3 support is already present
on the master) and transitively PyHive and Blaze.
* Supporting both Python 2 and Python 3 introduces significant
technical debt. In practice Python 3 is a different language with
backward incompatible syntax and growing number of features which
won't be backported to 2.x.
Suggestions:
* We need a public discussion about possible date for dropping Python
2 support.
* Early 2018 should give enough time for a graceful transition.
--
Best,
Maciej