interesting i didnt know that! On Tue, Jan 5, 2016 at 5:57 PM, Nicholas Chammas <nicholas.cham...@gmail.com > wrote:
> even if python 2.7 was needed only on this one machine that launches the > app we can not ship it with our software because its gpl licensed > > Not to nitpick, but maybe this is important. The Python license is > GPL-compatible > but not GPL <https://docs.python.org/3/license.html>: > > Note GPL-compatible doesn’t mean that we’re distributing Python under the > GPL. All Python licenses, unlike the GPL, let you distribute a modified > version without making your changes open source. The GPL-compatible > licenses make it possible to combine Python with other software that is > released under the GPL; the others don’t. > > Nick > > > On Tue, Jan 5, 2016 at 5:49 PM Koert Kuipers <ko...@tresata.com> wrote: > >> i do not think so. >> >> does the python 2.7 need to be installed on all slaves? if so, we do not >> have direct access to those. >> >> also, spark is easy for us to ship with our software since its apache 2 >> licensed, and it only needs to be present on the machine that launches the >> app (thanks to yarn). >> even if python 2.7 was needed only on this one machine that launches the >> app we can not ship it with our software because its gpl licensed, so the >> client would have to download it and install it themselves, and this would >> mean its an independent install which has to be audited and approved and >> now you are in for a lot of fun. basically it will never happen. >> >> >> On Tue, Jan 5, 2016 at 5:35 PM, Josh Rosen <joshro...@databricks.com> >> wrote: >> >>> If users are able to install Spark 2.0 on their RHEL clusters, then I >>> imagine that they're also capable of installing a standalone Python >>> alongside that Spark version (without changing Python systemwide). For >>> instance, Anaconda/Miniconda make it really easy to install Python >>> 2.7.x/3.x without impacting / changing the system Python and doesn't >>> require any special permissions to install (you don't need root / sudo >>> access). Does this address the Python versioning concerns for RHEL users? >>> >>> On Tue, Jan 5, 2016 at 2:33 PM, Koert Kuipers <ko...@tresata.com> wrote: >>> >>>> yeah, the practical concern is that we have no control over java or >>>> python version on large company clusters. our current reality for the vast >>>> majority of them is java 7 and python 2.6, no matter how outdated that is. >>>> >>>> i dont like it either, but i cannot change it. >>>> >>>> we currently don't use pyspark so i have no stake in this, but if we >>>> did i can assure you we would not upgrade to spark 2.x if python 2.6 was >>>> dropped. no point in developing something that doesnt run for majority of >>>> customers. >>>> >>>> On Tue, Jan 5, 2016 at 5:19 PM, Nicholas Chammas < >>>> nicholas.cham...@gmail.com> wrote: >>>> >>>>> As I pointed out in my earlier email, RHEL will support Python 2.6 >>>>> until 2020. So I'm assuming these large companies will have the option of >>>>> riding out Python 2.6 until then. >>>>> >>>>> Are we seriously saying that Spark should likewise support Python 2.6 >>>>> for the next several years? Even though the core Python devs stopped >>>>> supporting it in 2013? >>>>> >>>>> If that's not what we're suggesting, then when, roughly, can we drop >>>>> support? What are the criteria? >>>>> >>>>> I understand the practical concern here. If companies are stuck using >>>>> 2.6, it doesn't matter to them that it is deprecated. But balancing that >>>>> concern against the maintenance burden on this project, I would say that >>>>> "upgrade to Python 2.7 or stay on Spark 1.6.x" is a reasonable position to >>>>> take. There are many tiny annoyances one has to put up with to support >>>>> 2.6. >>>>> >>>>> I suppose if our main PySpark contributors are fine putting up with >>>>> those annoyances, then maybe we don't need to drop support just yet... >>>>> >>>>> Nick >>>>> 2016년 1월 5일 (화) 오후 2:27, Julio Antonio Soto de Vicente <ju...@esbet.es>님이 >>>>> 작성: >>>>> >>>>>> Unfortunately, Koert is right. >>>>>> >>>>>> I've been in a couple of projects using Spark (banking industry) >>>>>> where CentOS + Python 2.6 is the toolbox available. >>>>>> >>>>>> That said, I believe it should not be a concern for Spark. Python 2.6 >>>>>> is old and busted, which is totally opposite to the Spark philosophy IMO. >>>>>> >>>>>> >>>>>> El 5 ene 2016, a las 20:07, Koert Kuipers <ko...@tresata.com> >>>>>> escribió: >>>>>> >>>>>> rhel/centos 6 ships with python 2.6, doesnt it? >>>>>> >>>>>> if so, i still know plenty of large companies where python 2.6 is the >>>>>> only option. asking them for python 2.7 is not going to work >>>>>> >>>>>> so i think its a bad idea >>>>>> >>>>>> On Tue, Jan 5, 2016 at 1:52 PM, Juliet Hougland < >>>>>> juliet.hougl...@gmail.com> wrote: >>>>>> >>>>>>> I don't see a reason Spark 2.0 would need to support Python 2.6. At >>>>>>> this point, Python 3 should be the default that is encouraged. >>>>>>> Most organizations acknowledge the 2.7 is common, but lagging behind >>>>>>> the version they should theoretically use. Dropping python 2.6 >>>>>>> support sounds very reasonable to me. >>>>>>> >>>>>>> On Tue, Jan 5, 2016 at 5:45 AM, Nicholas Chammas < >>>>>>> nicholas.cham...@gmail.com> wrote: >>>>>>> >>>>>>>> +1 >>>>>>>> >>>>>>>> Red Hat supports Python 2.6 on REHL 5 until 2020 >>>>>>>> <https://alexgaynor.net/2015/mar/30/red-hat-open-source-community/>, >>>>>>>> but otherwise yes, Python 2.6 is ancient history and the core Python >>>>>>>> developers stopped supporting it in 2013. REHL 5 is not a good enough >>>>>>>> reason to continue support for Python 2.6 IMO. >>>>>>>> >>>>>>>> We should aim to support Python 2.7 and Python 3.3+ (which I >>>>>>>> believe we currently do). >>>>>>>> >>>>>>>> Nick >>>>>>>> >>>>>>>> On Tue, Jan 5, 2016 at 8:01 AM Allen Zhang <allenzhang...@126.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> plus 1, >>>>>>>>> >>>>>>>>> we are currently using python 2.7.2 in production environment. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 在 2016-01-05 18:11:45,"Meethu Mathew" <meethu.mat...@flytxt.com> >>>>>>>>> 写道: >>>>>>>>> >>>>>>>>> +1 >>>>>>>>> We use Python 2.7 >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> >>>>>>>>> Meethu Mathew >>>>>>>>> >>>>>>>>> On Tue, Jan 5, 2016 at 12:47 PM, Reynold Xin <r...@databricks.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Does anybody here care about us dropping support for Python 2.6 >>>>>>>>>> in Spark 2.0? >>>>>>>>>> >>>>>>>>>> Python 2.6 is ancient, and is pretty slow in many aspects (e.g. >>>>>>>>>> json parsing) when compared with Python 2.7. Some libraries that >>>>>>>>>> Spark >>>>>>>>>> depend on stopped supporting 2.6. We can still convince the library >>>>>>>>>> maintainers to support 2.6, but it will be extra work. I'm curious if >>>>>>>>>> anybody still uses Python 2.6 to run Spark. >>>>>>>>>> >>>>>>>>>> Thanks. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>> >>> >>