That’s not necessarily bad. I don’t know if we have plan to ever release any 
new 2.2.x, 2.3.x at this point and we can message this “supported version” of 
python change for any new 2.4 release.

Besides we could still support python 3.4 - it’s just more complicated to test 
manually without Jenkins coverage.


________________________________
From: shane knapp <skn...@berkeley.edu>
Sent: Tuesday, March 26, 2019 12:11 PM
To: Bryan Cutler
Cc: dev
Subject: Re: Upgrading minimal PyArrow version to 0.12.x [SPARK-27276]

i'm pretty certain that i've got a solid python 3.5 conda environment ready to 
be deployed, but this isn't a minor change to the build system and there might 
be some bugs to iron out.

another problem is that the current python 3.4 environment is hard-coded in to 
the both the build scripts on jenkins (all over the place) and in the codebase 
(thankfully in only one spot):  export PATH=/home/anaconda/envs/py3k/bin:$PATH

this means that every branch (master, 2.x, etc) will test against whatever 
version of python lives in that conda environment.  if we upgrade to 3.5, all 
branches will test against this version.  changing the build and test infra to 
support testing against 2.7, 3.4 or 3.5 based on branch is definitely 
non-trivial...

thoughts?




On Tue, Mar 26, 2019 at 11:39 AM Bryan Cutler 
<cutl...@gmail.com<mailto:cutl...@gmail.com>> wrote:
Thanks Hyukjin.  The plan is to get this done for 3.0 only.  Here is a link to 
the JIRA https://issues.apache.org/jira/browse/SPARK-27276.  Shane is also 
correct in that newer versions of pyarrow have stopped support for Python 3.4, 
so we should probably have Jenkins test against 2.7 and 3.5.

On Mon, Mar 25, 2019 at 9:44 PM Reynold Xin 
<r...@databricks.com<mailto:r...@databricks.com>> wrote:

+1 on doing this in 3.0.


On Mon, Mar 25, 2019 at 9:31 PM, Felix Cheung 
<felixcheun...@hotmail.com<mailto:felixcheun...@hotmail.com>> wrote:
I’m +1 if 3.0


________________________________
From: Sean Owen <sro...@gmail.com<mailto:sro...@gmail.com>>
Sent: Monday, March 25, 2019 6:48 PM
To: Hyukjin Kwon
Cc: dev; Bryan Cutler; Takuya UESHIN; shane knapp
Subject: Re: Upgrading minimal PyArrow version to 0.12.x [SPARK-27276]

I don't know a lot about Arrow here, but seems reasonable. Is this for
Spark 3.0 or for 2.x? Certainly, requiring the latest for Spark 3
seems right.

On Mon, Mar 25, 2019 at 8:17 PM Hyukjin Kwon 
<gurwls...@gmail.com<mailto:gurwls...@gmail.com>> wrote:
>
> Hi all,
>
> We really need to upgrade the minimal version soon. It's actually slowing 
> down the PySpark dev, for instance, by the overhead that sometimes we need 
> currently to test all multiple matrix of Arrow and Pandas. Also, it currently 
> requires to add some weird hacks or ugly codes. Some bugs exist in lower 
> versions, and some features are not supported in low PyArrow, for instance.
>
> Per, (Apache Arrow'+ Spark committer FWIW), Bryan's recommendation and my 
> opinion as well, we should better increase the minimal version to 0.12.x. 
> (Also, note that Pandas <> Arrow is an experimental feature).
>
> So, I and Bryan will proceed this roughly in few days if there isn't 
> objections assuming we're fine with increasing it to 0.12.x. Please let me 
> know if there are some concerns.
>
> For clarification, this requires some jobs in Jenkins to upgrade the minimal 
> version of PyArrow (I cc'ed Shane as well).
>
> PS: I roughly heard that Shane's busy for some work stuff .. but it's kind of 
> important in my perspective.
>

---------------------------------------------------------------------
To unsubscribe e-mail: 
dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org>



--
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Reply via email to