[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/21939 got it. Thank you! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-17 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 0.8: ``` -bash-4.1$ hostname amp-jenkins-worker-03 -bash-4.1$ export PATH=/home/anaconda/envs/py3k/bin/:$PATH -bash-4.1$ python3.4 -c "import pyarrow;

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-17 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 0.8 On Fri, Aug 17, 2018 at 11:30 AM, Yin Huai wrote: > @shaneknapp what was the version of > pyarrow in that build? 0.8 or 0.10? > >

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/21939 @shaneknapp what was the version of pyarrow in that build? 0.8 or 0.10? --- - To unsubscribe, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-17 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 @yhuai see: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94748/ --- - To unsubscribe, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler So, for this upgrade, even the JVM side dependency is 0.10, pyspark can work with any version between pyarrow 0.8 to 0.10 without problem? ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 merged to master, thanks for your efforts on this @shaneknapp , and thanks @cloud-fan @HyukjinKwon and @dongjoon-hyun for reviewing! ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94748/ Test PASSed. ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21939 **[Test build #94748 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94748/testReport)** for PR 21939 at commit

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/21939 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21939 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94736/ Test FAILed. ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21939 **[Test build #94736 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94736/testReport)** for PR 21939 at commit

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21939 **[Test build #94736 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94736/testReport)** for PR 21939 at commit

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21939 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94729/ Test FAILed. ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21939 **[Test build #94729 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94729/testReport)** for PR 21939 at commit

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler @shaneknapp Thanks for your work! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 > great! looking forward to seeing arrow 0.10.0 come out. @cloud-fan Arrow has already been released and the artifacts are available - sorry I should have made a post to indicate that.

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21939 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21939 **[Test build #94729 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94729/testReport)** for PR 21939 at commit

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21939 great! looking forward to seeing arrow 0.10.0 come out. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 >Do you mean we will upgrade arrow to 0.10.0 at java side, but leave the python side as it is? So people can still use PySpark with pyarrow 0.8.0 and python 3.4? If they go with arrow 0.10.0

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 @cloud-fan the 0.10.0 tests are passing both on the new, temporary testing box i set up (python3.5 + arrow 0.10.0), as well as the standard 3.4/0.8.0 deployments (both ubuntu and centos).

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21939 Sorry I didn't follow all the discussions here. @BryanCutler Do you mean we will upgrade arrow to 0.10.0 at java side, but leave the python side as it is? So people can still use PySpark with

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 > @BryanCutler, not a big deal but why don't we link Arrow JIRA for "Allow for adding BinaryType support" too? @HyukjinKwon I added the link, must have forgotten that from before ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler, not a big deal but why don't we link Arrow JIRA for "Allow for adding BinaryType support" too? --- - To

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-12 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 Nice! Thanks for getting that running @shaneknapp . So what are peoples thoughts about merging this for 2.4 since it passes normal tests with pyarrow 0.8.0 and we've also shown it passes with

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 green! https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.6-python-3.5-arrow-0.10.0-ubuntu-testing/8/ ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 alright, builds are now passing. it failed the last one on the junit publish, and since we're not running java/scala unittests, i have since removed that block. should be green in

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 `./build/mvn -DskipTests -Phadoop2.6 -Pyarn -Phive -Phive-thriftserver clean package` FTW. (i do know that the -Phadoop2.6 is superfluous, but at this point...) ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 ok think i got it... :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 making build config changes... forgot to add `-Pyarn` to the build target, as well as making sure the correct python env is selected before running python tests. ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 there's a bunch of stuff in the unittest logs that i could use some extra eyes on:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 hmm first build failed but i don't quite know what's up: `skipped 'Pandas >= 0.19.2 must be installed; however, it was not found.'` and: ``` (py3k)

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-master-test-sbt-hadoop-2.6-python-3.5-arrow-0.10.0-ubuntu-testing/ ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 master branch, i assume --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 Yup, that would run all the pyarrow tests --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler responding to the wall 'o text: 1) we are not moving to py3.5 until after 2.4 is cut 2) my 'testing' server has 3.5 and 0.10.0 installed, so i can create a job that

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-10 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 Wow, thanks @shaneknapp for helping to get this worked out! I think you're plan to move to Python 3.5 sounds great, but it does make me a bit nervous making a change like this at a a critical

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-09 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 SO. MANY. MOVING. PARTS. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-09 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 my PRB fork passed w/python3.5: https://amplab.cs.berkeley.edu/jenkins/view/RISELab%20Infra/job/ubuntuSparkPRB/68 this increases my confidence for a python3.5 upgrade on the centos

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 Really thank you for your help! @shaneknapp --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 ok, i'll need to hack `python/run-tests.py` to support 3.5. doing that now on my personal fork and relaunching a test. ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 fack. this will be harder than just upgrading python on the workers: --- - To unsubscribe, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 i'm going to wait for my PRB to finish on the ubuntu node (it passed/skipped that test that's hanging), so we can see how the python tests go. after that i'll merge in any upstream

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler still seeing the error popping up on `org.apache.spark.unsafe.map.BytesToBytesMapOnHeapSuite.randomizedStressTest`.

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 btw: yes. i am staring at build logs today. does a watched build build faster? :\ --- - To unsubscribe, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler BULLET DODGED. my new build made it past that weird mesos bug! non-deterministic crap ftw! ``` the master-test-sbt-hadoop build got wedged in a concerning (but

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 @dongjoon-hyun i would hold off until we are certain that spark is buildable in our environment w/python 3.5. --- - To

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler . Shall we update the doc together in this PR? - https://github.com/apache/spark/blame/master/docs/index.md#L29 ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 the master-test-sbt-hadoop build got wedged in a concerning (but non-python spot). exact same issue as seen here: https://issues.apache.org/jira/browse/SPARK-20128 i kicked off

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 Thanks @shaneknapp ! Sorry, I didn't realize that requirement changed.. I'm keeping my fingers crossed those builds run smoothly!! ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94429/ Test PASSed. ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21939 **[Test build #94429 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94429/testReport)** for PR 21939 at commit

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 test builds started (on ubuntu). if these pass, i will feel comfortable performing the same installation/upgrade steps on the centos workers. upgrade/installation commands: conda

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 my message to dev@: ``` well... i've been running in to problems (aka dependency hell), and just hit a show-stopper: UnsatisfiableError: The following specifications were

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 oh god. we're running python 3.4 on the centos workers... pyarrow 0.10.0 needs 3.5 or greater. this is... problematic. ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 weird. i'm finding the pyarrow libs here: ``` /home/anaconda/envs/py3k/lib/python3.4/site-packages/pyarrow/ ``` but when i add this to LD_LIBRARY_PATH it still fails to find

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 ok this is getting to be a PITA: ``` pip install pyarrow==0.10.0 Collecting pyarrow==0.10.0 Using cached

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 centos workers taken offline, email sent to dev@, prepping for upgrade. --- - To unsubscribe, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 after pondering the situation, we can get away w/putting the centos workers offline as the most important spark builds (pull request builder, etc) run on them. this can leave the ubuntu workers

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 @shaneknapp That is great! I think making Jenkins in a quite mode looks fine, as long as we send out a note to the dev list. ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 yep, i will email dev@ saying that i will be making this change today. i could put jenkins in to quiet mode before the update, but i am thinking that there will be very few, if any,

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 https://lists.apache.org/thread.html/5b0836e44f9386fae2f99deed0a01441c699040c991d833faf520357@%3Cdev.arrow.apache.org%3E Arrow 0.10.0 release is officially announced. @shaneknapp Could

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 We need to upgrade it to 0.10.0 in this release, if possible. It resolves some bugs, e.g., https://issues.apache.org/jira/browse/ARROW-1973. ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 Ok, so we will up the pyarrow version in Jenkins to 0.10.0, but keep the minimum version in python/setup.py as 0.8.0 for now, correct? ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21939 **[Test build #94429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94429/testReport)** for PR 21939 at commit

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21939 **[Test build #94407 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94407/testReport)** for PR 21939 at commit

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94407/ Test FAILed. ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21939 > We are already not testing all the combinations and at least I manually test other combinations locally. For the minimum PyArrow upgrade for Spark itself in the code base, wouldn't we better

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21939 Upping PyArrow to 0.10.0 sounds fine to me within the Jenkins environment considering 2.4.0 is being close. We are already not testing all the combinations and at least I manually test

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21939 SGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-07 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 @shaneknapp I think we would be better off just upping the minimum version of arrow to 0.10.0 here since it's pretty involved to get a test matrix up and running and the project is still in a

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21939 **[Test build #94407 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94407/testReport)** for PR 21939 at commit

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-06 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler we currently test spark against only one version of pyarrow (and against py27 and py34).. setting things up to test against a matrix of python/pyarrow versions will have to

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 It sounds like the vote can pass soon. https://lists.apache.org/thread.html/9900da1540be5aafce27691fd40395bb53f465302db29979c154d99a@%3Cdev.arrow.apache.org%3E ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 To get this in, we might need to delay the code freeze. Can you reply the dev list email

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 After the code freeze, the dependency changes are not allowed. Hopefully, we can make it before that. --- - To unsubscribe,

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-05 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 > i'm ready to pull the trigger on the update to arrow... i'd much prefer a pip dist, but would be ok w/a conda package. :) Thanks @shaneknapp ! So for those suggesting we keep the

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-05 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 @gatorsmile , there is a RC1 vote up now, so it should very soon --- - To unsubscribe, e-mail:

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-05 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 @cloud-fan , we have BinaryType support in Java already, but it has not been added to Python due to an issue - the related jiras that @HyukjinKwon mentioned. So Arrow 0.10.0 has a bug fix that

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-01 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/21939 i'm ready to pull the trigger on the update to arrow... i'd much prefer a pip dist, but would be ok w/a conda package. :) ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-07-31 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler Thanks! What is the expected target release date of Apache Arrow 0.10.0? --- - To unsubscribe, e-mail:

  1   2   >