Neal Richardson created ARROW-5222:
--------------------------------------

             Summary: [Python] Issues with installing pyarrow for development 
on MacOS
                 Key: ARROW-5222
                 URL: https://issues.apache.org/jira/browse/ARROW-5222
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Documentation, Python
            Reporter: Neal Richardson
             Fix For: 0.14.0


I tried following the 
[instructions|https://github.com/apache/arrow/blob/master/docs/source/developers/python.rst]
 for installing pyarrow for developers on macos, and I ran into quite a bit of 
difficulty. I'm hoping we can improve our documentation and/or tooling to make 
this a smoother process. 

I know we can't anticipate every quirk of everyone's dev environment, but in my 
case, I was getting set up on a new machine, so this was from a clean slate. 
I'm also new to contributing to the project, so I'm a "clean slate" in that 
regard too, so my ignorance may be exposing other assumptions in the docs.
 # The instructions recommend using conda, but as this [Stack Overflow 
question|https://stackoverflow.com/questions/55798166/cmake-fails-with-when-attempting-to-compile-simple-test-program]
 notes, cmake fails. Uwe helpfully suggested installing an older MacOS SDK from 
[here|https://github.com/phracker/MacOSX-SDKs/releases]. That may work, but I'm 
personally wary to install binaries from an unofficial github account, let 
alone record that in our docs as an official recommendation. Either way, we 
should update the docs either to note this necessity or to recommend against 
installing with conda on macos.
 # After that, I tried to go the Homebrew path. Ultimately this did succeed, 
but it was rough. It seemed that I had to `brew install` a lot of packages that 
weren't included in the arrow/python/Brewfile (i.e. try to cmake, see what 
missing dependency it failed on, `brew install` it, retry `cmake`, and repeat). 
Among the libs I installed this way were double-conversion snappy brotli 
protobuf gtest rapidjson flatbuffers lz4 zstd c-ares boost. It's not clear how 
many of these extra dependencies I had to install were because I'd only 
installed the xcode command-line tools and not the full xcode from the App 
Store; regardless, the Brewfile should be complete if we want to use it.
 # In searching Jira for the double-conversion issue (the first one I hit), I 
found [this issue/PR|https://github.com/apache/arrow/pull/4132/files], which 
added double-conversion to a different Brewfile, in c_glib. So I tried `brew 
bundle` installing that Brewfile. It would probably be good to have a common 
Brewfile for the C++ setup, which the python and glib ones could load and then 
add any other extra dependencies, if necessary. That way, there's one place to 
add common dependencies.
 # I got close here but still had issues with `BOOST_HOME` not being found, 
even though I had brew-installed it. From the console output, it appeared that 
even though I was not using conda and did not have an active conda environment 
(I'd even done `conda env remove --name pyarrow-dev`), the cmake configuration 
script detected that conda existed and decided to use conda to resolve 
dependencies. I tried setting lots of different environment variables to tell 
cmake not to use conda, but ultimately I was only able to get past this by 
deleting conda from my system entirely.
 # This let me get to the point of being able to `import pyarrow`. But then 
running tests failed because the `hypothesis` package was not installed. I see 
that it is included in requirements-test.txt and setup.py under tests_require, 
but I followed the installation instructions and this package did not end up in 
my virtualenv. `pip install hypothesis` resolved it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to