[ https://issues.apache.org/jira/browse/ARROW-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999609#comment-15999609 ]
Wes McKinney commented on ARROW-955: ------------------------------------ If you're going to work with both development builds and released binary artifacts, it's good practice to work in conda environments, so you would do development in a different environment from the one where you installed the pyarrow package from conda-forge. You can see what is imported in the Python shell {code} In [1]: import pyarrow In [2]: pyarrow Out[2]: <module 'pyarrow' from '/home/wesm/code/arrow/python/pyarrow/__init__.py'> {code} > [Docs] Guide for building Python from source on Ubuntu 14.04 LTS without conda > ------------------------------------------------------------------------------ > > Key: ARROW-955 > URL: https://issues.apache.org/jira/browse/ARROW-955 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Environment: Ubuntu - 3.19.0-80-generic #88~14.04.1-Ubuntu > Python 2.7.6 > Reporter: Devang Shah > > I built pyarrow, arrow, and parquet-cpp from source - so that I could use the > new read_row_group() interface and in general, have access to the latest > versions. I ran into many issues during the build but was ultimately > successful (notes below). However, I am not able to import pyarrow.parquet > due to the following issue: > >>import pyarrow.parquet > Traceback (most recent call last): > File "", line 1, in > File "pyarrow/init.py", line 28, in > import pyarrow._config > ImportError: No module named _config > This is similar to an issue reported in github/conda-forge/pyarrow-feedstock, > where also I posted this...but I think this forum is more direct and > appropriate - so re-posting here. > I used instructions at https://arrow.apache.org/docs/python/install.html to > build arrow/cpp, parquet-cpp, and then pyarrow, with the following deviations > (I view them as possibly bugs in the instructions): > arrow/cpp build: > export ARROW_HOME=$HOME/local > I had to specify -DARROW_PYTHON=on and -DPARQUET_ARROW=ON to the cmake > command (besides the -DCMAKE_INSTALL_PREFIX=$ARROW_HOME) > parquet-cpp build: > export ARROW_HOME=$HOME/local > cmake -DARROW_HOME=$HOME/local -DPARQUET_ARROW_LINKAGE=static > -DPARQUET_ARROW=ON . > make > sudo make install ----> this installs parquet libs in the std systems > location (/usr/local/lib) so that the pyarrow build (see below) can find the > parquet libs > pyarrow build: > export ARROW_HOME=$HOME/local (not a deviation; just repeating here) > export LD_LIBRARY_PATH=$HOME/local/lib:$HOME/parquet4/parquet-cpp/build/latest > sudo python setup.py build_ext --with-parquet --with-jemalloc > --build-type=release install > sudo python setup.py install > (sudo is needed to install in /usr/local/lib/python2.7/dist-packages ) > These are the steps and modifications to the instructions needed for me to > build the pyarrow.parquet package. However, when I now try to import the > package I get the error specified above. > Maybe I did something wrong in my steps which I kind of put together by > searching for these issues...but really can't tell what. It took me almost a > whole day to get to the point where I can build pyarrow and parquet, and now > I can't use what I built. > Any comments, help appreciated! Thanks in advance. -- This message was sent by Atlassian JIRA (v6.3.15#6346)