[ https://issues.apache.org/jira/browse/ARROW-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeffrey Wong updated ARROW-4316: -------------------------------- Description: My team uses both pyarrow and R arrow, we'd like both libraries to link to the same arrow.so file for consistency. pyarrow ships both arrow.so and parquet.so, if I can reuse those .so's to link R that would guarantee consistency. Under arrow v0.11.1 I was able to link R against libarrow.so found under pyarrow by passing LIB_DIR to the R [configure file|https://github.com/apache/arrow/blob/master/r/configure]. However, in v0.12.0 I am no longer able to do that. Here is a reproducible example on Ubuntu 16.04 which produces the error: Reproducible example: {code:java} # get the parquet headers which are not shipped with pyarrow tee /etc/apt/sources.list.d/apache-arrow.list <<APT_LINE deb [arch=amd64] https://dl.bintray.com/apache/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/ $(lsb_release --codename --short) main deb-src [] https://dl.bintray.com/apache/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/ $(lsb_release --codename --short) main APT_LINE apt-get update mkdir /tmp/arrow_headers; cd/tmp/arrow_headers apt-get download --allow-unauthenticated libparquet-dev ar -x libparquet-dev_0.12.0-1_amd64.deb tar -xJvf data.tar.xz #get pyarrow v0.12 pip3 install pyarrow --upgrade #figure out where pyarrow is PY_ARROW_PATH=$(python3 -c "import pyarrow, os; print(os.path.dirname(pyarrow._file_))") PY_ARROW_VERSION=$(python3 -c "import pyarrow; print(pyarrow._version_)") PYTHON_LIBDIR=$(python3 -c "import sysconfig; print(sysconfig.get_config_var('LIBDIR'))") # pyarrow doesn't ship parquet headers. Copy the ones from apt into the pyarrow dir mkdir $PY_ARROW_PATH/include/parquet cp -r /tmp/arrow_headers/usr/include/parquet/* $PY_ARROW_PATH/include/parquet/ #install R arrow echo export LD_LIBRARY_PATH=\"\${LD_LIBRARY_PATH}:${PYTHON_LIBDIR}:${PY_ARROW_PATH}\"" | tee -a /usr/lib/R/etc/ldpaths git clone https://github.com/apache/arrow.git /tmp/arrow cd /tmp/arrow/r git checkout "apache-arrow-${PY_ARROW_VERSION}" sed -i "/Depends: R/c\Depends: R (>= 3.4)" DESCRIPTION sed -i "s/PKG_CXXFLAGS=/PKG_CXXFLAGS=-D_GLIBCXX_USE_CXX11_ABI=0 /g" src/Makevars.in R CMD INSTALL ./ --configure-vars="INCLUDE_DIR=$PY_ARROW_PATH/include LIB_DIR=$PY_ARROW_PATH" {code} was: My team uses both pyarrow and R arrow, we'd like both libraries to link to the same arrow.so file for consistency. pyarrow ships both arrow.so and parquet.so, if I can reuse those .so's to link R that would guarantee consistency. Under arrow v0.11.1 I was able to link R against libarrow.so found under pyarrow by passing LIB_DIR to the R [configure file|https://github.com/apache/arrow/blob/master/r/configure]. However, in v0.12.0 I am no longer able to do that. Here is a reproducible example on Ubuntu 16.04 which produces the error: Reproducible example: # get the parquet headers which are not shipped with pyarrow tee /etc/apt/sources.list.d/apache-arrow.list <<APT_LINE deb [arch=amd64] https://dl.bintray.com/apache/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/ $(lsb_release --codename --short) main deb-src [] https://dl.bintray.com/apache/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/ $(lsb_release --codename --short) main APT_LINE apt-get update mkdir /tmp/arrow_headers; cd/tmp/arrow_headers apt-get download --allow-unauthenticated libparquet-dev ar -x libparquet-dev_0.12.0-1_amd64.deb tar -xJvf data.tar.xz #get pyarrow v0.12 pip3 install pyarrow --upgrade #figure out where pyarrow is PY_ARROW_PATH=$(python3 -c "import pyarrow, os; print(os.path.dirname(pyarrow.__file__))") PY_ARROW_VERSION=$(python3 -c "import pyarrow; print(pyarrow.__version__)") PYTHON_LIBDIR=$(python3 -c "import sysconfig; print(sysconfig.get_config_var('LIBDIR'))") # pyarrow doesn't ship parquet headers. Copy the ones from apt into the pyarrow dir mkdir $PY_ARROW_PATH/include/parquet cp -r /tmp/arrow_headers/usr/include/parquet/* $PY_ARROW_PATH/include/parquet/ #install R arrow echo export LD_LIBRARY_PATH=\"\${LD_LIBRARY_PATH}:${PYTHON_LIBDIR}:${PY_ARROW_PATH}\"" | tee -a /usr/lib/R/etc/ldpaths git clone https://github.com/apache/arrow.git /tmp/arrow cd /tmp/arrow/r git checkout "apache-arrow-${PY_ARROW_VERSION}" sed -i "/Depends: R/c\Depends: R (>= 3.4)" DESCRIPTION sed -i "s/PKG_CXXFLAGS=/PKG_CXXFLAGS=-D_GLIBCXX_USE_CXX11_ABI=0 /g" src/Makevars.in R CMD INSTALL ./ --configure-vars="INCLUDE_DIR=$PY_ARROW_PATH/include LIB_DIR=$PY_ARROW_PATH" > Reusing arrow.so for both Python and R > -------------------------------------- > > Key: ARROW-4316 > URL: https://issues.apache.org/jira/browse/ARROW-4316 > Project: Apache Arrow > Issue Type: Bug > Components: Python, R > Affects Versions: 0.12.0 > Environment: Ubuntu 16.04, R 3.4.4, pyarrow 0.12, cmake 3.12 > Reporter: Jeffrey Wong > Priority: Major > > My team uses both pyarrow and R arrow, we'd like both libraries to link to > the same arrow.so file for consistency. pyarrow ships both arrow.so and > parquet.so, if I can reuse those .so's to link R that would guarantee > consistency. > Under arrow v0.11.1 I was able to link R against libarrow.so found under > pyarrow by passing LIB_DIR to the R [configure > file|https://github.com/apache/arrow/blob/master/r/configure]. However, in > v0.12.0 I am no longer able to do that. Here is a reproducible example on > Ubuntu 16.04 which produces the error: > Reproducible example: > {code:java} > # get the parquet headers which are not shipped with pyarrow > > tee /etc/apt/sources.list.d/apache-arrow.list <<APT_LINE > deb [arch=amd64] https://dl.bintray.com/apache/arrow/$(lsb_release --id > --short | tr 'A-Z' 'a-z')/ $(lsb_release --codename --short) main > deb-src [] https://dl.bintray.com/apache/arrow/$(lsb_release --id --short | > tr 'A-Z' 'a-z')/ $(lsb_release --codename --short) main > APT_LINE > apt-get update > mkdir /tmp/arrow_headers; cd/tmp/arrow_headers > apt-get download --allow-unauthenticated libparquet-dev > ar -x libparquet-dev_0.12.0-1_amd64.deb > tar -xJvf data.tar.xz > > #get pyarrow v0.12 > > pip3 install pyarrow --upgrade > #figure out where pyarrow is > PY_ARROW_PATH=$(python3 -c "import pyarrow, os; > print(os.path.dirname(pyarrow._file_))") > PY_ARROW_VERSION=$(python3 -c "import pyarrow; print(pyarrow._version_)") > PYTHON_LIBDIR=$(python3 -c "import sysconfig; > print(sysconfig.get_config_var('LIBDIR'))") > > # pyarrow doesn't ship parquet headers. Copy the ones from apt into the > pyarrow dir > mkdir $PY_ARROW_PATH/include/parquet > cp -r /tmp/arrow_headers/usr/include/parquet/* > $PY_ARROW_PATH/include/parquet/ > > #install R arrow > echo export > LD_LIBRARY_PATH=\"\${LD_LIBRARY_PATH}:${PYTHON_LIBDIR}:${PY_ARROW_PATH}\"" | > tee -a /usr/lib/R/etc/ldpaths > git clone https://github.com/apache/arrow.git /tmp/arrow > cd /tmp/arrow/r > git checkout "apache-arrow-${PY_ARROW_VERSION}" > sed -i "/Depends: R/c\Depends: R (>= 3.4)" DESCRIPTION > sed -i "s/PKG_CXXFLAGS=/PKG_CXXFLAGS=-D_GLIBCXX_USE_CXX11_ABI=0 /g" > src/Makevars.in > R CMD INSTALL ./ --configure-vars="INCLUDE_DIR=$PY_ARROW_PATH/include > LIB_DIR=$PY_ARROW_PATH" {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)