nealrichardson commented on a change in pull request #9579:
URL: https://github.com/apache/arrow/pull/9579#discussion_r591773618



##########
File path: r/configure
##########
@@ -96,7 +96,7 @@ else
       echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
     fi
   else
-    if [ "$UNAME" = "Darwin" ]; then
+    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_TOOLS_LIBS_SCRIPT" = "" ]; then

Review comment:
       ```suggestion
       if [ "$UNAME" = "Darwin" ] && [ "$FORCE_TOOLS_LIBS_SCRIPT" != "true" ]; 
then
   ```
   
   Also see L38 above and do the same thing with this var to make it case 
insensitive

##########
File path: dev/tasks/r/github.macos-linux.local.yml
##########
@@ -0,0 +1,93 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# NOTE: must set "Crossbow" as name to have the badge links working in the
+# github comment reports!
+name: Crossbow
+
+on:
+  push:
+    branches:
+      - "*-github-*"
+
+jobs:
+  autobrew:
+    name: "install from local source"
+    runs-on: {{ '${{ matrix.os }}' }}
+    strategy:
+      fail-fast: false
+      matrix:
+        os: [macOS-latest, ubuntu-20.04]
+
+   
+
+    steps:
+      - name: Checkout Arrow
+        run: |
+          git clone --no-checkout {{ arrow.remote }} arrow
+          git -C arrow fetch -t {{ arrow.remote }} {{ arrow.branch }}
+          git -C arrow checkout FETCH_HEAD
+          git -C arrow submodule update --init --recursive
+      - name: Configure non-autobrew dependencies (macos)
+        run: |
+          cd arrow/r
+          brew install openssl
+        if: contains(matrix.os, 'macOS')
+      - name: Configure non-autobrew dependencies (linux)
+        run: |
+          cd arrow/r
+          sudo apt install libcurl4-openssl-dev libssl-dev
+        if: contains(matrix.os, 'ubuntu')
+      - uses: r-lib/actions/setup-r@v1
+      - name: Install dependencies
+        run: |
+          install.packages("remotes")
+          remotes::install_deps("arrow/r", dependencies = TRUE)
+          remotes::install_cran(c("rcmdcheck", "sys", "sessioninfo"))
+        shell: Rscript {0}
+      - name: Session info
+        run: |
+          options(width = 100)
+          pkgs <- installed.packages()[, "Package"]
+          sessioninfo::session_info(pkgs, include_base = TRUE)
+        shell: Rscript {0}
+      - name: Install
+        env:
+          _R_CHECK_CRAN_INCOMING_: false
+          ARROW_USE_PKG_CONFIG: false
+          FORCE_TOOLS_LIBS_SCRIPT: true
+          LIBARROW_MINIMAL: false 
+          TEST_R_WITH_ARROW: TRUE
+          ARROW_R_DEV: TRUE
+        run: |
+          cd arrow/r
+          # Setting the SDK root is necesary on modern macOSes/SDKs to find 
standard libraries when building 

Review comment:
       Why have I never had to do this?

##########
File path: r/vignettes/install.Rmd
##########
@@ -166,10 +166,30 @@ run `install.packages("arrow")` or `R CMD INSTALL` but 
not when running `R CMD c
 unless you've set the `NOT_CRAN=true` environment variable.
 
 For the mechanics of how all this works, see the R package `configure` script,
-which calls `tools/linuxlibs.R`.
+which calls `tools/nixlibs.R`.
 If the C++ library is built from source, `inst/build_arrow_static.sh` is 
executed.
 This build script is also what is used to generate the prebuilt binaries.
 
+
+# Using `remotes::install_github(...)` 
+
+If you need an Arrow installation from a specific repository or at a specific 
ref, `remotes::install_github()` should work on most platforms (with the 
notable exception of windows). This method is helpful if you need a full 
install of arrow that is separate from another install (e.g. we use this in 
[arrowbench](https://github.com/ursacomputing/arrowbench) to install 
development versions of arrow isolated from the system install). However there 
are some caveats to be aware of:
+
+* Setting the environment variable `FORCE_TOOLS_LIBS_SCRIPT` to `true` will 
avoid linking to any arrow libraries already installed and attempt to build 
arrow from the same source at the repository+ref given.
+* If you are using the `FORCE_TOOLS_LIBS_SCRIPT` you must also set `build = 
FALSE` in the `remotes::install_github()` call. This is similar to checking out 
the repository and calling `R CMD INSTALL .` in the `arrow/r` directory (as 
opposed to first calling `R CMD BUILD .` and then installing the tar.gz file 
that produces, which is the default for `remotes::install_github()`). If you 
have arrow installed already, you may want to change your Makevars `CPPFLAGS` 
and `LDFLAGS` to `""` in order to prevent the installation process from 
attempting to link to already installed system versions of arrow. One way to do 
this temporarily is wrapping your `remotes::install_github()` call like so: 
`withr::with_makevars(list(CPPFLAGS = "", LDFLAGS = ""), 
remotes::install_github(...))`. 
+* Specify `subdir = "r"` to get the R package (or use `/r` after 
`username/repo` e.g. `apache/arrow/r`).
+* On macOS you may need to also specify the environment variable `SDKROOT` to 
an appropriate location (typically something like 
`/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk`).
 This is most easily and reliably done using `xcrun --show-sdk-path` (to set 
the environment variable inside of R you can `Sys.setenv(SDKROOT=system("xcrun 
--show-sdk-path", intern = TRUE))`). Setting the `SDKROOT` variable is 
necessary on modern (at least >= 10.15) macOS SDKs. This allows the build 
system to find the appropriate standard libraries and headers when it is 
compiling them.

Review comment:
       If it's necessary, I still don't understand why I've never had to do 
this before locally. I don't want to give the impression that macOS users 
always have to set some env var before installing arrow; if it is truly always 
required then it should get set in `configure` (but I don't see how it is 
always required).

##########
File path: r/configure
##########
@@ -133,22 +133,44 @@ else
       if [ "${LIBARROW_MINIMAL}" = "" ] && [ "${NOT_CRAN}" = "true" ]; then
         LIBARROW_MINIMAL=false; export LIBARROW_MINIMAL
       fi
-      ${R_HOME}/bin/Rscript tools/linuxlibs.R $VERSION
+
+      # find openssl on macos. macOS ships with libressl. openssl is 
installable
+      # with brew, but it is generally not linked. We can over-ride this and 
find
+      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+      # the installation process). FWIW, arrow's cmake process uses this
+      # same process to find openssl, but doing it now allows us to catch it in
+      # nixlibs.R and throw a nicer error.
+      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
+        brew --prefix openssl >/dev/null 2>&1
+        if [ $? -eq 0 ]; then
+          export OPENSSL_ROOT_DIR="$(brew --prefix openssl)"
+        fi
+      fi
+
+      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
       PKG_CFLAGS="-I$(pwd)/libarrow/arrow-${VERSION}/include $PKG_CFLAGS"
 
       LIB_DIR="libarrow/arrow-${VERSION}/lib"
       if [ -d "$LIB_DIR" ]; then
         # Enumerate the static libs, put their -l flags in BUNDLED_LIBS,
         # and put their -L location in PKG_DIRS
         #
-        # If tools/linuxlibs.R fails to produce libs, this dir won't exist
+        # If tools/nixlibs.R fails to produce libs, this dir won't exist
         # so don't try (the error message from `ls` would be misleading)
-        # Assume linuxlibs.R has handled and messaged about its failure already
+        # Assume nixlibs.R has handled and messaged about its failure already
         #
         # TODO: what about non-bundled deps?
         BUNDLED_LIBS=`cd $LIB_DIR && ls *.a`
         BUNDLED_LIBS=`echo $BUNDLED_LIBS | sed -E "s/lib(.*)\.a/-l\1/" | sed 
-e "s/\\.a lib/ -l/g"`
         PKG_DIRS="-L$(pwd)/$LIB_DIR"
+
+        # When using brew's openssl it is not bundled and it is not on the 
system
+        # search path  and so we must add the lib path to PKG_LIBS if we are
+        # using it. Note the order is important, this must be after the arrow
+        # lib path + the pkg and bundled libs above

Review comment:
       "... so this is why we're appending to BUNDLED_LIBS and not PKG_DIRS"

##########
File path: r/vignettes/install.Rmd
##########
@@ -166,10 +166,30 @@ run `install.packages("arrow")` or `R CMD INSTALL` but 
not when running `R CMD c
 unless you've set the `NOT_CRAN=true` environment variable.
 
 For the mechanics of how all this works, see the R package `configure` script,
-which calls `tools/linuxlibs.R`.
+which calls `tools/nixlibs.R`.
 If the C++ library is built from source, `inst/build_arrow_static.sh` is 
executed.
 This build script is also what is used to generate the prebuilt binaries.
 
+
+# Using `remotes::install_github(...)` 
+
+If you need an Arrow installation from a specific repository or at a specific 
ref, `remotes::install_github()` should work on most platforms (with the 
notable exception of windows). This method is helpful if you need a full 
install of arrow that is separate from another install (e.g. we use this in 
[arrowbench](https://github.com/ursacomputing/arrowbench) to install 
development versions of arrow isolated from the system install). However there 
are some caveats to be aware of:

Review comment:
       ```suggestion
   If you need an Arrow installation from a specific repository or at a 
specific ref, `remotes::install_github("apache/arrow/r", build = FALSE)` should 
work on most platforms (with the notable exception of windows). This method is 
helpful if you need a full install of arrow that is separate from another 
install (e.g. we use this in 
[arrowbench](https://github.com/ursacomputing/arrowbench) to install 
development versions of arrow isolated from the system install). However there 
are some caveats to be aware of:
   ```

##########
File path: r/vignettes/install.Rmd
##########
@@ -166,10 +166,30 @@ run `install.packages("arrow")` or `R CMD INSTALL` but 
not when running `R CMD c
 unless you've set the `NOT_CRAN=true` environment variable.
 
 For the mechanics of how all this works, see the R package `configure` script,
-which calls `tools/linuxlibs.R`.
+which calls `tools/nixlibs.R`.
 If the C++ library is built from source, `inst/build_arrow_static.sh` is 
executed.
 This build script is also what is used to generate the prebuilt binaries.
 
+
+# Using `remotes::install_github(...)` 
+
+If you need an Arrow installation from a specific repository or at a specific 
ref, `remotes::install_github()` should work on most platforms (with the 
notable exception of windows). This method is helpful if you need a full 
install of arrow that is separate from another install (e.g. we use this in 
[arrowbench](https://github.com/ursacomputing/arrowbench) to install 
development versions of arrow isolated from the system install). However there 
are some caveats to be aware of:
+
+* Setting the environment variable `FORCE_TOOLS_LIBS_SCRIPT` to `true` will 
avoid linking to any arrow libraries already installed and attempt to build 
arrow from the same source at the repository+ref given.
+* If you are using the `FORCE_TOOLS_LIBS_SCRIPT` you must also set `build = 
FALSE` in the `remotes::install_github()` call. This is similar to checking out 
the repository and calling `R CMD INSTALL .` in the `arrow/r` directory (as 
opposed to first calling `R CMD BUILD .` and then installing the tar.gz file 
that produces, which is the default for `remotes::install_github()`). If you 
have arrow installed already, you may want to change your Makevars `CPPFLAGS` 
and `LDFLAGS` to `""` in order to prevent the installation process from 
attempting to link to already installed system versions of arrow. One way to do 
this temporarily is wrapping your `remotes::install_github()` call like so: 
`withr::with_makevars(list(CPPFLAGS = "", LDFLAGS = ""), 
remotes::install_github(...))`. 

Review comment:
       Why not always recommend `build = FALSE`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to