nealrichardson commented on a change in pull request #9579:
URL: https://github.com/apache/arrow/pull/9579#discussion_r588794433



##########
File path: r/configure
##########
@@ -130,22 +130,43 @@ else
       if [ "${LIBARROW_MINIMAL}" = "" ] && [ "${NOT_CRAN}" = "true" ]; then
         LIBARROW_MINIMAL=false; export LIBARROW_MINIMAL
       fi
-      ${R_HOME}/bin/Rscript tools/linuxlibs.R $VERSION
+
+      # find openssl on macos. macOS ships with libressl. openssl is insallable

Review comment:
       ```suggestion
         # find openssl on macos. macOS ships with libressl. openssl is 
installable
   ```

##########
File path: r/README.md
##########
@@ -178,6 +178,28 @@ For any other build/configuration challenges, see the [C++ 
developer
 guide](https://arrow.apache.org/docs/developers/cpp/building.html) and
 `vignette("install", package = "arrow")`.
 
+### Using `remotes::install_github(...)` 
+
+If you need an Arrow installation from a specific repository or at a specific 
ref, `remotes::install_github()` should work on most platforms. This method is 
helpful if you need a full install of arrow that is separate from another 
install (e.g. we use this in 
[arrowbench](https://github.com/ursacomputing/arrowbench) to install 
development versions of arrow isolated from the system install). However there 
are some caveats to be aware of:
+
+* Setting the environment variable `FORCE_TOOLS_LIBS_SCRIPT` to `true` will 
avoid linking to any arrow libraries already installed and attempt to build 
arrow from the same source at the repository+ref given.
+* If you are using the `FORCE_TOOLS_LIBS_SCRIPT` you must also set `build = 
FALSE` in the `remotes::install_github()` call. This is similar to checking out 
the repository and calling `R CMD INSTALL .` in the `arrow/r` directory (as 
opposed to first calling `R CMD BUILD .` and then installing the tar.gz file 
that produces, which is the default for `remotes::install_github()`).
+* Specify `subdir = "r"` to get the R package.
+* On macOS you may need to also specify the environment variable `SDKROOT` to 
an appropriate location (typically something like 
`/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk`).
 This is most easily and reliably done using `xcrun --show-sdk-path` (to set 
the environment variable inside of R you can `Sys.setenv(SDKROOT=system("xcrun 
--show-sdk-path", intern = TRUE))`).
+* If you have arrow installed already, you may want to change your Makevars 
`CPPFLAGS` and `LDFLAGS` to `""` in order to prevent the installation process 
from attempting to link to already installed system versions of arrow. One way 
to do this temporarily is wrapping your `remotes::install_github()` call like 
so: `withr::with_makevars(list(CPPFLAGS = "", LDFLAGS = ""), 
remotes::install_github(...))`. 

Review comment:
       I would group this with the FORCE_TOOLS_LIBS_SCRIPT since what you're 
saying is "If you already have Arrow C++ installed on your system, you need to 
set this env var and these makevars so that you don't use them"
   
   Also I'm resisting the urge to 🚲 🏠  FORCE_TOOLS_LIBS_SCRIPT further

##########
File path: r/README.md
##########
@@ -178,6 +178,28 @@ For any other build/configuration challenges, see the [C++ 
developer
 guide](https://arrow.apache.org/docs/developers/cpp/building.html) and
 `vignette("install", package = "arrow")`.
 
+### Using `remotes::install_github(...)` 
+
+If you need an Arrow installation from a specific repository or at a specific 
ref, `remotes::install_github()` should work on most platforms. This method is 
helpful if you need a full install of arrow that is separate from another 
install (e.g. we use this in 
[arrowbench](https://github.com/ursacomputing/arrowbench) to install 
development versions of arrow isolated from the system install). However there 
are some caveats to be aware of:
+
+* Setting the environment variable `FORCE_TOOLS_LIBS_SCRIPT` to `true` will 
avoid linking to any arrow libraries already installed and attempt to build 
arrow from the same source at the repository+ref given.
+* If you are using the `FORCE_TOOLS_LIBS_SCRIPT` you must also set `build = 
FALSE` in the `remotes::install_github()` call. This is similar to checking out 
the repository and calling `R CMD INSTALL .` in the `arrow/r` directory (as 
opposed to first calling `R CMD BUILD .` and then installing the tar.gz file 
that produces, which is the default for `remotes::install_github()`).
+* Specify `subdir = "r"` to get the R package.

Review comment:
       I believe it is preferred now that you specify the subdir as part of 
`repo` now, i.e. `remotes::install_github("apache/arrow/r", build = FALSE)`  
cf. https://remotes.r-lib.org/reference/install_github.html

##########
File path: r/configure
##########
@@ -130,22 +130,43 @@ else
       if [ "${LIBARROW_MINIMAL}" = "" ] && [ "${NOT_CRAN}" = "true" ]; then
         LIBARROW_MINIMAL=false; export LIBARROW_MINIMAL
       fi
-      ${R_HOME}/bin/Rscript tools/linuxlibs.R $VERSION
+
+      # find openssl on macos. macOS ships with libressl. openssl is insallable
+      # with brew, but it is generally not linked. We can over-ride this and 
find
+      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+      # the installation process). FWIW, arrow's cmake process will use this
+      # same process to find openssl, but doing it now allows us to catch it in
+      # nixlibs.R and throw a nicer error.
+      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
+        brew --prefix openssl >/dev/null 2>&1
+        if [ $? -eq 0 ]; then
+          export OPENSSL_ROOT_DIR="$(brew --prefix openssl)"
+        fi
+      fi
+
+      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
       PKG_CFLAGS="-I$(pwd)/libarrow/arrow-${VERSION}/include $PKG_CFLAGS"
 
       LIB_DIR="libarrow/arrow-${VERSION}/lib"
       if [ -d "$LIB_DIR" ]; then
         # Enumerate the static libs and add to PKG_LIBS
         # (technically repeating arrow libs so they're in the right order)
         #
-        # If tools/linuxlibs.R fails to produce libs, this dir won't exist
+        # If tools/nixlibs.R fails to produce libs, this dir won't exist
         # so don't try (the error message from `ls` would be misleading)
-        # Assume linuxlibs.R has handled and messaged about its failure already
+        # Assume nixlibs.R has handled and messaged about its failure already
         #
         # TODO: what about non-bundled deps?
         BUNDLED_LIBS=`cd $LIB_DIR && ls *.a`
         BUNDLED_LIBS=`echo $BUNDLED_LIBS | sed -E "s/lib(.*)\.a/-l\1/" | sed 
-e "s/\\.a lib/ -l/g"`
         PKG_LIBS="-L$(pwd)/$LIB_DIR $PKG_LIBS $BUNDLED_LIBS"
+
+        # openssl is not bundled, and so we must add the lib path to PKG_LIBS 
if

Review comment:
       Note that this is only for the `brew` case you set above

##########
File path: r/README.md
##########
@@ -178,6 +178,28 @@ For any other build/configuration challenges, see the [C++ 
developer
 guide](https://arrow.apache.org/docs/developers/cpp/building.html) and
 `vignette("install", package = "arrow")`.
 
+### Using `remotes::install_github(...)` 
+
+If you need an Arrow installation from a specific repository or at a specific 
ref, `remotes::install_github()` should work on most platforms. This method is 
helpful if you need a full install of arrow that is separate from another 
install (e.g. we use this in 
[arrowbench](https://github.com/ursacomputing/arrowbench) to install 
development versions of arrow isolated from the system install). However there 
are some caveats to be aware of:
+
+* Setting the environment variable `FORCE_TOOLS_LIBS_SCRIPT` to `true` will 
avoid linking to any arrow libraries already installed and attempt to build 
arrow from the same source at the repository+ref given.
+* If you are using the `FORCE_TOOLS_LIBS_SCRIPT` you must also set `build = 
FALSE` in the `remotes::install_github()` call. This is similar to checking out 
the repository and calling `R CMD INSTALL .` in the `arrow/r` directory (as 
opposed to first calling `R CMD BUILD .` and then installing the tar.gz file 
that produces, which is the default for `remotes::install_github()`).
+* Specify `subdir = "r"` to get the R package.
+* On macOS you may need to also specify the environment variable `SDKROOT` to 
an appropriate location (typically something like 
`/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk`).
 This is most easily and reliably done using `xcrun --show-sdk-path` (to set 
the environment variable inside of R you can `Sys.setenv(SDKROOT=system("xcrun 
--show-sdk-path", intern = TRUE))`).
+* If you have arrow installed already, you may want to change your Makevars 
`CPPFLAGS` and `LDFLAGS` to `""` in order to prevent the installation process 
from attempting to link to already installed system versions of arrow. One way 
to do this temporarily is wrapping your `remotes::install_github()` call like 
so: `withr::with_makevars(list(CPPFLAGS = "", LDFLAGS = ""), 
remotes::install_github(...))`. 
+
+As with other installation methods setting the environment variables 
`LIBARROW_MINIMAL=false` and `ARROW_R_DEV=true` will provide a more 
full-featured version of Arrow and provide more verbose output respectively.
+
+For example, to install from the (fictional) branch `bugfix` from 
`apache/arrow` one could:
+
+```r
+Sys.setenv(FORCE_TOOLS_LIBS_SCRIPT="true")
+Sys.setenv(LIBARROW_MINIMAL="false")
+remotes::install_github("apache/arrow", ref = "bugfix", subdir = "r", build = 
FALSE)
+```
+
+This feature is experimental and might not work on all platforms.

Review comment:
       This undercuts the statement in the first sentence

##########
File path: r/README.md
##########
@@ -178,6 +178,28 @@ For any other build/configuration challenges, see the [C++ 
developer
 guide](https://arrow.apache.org/docs/developers/cpp/building.html) and
 `vignette("install", package = "arrow")`.
 
+### Using `remotes::install_github(...)` 
+
+If you need an Arrow installation from a specific repository or at a specific 
ref, `remotes::install_github()` should work on most platforms. This method is 
helpful if you need a full install of arrow that is separate from another 
install (e.g. we use this in 
[arrowbench](https://github.com/ursacomputing/arrowbench) to install 
development versions of arrow isolated from the system install). However there 
are some caveats to be aware of:
+
+* Setting the environment variable `FORCE_TOOLS_LIBS_SCRIPT` to `true` will 
avoid linking to any arrow libraries already installed and attempt to build 
arrow from the same source at the repository+ref given.
+* If you are using the `FORCE_TOOLS_LIBS_SCRIPT` you must also set `build = 
FALSE` in the `remotes::install_github()` call. This is similar to checking out 
the repository and calling `R CMD INSTALL .` in the `arrow/r` directory (as 
opposed to first calling `R CMD BUILD .` and then installing the tar.gz file 
that produces, which is the default for `remotes::install_github()`).
+* Specify `subdir = "r"` to get the R package.
+* On macOS you may need to also specify the environment variable `SDKROOT` to 
an appropriate location (typically something like 
`/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk`).
 This is most easily and reliably done using `xcrun --show-sdk-path` (to set 
the environment variable inside of R you can `Sys.setenv(SDKROOT=system("xcrun 
--show-sdk-path", intern = TRUE))`).

Review comment:
       "may"? When/why?

##########
File path: r/README.md
##########
@@ -178,6 +178,28 @@ For any other build/configuration challenges, see the [C++ 
developer
 guide](https://arrow.apache.org/docs/developers/cpp/building.html) and
 `vignette("install", package = "arrow")`.
 
+### Using `remotes::install_github(...)` 
+
+If you need an Arrow installation from a specific repository or at a specific 
ref, `remotes::install_github()` should work on most platforms. This method is 
helpful if you need a full install of arrow that is separate from another 
install (e.g. we use this in 
[arrowbench](https://github.com/ursacomputing/arrowbench) to install 
development versions of arrow isolated from the system install). However there 
are some caveats to be aware of:
+
+* Setting the environment variable `FORCE_TOOLS_LIBS_SCRIPT` to `true` will 
avoid linking to any arrow libraries already installed and attempt to build 
arrow from the same source at the repository+ref given.
+* If you are using the `FORCE_TOOLS_LIBS_SCRIPT` you must also set `build = 
FALSE` in the `remotes::install_github()` call. This is similar to checking out 
the repository and calling `R CMD INSTALL .` in the `arrow/r` directory (as 
opposed to first calling `R CMD BUILD .` and then installing the tar.gz file 
that produces, which is the default for `remotes::install_github()`).
+* Specify `subdir = "r"` to get the R package.
+* On macOS you may need to also specify the environment variable `SDKROOT` to 
an appropriate location (typically something like 
`/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk`).
 This is most easily and reliably done using `xcrun --show-sdk-path` (to set 
the environment variable inside of R you can `Sys.setenv(SDKROOT=system("xcrun 
--show-sdk-path", intern = TRUE))`).
+* If you have arrow installed already, you may want to change your Makevars 
`CPPFLAGS` and `LDFLAGS` to `""` in order to prevent the installation process 
from attempting to link to already installed system versions of arrow. One way 
to do this temporarily is wrapping your `remotes::install_github()` call like 
so: `withr::with_makevars(list(CPPFLAGS = "", LDFLAGS = ""), 
remotes::install_github(...))`. 
+
+As with other installation methods setting the environment variables 
`LIBARROW_MINIMAL=false` and `ARROW_R_DEV=true` will provide a more 
full-featured version of Arrow and provide more verbose output respectively.
+
+For example, to install from the (fictional) branch `bugfix` from 
`apache/arrow` one could:
+
+```r
+Sys.setenv(FORCE_TOOLS_LIBS_SCRIPT="true")
+Sys.setenv(LIBARROW_MINIMAL="false")
+remotes::install_github("apache/arrow", ref = "bugfix", subdir = "r", build = 
FALSE)

Review comment:
       Or, `remotes::install_github("apache/arrow/r@bugfix", build = FALSE)`
   

##########
File path: dev/tasks/r/github.macos.local.yml
##########
@@ -0,0 +1,78 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# NOTE: must set "Crossbow" as name to have the badge links working in the
+# github comment reports!
+name: Crossbow
+
+on:
+  push:
+    branches:
+      - "*-github-*"
+
+jobs:
+  autobrew:
+    name: "macOS install from local source"
+    runs-on: macOS-latest
+    steps:
+      - name: Checkout Arrow
+        run: |
+          git clone --no-checkout {{ arrow.remote }} arrow
+          git -C arrow fetch -t {{ arrow.remote }} {{ arrow.branch }}
+          git -C arrow checkout FETCH_HEAD
+          git -C arrow submodule update --init --recursive
+      - name: Configure non-autobrew dependencies
+        run: |
+          cd arrow/r
+          brew install openssl
+      - uses: r-lib/actions/setup-r@v1
+      - name: Install dependencies
+        run: |
+          install.packages("remotes")
+          remotes::install_deps("arrow/r", dependencies = TRUE)
+          remotes::install_cran(c("rcmdcheck", "sys", "sessioninfo"))
+        shell: Rscript {0}
+      - name: Session info
+        run: |
+          options(width = 100)
+          pkgs <- installed.packages()[, "Package"]
+          sessioninfo::session_info(pkgs, include_base = TRUE)
+        shell: Rscript {0}
+      - name: Install
+        env:
+          _R_CHECK_CRAN_INCOMING_: false
+          ARROW_USE_PKG_CONFIG: false
+          FORCE_TOOLS_LIBS_SCRIPT: true
+          LIBARROW_MINIMAL: false 
+          TEST_R_WITH_ARROW: TRUE
+          ARROW_R_DEV: TRUE
+        run: |
+          cd arrow/r
+          export SDKROOT="$(xcrun --show-sdk-path)"

Review comment:
       Leave a comment as to why this is needed.

##########
File path: r/configure
##########
@@ -130,22 +130,43 @@ else
       if [ "${LIBARROW_MINIMAL}" = "" ] && [ "${NOT_CRAN}" = "true" ]; then
         LIBARROW_MINIMAL=false; export LIBARROW_MINIMAL
       fi
-      ${R_HOME}/bin/Rscript tools/linuxlibs.R $VERSION
+
+      # find openssl on macos. macOS ships with libressl. openssl is insallable
+      # with brew, but it is generally not linked. We can over-ride this and 
find
+      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+      # the installation process). FWIW, arrow's cmake process will use this

Review comment:
       ```suggestion
         # the installation process). FWIW, arrow's cmake process uses this
   ```

##########
File path: r/README.md
##########
@@ -178,6 +178,28 @@ For any other build/configuration challenges, see the [C++ 
developer
 guide](https://arrow.apache.org/docs/developers/cpp/building.html) and
 `vignette("install", package = "arrow")`.
 
+### Using `remotes::install_github(...)` 
+
+If you need an Arrow installation from a specific repository or at a specific 
ref, `remotes::install_github()` should work on most platforms. This method is 
helpful if you need a full install of arrow that is separate from another 
install (e.g. we use this in 
[arrowbench](https://github.com/ursacomputing/arrowbench) to install 
development versions of arrow isolated from the system install). However there 
are some caveats to be aware of:
+
+* Setting the environment variable `FORCE_TOOLS_LIBS_SCRIPT` to `true` will 
avoid linking to any arrow libraries already installed and attempt to build 
arrow from the same source at the repository+ref given.
+* If you are using the `FORCE_TOOLS_LIBS_SCRIPT` you must also set `build = 
FALSE` in the `remotes::install_github()` call. This is similar to checking out 
the repository and calling `R CMD INSTALL .` in the `arrow/r` directory (as 
opposed to first calling `R CMD BUILD .` and then installing the tar.gz file 
that produces, which is the default for `remotes::install_github()`).

Review comment:
       I believe you always need build = FALSE, right? Otherwise it won't find 
the cpp dir to build.

##########
File path: r/README.md
##########
@@ -178,6 +178,28 @@ For any other build/configuration challenges, see the [C++ 
developer
 guide](https://arrow.apache.org/docs/developers/cpp/building.html) and
 `vignette("install", package = "arrow")`.
 
+### Using `remotes::install_github(...)` 
+
+If you need an Arrow installation from a specific repository or at a specific 
ref, `remotes::install_github()` should work on most platforms. This method is 
helpful if you need a full install of arrow that is separate from another 
install (e.g. we use this in 
[arrowbench](https://github.com/ursacomputing/arrowbench) to install 
development versions of arrow isolated from the system install). However there 
are some caveats to be aware of:

Review comment:
       > `remotes::install_github()` should work on most platforms
   
   I'm not sure how true that is. It's not true on Windows. 
   
   Either way, we definitely don't want to recommend it to users--we'd rather 
they install nightly packages if they want to test the latest. Since we want to 
deemphasize this, and since the concerns you address below (where you already 
have a C++ library built separately and installed on your system) are things 
that only developers would face, I think we should move this, perhaps to the 
install.Rmd vignette for now. We should revise the readme and vignettes more 
holistically before the release as well; Ian made a JIRA about that a while 
back.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to