nealrichardson commented on a change in pull request #9898:
URL: https://github.com/apache/arrow/pull/9898#discussion_r613577211



##########
File path: r/vignettes/developing.Rmd
##########
@@ -0,0 +1,510 @@
+---
+title: "Arrow R Developer Guide"
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{Arrow R Developer Guide}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r setup options, include=FALSE}
+knitr::opts_chunk$set(error = TRUE, eval = FALSE)
+
+# Get environment variables describing what to evaluate
+run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true"
+macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true"
+ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true"
+sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true"
+
+# Update the source knit_hook to save the chunk (if it is marked to be saved)
+knit_hooks_source <- knitr::knit_hooks$get("source")
+knitr::knit_hooks$set(source = function(x, options) {
+  # Extra paranoia about when this will write the chunks to the script, we will
+  # only save when:
+  #   * CI is true
+  #   * RUN_DEVDOCS is true
+  #   * options$save is TRUE (and a check that not NULL won't crash it)
+  if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && 
options$save)
+    cat(x, file = "script.sh", append = TRUE, sep = "\n")
+  # but hide the blocks we want hidden:
+  if (!is.null(options$hide) && options$hide) {
+    return(NULL)
+  }
+  knit_hooks_source(x, options)
+})
+```
+
+```{bash, save=run, hide=TRUE}
+# Stop on failure, echo input as we go
+set -e
+set -x
+```
+
+If you're looking to contribute to `arrow`, this document can help you set up 
a development environment that will enable you to write code and run tests 
locally. It outlines how to build the various components that make up the Arrow 
project and R package, as well as some common troubleshooting and workflows 
developers use. Many contributions can be accomplished with the instructions in 
[R-only development](#r-only-development). But if you're working on both the 
C++ library and the R package, the [Developer environment 
setup](#-developer-environment-setup) section will guide you through setting up 
a developer environment.
+
+This document is intended only for developers of Apache Arrow or the Arrow R 
package. Users of the package in R do not need to do any of this setup. If 
you're looking for how to install Arrow, see [the instructions in the 
readme](https://arrow.apache.org/docs/r/#installation); Linux users can find 
more details on building from source at `vignette("install", package = 
"arrow")`.
+
+This document is a work in progress and will grow + change as the Apache Arrow 
project grows and changes. We have tried to make these steps as robust as 
possible (in fact, we even test exactly these instructions on our nightly CI to 
ensure they don't become stale!), but certain custom configurations might 
conflict with these instructions and there are differences of opinion across 
developers about if and what the one true way to set up development 
environments like this is.  We also solicit any feedback you have about things 
that are confusing or additions you would like to see here. Please [report an 
issue](https://issues.apache.org/jira/projects/ARROW/issues) if there you see 
anything that is confusing, odd, or just plain wrong.
+
+## R-only development
+
+Windows and macOS users who wish to contribute to the R package and
+don’t need to alter the Arrow C++ library may be able to obtain a
+recent version of the library without building from source. On macOS,
+you may install the C++ library using [Homebrew](https://brew.sh/):
+
+``` shell
+# For the released version:
+brew install apache-arrow
+# Or for a development version, you can try:
+brew install apache-arrow --HEAD
+```
+
+On Windows and Linux, you can download a .zip file with the arrow dependencies 
from the
+nightly repository,
+and then set the `RWINLIB_LOCAL` environment variable to point to that
+zip file before installing the `arrow` R package. Version numbers in that
+repository correspond to dates, and you will likely want the most recent.
+
+To see what nightlies are available, you can use Arrow's (or any other S3 
client's) S3 listing functionality to see what is in the bucket 
`s3://arrow-r-nightly/libarrow/bin`:
+
+```
+nightly <- s3_bucket("arrow-r-nightly")
+nightly$ls("libarrow/bin")
+```
+
+## Developer environment setup
+
+If you need to alter both the Arrow C++ library and the R package code, or if 
you can’t get a binary version of the latest C++ library elsewhere, you’ll need 
to build it from source too. This section discusses how to set up a C++ build 
configured to work with the R package. For more general resources, see the 
[Arrow C++ developer
+guide](https://arrow.apache.org/docs/developers/cpp/building.html).
+
+### Install dependencies {.tabset}
+
+The Arrow C++ library will by default use system dependencies if suitable 
versions are found; if they are not present, it will build them during its own 
build process. The only dependencies that one needs to install outside of the 
build process are `cmake` (for configuring the build) and `openssl` if you are 
building with S3 support.
+
+For a faster build, you may choose to install on the system more C++ library 
dependencies (such as `lz4`, `zstd`, etc.) so that they don't need to be built 
from source in the Arrow build. This is optional.
+
+#### macOS
+```{bash, save=run & macos}
+brew install cmake openssl
+```
+
+#### Ubuntu
+```{bash, save=run & ubuntu}
+sudo apt install -y cmake libcurl4-openssl-dev libssl-dev
+```
+
+### Configure the Arrow build {.tabset}
+
+You can choose to build and then install the Arrow library into a user-defined 
directory or into a system-level directory. You only need to do one of these 
two options.

Review comment:
       Let's make clear in general, regardless of approach, that there are 
effectively three steps:
   
   * configure (`cmake`), which populates a build directory with scripts etc.
   * build, which compiles arrow (including potentially downloading and 
building dependencies) and produces the resulting libraries and headers in that 
build directory
   * install, which organizes just the libraries and headers into a clean 
location, without any of the build detritus.
   
   It's this last step, where to install to, that is the key difference in the 
two approaches described here. And it requires a change or two in the 
configuring step.
   
   These steps involve three directories:
   
   * source (the cpp/ dir in arrow)
   * build (which you create)
   * install (which in one version you create, and in the other already exists 
on the system and is known to the system as the place where libraries and 
headers go)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to