nealrichardson commented on a change in pull request #9898: URL: https://github.com/apache/arrow/pull/9898#discussion_r610767751
########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup + +If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too. + +First, install the C++ library. See the [developer +guide](https://arrow.apache.org/docs/developers/cpp/building.html) for details. + +### Install dependencies {.tabset} Review comment: what is `{.tabset}`? ########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup + +If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too. + +First, install the C++ library. See the [developer +guide](https://arrow.apache.org/docs/developers/cpp/building.html) for details. + +### Install dependencies {.tabset} + +These dependencies are technically only needed for S3 support(?) Review comment: Yes, only S3 support. Otherwise Arrow doesn't require any other system dependencies (beyond what R itself requires)--however, if you're providing a build script that you want a dev to run separately, probably should have them install `cmake`. ########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup + +If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too. + +First, install the C++ library. See the [developer +guide](https://arrow.apache.org/docs/developers/cpp/building.html) for details. + +### Install dependencies {.tabset} + +These dependencies are technically only needed for S3 support(?) + +#### macOS +```{bash, save=run & macos} +brew install openssl +``` + +#### ubuntu +```{bash, save=run & ubuntu} +sudo apt install -y libcurl4-openssl-dev libssl-dev +``` + + +### Building Arrow {.tabset} + +It’s recommended to make a build directory inside of the cpp directory of the Arrow git repository (it is git-ignored). Assuming you are inside cpp/build, you’ll first call cmake to configure the build and then make install. For the R package, you’ll need to enable several features in the C++ library using -D flags: + +#### Installing to another directory + +If you would like to use arrow from an alternative directory + +```{bash, save=run & !sys_install} +export ARROW_HOME=$(pwd)/dist +export LD_LIBRARY_PATH=$(pwd)/dist/lib +export INCLUDE_DIR=$ARROW_HOME/include +export LIB_DIR=$ARROW_HOME/lib +``` + +TODO: how to add these vars separated by colons if they have other things in them already (eg LD_LIBRARY_PATH=$(pwd)/dist/lib:$LD_LIBRARY_PATH) + +On linux, you will need to set `LD_LIBRARY_PATH` to this same directory before launching R and using Arrow. One way to do this is to add it to your profile. # TODO: do we want to recommend this? Is there any other alternative? + +```{bash, save=run & ubuntu & !sys_install} +echo "export LD_LIBRARY_PATH=$(pwd)/dist/lib" >> ~/.bash_profile +``` + +```{bash, save=run} +cd arrow/cpp +mkdir build +cd build +``` + +```{bash, save=run & !sys_install} +mkdir $ARROW_HOME + +cmake \ + -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ + -DCMAKE_INSTALL_LIBDIR=lib \ + -DARROW_COMPUTE=ON \ + -DARROW_CSV=ON \ + -DARROW_DATASET=ON \ + -DARROW_FILESYSTEM=ON \ + -DARROW_JEMALLOC=ON \ + -DARROW_JSON=ON \ + -DARROW_PARQUET=ON \ + -DCMAKE_BUILD_TYPE=release \ + -DARROW_WITH_SNAPPY=ON \ + -DARROW_WITH_ZLIB=ON \ + -DARROW_INSTALL_NAME_RPATH=OFF \ + -DThrift_SOURCE=BUNDLED \ Review comment: I would leave this off and suggest it later in a troubleshooting section ########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup + +If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too. + +First, install the C++ library. See the [developer +guide](https://arrow.apache.org/docs/developers/cpp/building.html) for details. + +### Install dependencies {.tabset} + +These dependencies are technically only needed for S3 support(?) + +#### macOS +```{bash, save=run & macos} +brew install openssl +``` + +#### ubuntu +```{bash, save=run & ubuntu} +sudo apt install -y libcurl4-openssl-dev libssl-dev +``` + + +### Building Arrow {.tabset} + +It’s recommended to make a build directory inside of the cpp directory of the Arrow git repository (it is git-ignored). Assuming you are inside cpp/build, you’ll first call cmake to configure the build and then make install. For the R package, you’ll need to enable several features in the C++ library using -D flags: + +#### Installing to another directory + +If you would like to use arrow from an alternative directory + +```{bash, save=run & !sys_install} +export ARROW_HOME=$(pwd)/dist +export LD_LIBRARY_PATH=$(pwd)/dist/lib +export INCLUDE_DIR=$ARROW_HOME/include +export LIB_DIR=$ARROW_HOME/lib Review comment: We could add handling to `r/configure` that looks for `$ARROW_HOME` and sets these accordingly. One downside to setting INCLUDE_DIR and LIB_DIR to the arrow install dir is that some other packages' configure scripts use the same env vars, and (e.g.) curl's headers/libs are not in those directories. So it would be better to have an arrow-specific var IME. ########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup Review comment: ```suggestion ## Developer environment setup ``` ########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup + +If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too. + +First, install the C++ library. See the [developer +guide](https://arrow.apache.org/docs/developers/cpp/building.html) for details. + +### Install dependencies {.tabset} + +These dependencies are technically only needed for S3 support(?) + +#### macOS +```{bash, save=run & macos} +brew install openssl +``` + +#### ubuntu +```{bash, save=run & ubuntu} +sudo apt install -y libcurl4-openssl-dev libssl-dev +``` + + +### Building Arrow {.tabset} + +It’s recommended to make a build directory inside of the cpp directory of the Arrow git repository (it is git-ignored). Assuming you are inside cpp/build, you’ll first call cmake to configure the build and then make install. For the R package, you’ll need to enable several features in the C++ library using -D flags: Review comment: ```suggestion It’s recommended to make a `build` directory inside of the `cpp` directory of the Arrow git repository (it is git-ignored). Assuming you are inside `cpp/build`, you’ll first call `cmake` to configure the build and then `make install`. For the R package, you’ll need to enable several features in the C++ library using `-D` flags: ``` `make install` is correct because the default generator for cmake is `Unix Makefiles`. You aren't (presumably) going to get into using alternative generators here (ninja etc.), but if you wanted to have the command that will work regardless of whether you use ninja, msvc, etc., it would be `cmake --build . --target install` ########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup + +If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too. + +First, install the C++ library. See the [developer +guide](https://arrow.apache.org/docs/developers/cpp/building.html) for details. + +### Install dependencies {.tabset} + +These dependencies are technically only needed for S3 support(?) + +#### macOS +```{bash, save=run & macos} +brew install openssl +``` + +#### ubuntu +```{bash, save=run & ubuntu} +sudo apt install -y libcurl4-openssl-dev libssl-dev +``` + + +### Building Arrow {.tabset} + +It’s recommended to make a build directory inside of the cpp directory of the Arrow git repository (it is git-ignored). Assuming you are inside cpp/build, you’ll first call cmake to configure the build and then make install. For the R package, you’ll need to enable several features in the C++ library using -D flags: + +#### Installing to another directory + +If you would like to use arrow from an alternative directory + +```{bash, save=run & !sys_install} +export ARROW_HOME=$(pwd)/dist Review comment: I think you need to explain where "dist" comes from (and note that it is arbitrary, you can call it whatever you want) ########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup + +If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too. + +First, install the C++ library. See the [developer +guide](https://arrow.apache.org/docs/developers/cpp/building.html) for details. + +### Install dependencies {.tabset} + +These dependencies are technically only needed for S3 support(?) + +#### macOS +```{bash, save=run & macos} +brew install openssl +``` + +#### ubuntu +```{bash, save=run & ubuntu} +sudo apt install -y libcurl4-openssl-dev libssl-dev +``` + + +### Building Arrow {.tabset} + +It’s recommended to make a build directory inside of the cpp directory of the Arrow git repository (it is git-ignored). Assuming you are inside cpp/build, you’ll first call cmake to configure the build and then make install. For the R package, you’ll need to enable several features in the C++ library using -D flags: + +#### Installing to another directory + +If you would like to use arrow from an alternative directory + +```{bash, save=run & !sys_install} +export ARROW_HOME=$(pwd)/dist +export LD_LIBRARY_PATH=$(pwd)/dist/lib +export INCLUDE_DIR=$ARROW_HOME/include +export LIB_DIR=$ARROW_HOME/lib +``` + +TODO: how to add these vars separated by colons if they have other things in them already (eg LD_LIBRARY_PATH=$(pwd)/dist/lib:$LD_LIBRARY_PATH) + +On linux, you will need to set `LD_LIBRARY_PATH` to this same directory before launching R and using Arrow. One way to do this is to add it to your profile. # TODO: do we want to recommend this? Is there any other alternative? Review comment: I think you also want to explain why you wouldn't need to do this on macOS (something something rpath). Alternatives: probably setting something in R's Makevars ########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup + +If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too. + +First, install the C++ library. See the [developer +guide](https://arrow.apache.org/docs/developers/cpp/building.html) for details. + +### Install dependencies {.tabset} + +These dependencies are technically only needed for S3 support(?) + +#### macOS +```{bash, save=run & macos} +brew install openssl +``` + +#### ubuntu +```{bash, save=run & ubuntu} +sudo apt install -y libcurl4-openssl-dev libssl-dev +``` + + +### Building Arrow {.tabset} + +It’s recommended to make a build directory inside of the cpp directory of the Arrow git repository (it is git-ignored). Assuming you are inside cpp/build, you’ll first call cmake to configure the build and then make install. For the R package, you’ll need to enable several features in the C++ library using -D flags: + +#### Installing to another directory + +If you would like to use arrow from an alternative directory + +```{bash, save=run & !sys_install} +export ARROW_HOME=$(pwd)/dist +export LD_LIBRARY_PATH=$(pwd)/dist/lib +export INCLUDE_DIR=$ARROW_HOME/include +export LIB_DIR=$ARROW_HOME/lib +``` + +TODO: how to add these vars separated by colons if they have other things in them already (eg LD_LIBRARY_PATH=$(pwd)/dist/lib:$LD_LIBRARY_PATH) Review comment: Yes, you definitely don't want to overwrite LD_LIBRARY_PATH like your code currently does, I think that can cause unexpected breakage. ########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup + +If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too. + +First, install the C++ library. See the [developer +guide](https://arrow.apache.org/docs/developers/cpp/building.html) for details. + +### Install dependencies {.tabset} + +These dependencies are technically only needed for S3 support(?) + +#### macOS +```{bash, save=run & macos} +brew install openssl +``` + +#### ubuntu +```{bash, save=run & ubuntu} +sudo apt install -y libcurl4-openssl-dev libssl-dev +``` + + +### Building Arrow {.tabset} + +It’s recommended to make a build directory inside of the cpp directory of the Arrow git repository (it is git-ignored). Assuming you are inside cpp/build, you’ll first call cmake to configure the build and then make install. For the R package, you’ll need to enable several features in the C++ library using -D flags: + +#### Installing to another directory + +If you would like to use arrow from an alternative directory + +```{bash, save=run & !sys_install} +export ARROW_HOME=$(pwd)/dist +export LD_LIBRARY_PATH=$(pwd)/dist/lib +export INCLUDE_DIR=$ARROW_HOME/include +export LIB_DIR=$ARROW_HOME/lib +``` + +TODO: how to add these vars separated by colons if they have other things in them already (eg LD_LIBRARY_PATH=$(pwd)/dist/lib:$LD_LIBRARY_PATH) + +On linux, you will need to set `LD_LIBRARY_PATH` to this same directory before launching R and using Arrow. One way to do this is to add it to your profile. # TODO: do we want to recommend this? Is there any other alternative? + +```{bash, save=run & ubuntu & !sys_install} +echo "export LD_LIBRARY_PATH=$(pwd)/dist/lib" >> ~/.bash_profile +``` + +```{bash, save=run} +cd arrow/cpp +mkdir build +cd build +``` + +```{bash, save=run & !sys_install} +mkdir $ARROW_HOME + +cmake \ + -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ + -DCMAKE_INSTALL_LIBDIR=lib \ + -DARROW_COMPUTE=ON \ + -DARROW_CSV=ON \ + -DARROW_DATASET=ON \ + -DARROW_FILESYSTEM=ON \ + -DARROW_JEMALLOC=ON \ + -DARROW_JSON=ON \ + -DARROW_PARQUET=ON \ + -DCMAKE_BUILD_TYPE=release \ + -DARROW_WITH_SNAPPY=ON \ + -DARROW_WITH_ZLIB=ON \ + -DARROW_INSTALL_NAME_RPATH=OFF \ + -DThrift_SOURCE=BUNDLED \ + .. +``` + +#### Installing to the system Review comment: I would lead with this one and then what you're doing is adding `-DCMAKE_INSTALL_PREFIX=$ARROW_HOME` ########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup + +If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too. + +First, install the C++ library. See the [developer +guide](https://arrow.apache.org/docs/developers/cpp/building.html) for details. + +### Install dependencies {.tabset} + +These dependencies are technically only needed for S3 support(?) + +#### macOS +```{bash, save=run & macos} +brew install openssl +``` + +#### ubuntu +```{bash, save=run & ubuntu} +sudo apt install -y libcurl4-openssl-dev libssl-dev +``` + + +### Building Arrow {.tabset} + +It’s recommended to make a build directory inside of the cpp directory of the Arrow git repository (it is git-ignored). Assuming you are inside cpp/build, you’ll first call cmake to configure the build and then make install. For the R package, you’ll need to enable several features in the C++ library using -D flags: + +#### Installing to another directory + +If you would like to use arrow from an alternative directory + +```{bash, save=run & !sys_install} +export ARROW_HOME=$(pwd)/dist +export LD_LIBRARY_PATH=$(pwd)/dist/lib Review comment: ```suggestion export LD_LIBRARY_PATH="${ARROW_HOME}/lib:$LD_LIBRARY_PATH" ``` ########## File path: r/vignettes/dev-docs.Rmd ########## @@ -0,0 +1,349 @@ +--- +title: "Arrow R Package Developer Documentation" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Arrow R Package Developer Documentation} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r setup options, include=FALSE} +knitr::opts_chunk$set(error = TRUE, eval = FALSE) + +# Get environment variables describing what to evaluate +run <- tolower(Sys.getenv("RUN_DEVDOCS", "false")) == "true" +macos <- tolower(Sys.getenv("DEVDOCS_MACOS", "false")) == "true" +ubuntu <- tolower(Sys.getenv("DEVDOCS_UBUNTU", "false")) == "true" +sys_install <- tolower(Sys.getenv("DEVDOCS_SYSTEM_INSTALL", "false")) == "true" + +# Update the source knit_hook to save the chunk (if it is marked to be saved) +knit_hooks_source <- knitr::knit_hooks$get("source") +knitr::knit_hooks$set(source = function(lines, options) { + # Extra paranoia about when this will write the chunks to the script, we will + # only save when: + # * CI is true + # * RUN_DEVDOCS is true + # * options$save is TRUE (and a check that not NULL won't crash it) + if (as.logical(Sys.getenv("CI", FALSE)) && run && !is.null(options$save) && options$save) + cat(lines, file = "script.sh", append = TRUE, sep = "\n") + NULL +}) +``` + +```{bash, save=run} +# Stop on failure, echo input as we go +set -e +set -x +``` + +## R-only development + +Windows and macOS users who wish to contribute to the R package and +don’t need to alter the Arrow C++ library may be able to obtain a +recent version of the library without building from source. On macOS, +you may install the C++ library using [Homebrew](https://brew.sh/): + +``` shell +# For the released version: +brew install apache-arrow +# Or for a development version, you can try: +brew install apache-arrow --HEAD +``` + +On Windows, you can download a .zip file with the arrow dependencies from the +[nightly repository](https://arrow-r-nightly.s3.amazonaws.com/libarrow/bin/windows/), +and then set the `RWINLIB_LOCAL` environment variable to point to that +zip file before installing the `arrow` R package. Version numbers in that +repository correspond to dates, and you will likely want the most recent. + +## Developer envorinment setup + +If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too. + +First, install the C++ library. See the [developer +guide](https://arrow.apache.org/docs/developers/cpp/building.html) for details. + +### Install dependencies {.tabset} + +These dependencies are technically only needed for S3 support(?) + +#### macOS +```{bash, save=run & macos} +brew install openssl +``` + +#### ubuntu +```{bash, save=run & ubuntu} +sudo apt install -y libcurl4-openssl-dev libssl-dev +``` + + +### Building Arrow {.tabset} + +It’s recommended to make a build directory inside of the cpp directory of the Arrow git repository (it is git-ignored). Assuming you are inside cpp/build, you’ll first call cmake to configure the build and then make install. For the R package, you’ll need to enable several features in the C++ library using -D flags: + +#### Installing to another directory + +If you would like to use arrow from an alternative directory + +```{bash, save=run & !sys_install} +export ARROW_HOME=$(pwd)/dist +export LD_LIBRARY_PATH=$(pwd)/dist/lib +export INCLUDE_DIR=$ARROW_HOME/include +export LIB_DIR=$ARROW_HOME/lib +``` + +TODO: how to add these vars separated by colons if they have other things in them already (eg LD_LIBRARY_PATH=$(pwd)/dist/lib:$LD_LIBRARY_PATH) + +On linux, you will need to set `LD_LIBRARY_PATH` to this same directory before launching R and using Arrow. One way to do this is to add it to your profile. # TODO: do we want to recommend this? Is there any other alternative? + +```{bash, save=run & ubuntu & !sys_install} +echo "export LD_LIBRARY_PATH=$(pwd)/dist/lib" >> ~/.bash_profile +``` + +```{bash, save=run} +cd arrow/cpp +mkdir build +cd build +``` + +```{bash, save=run & !sys_install} +mkdir $ARROW_HOME + +cmake \ + -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ + -DCMAKE_INSTALL_LIBDIR=lib \ + -DARROW_COMPUTE=ON \ + -DARROW_CSV=ON \ + -DARROW_DATASET=ON \ + -DARROW_FILESYSTEM=ON \ + -DARROW_JEMALLOC=ON \ + -DARROW_JSON=ON \ + -DARROW_PARQUET=ON \ + -DCMAKE_BUILD_TYPE=release \ + -DARROW_WITH_SNAPPY=ON \ + -DARROW_WITH_ZLIB=ON \ + -DARROW_INSTALL_NAME_RPATH=OFF \ + -DThrift_SOURCE=BUNDLED \ + .. +``` + +#### Installing to the system + +If you would like to install Arrow as a system library you can + +```{bash, save=run & sys_install} +cmake \ + -DARROW_COMPUTE=ON \ + -DARROW_CSV=ON \ + -DARROW_DATASET=ON \ + -DARROW_FILESYSTEM=ON \ + -DARROW_JEMALLOC=ON \ + -DARROW_JSON=ON \ + -DARROW_PARQUET=ON \ + -DCMAKE_BUILD_TYPE=release \ Review comment: I'd leave this one off of the main list too; release is default. You can suggest "debug" or "relwithdebinfo" as options for later (you don't want them normally because they're much slower, but sometimes you need them if you're debugging) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org