jonkeane commented on a change in pull request #9743:
URL: https://github.com/apache/arrow/pull/9743#discussion_r606244216



##########
File path: r/R/util.R
##########
@@ -45,3 +45,16 @@ is_list_of <- function(object, class) {
 }
 
 empty_named_list <- function() structure(list(), .Names = character(0))
+
+read_compressed_error <- function(e) {
+  e <- as.character(e)
+  alg <- regmatches(e, gregexpr("(?<=\')(.*?)(?=\')", e, perl = TRUE))[[1]]
+  msg <- paste("Unsupported compressed format", alg,
+    "\nPlease visit https://arrow.apache.org/docs/r/articles/install.html";,
+    "\nfor an explanation about optional features such as compression 
libraries enabled.",

Review comment:
       ```suggestion
       "\nfor information about troubleshooting installation issues.",
   ```

##########
File path: r/R/util.R
##########
@@ -45,3 +45,16 @@ is_list_of <- function(object, class) {
 }
 
 empty_named_list <- function() structure(list(), .Names = character(0))
+
+read_compressed_error <- function(e) {
+  e <- as.character(e)
+  alg <- regmatches(e, gregexpr("(?<=\')(.*?)(?=\')", e, perl = TRUE))[[1]]
+  msg <- paste("Unsupported compressed format", alg,
+    "\nPlease visit https://arrow.apache.org/docs/r/articles/install.html";,
+    "\nfor an explanation about optional features such as compression 
libraries enabled.",
+    "\nSetting LIBARROW_MINIMAL=false and then building the package from 
source fixes this,",
+    sprintf("\nor building libarrow with -DARROW_WITH_%s=ON and reinstalling 
the package.",
+            toupper(alg))
+    )

Review comment:
       `stop` will `paste` these together, so you don't need to wrap this whole 
thing in `paste`.

##########
File path: r/R/feather.R
##########
@@ -154,7 +154,15 @@ read_feather <- function(file, col_select = NULL, 
as_data_frame = TRUE, ...) {
     vars_select(names(reader), !!col_select)
   }
 
-  out <- reader$Read(columns)
+  out <- tryCatch(

Review comment:
       This tryCatch + function is an improvement, but is there a reason you 
didn't include the `tryCatch` in your function instead of repeating it? 
Something like
   
   ```
   ensure_compression_available <- function(expr) {
     tryCatch(
       expr,
       error = ...
     )
   }
   
   ensure_compression_available(reader$Read(columns))
   ```
   
   It would also be nice to have a slightly more clear name for what the 
function is doing (like `ensure_compression_available`) instead of 
`ensure_compression_available`.

##########
File path: r/R/util.R
##########
@@ -45,3 +45,16 @@ is_list_of <- function(object, class) {
 }
 
 empty_named_list <- function() structure(list(), .Names = character(0))
+
+read_compressed_error <- function(e) {
+  e <- as.character(e)
+  alg <- regmatches(e, gregexpr("(?<=\')(.*?)(?=\')", e, perl = TRUE))[[1]]
+  msg <- paste("Unsupported compressed format", alg,
+    "\nPlease visit https://arrow.apache.org/docs/r/articles/install.html";,
+    "\nfor an explanation about optional features such as compression 
libraries enabled.",
+    "\nSetting LIBARROW_MINIMAL=false and then building the package from 
source fixes this,",
+    sprintf("\nor building libarrow with -DARROW_WITH_%s=ON and reinstalling 
the package.",

Review comment:
       These two lines should be:
   
   "Try setting the environment variable LIBARROW_MINIMAL=false and 
reinstalling for a more complete installation (including {compression}) or 
setting ARROW_WITH_{compression}=ON and reinstalling to enable support for this 
codec."
   
   Note that for everyone who is not a developer in this situation, they will 
be end up triggering `inst/build_arrow_static.sh` which includes a line that 
[checks that environment variable specifically to enable the correct cmake 
flag](https://github.com/apache/arrow/blob/master/r/inst/build_arrow_static.sh#L63).
 We want to make this as easy as possible for someone so we want to recommend 
they do that instead of asking them to use cmake themselves, learn about all 
the flags, etc. 
   
   You should try this out yourself by setting `ARROW_WITH_LZ4=ON` and then 
using `install.packages("arrow")` and then checking that lz4 is enabled (or any 
of the other flags that we turn on/off for our minimal builds. 

##########
File path: r/R/util.R
##########
@@ -45,3 +45,16 @@ is_list_of <- function(object, class) {
 }
 
 empty_named_list <- function() structure(list(), .Names = character(0))
+
+read_compressed_error <- function(e) {
+  e <- as.character(e)
+  alg <- regmatches(e, gregexpr("(?<=\')(.*?)(?=\')", e, perl = TRUE))[[1]]
+  msg <- paste("Unsupported compressed format", alg,
+    "\nPlease visit https://arrow.apache.org/docs/r/articles/install.html";,
+    "\nfor an explanation about optional features such as compression 
libraries enabled.",
+    "\nSetting LIBARROW_MINIMAL=false and then building the package from 
source fixes this,",
+    sprintf("\nor building libarrow with -DARROW_WITH_%s=ON and reinstalling 
the package.",
+            toupper(alg))
+    )
+  stop(msg)

Review comment:
       Could you add `call. = FALSE` here so the call isn't printed — I don't 
think it will be particularly useful here. 

##########
File path: r/tests/testthat/test-parquet.R
##########
@@ -234,3 +234,16 @@ test_that("ParquetFileReader $ReadRowGroup(s) methods", {
   expect_true(reader$ReadRowGroups(c(0, 1), 0) == Table$create(x = 1:20))
   expect_error(reader$ReadRowGroups(c(0, 1), 1))
 })
+
+test_that("Error messages are shown when the compression algorithm lz4/snappy
+          is not found", {
+  if (codec_is_available("snappy")) {
+    d <- read_parquet(pq_file)
+    expect_is(d, "data.frame")
+  } else {
+    expect_error(
+      read_parquet(pq_file),
+      "Unsupported compressed format"
+    )

Review comment:
       I don't think that the `else` branch of this test will ever be 
exercised. On our minimal build test we disable (among other things) both 
[snappy](https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=2724&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=6c939d89-0d1a-51f2-8b30-091a7a82e98c&l=309)
 and 
[parquet](https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=2724&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=6c939d89-0d1a-51f2-8b30-091a7a82e98c&l=248)
 on that build. And then also note that [this entire file is skipped if parquet 
is not 
available.](https://github.com/apache/arrow/blob/master/r/tests/testthat/test-parquet.R#L18).
   
   Though, if this test did run in an environment where both parquet and snappy 
are disabled, it shouldn't be caught and talk about compression codecs, it 
should mention that parquet support was not built instead (see [my comment 
above](https://github.com/apache/arrow/pull/9743/files#r605862157) about that).
   

##########
File path: r/tests/testthat/test-feather.R
##########
@@ -196,3 +196,18 @@ test_that("Character vectors > 2GB can write to feather", {
 })
 
 unlink(feather_file)
+
+ft_file <- system.file("v0.7.1.feather", package = "arrow")

Review comment:
       It looks like you added this feather file? That's shouldn't actually be 
necessary. We have a number of feather files with different compression codecs 
at https://github.com/apache/arrow/tree/master/r/tests/testthat/golden-files. 
Please use one those if possible.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to