Thanks for the explanation. I'll keep my eye out for a new binary soon.

On Mon, Nov 1, 2021 at 1:23 PM Neal Richardson <[email protected]>
wrote:

> Thanks for the details. I see you're using RStudio Package Manager. There
> was an issue with the binaries that RSPM built for 6.0.0.2, we've been
> discussing with them and they should be fixing it on their side, so this
> should resolve itself soon (if it isn't already resolved).
>
> Neal
>
>
> On Mon, Nov 1, 2021 at 1:36 PM Chris Berthiaume <[email protected]> wrote:
>
>> Hi Neal,
>>
>> Here's a reproducible example using a fresh Docker container for
>> bioconductor/bioconductor_docker:RELEASE_3_13. I start the container, start
>> R, install arrow, attach arrow, then try to read a simple parquet file I
>> just now created separately in Rstudio on MacOS with arrow 5.0.0. This
>> fails. I stop/start R again, install arrow 5.0.0.2 with
>> devtools::install_version(), attach, then verify that I can successfully
>> read the same parquet file.
>>
>> I've changed the R prompt character below from ">" to "$" to prevent any
>> text from being interpreted as an email reply.
>>
>> # Creating the parquet file in Rstudio in MacOS
>> $ x <- data.frame(A=seq(0, 2), B=seq(10,12))
>> $ x
>>   A  B
>> 1 0 10
>> 2 1 11
>> 3 2 12
>> $ arrow::write_parquet(x, "~/Desktop/arrowtest/x.parquet")
>>
>> # Run the test in a docker container
>> docker run -it --rm -v ~/Desktop/arrowtest:/data
>> bioconductor/bioconductor_docker:RELEASE_3_13 bash
>> root@5fa84c3f4a41:/# cd /data
>> root@5fa84c3f4a41:/data# R
>>
>> R version 4.1.1 (2021-08-10) -- "Kick Things"
>> Copyright (C) 2021 The R Foundation for Statistical Computing
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>> You are welcome to redistribute it under certain conditions.
>> Type 'license()' or 'licence()' for distribution details.
>>
>> R is a collaborative project with many contributors.
>> Type 'contributors()' for more information and
>> 'citation()' on how to cite R or R packages in publications.
>>
>> Type 'demo()' for some demos, 'help()' for on-line help, or
>> 'help.start()' for an HTML browser interface to help.
>> Type 'q()' to quit R.
>>
>> $ install.packages('arrow')
>> Installing package into ‘/usr/local/lib/R/site-library’
>> (as ‘lib’ is unspecified)
>> also installing the dependencies ‘bit’, ‘assertthat’, ‘bit64’
>>
>> trying URL '
>> https://packagemanager.rstudio.com/all/__linux__/focal/latest/src/contrib/bit_4.0.4.tar.gz
>> '
>> Content type 'binary/octet-stream' length 691644 bytes (675 KB)
>> ==================================================
>> downloaded 675 KB
>>
>> trying URL '
>> https://packagemanager.rstudio.com/all/__linux__/focal/latest/src/contrib/assertthat_0.2.1.tar.gz
>> '
>> Content type 'binary/octet-stream' length 52329 bytes (51 KB)
>> ==================================================
>> downloaded 51 KB
>>
>> trying URL '
>> https://packagemanager.rstudio.com/all/__linux__/focal/latest/src/contrib/bit64_4.0.5.tar.gz
>> '
>> Content type 'binary/octet-stream' length 573106 bytes (559 KB)
>> ==================================================
>> downloaded 559 KB
>>
>> trying URL '
>> https://packagemanager.rstudio.com/all/__linux__/focal/latest/src/contrib/arrow_6.0.0.2.tar.gz
>> '
>> Content type 'binary/octet-stream' length 23646684 bytes (22.6 MB)
>> ==================================================
>> downloaded 22.6 MB
>>
>> * installing *binary* package ‘bit’ ...
>> * DONE (bit)
>> * installing *binary* package ‘assertthat’ ...
>> * DONE (assertthat)
>> * installing *binary* package ‘bit64’ ...
>> * DONE (bit64)
>> * installing *binary* package ‘arrow’ ...
>> * DONE (arrow)
>>
>> The downloaded source packages are in
>> ‘/tmp/Rtmp8HkDvX/downloaded_packages’
>> $ library(arrow)
>> See arrow_info() for available features
>>
>> Attaching package: ‘arrow’
>>
>> The following object is masked from ‘package:utils’:
>>
>>     timestamp
>>
>> $ sessionInfo()
>> R version 4.1.1 (2021-08-10)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Ubuntu 20.04.3 LTS
>>
>> Matrix products: default
>> BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/
>> libopenblasp-r0.3.8.so
>>
>> locale:
>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C
>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] arrow_6.0.0.2
>>
>> loaded via a namespace (and not attached):
>>  [1] tidyselect_1.1.1 bit_4.0.4        compiler_4.1.1   magrittr_2.0.1
>>  [5] assertthat_0.2.1 R6_2.5.1         tools_4.1.1      glue_1.4.2
>>  [9] bit64_4.0.5      vctrs_0.3.8      rlang_0.4.11     purrr_0.3.4
>> $ read_parquet("x.parquet")
>> Error: NotImplemented: Support for codec 'snappy' not built
>> In order to read this file, you will need to reinstall arrow with
>> additional features enabled.
>> Set one of these environment variables before installing:
>>
>>  * LIBARROW_MINIMAL=false (for all optional features, including 'snappy')
>>  * ARROW_WITH_SNAPPY=ON (for just 'snappy')
>>
>> See https://arrow.apache.org/docs/r/articles/install.html for details
>>
>> root@5fa84c3f4a41:/data# R
>>
>> R version 4.1.1 (2021-08-10) -- "Kick Things"
>> Copyright (C) 2021 The R Foundation for Statistical Computing
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>> You are welcome to redistribute it under certain conditions.
>> Type 'license()' or 'licence()' for distribution details.
>>
>> R is a collaborative project with many contributors.
>> Type 'contributors()' for more information and
>> 'citation()' on how to cite R or R packages in publications.
>>
>> Type 'demo()' for some demos, 'help()' for on-line help, or
>> 'help.start()' for an HTML browser interface to help.
>> Type 'q()' to quit R.
>>
>> $ devtools::install_version("arrow", "5.0.0.2")
>> Downloading package from url:
>> https://packagemanager.rstudio.com/all/__linux__/focal/latest/src/contrib/Archive/arrow/arrow_5.0.0.2.tar.gz
>> These packages have more recent versions available.
>> It is recommended to update all of them.
>> Which would you like to update?
>>
>> 1: All
>> 2: CRAN packages only
>> 3: None
>> 4: rlang (0.4.11 -> 0.4.12) [CRAN]
>>
>> Enter one or more numbers, or an empty line to skip updates:
>> Installing package into ‘/usr/local/lib/R/site-library’
>> (as ‘lib’ is unspecified)
>> * installing *binary* package ‘arrow’ ...
>> * DONE (arrow)
>> $ library(arrow)
>>
>> Attaching package: ‘arrow’
>>
>> The following object is masked from ‘package:utils’:
>>
>>     timestamp
>>
>> $ sessionInfo()
>> R version 4.1.1 (2021-08-10)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Ubuntu 20.04.3 LTS
>>
>> Matrix products: default
>> BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/
>> libopenblasp-r0.3.8.so
>>
>> locale:
>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C
>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] arrow_5.0.0.2
>>
>> loaded via a namespace (and not attached):
>>  [1] magrittr_2.0.1    usethis_2.0.1     devtools_2.4.2
>>  tidyselect_1.1.1
>>  [5] bit_4.0.4         pkgload_1.2.2     R6_2.5.1          rlang_0.4.11
>>
>>  [9] fastmap_1.1.0     tools_4.1.1       pkgbuild_1.2.0
>>  sessioninfo_1.1.1
>> [13] cli_3.0.1         withr_2.4.2       ellipsis_0.3.2    remotes_2.4.0
>>
>> [17] bit64_4.0.5       rprojroot_2.0.2   assertthat_0.2.1
>>  lifecycle_1.0.1
>> [21] crayon_1.4.1      processx_3.5.2    purrr_0.3.4       callr_3.7.0
>>
>> [25] vctrs_0.3.8       fs_1.5.0          ps_1.6.0          testthat_3.0.4
>>
>> [29] memoise_2.0.0     glue_1.4.2        cachem_1.0.6      compiler_4.1.1
>>
>> [33] desc_1.3.0        prettyunits_1.1.1
>> $ read_parquet("x.parquet")
>>   A  B
>> 1 0 10
>> 2 1 11
>> 3 2 12
>>
>> On Mon, Nov 1, 2021 at 7:05 AM Neal Richardson <
>> [email protected]> wrote:
>>
>>> Hi Chris,
>>> Could you share the output from when you installed the package? Snappy
>>> and the other compression libraries should be on in the binaries (see
>>> https://github.com/ursa-labs/arrow-r-nightly/runs/4052316735?check_suite_focus=true#step:4:625
>>> for example), so I'm curious if there's anything in the install logs that
>>> help us understand what's up.
>>>
>>> Neal
>>>
>>> On Sun, Oct 31, 2021 at 7:06 PM Chris Berthiaume <[email protected]>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> After upgrading Arrow 5.0.0.2 to 6.0.0.2 in a Bioconductor 3.13 Docker
>>>> container, I started to see some new errors when reading Parquet files that
>>>> use snappy compression. I'm using the prebuilt Linux binary by setting
>>>> LIBARROW_BINARY=true during installation. Building arrow using the latest
>>>> nightly source fixes the issue. Is it possible the 6.0.0.2 prebuilt Linux
>>>> binary does not have snappy compression support enabled? The error is
>>>> copied below.
>>>>
>>>> Error: NotImplemented: Support for codec 'snappy' not built
>>>> In order to read this file, you will need to reinstall arrow with
>>>> additional features enabled.
>>>> Set one of these environment variables before installing:
>>>>
>>>>  * LIBARROW_MINIMAL=false (for all optional features, including
>>>> 'snappy')
>>>>  * ARROW_WITH_SNAPPY=ON (for just 'snappy')
>>>>
>>>> See https://arrow.apache.org/docs/r/articles/install.html for details
>>>> Backtrace:
>>>>  1. popcycle::get.vct.by.file(db, vct_dir,
>>>> "2018_176/2018-06-25T20-03-48+00-00") test_files.R:210:2
>>>>  4. arrow::read_parquet(...)
>>>>  5. base::tryCatch(reader$ReadTable(), error = read_compressed_error)
>>>>  6. base:::tryCatchList(expr, classes, parentenv, handlers)
>>>>  7. base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
>>>>  8. value[[3L]](cond)
>>>>
>>>> Thanks,
>>>> Chris Berthiaume
>>>>
>>>

Reply via email to