[jira] [Resolved] (ARROW-14206) [Go] Fix Build for ARM and s390x
[ https://issues.apache.org/jira/browse/ARROW-14206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Topol resolved ARROW-14206. --- Resolution: Fixed Issue resolved by pull request 11299 [https://github.com/apache/arrow/pull/11299] > [Go] Fix Build for ARM and s390x > > > Key: ARROW-14206 > URL: https://issues.apache.org/jira/browse/ARROW-14206 > Project: Apache Arrow > Issue Type: Bug > Components: Continuous Integration, Go >Affects Versions: 6.0.0 >Reporter: Matthew Topol >Assignee: Matthew Topol >Priority: Major > Labels: pull-request-available > Fix For: 6.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-14210) [C++] CMAKE_AR is not passed to bzip2 thirdparty dependency
Karl Dunkle Werner created ARROW-14210: -- Summary: [C++] CMAKE_AR is not passed to bzip2 thirdparty dependency Key: ARROW-14210 URL: https://issues.apache.org/jira/browse/ARROW-14210 Project: Apache Arrow Issue Type: Bug Components: C++ Affects Versions: 5.0.0 Reporter: Karl Dunkle Werner It seems like the {{AR}} or {{CMAKE_AR}} variables aren't getting passed for the bzip2 build, which causes if to fail if we're doing a {{BUNDLED}} build and {{ar}} isn't available in the {{$PATH}} (e.g. in a conda environment). To replicate: 1. Download Arrow and start an interactive shell in a container (docker should be fine if you prefer it to podman) {code:sh} git clone --depth 1 g...@github.com:apache/arrow.git podman run -it --rm -v ./arrow:/arrow:Z docker://ursalab/amd64-ubuntu-18.04-conda-python-3.6:worker bash {code} 2. Build Arrow by running this in in the container: {code:sh} export ARROW_BUILD_TOOLCHAIN=$CONDA_PREFIX export ARROW_HOME=$CONDA_PREFIX export PARQUET_HOME=$CONDA_PREFIX cd /arrow mkdir -p cpp/build pushd cpp/build cmake \ -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \ -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ -DCMAKE_AR=${AR} \ -DCMAKE_RANLIB=${RANLIB} \ -DARROW_WITH_BZ2=ON \ -DARROW_VERBOSE_THIRDPARTY_BUILD=ON \ -DARROW_JEMALLOC=OFF \ -DARROW_SIMD_LEVEL=NONE -DARROW_RUNTIME_SIMD_LEVEL=NONE \ -DARROW_DEPENDENCY_SOURCE=BUNDLED \ .. make # make[3]: ar: No such file or directory # make[3]: *** [Makefile:48: libbz2.a] Error 127 # make[2]: *** [CMakeFiles/bzip2_ep.dir/build.make:135: bzip2_ep-prefix/src/bzip2_ep-stamp/bzip2_ep-build] Error 2 # make[1]: *** [CMakeFiles/Makefile2:726: CMakeFiles/bzip2_ep.dir/all] Error 2 {code} In the cmake call above, {{ARROW_JEMALLOC}} and the SIMD flags are just to skip compiling irrelevant things. I think this line in {{ThirdpartyToolchain.cmake}} needs to be changed to pass {{CMAKE_AR}}. [https://github.com/apache/arrow/blob/bad8824d5cda0fd8337c7167729c49af868f93a5/cpp/cmake_modules/ThirdpartyToolchain.cmake#L2211] Other related issues have also needed to pass {{CMAKE_RANLIB}}, in addition to {{CMAKE_AR}}. I'm not sure if that applies here. Related: ARROW-4471, ARROW-4831 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-14188) link error on ubuntu
[ https://issues.apache.org/jira/browse/ARROW-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423725#comment-17423725 ] Kouhei Sutou commented on ARROW-14188: -- [~icook] Could you confirm this vcpkg related problem? It seems that {{libarrow_bundled_dependencies.a}} isn't linked automatically. > link error on ubuntu > > > Key: ARROW-14188 > URL: https://issues.apache.org/jira/browse/ARROW-14188 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 4.0.0, 5.0.0 > Environment: Ubuntu 18.04, gcc-9, and vcpkg installation of arrow >Reporter: Amir Ghamarian >Priority: Major > Attachments: completerr.txt, linkerror.txt > > > I used vcpkg to install arrow versions 4 and 5, trying to build my code that > uses parquet fails by giving link errors of undefined reference. > The same code works on OSX but fails on ubuntu. > My cmake snippet is as follows: > > {code:java} > find_package(Arrow CONFIG REQUIRED) > get_filename_component(MY_SEARCH_DIR ${Arrow_CONFIG} DIRECTORY) > find_package(Parquet CONFIG REQUIRED PATHS ${MY_SEARCH_DIR}) > find_package(Thrift CONFIG REQUIRED) > {code} > and the linking: > > {code:java} > target_link_libraries(vision_obj PUBLIC thrift::thrift re2::re2 > arrow_static parquet_static ) > {code} > > I get a lot of errors > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (ARROW-14188) link error on ubuntu
[ https://issues.apache.org/jira/browse/ARROW-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou updated ARROW-14188: - Comment: was deleted (was: [~icook] Could you confirm this vcpkg related problem? It seems that {{libarrow_bundled_dependencies.a}} isn't linked automatically.) > link error on ubuntu > > > Key: ARROW-14188 > URL: https://issues.apache.org/jira/browse/ARROW-14188 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 4.0.0, 5.0.0 > Environment: Ubuntu 18.04, gcc-9, and vcpkg installation of arrow >Reporter: Amir Ghamarian >Priority: Major > Attachments: completerr.txt, linkerror.txt > > > I used vcpkg to install arrow versions 4 and 5, trying to build my code that > uses parquet fails by giving link errors of undefined reference. > The same code works on OSX but fails on ubuntu. > My cmake snippet is as follows: > > {code:java} > find_package(Arrow CONFIG REQUIRED) > get_filename_component(MY_SEARCH_DIR ${Arrow_CONFIG} DIRECTORY) > find_package(Parquet CONFIG REQUIRED PATHS ${MY_SEARCH_DIR}) > find_package(Thrift CONFIG REQUIRED) > {code} > and the linking: > > {code:java} > target_link_libraries(vision_obj PUBLIC thrift::thrift re2::re2 > arrow_static parquet_static ) > {code} > > I get a lot of errors > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (ARROW-14188) link error on ubuntu
[ https://issues.apache.org/jira/browse/ARROW-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423722#comment-17423722 ] Kouhei Sutou edited comment on ARROW-14188 at 10/3/21, 8:53 PM: [~icook] Could you confirm this vcpkg related problem? It seems that {{libarrow_bundled_dependencies.a}} isn't linked automatically. was (Author: kou): [~ianmcook] Could you confirm this vcpkg related problem? It seems that {{libarrow_bundled_dependencies.a}} isn't linked automatically. > link error on ubuntu > > > Key: ARROW-14188 > URL: https://issues.apache.org/jira/browse/ARROW-14188 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 4.0.0, 5.0.0 > Environment: Ubuntu 18.04, gcc-9, and vcpkg installation of arrow >Reporter: Amir Ghamarian >Priority: Major > Attachments: completerr.txt, linkerror.txt > > > I used vcpkg to install arrow versions 4 and 5, trying to build my code that > uses parquet fails by giving link errors of undefined reference. > The same code works on OSX but fails on ubuntu. > My cmake snippet is as follows: > > {code:java} > find_package(Arrow CONFIG REQUIRED) > get_filename_component(MY_SEARCH_DIR ${Arrow_CONFIG} DIRECTORY) > find_package(Parquet CONFIG REQUIRED PATHS ${MY_SEARCH_DIR}) > find_package(Thrift CONFIG REQUIRED) > {code} > and the linking: > > {code:java} > target_link_libraries(vision_obj PUBLIC thrift::thrift re2::re2 > arrow_static parquet_static ) > {code} > > I get a lot of errors > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-14188) link error on ubuntu
[ https://issues.apache.org/jira/browse/ARROW-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423722#comment-17423722 ] Kouhei Sutou commented on ARROW-14188: -- [~ianmcook] Could you confirm this vcpkg related problem? It seems that {{libarrow_bundled_dependencies.a}} isn't linked automatically. > link error on ubuntu > > > Key: ARROW-14188 > URL: https://issues.apache.org/jira/browse/ARROW-14188 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 4.0.0, 5.0.0 > Environment: Ubuntu 18.04, gcc-9, and vcpkg installation of arrow >Reporter: Amir Ghamarian >Priority: Major > Attachments: completerr.txt, linkerror.txt > > > I used vcpkg to install arrow versions 4 and 5, trying to build my code that > uses parquet fails by giving link errors of undefined reference. > The same code works on OSX but fails on ubuntu. > My cmake snippet is as follows: > > {code:java} > find_package(Arrow CONFIG REQUIRED) > get_filename_component(MY_SEARCH_DIR ${Arrow_CONFIG} DIRECTORY) > find_package(Parquet CONFIG REQUIRED PATHS ${MY_SEARCH_DIR}) > find_package(Thrift CONFIG REQUIRED) > {code} > and the linking: > > {code:java} > target_link_libraries(vision_obj PUBLIC thrift::thrift re2::re2 > arrow_static parquet_static ) > {code} > > I get a lot of errors > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14200) [R] strftime on a date should not use or be confused by timezones
[ https://issues.apache.org/jira/browse/ARROW-14200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-14200: --- Labels: pull-request-available (was: ) > [R] strftime on a date should not use or be confused by timezones > - > > Key: ARROW-14200 > URL: https://issues.apache.org/jira/browse/ARROW-14200 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Jonathan Keane >Assignee: Jonathan Keane >Priority: Major > Labels: pull-request-available > Fix For: 7.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > When the input to {{strftime}} is a date, timezones shouldn't be necessary or > assumed. > What I think is going on below is the date 1992-01-01 is being interpreted as > 1992-01-01 00:00:00 in UTC, and then when {{strftime()}} is being called it's > displaying that timestamp as 1991-12-31 ... (since my system is set to an > after UTC timezone), and then taking the year out of it. If I specify {{tz = > "utc"}} in the {{strftime()}}, I get the expected result (though that > shouldn't be necessary). > Run in the US central timezone: > {code} > library(arrow, warn.conflicts = FALSE) > library(dplyr, warn.conflicts = FALSE) > library(lubridate, warn.conflicts = FALSE) > Table$create( > data.frame( > x = as.Date("1992-01-01") > ) > ) %>% > mutate( > as_int_strftime = as.integer(strftime(x, "%Y")), > strftime = strftime(x, "%Y"), > as_int_strftime_utc = as.integer(strftime(x, "%Y", tz = "UTC")), > strftime_utc = strftime(x, "%Y", tz = "UTC"), > year = year(x) > ) %>% > collect() > #>x as_int_strftime strftime as_int_strftime_utc strftime_utc year > #> 1 1992-01-011991 19911992 1992 1992 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-13588) [R] Empty character attributes not stored
[ https://issues.apache.org/jira/browse/ARROW-13588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-13588: --- Assignee: Neal Richardson > [R] Empty character attributes not stored > - > > Key: ARROW-13588 > URL: https://issues.apache.org/jira/browse/ARROW-13588 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 5.0.0 > Environment: Ubuntu 20.04 R 4.1 release >Reporter: Charlie Gao >Assignee: Neal Richardson >Priority: Critical > Labels: attributes, feather > Fix For: 6.0.0 > > > Date-times in the POSIXct format have a 'tzone' attribute that by default is > set to "", an empty character vector (not NULL) when created. > This however is not stored in the Arrow feather file. When the file is read > back, the original and restored dataframes are not identical as per the below > reprex. > I am thinking that this should not be the intention? My workaround at the > moment is making a check when reading back to write the empty string if the > tzone attribute does not exist. > Just to confirm, the attribute is stored correctly when it is not empty. > Thanks. > {code:java} > ``` r > dates <- as.POSIXct(c("2020-01-01", "2020-01-02", "2020-01-02")) > attributes(dates) > #> $class > #> [1] "POSIXct" "POSIXt" > #> > #> $tzone > #> [1] "" > values <- c(1:3) > original <- data.frame(dates, values) > original > #> dates values > #> 1 2020-01-01 1 > #> 2 2020-01-02 2 > #> 3 2020-01-02 3 > tempfile <- tempfile() > arrow::write_feather(original, tempfile) > restored <- arrow::read_feather(tempfile) > identical(original, restored) > #> [1] FALSE > waldo::compare(original, restored) > #> `attr(old$dates, 'tzone')` is a character vector ('') > #> `attr(new$dates, 'tzone')` is absent > unlink(tempfile) > ``` > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14085) [R] Expose null placement option through sort bindings
[ https://issues.apache.org/jira/browse/ARROW-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-14085: Fix Version/s: (was: 6.0.0) 7.0.0 > [R] Expose null placement option through sort bindings > -- > > Key: ARROW-14085 > URL: https://issues.apache.org/jira/browse/ARROW-14085 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Ian Cook >Assignee: Ian Cook >Priority: Major > Labels: kernel > Fix For: 7.0.0 > > > ARROW-12063 added a null placement option to the sort kernels and to > {{OrderBySinkNode}} in the C++ library. Expose this through the R bindings. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14071) [R] Try to arrow_eval user-defined functions
[ https://issues.apache.org/jira/browse/ARROW-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-14071: Fix Version/s: (was: 6.0.0) 7.0.0 > [R] Try to arrow_eval user-defined functions > > > Key: ARROW-14071 > URL: https://issues.apache.org/jira/browse/ARROW-14071 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Fix For: 7.0.0 > > > The first test passes but the second one fails, even though they're > equivalent. The user's function isn't being evaluated in the nse_funcs > environment. > {code} > expect_dplyr_equal( > input %>% > select(-fct) %>% > filter(nchar(padded_strings) < 10) %>% > collect(), > tbl > ) > isShortString <- function(x) nchar(x) < 10 > expect_dplyr_equal( > input %>% > select(-fct) %>% > filter(isShortString(padded_strings)) %>% > collect(), > tbl > ) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (ARROW-13779) [R] Disallow expressions that depend on order
[ https://issues.apache.org/jira/browse/ARROW-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson closed ARROW-13779. --- Resolution: Won't Fix Nothing to do right now. > [R] Disallow expressions that depend on order > - > > Key: ARROW-13779 > URL: https://issues.apache.org/jira/browse/ARROW-13779 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > > Because in the current ExecPlan, sorting is only done in the end (in a sink > node), we can't sort the data and then do an operation that depends on > sorting (like cumsum) without first calling compute(). (This is probably not > yet a concern because don't seem to have any kernels that require order.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13865) [C++][R] Writing moderate-size parquet files of nested dataframes from R slows down/process hangs
[ https://issues.apache.org/jira/browse/ARROW-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-13865: Fix Version/s: 7.0.0 > [C++][R] Writing moderate-size parquet files of nested dataframes from R > slows down/process hangs > - > > Key: ARROW-13865 > URL: https://issues.apache.org/jira/browse/ARROW-13865 > Project: Apache Arrow > Issue Type: Bug > Components: C++, R >Affects Versions: 5.0.0 >Reporter: John Sheffield >Priority: Major > Fix For: 7.0.0 > > Attachments: Screen Shot 2021-09-02 at 11.21.37 AM.png > > > I observed a significant slowdown in parquet writes (and ultimately the > process just hangs for minutes without completion) while writing > moderate-size nested dataframes from R. I have replicated the issue on MacOS > and Ubuntu so far. > > An example: > ``` > testdf <- dplyr::tibble( > id = uuid::UUIDgenerate(n = 5000), > l1 = as.list(lapply(1:5000, (function( x ) runif(1000, > l2 = as.list(lapply(1:5000, (function( x ) rnorm(1000 > ) > testdf_long <- tidyr::unnest(testdf, cols = c(l1, l2)) > > # This works > arrow::write_parquet(testdf_long, "testdf_long.parquet") > # This write does not complete within a few minutes on my testing but throws > no errors > arrow::write_parquet(testdf, "testdf.parquet") > ``` > I can't guess at why this is true, but the slowdown is closely tied to row > counts: > ``` > # screenshot attached; 12ms, 56ms, and 680ms respectively. > microbenchmark::microbenchmark( > arrow::write_parquet(testdf[1, ], "testdf.parquet"), > arrow::write_parquet(testdf[1:10, ], "testdf.parquet"), > arrow::write_parquet(testdf[1:100, ], "testdf.parquet"), > times = 5 > ) > ``` > I'm using the CRAN version 5.0.0 in both cases. The sessionInfo() for Ubuntu > is > R version 4.0.5 (2021-03-31) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 20.04.3 LTS > Matrix products: default > BLAS/LAPACK: > /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 > LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C > LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > attached base packages: > [1] stats graphics grDevices utils datasets methods base > other attached packages: > [1] arrow_5.0.0 > And sessionInfo for MacOS is: > R version 4.0.1 (2020-06-06) Platform: x86_64-apple-darwin17.0 (64-bit) > Running under: macOS Catalina 10.15.7 Matrix products: default BLAS: > /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib > LAPACK: > /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib > locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > attached base packages: [1] stats graphics grDevices utils datasets methods > base other attached packages: [1] arrow_5.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14020) [R] Writing datafames with list columns is slow and scales poorly with nesting level
[ https://issues.apache.org/jira/browse/ARROW-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-14020: Fix Version/s: 6.0.0 > [R] Writing datafames with list columns is slow and scales poorly with > nesting level > > > Key: ARROW-14020 > URL: https://issues.apache.org/jira/browse/ARROW-14020 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 5.0.0 > Environment: Windows 10 x64 >Reporter: Miles McBain >Assignee: Jonathan Keane >Priority: Major > Labels: pull-request-available > Fix For: 6.0.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Writing data frames that contain list columns seems much slower than expected: > ``` r > library(tidyverse) > #> Warning: package 'tidyverse' was built under R version 4.1.1 > #> Warning: package 'tibble' was built under R version 4.1.1 > #> Warning: package 'readr' was built under R version 4.1.1 > library(arrow) > #> Warning: package 'arrow' was built under R version 4.1.1 > #> > #> Attaching package: 'arrow' > #> The following object is masked from 'package:utils': > #> > #> timestamp > dummy <- tibble( > points = rep(list(seq(6)), 2e6), > index = seq(2e6) > ) > # very slooow > system.time(write_parquet(dummy, "dummy.parquet")) > #> user system elapsed > #> 55.64 0.11 55.98 > dummy_txt <- mutate(dummy, points = map_chr(points, deparse)) > # orders of magnitude faster > system.time(write_parquet(dummy_txt, "dummytext.parquet")) > #> user system elapsed > #> 0.24 0.02 0.25 > ``` > Created on 2021-09-17 by the [reprex > package]([https://reprex.tidyverse.org|https://reprex.tidyverse.org/]) > (v2.0.0) > > Session info > ``` r > sessioninfo::session_info() > #> - Session info > --- > #> setting value > #> version R version 4.1.0 (2021-05-18) > #> os Windows 10 x64 > #> system x86_64, mingw32 > #> ui RTerm > #> language (EN) > #> collate English_Australia.1252 > #> ctype English_Australia.1252 > #> tz Australia/Brisbane > #> date 2021-09-17 > #> > #> - Packages > --- > #> package * version date lib source > #> arrow * 5.0.0.2 2021-09-05 [1] CRAN (R 4.1.1) > #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) > #> backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0) > #> bit 4.0.4 2020-08-04 [1] CRAN (R 4.1.0) > #> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.1.0) > #> broom 0.7.7 2021-06-13 [1] CRAN (R 4.1.0) > #> cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.1.0) > #> cli 3.0.1 2021-07-17 [1] CRAN (R 4.1.0) > #> colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) > #> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.1.0) > #> DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) > #> dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) > #> digest 0.6.27 2020-10-24 [1] CRAN (R 4.1.0) > #> dplyr * 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) > #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) > #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) > #> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) > #> forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.1.0) > #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) > #> generics 0.1.0 2020-10-31 [1] CRAN (R 4.1.0) > #> ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) > #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.1.0) > #> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) > #> haven 2.4.1 2021-04-23 [1] CRAN (R 4.1.0) > #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) > #> hms 1.1.0 2021-05-17 [1] CRAN (R 4.1.0) > #> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.1.0) > #> httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) > #> jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) > #> knitr 1.33 2021-04-24 [1] CRAN (R 4.1.0) > #> lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.1.0) > #> lubridate 1.7.10 2021-02-26 [1] CRAN (R 4.1.0) > #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) > #> modelr 0.1.8 2020-05-19 [1] CRAN (R 4.1.0) > #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) > #> pillar 1.6.2 2021-07-29 [1] CRAN (R 4.1.0) > #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) > #> purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) > #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.1) > #> Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) > #> readr * 2.0.1 2021-08-10 [1] CRAN (R 4.1.1) > #> readxl 1.3.1 2019-03-13 [1] CRAN (R 4.1.0) > #> reprex 2.0.0 2021-04-02 [1] CRAN (R 4.1.0) > #> rlang 0.4.11 2021-04-30 [1] CRAN (R 4.1.0) > #> rmarkdown 2.9 2021-06-15 [1] CRAN (R 4.1.0) > #> rvest 1.0.1 2021-07-26 [1] CRAN (R 4.1.0) > #> scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) > #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0) > #> stringi 1.7.4 2021-08-25 [1] CRAN (R 4.1.1) > #> stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) > #>
[jira] [Updated] (ARROW-14025) [R][C++] PreBuffer is not enabled when scanning parquet via exec nodes
[ https://issues.apache.org/jira/browse/ARROW-14025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-14025: Fix Version/s: 6.0.0 > [R][C++] PreBuffer is not enabled when scanning parquet via exec nodes > -- > > Key: ARROW-14025 > URL: https://issues.apache.org/jira/browse/ARROW-14025 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, R >Reporter: Weston Pace >Priority: Major > Fix For: 6.0.0 > > > In ExecNode_Scan a ScanOptions object is built up. If we are reading parquet > we should enable pre-buffering. This is done by creating a > ParquetFragmentScanOptions object and enabling pre-buffering. > Alternatively, we could just default pre-buffering to true for asynchronous > scans of parquet data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13866) [R] Implement Options for all compute kernels available via list_compute_functions
[ https://issues.apache.org/jira/browse/ARROW-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-13866: Fix Version/s: 6.0.0 > [R] Implement Options for all compute kernels available via > list_compute_functions > -- > > Key: ARROW-13866 > URL: https://issues.apache.org/jira/browse/ARROW-13866 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Nicola Crane >Assignee: Nicola Crane >Priority: Major > Fix For: 6.0.0 > > > Not all of the compute kernels available via {{list_compute_functions()}} are > actually available to use in R, as they haven't been hooked up to the > relevant Options class in {{r/src/compute.cpp}}. > We should: > # Implement all remaining options classes > # Go through all the kernels listed by {{list_compute_functions()}} and > check that they have either no options classes to implement or that they have > been hooked up to the appropriate options class > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13901) [R] Implement IndexOptions
[ https://issues.apache.org/jira/browse/ARROW-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-13901: Fix Version/s: 6.0.0 > [R] Implement IndexOptions > -- > > Key: ARROW-13901 > URL: https://issues.apache.org/jira/browse/ARROW-13901 > Project: Apache Arrow > Issue Type: Sub-task > Components: R >Reporter: Nicola Crane >Assignee: Nicola Crane >Priority: Major > Fix For: 6.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14028) [R] Cast of NaN to integer should return NA_integer_
[ https://issues.apache.org/jira/browse/ARROW-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-14028: Fix Version/s: 7.0.0 > [R] Cast of NaN to integer should return NA_integer_ > > > Key: ARROW-14028 > URL: https://issues.apache.org/jira/browse/ARROW-14028 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Ian Cook >Priority: Major > Fix For: 7.0.0 > > > Casting double {{NaN}} to integer returns a sentinel value: > {code:r} > call_function("cast", Scalar$create(NaN), options = list(to_type = int32(), > allow_float_truncate = TRUE)) > #> Scalar > #> -2147483648 > call_function("cast", Scalar$create(NaN), options = list(to_type = int64(), > allow_float_truncate = TRUE)) > #> Scalar > #> -9223372036854775808{code} > It would be nice if this would instead return {{NA_integer}}. > N.B. for some reason this doesn't reproduce in dplyr unless you round-trip it > back to double: > {code:r} > > Table$create(x = NaN) %>% transmute(as.double(as.integer(x))) %>% pull(1) > #> [1] -2147483648{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14138) [R] update metadata when casting a record batch column
[ https://issues.apache.org/jira/browse/ARROW-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-14138: Fix Version/s: 7 > [R] update metadata when casting a record batch column > -- > > Key: ARROW-14138 > URL: https://issues.apache.org/jira/browse/ARROW-14138 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Romain Francois >Assignee: Romain Francois >Priority: Minor > Fix For: 7 > > > library(arrow, warn.conflicts = FALSE) > #> See arrow_info() for available features > raws <- structure(list( > as.raw(c(0x70, 0x65, 0x72, 0x73, 0x6f, 0x6e)) > ), class = c("arrow_binary", "vctrs_vctr", "list")) > batch <- record_batch(b = raws) > batch$metadata$r > #> 'arrow_r_metadata' chr > "A\n3\n262147\n197888\n5\nUTF-8\n531\n1\n531\n1\n531\n2\n531\n1\n16\n3\n262153\n12\narrow_binary\n262153\n10\nvc"| > __truncated__ > #> List of 1 > #> $ columns:List of 1 > #> ..$ b:List of 2 > #> .. ..$ attributes:List of 1 > #> .. .. ..$ class: chr [1:3] "arrow_binary" "vctrs_vctr" "list" > #> .. ..$ columns : NULL > # when casting `b` to a string column, the metadata is kept > batch$b <- batch$b$cast(utf8()) > batch$metadata$r > #> 'arrow_r_metadata' chr > "A\n3\n262147\n197888\n5\nUTF-8\n531\n1\n531\n1\n531\n2\n531\n1\n16\n3\n262153\n12\narrow_binary\n262153\n10\nvc"| > __truncated__ > #> List of 1 > #> $ columns:List of 1 > #> ..$ b:List of 2 > #> .. ..$ attributes:List of 1 > #> .. .. ..$ class: chr [1:3] "arrow_binary" "vctrs_vctr" "list" > #> .. ..$ columns : NULL > # but it should not have > batch2 <- record_batch(b = "string") > batch2$metadata$r > #> NULL -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-14200) [R] strftime on a date should not use or be confused by timezones
[ https://issues.apache.org/jira/browse/ARROW-14200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Keane reassigned ARROW-14200: -- Assignee: Jonathan Keane > [R] strftime on a date should not use or be confused by timezones > - > > Key: ARROW-14200 > URL: https://issues.apache.org/jira/browse/ARROW-14200 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Jonathan Keane >Assignee: Jonathan Keane >Priority: Major > Fix For: 7.0.0 > > > When the input to {{strftime}} is a date, timezones shouldn't be necessary or > assumed. > What I think is going on below is the date 1992-01-01 is being interpreted as > 1992-01-01 00:00:00 in UTC, and then when {{strftime()}} is being called it's > displaying that timestamp as 1991-12-31 ... (since my system is set to an > after UTC timezone), and then taking the year out of it. If I specify {{tz = > "utc"}} in the {{strftime()}}, I get the expected result (though that > shouldn't be necessary). > Run in the US central timezone: > {code} > library(arrow, warn.conflicts = FALSE) > library(dplyr, warn.conflicts = FALSE) > library(lubridate, warn.conflicts = FALSE) > Table$create( > data.frame( > x = as.Date("1992-01-01") > ) > ) %>% > mutate( > as_int_strftime = as.integer(strftime(x, "%Y")), > strftime = strftime(x, "%Y"), > as_int_strftime_utc = as.integer(strftime(x, "%Y", tz = "UTC")), > strftime_utc = strftime(x, "%Y", tz = "UTC"), > year = year(x) > ) %>% > collect() > #>x as_int_strftime strftime as_int_strftime_utc strftime_utc year > #> 1 1992-01-011991 19911992 1992 1992 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14138) [R] update metadata when casting a record batch column
[ https://issues.apache.org/jira/browse/ARROW-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-14138: Fix Version/s: (was: 7) 7.0.0 > [R] update metadata when casting a record batch column > -- > > Key: ARROW-14138 > URL: https://issues.apache.org/jira/browse/ARROW-14138 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Romain Francois >Assignee: Romain Francois >Priority: Minor > Fix For: 7.0.0 > > > library(arrow, warn.conflicts = FALSE) > #> See arrow_info() for available features > raws <- structure(list( > as.raw(c(0x70, 0x65, 0x72, 0x73, 0x6f, 0x6e)) > ), class = c("arrow_binary", "vctrs_vctr", "list")) > batch <- record_batch(b = raws) > batch$metadata$r > #> 'arrow_r_metadata' chr > "A\n3\n262147\n197888\n5\nUTF-8\n531\n1\n531\n1\n531\n2\n531\n1\n16\n3\n262153\n12\narrow_binary\n262153\n10\nvc"| > __truncated__ > #> List of 1 > #> $ columns:List of 1 > #> ..$ b:List of 2 > #> .. ..$ attributes:List of 1 > #> .. .. ..$ class: chr [1:3] "arrow_binary" "vctrs_vctr" "list" > #> .. ..$ columns : NULL > # when casting `b` to a string column, the metadata is kept > batch$b <- batch$b$cast(utf8()) > batch$metadata$r > #> 'arrow_r_metadata' chr > "A\n3\n262147\n197888\n5\nUTF-8\n531\n1\n531\n1\n531\n2\n531\n1\n16\n3\n262153\n12\narrow_binary\n262153\n10\nvc"| > __truncated__ > #> List of 1 > #> $ columns:List of 1 > #> ..$ b:List of 2 > #> .. ..$ attributes:List of 1 > #> .. .. ..$ class: chr [1:3] "arrow_binary" "vctrs_vctr" "list" > #> .. ..$ columns : NULL > # but it should not have > batch2 <- record_batch(b = "string") > batch2$metadata$r > #> NULL -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14169) [R] altrep for factors
[ https://issues.apache.org/jira/browse/ARROW-14169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-14169: Fix Version/s: 7.0.0 > [R] altrep for factors > -- > > Key: ARROW-14169 > URL: https://issues.apache.org/jira/browse/ARROW-14169 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Romain Francois >Assignee: Romain Francois >Priority: Major > Fix For: 7.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14200) [R] strftime on a date should not use or be confused by timezones
[ https://issues.apache.org/jira/browse/ARROW-14200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-14200: Fix Version/s: 7.0.0 > [R] strftime on a date should not use or be confused by timezones > - > > Key: ARROW-14200 > URL: https://issues.apache.org/jira/browse/ARROW-14200 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Jonathan Keane >Priority: Major > Fix For: 7.0.0 > > > When the input to {{strftime}} is a date, timezones shouldn't be necessary or > assumed. > What I think is going on below is the date 1992-01-01 is being interpreted as > 1992-01-01 00:00:00 in UTC, and then when {{strftime()}} is being called it's > displaying that timestamp as 1991-12-31 ... (since my system is set to an > after UTC timezone), and then taking the year out of it. If I specify {{tz = > "utc"}} in the {{strftime()}}, I get the expected result (though that > shouldn't be necessary). > Run in the US central timezone: > {code} > library(arrow, warn.conflicts = FALSE) > library(dplyr, warn.conflicts = FALSE) > library(lubridate, warn.conflicts = FALSE) > Table$create( > data.frame( > x = as.Date("1992-01-01") > ) > ) %>% > mutate( > as_int_strftime = as.integer(strftime(x, "%Y")), > strftime = strftime(x, "%Y"), > as_int_strftime_utc = as.integer(strftime(x, "%Y", tz = "UTC")), > strftime_utc = strftime(x, "%Y", tz = "UTC"), > year = year(x) > ) %>% > collect() > #>x as_int_strftime strftime as_int_strftime_utc strftime_utc year > #> 1 1992-01-011991 19911992 1992 1992 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14199) [R] bindings for format where possible
[ https://issues.apache.org/jira/browse/ARROW-14199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-14199: Fix Version/s: 7.0.0 > [R] bindings for format where possible > -- > > Key: ARROW-14199 > URL: https://issues.apache.org/jira/browse/ARROW-14199 > Project: Apache Arrow > Issue Type: New Feature > Components: R >Reporter: Jonathan Keane >Priority: Major > Fix For: 7.0.0 > > > Now that we have {{strftime}}, we should also be able to make bindings for > {{format()}} as well. This might be complicated / we might need to punt on a > bunch of types that {{format()}} can take but arrow doesn't (yet) support > formatting of them, that's ok. > Though some of those might be wrappable with a handful of kernels stacked > together: {{format(float)}} might be round + cast to character -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14209) [R] Allow multiple arguments to n_distinct()
[ https://issues.apache.org/jira/browse/ARROW-14209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Cook updated ARROW-14209: - Description: ARROW-13620 and ARROW-14036 added support for the {{n_distinct()}} function in the dplyr verb {{summarise()}} but only with a single argument. Add support for multiple arguments to {{n_distinct()}}. This should return the number of unique combinations of values in the specified columns/expressions. See the comment about this here: [https://github.com/apache/arrow/pull/11257#discussion_r720873549] was: ARROW-13620 and ARROW-14036 added support for the {{n_distinct()}} in function in the dplyr verb {{summarise()}} but only with a single argument. Add support for multiple arguments to {{n_distinct()}}. This should return the number of unique combinations of values in the specified columns/expressions. See the comment about this here: https://github.com/apache/arrow/pull/11257#discussion_r720873549 > [R] Allow multiple arguments to n_distinct() > > > Key: ARROW-14209 > URL: https://issues.apache.org/jira/browse/ARROW-14209 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Ian Cook >Priority: Major > Fix For: 7.0.0 > > > ARROW-13620 and ARROW-14036 added support for the {{n_distinct()}} function > in the dplyr verb {{summarise()}} but only with a single argument. Add > support for multiple arguments to {{n_distinct()}}. This should return the > number of unique combinations of values in the specified columns/expressions. > See the comment about this here: > [https://github.com/apache/arrow/pull/11257#discussion_r720873549] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14209) [R] Allow multiple arguments to n_distinct()
[ https://issues.apache.org/jira/browse/ARROW-14209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Cook updated ARROW-14209: - Description: ARROW-13620 and ARROW-14036 added support for the {{n_distinct()}} in function in the dplyr verb {{summarise()}} but only with a single argument. Add support for multiple arguments to {{n_distinct()}}. This should return the number of unique combinations of values in the specified columns/expressions. See the comment about this here: https://github.com/apache/arrow/pull/11257#discussion_r720873549 was:ARROW-13620 and ARROW-14036 added support for the {{n_distinct()}} in function in the dplyr verb {{summarise()}} but only with a single argument. Add support for multiple arguments to {{n_distinct()}}. This should return the number of unique combinations of values in the specified columns/expressions. > [R] Allow multiple arguments to n_distinct() > > > Key: ARROW-14209 > URL: https://issues.apache.org/jira/browse/ARROW-14209 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Ian Cook >Priority: Major > Fix For: 7.0.0 > > > ARROW-13620 and ARROW-14036 added support for the {{n_distinct()}} in > function in the dplyr verb {{summarise()}} but only with a single argument. > Add support for multiple arguments to {{n_distinct()}}. This should return > the number of unique combinations of values in the specified > columns/expressions. > See the comment about this here: > https://github.com/apache/arrow/pull/11257#discussion_r720873549 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-14209) [R] Allow multiple arguments to n_distinct()
Ian Cook created ARROW-14209: Summary: [R] Allow multiple arguments to n_distinct() Key: ARROW-14209 URL: https://issues.apache.org/jira/browse/ARROW-14209 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Ian Cook Fix For: 7.0.0 ARROW-13620 and ARROW-14036 added support for the {{n_distinct()}} in function in the dplyr verb {{summarise()}} but only with a single argument. Add support for multiple arguments to {{n_distinct()}}. This should return the number of unique combinations of values in the specified columns/expressions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-14208) [C++] Build errors with Visual Studio 2019
[ https://issues.apache.org/jira/browse/ARROW-14208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423699#comment-17423699 ] Ian Cook commented on ARROW-14208: -- [~apitrou] could you take a look at this? Thank you > [C++] Build errors with Visual Studio 2019 > -- > > Key: ARROW-14208 > URL: https://issues.apache.org/jira/browse/ARROW-14208 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Ian Cook >Priority: Major > > On September 10 the *test-build-vcpkg-win* nightly Crossbow job began to fail. > This job uses the current {{windows-2019}} GHA runner image, so it often > catches build errors associated with Visual Studio/MSVC updates: > The logs show these error messages (simplified for readability): > {code:java} > compute/util_internal.h(26,20): warning C4003: not enough arguments for > function-like macro invocation 'RtlZeroMemory' > compute/util_internal.h(26,20): error C2146: syntax error: missing ')' before > identifier 'buffer' > compute/util_internal.h(26,20): error C2065: 'buffer': undeclared identifier > compute/util_internal.h(26,20): error C2182: 'memset': illegal use of type > 'void' > compute/util_internal.h(26,20): error C7525: inline variables require at > least '/std:c++17' > compute/util_internal.h(26,20): error C2059: syntax error: 'constant' > compute/util_internal.h(26,20): error C2059: syntax error: ')' > compute/util_internal.h(26,47): error C2143: syntax error: missing ';' before > '{' > compute/util_internal.h(26,47): error C2447: '{': missing function header > (old-style formal list?){code} > Here is a link to the logs when they first began to fail on September 10: > [https://github.com/ursacomputing/crossbow/runs/3564248552#step:4:2985] > The error messages have remained the same since then. > Here is a link to the logs from the previous day (September 9) before they > began to fail: > [https://github.com/ursacomputing/crossbow/runs/3552742330] > > Possible causes include: > Updates to MSVC that were applied to the {{windows-2019}} GHA runner image > on September 9: > [https://github.com/actions/virtual-environments/pull/3452] > One of these commits on September 9: > > [https://github.com/apache/arrow/search?o=desc=1=committer-date%3A2021-09-09=author-date=commits] > Changes to one of the vcpkg-installed Arrow dependencies on September 9 (but > I don't see any such changes in the {{microsoft/vcpkg}} repo commit history). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-14208) [C++] Build errors with Visual Studio 2019
[ https://issues.apache.org/jira/browse/ARROW-14208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Cook updated ARROW-14208: - Description: On September 10 the *test-build-vcpkg-win* nightly Crossbow job began to fail. This job uses the current {{windows-2019}} GHA runner image, so it often catches build errors associated with Visual Studio/MSVC updates. The logs show these error messages (simplified for readability): {code:java} compute/util_internal.h(26,20): warning C4003: not enough arguments for function-like macro invocation 'RtlZeroMemory' compute/util_internal.h(26,20): error C2146: syntax error: missing ')' before identifier 'buffer' compute/util_internal.h(26,20): error C2065: 'buffer': undeclared identifier compute/util_internal.h(26,20): error C2182: 'memset': illegal use of type 'void' compute/util_internal.h(26,20): error C7525: inline variables require at least '/std:c++17' compute/util_internal.h(26,20): error C2059: syntax error: 'constant' compute/util_internal.h(26,20): error C2059: syntax error: ')' compute/util_internal.h(26,47): error C2143: syntax error: missing ';' before '{' compute/util_internal.h(26,47): error C2447: '{': missing function header (old-style formal list?){code} Here is a link to the logs when they first began to fail on September 10: [https://github.com/ursacomputing/crossbow/runs/3564248552#step:4:2985] The error messages have remained the same since then. Here is a link to the logs from the previous day (September 9) before they began to fail: [https://github.com/ursacomputing/crossbow/runs/3552742330] Possible causes include: Updates to MSVC that were applied to the {{windows-2019}} GHA runner image on September 9: [https://github.com/actions/virtual-environments/pull/3452] One of these commits on September 9: [https://github.com/apache/arrow/search?o=desc=1=committer-date%3A2021-09-09=author-date=commits] Changes to one of the vcpkg-installed Arrow dependencies on September 9 (but I don't see any such changes in the {{microsoft/vcpkg}} repo commit history). was: On September 10 the *test-build-vcpkg-win* nightly Crossbow job began to fail. This job uses the current {{windows-2019}} GHA runner image, so it often catches build errors associated with Visual Studio/MSVC updates: The logs show these error messages (simplified for readability): {code:java} compute/util_internal.h(26,20): warning C4003: not enough arguments for function-like macro invocation 'RtlZeroMemory' compute/util_internal.h(26,20): error C2146: syntax error: missing ')' before identifier 'buffer' compute/util_internal.h(26,20): error C2065: 'buffer': undeclared identifier compute/util_internal.h(26,20): error C2182: 'memset': illegal use of type 'void' compute/util_internal.h(26,20): error C7525: inline variables require at least '/std:c++17' compute/util_internal.h(26,20): error C2059: syntax error: 'constant' compute/util_internal.h(26,20): error C2059: syntax error: ')' compute/util_internal.h(26,47): error C2143: syntax error: missing ';' before '{' compute/util_internal.h(26,47): error C2447: '{': missing function header (old-style formal list?){code} Here is a link to the logs when they first began to fail on September 10: [https://github.com/ursacomputing/crossbow/runs/3564248552#step:4:2985] The error messages have remained the same since then. Here is a link to the logs from the previous day (September 9) before they began to fail: [https://github.com/ursacomputing/crossbow/runs/3552742330] Possible causes include: Updates to MSVC that were applied to the {{windows-2019}} GHA runner image on September 9: [https://github.com/actions/virtual-environments/pull/3452] One of these commits on September 9: [https://github.com/apache/arrow/search?o=desc=1=committer-date%3A2021-09-09=author-date=commits] Changes to one of the vcpkg-installed Arrow dependencies on September 9 (but I don't see any such changes in the {{microsoft/vcpkg}} repo commit history). > [C++] Build errors with Visual Studio 2019 > -- > > Key: ARROW-14208 > URL: https://issues.apache.org/jira/browse/ARROW-14208 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Ian Cook >Priority: Major > > On September 10 the *test-build-vcpkg-win* nightly Crossbow job began to fail. > This job uses the current {{windows-2019}} GHA runner image, so it often > catches build errors associated with Visual Studio/MSVC updates. > The logs show these error messages (simplified for readability): > {code:java} > compute/util_internal.h(26,20): warning C4003: not enough arguments for > function-like macro invocation 'RtlZeroMemory' > compute/util_internal.h(26,20): error C2146: syntax error: missing ')' before > identifier 'buffer' > compute/util_internal.h(26,20): error
[jira] [Created] (ARROW-14208) [C++] Build errors with Visual Studio 2019
Ian Cook created ARROW-14208: Summary: [C++] Build errors with Visual Studio 2019 Key: ARROW-14208 URL: https://issues.apache.org/jira/browse/ARROW-14208 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Ian Cook On September 10 the *test-build-vcpkg-win* nightly Crossbow job began to fail. This job uses the current {{windows-2019}} GHA runner image, so it often catches build errors associated with Visual Studio/MSVC updates: The logs show these error messages (simplified for readability): {code:java} compute/util_internal.h(26,20): warning C4003: not enough arguments for function-like macro invocation 'RtlZeroMemory' compute/util_internal.h(26,20): error C2146: syntax error: missing ')' before identifier 'buffer' compute/util_internal.h(26,20): error C2065: 'buffer': undeclared identifier compute/util_internal.h(26,20): error C2182: 'memset': illegal use of type 'void' compute/util_internal.h(26,20): error C7525: inline variables require at least '/std:c++17' compute/util_internal.h(26,20): error C2059: syntax error: 'constant' compute/util_internal.h(26,20): error C2059: syntax error: ')' compute/util_internal.h(26,47): error C2143: syntax error: missing ';' before '{' compute/util_internal.h(26,47): error C2447: '{': missing function header (old-style formal list?){code} Here is a link to the logs when they first began to fail on September 10: [https://github.com/ursacomputing/crossbow/runs/3564248552#step:4:2985] The error messages have remained the same since then. Here is a link to the logs from the previous day (September 9) before they began to fail: [https://github.com/ursacomputing/crossbow/runs/3552742330] Possible causes include: Updates to MSVC that were applied to the {{windows-2019}} GHA runner image on September 9: [https://github.com/actions/virtual-environments/pull/3452] One of these commits on September 9: [https://github.com/apache/arrow/search?o=desc=1=committer-date%3A2021-09-09=author-date=commits] Changes to one of the vcpkg-installed Arrow dependencies on September 9 (but I don't see any such changes in the {{microsoft/vcpkg}} repo commit history). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-14188) link error on ubuntu
[ https://issues.apache.org/jira/browse/ARROW-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423649#comment-17423649 ] Amir Ghamarian commented on ARROW-14188: Thanks, same error. It works fine on OSX. {code:java} ../../vcpkg_installed/x64-linux/debug/lib/libarrow.a(compression_brotli.cc.o): In function `arrow::util::internal::(anonymous namespace)::BrotliDecompressor::~BrotliDecompressor()': /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:43: undefined reference to `BrotliDecoderDestroyInstance' ../../vcpkg_installed/x64-linux/debug/lib/libarrow.a(compression_brotli.cc.o): In function `arrow::util::internal::(anonymous namespace)::BrotliDecompressor::Init()': /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:48: undefined reference to `BrotliDecoderCreateInstance' ../../vcpkg_installed/x64-linux/debug/lib/libarrow.a(compression_brotli.cc.o): In function `arrow::util::internal::(anonymous namespace)::BrotliDecompressor::Reset()': /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:57: undefined reference to `BrotliDecoderDestroyInstance' ../../vcpkg_installed/x64-linux/debug/lib/libarrow.a(compression_brotli.cc.o): In function `arrow::util::internal::(anonymous namespace)::BrotliDecompressor::Decompress(long, unsigned char const*, long, unsigned char*)': /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:68: undefined reference to `BrotliDecoderDecompressStream' /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:71: undefined reference to `BrotliDecoderGetErrorCode' ../../vcpkg_installed/x64-linux/debug/lib/libarrow.a(compression_brotli.cc.o): In function `arrow::util::internal::(anonymous namespace)::BrotliDecompressor::IsFinished()': /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:78: undefined reference to `BrotliDecoderIsFinished' ../../vcpkg_installed/x64-linux/debug/lib/libarrow.a(compression_brotli.cc.o): In function `arrow::util::internal::(anonymous namespace)::BrotliDecompressor::BrotliError(BrotliDecoderErrorCode, char const*)': /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:84: undefined reference to `BrotliDecoderErrorString' ../../vcpkg_installed/x64-linux/debug/lib/libarrow.a(compression_brotli.cc.o): In function `arrow::util::internal::(anonymous namespace)::BrotliCompressor::~BrotliCompressor()': /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:100: undefined reference to `BrotliEncoderDestroyInstance' ../../vcpkg_installed/x64-linux/debug/lib/libarrow.a(compression_brotli.cc.o): In function `arrow::util::internal::(anonymous namespace)::BrotliCompressor::Init()': /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:105: undefined reference to `BrotliEncoderCreateInstance' /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:109: undefined reference to `BrotliEncoderSetParameter' ../../vcpkg_installed/x64-linux/debug/lib/libarrow.a(compression_brotli.cc.o): In function `arrow::util::internal::(anonymous namespace)::BrotliCompressor::Compress(long, unsigned char const*, long, unsigned char*)': /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:121: undefined reference to `BrotliEncoderCompressStream' ../../vcpkg_installed/x64-linux/debug/lib/libarrow.a(compression_brotli.cc.o): In function `arrow::util::internal::(anonymous namespace)::BrotliCompressor::Flush(long, unsigned char*)': /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:136: undefined reference to `BrotliEncoderCompressStream' /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:142: undefined reference to `BrotliEncoderHasMoreOutput' ../../vcpkg_installed/x64-linux/debug/lib/libarrow.a(compression_brotli.cc.o): In function `arrow::util::internal::(anonymous namespace)::BrotliCompressor::End(long, unsigned char*)': /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:152: undefined reference to `BrotliEncoderCompressStream' /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:157: undefined reference to `BrotliEncoderHasMoreOutput' /projects/vcpkg/buildtrees/arrow/src/rrow-5.0.0-fc32d5e3bc.clean/cpp/src/arrow/util/compression_brotli.cc:158: undefined reference to