[jira] [Assigned] (ARROW-8586) [R] installation failure on CentOS 7
[ https://issues.apache.org/jira/browse/ARROW-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-8586: -- Assignee: Neal Richardson > [R] installation failure on CentOS 7 > > > Key: ARROW-8586 > URL: https://issues.apache.org/jira/browse/ARROW-8586 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: CentOS 7 >Reporter: Hei >Assignee: Neal Richardson >Priority: Major > > Hi, > I am trying to install arrow via RStudio, but it seems like it is not working > that after I installed the package, it kept asking me to run > arrow::install_arrow() even after I did: > {code} > > install.packages("arrow") > Installing package into ‘/home/hc/R/x86_64-redhat-linux-gnu-library/3.6’ > (as ‘lib’ is unspecified) > trying URL 'https://cran.rstudio.com/src/contrib/arrow_0.17.0.tar.gz' > Content type 'application/x-gzip' length 242534 bytes (236 KB) > == > downloaded 236 KB > * installing *source* package ‘arrow’ ... > ** package ‘arrow’ successfully unpacked and MD5 sums checked > ** using staged installation > *** Successfully retrieved C++ source > *** Building C++ libraries > cmake > arrow > ./configure: line 132: cd: libarrow/arrow-0.17.0/lib: Not a directory > - NOTE --- > After installation, please run arrow::install_arrow() > for help installing required runtime libraries > - > ** libs > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array.cpp -o array.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array_from_vector.cpp -o > array_from_vector.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array_to_vector.cpp -o > array_to_vector.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c arraydata.cpp -o arraydata.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c arrowExports.cpp -o > arrowExports.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c buffer.cpp -o buffer.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c chunkedarray.cpp -o > chunkedarray.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c compression.cpp -o > compression.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c compute.cpp -o compute.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include
[jira] [Assigned] (ARROW-8734) [R] autobrew script always builds from master
[ https://issues.apache.org/jira/browse/ARROW-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-8734: -- Assignee: Neal Richardson > [R] autobrew script always builds from master > - > > Key: ARROW-8734 > URL: https://issues.apache.org/jira/browse/ARROW-8734 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Jonathan Keane >Assignee: Neal Richardson >Priority: Major > > I've tried to install / build from source (with from a git checkout and using > the built-in `install_arrow()`) and when compiling I'm getting the following > error reliably during the auto brew process: > {code:bash} > x System command 'R' failed, exit status: 1, stdout + stderr: > E> * checking for file ‘/Users/jkeane/Dropbox/arrow/r/DESCRIPTION’ ... OK > E> * preparing ‘arrow’: > E> * checking DESCRIPTION meta-information ... OK > E> * cleaning src > E> * running ‘cleanup’ > E> * installing the package to build vignettes > E> --- > E> * installing *source* package ‘arrow’ ... > E> ** using staged installation > E> *** Generating code with data-raw/codegen.R > E> There were 27 warnings (use warnings() to see them) > E> *** > 375 functions decorated with [[arrow|s3::export]] > E> *** > generated file `src/arrowExports.cpp` > E> *** > generated file `R/arrowExports.R` > E> *** Downloading apache-arrow > E> Using local manifest for apache-arrow > E> Thu May 7 13:13:42 CDT 2020: Auto-brewing apache-arrow in > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T//build-apache-arrow... > E> ==> Tapping autobrew/core from https://github.com/autobrew/homebrew-core > E> Tapped 2 commands and 4639 formulae (4,888 files, 12.7MB). > E> lz4 > E> openssl > E> thrift > E> snappy > E> ==> Downloading > https://homebrew.bintray.com/bottles/lz4-1.8.3.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/b4158ef68d619dbf78935df6a42a70b8339a65bc8876cbb4446355ccd40fa5de--lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Pouring lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/lz4/1.8.3: > 22 files, 512.7KB > E> ==> Downloading > https://homebrew.bintray.com/bottles/openssl-1.0.2p.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/fbb493745981c8b26c0fab115c76c2a70142bfde9e776c450277e9dfbbba0bb2--openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Pouring openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> ==> Caveats > E> openssl is keg-only, which means it was not symlinked into > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow, > E> because Apple has deprecated use of OpenSSL in favor of its own TLS and > crypto libraries. > E> > E> If you need to have openssl first in your PATH run: > E> echo 'export > PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/bin:$PATH"' > >> ~/.zshrc > E> > E> For compilers to find openssl you may need to set: > E> export > LDFLAGS="-L/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib" > E> export > CPPFLAGS="-I/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/include" > E> > E> For pkg-config to find openssl you may need to set: > E> export > PKG_CONFIG_PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib/pkgconfig" > E> > E> ==> Summary > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/openssl/1.0.2p: > 1,793 files, 12MB > E> ==> Downloading > https://homebrew.bintray.com/bottles/thrift-0.11.0.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/7e05ea11a9f7f924dd7f8f36252ec73a24958b7f214f71e3752a355e75e589bd--thrift-0.11.0.mojave.bottle.tar.gz > E> ==> Pouring thrift-0.11.0.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> ==> Caveats > E> To install Ruby binding: > E> gem install thrift > E> ==> Summary > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/thrift/0.11.0: > 102 files, 7MB > E> ==> Downloading > https://homebrew.bintray.com/bottles/snappy-1.1.7_1.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/1f09938804055499d1dd951b13b26d80c56eae359aa051284bf4f51d109a9f73--snappy-1.1.7_1.mojave.bottle.tar.gz > E> ==> Pouring snappy-1.1.7_1.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> 🍺
[jira] [Created] (ARROW-8804) [R][CI] Followup to Rtools40 upgrade
Neal Richardson created ARROW-8804: -- Summary: [R][CI] Followup to Rtools40 upgrade Key: ARROW-8804 URL: https://issues.apache.org/jira/browse/ARROW-8804 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (ARROW-8787) [R] read_parquet() don't end
[ https://issues.apache.org/jira/browse/ARROW-8787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson closed ARROW-8787. -- Assignee: Neal Richardson Resolution: Duplicate > [R] read_parquet() don't end > > > Key: ARROW-8787 > URL: https://issues.apache.org/jira/browse/ARROW-8787 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Windows10 > R 3.6.3 >Reporter: Masaru >Assignee: Neal Richardson >Priority: Major > > I have tried to use read_parquet() function as follows: > {code:java} > write_parquet(data.table(matrix(1,1)),"test.parquet") > read_parquet("test.parquet"){code} > The data set is very small. > However, the process never end at read_parquet(). > Could you please show how to fix the settings or code? > I have installed the package via cran site. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (ARROW-8787) [R] read_parquet() don't end
[ https://issues.apache.org/jira/browse/ARROW-8787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reopened ARROW-8787: > [R] read_parquet() don't end > > > Key: ARROW-8787 > URL: https://issues.apache.org/jira/browse/ARROW-8787 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Windows10 > R 3.6.3 >Reporter: Masaru >Priority: Major > Fix For: 0.17.0 > > > I have tried to use read_parquet() function as follows: > {code:java} > write_parquet(data.table(matrix(1,1)),"test.parquet") > read_parquet("test.parquet"){code} > The data set is very small. > However, the process never end at read_parquet(). > Could you please show how to fix the settings or code? > I have installed the package via cran site. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8787) [R] read_parquet() don't end
[ https://issues.apache.org/jira/browse/ARROW-8787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8787: --- Fix Version/s: (was: 0.17.0) > [R] read_parquet() don't end > > > Key: ARROW-8787 > URL: https://issues.apache.org/jira/browse/ARROW-8787 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Windows10 > R 3.6.3 >Reporter: Masaru >Priority: Major > > I have tried to use read_parquet() function as follows: > {code:java} > write_parquet(data.table(matrix(1,1)),"test.parquet") > read_parquet("test.parquet"){code} > The data set is very small. > However, the process never end at read_parquet(). > Could you please show how to fix the settings or code? > I have installed the package via cran site. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (ARROW-8787) [R] read_parquet() don't end
[ https://issues.apache.org/jira/browse/ARROW-8787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson closed ARROW-8787. -- Resolution: Duplicate > [R] read_parquet() don't end > > > Key: ARROW-8787 > URL: https://issues.apache.org/jira/browse/ARROW-8787 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Windows10 > R 3.6.3 >Reporter: Masaru >Priority: Major > Fix For: 0.17.0 > > > I have tried to use read_parquet() function as follows: > {code:java} > write_parquet(data.table(matrix(1,1)),"test.parquet") > read_parquet("test.parquet"){code} > The data set is very small. > However, the process never end at read_parquet(). > Could you please show how to fix the settings or code? > I have installed the package via cran site. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8787) [R] read_parquet() don't end
[ https://issues.apache.org/jira/browse/ARROW-8787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106876#comment-17106876 ] Neal Richardson commented on ARROW-8787: Glad to hear that works around the issue for you, though obviously it's not ideal. Hopefully someone will be able to fix this properly. > [R] read_parquet() don't end > > > Key: ARROW-8787 > URL: https://issues.apache.org/jira/browse/ARROW-8787 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Windows10 > R 3.6.3 >Reporter: Masaru >Priority: Major > Fix For: 0.17.0 > > > I have tried to use read_parquet() function as follows: > {code:java} > write_parquet(data.table(matrix(1,1)),"test.parquet") > read_parquet("test.parquet"){code} > The data set is very small. > However, the process never end at read_parquet(). > Could you please show how to fix the settings or code? > I have installed the package via cran site. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8787) [R] read_parquet() don't end
[ https://issues.apache.org/jira/browse/ARROW-8787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106820#comment-17106820 ] Neal Richardson commented on ARROW-8787: Hmm, this could be a duplicate of ARROW-7288. Could you try setting the locale like that issue mentions and see if that works? > [R] read_parquet() don't end > > > Key: ARROW-8787 > URL: https://issues.apache.org/jira/browse/ARROW-8787 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Windows10 > R 3.6.3 >Reporter: Masaru >Priority: Major > Fix For: 0.17.0 > > > I have tried to use read_parquet() function as follows: > {code:java} > write_parquet(data.table(matrix(1,1)),"test.parquet") > read_parquet("test.parquet"){code} > The data set is very small. > However, the process never end at read_parquet(). > Could you please show how to fix the settings or code? > I have installed the package via cran site. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8787) [R] read_parquet() don't end
[ https://issues.apache.org/jira/browse/ARROW-8787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106791#comment-17106791 ] Neal Richardson commented on ARROW-8787: It should be instant. It's instantaneous when we run the tests in our CI, and also when CRAN runs it too. I'm not sure what's different about your system to even begin to give you recommendations on what to do. Providing {{sessionInfo()}} might be a start, as well as anything unique about how your system is configured. > [R] read_parquet() don't end > > > Key: ARROW-8787 > URL: https://issues.apache.org/jira/browse/ARROW-8787 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Windows10 > R 3.6.3 >Reporter: Masaru >Priority: Major > Fix For: 0.17.0 > > > I have tried to use read_parquet() function as follows: > {code:java} > write_parquet(data.table(matrix(1,1)),"test.parquet") > read_parquet("test.parquet"){code} > The data set is very small. > However, the process never end at read_parquet(). > Could you please show how to fix the settings or code? > I have installed the package via cran site. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-8717) [CI][Packaging] Add build dependency on boost to homebrew
[ https://issues.apache.org/jira/browse/ARROW-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8717. Resolution: Fixed Issue resolved by pull request 7173 [https://github.com/apache/arrow/pull/7173] > [CI][Packaging] Add build dependency on boost to homebrew > - > > Key: ARROW-8717 > URL: https://issues.apache.org/jira/browse/ARROW-8717 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration, Packaging >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > cf. https://github.com/Homebrew/homebrew-core/pull/54287 > and revise the Travis jobs to uninstall boost and thrift before checking the > formula -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-8604) [R][CI] Update CI to use R 4.0
[ https://issues.apache.org/jira/browse/ARROW-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8604. Resolution: Fixed Issue resolved by pull request 7107 [https://github.com/apache/arrow/pull/7107] > [R][CI] Update CI to use R 4.0 > -- > > Key: ARROW-8604 > URL: https://issues.apache.org/jira/browse/ARROW-8604 > Project: Apache Arrow > Issue Type: Bug > Components: Continuous Integration, R >Reporter: Francois Saint-Jacques >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > [Master|[https://github.com/apache/arrow/runs/622393526]] fails to compile. > The C++ cmake build is not using the same > [compiler|[https://github.com/apache/arrow/runs/622393526#step:8:807]] than > the R extension > [compiler|[https://github.com/apache/arrow/runs/622393526#step:11:141]]. > {code:java} > // Files installed here > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libarrow.a (deflated 85%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libarrow_dataset.a (deflated 82%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libparquet.a (deflated 84%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libsnappy.a (deflated 61%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libthrift.a (deflated 81%) > // Linker is using `-L` > C:/Rtools/mingw_32/bin/g++ -shared -s -static-libgcc -o arrow.dll tmp.def > array.o array_from_vector.o array_to_vector.o arraydata.o arrowExports.o > buffer.o chunkedarray.o compression.o compute.o csv.o dataset.o datatype.o > expression.o feather.o field.o filesystem.o io.o json.o memorypool.o > message.o parquet.o py-to-r.o recordbatch.o recordbatchreader.o > recordbatchwriter.o schema.o symbols.o table.o threadpool.o > -L../windows/arrow-0.17.0.9000/lib-8.3.0/i386 > -L../windows/arrow-0.17.0.9000/lib/i386 -lparquet -larrow_dataset -larrow > -lthrift -lsnappy -lz -lzstd -llz4 -lcrypto -lcrypt32 -lws2_32 > -LC:/R/bin/i386 -lR > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lparquet > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -larrow_dataset > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -larrow > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lthrift > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lsnappy > {code} > > C++ developers, rejoice, this is almost the end of gcc-4.9. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-8768) [R][CI] Fix nightly as-cran spurious failure
[ https://issues.apache.org/jira/browse/ARROW-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8768. Resolution: Fixed Issue resolved by pull request 7151 [https://github.com/apache/arrow/pull/7151] > [R][CI] Fix nightly as-cran spurious failure > > > Key: ARROW-8768 > URL: https://issues.apache.org/jira/browse/ARROW-8768 > Project: Apache Arrow > Issue Type: Bug > Components: Continuous Integration, R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Fix For: 1.0.0 > > > An extra check we added to ensure that the package doesn't write anything to > the user's home directory started failing on one of the 5 as-cran checks. It > appears that a new feature of texlive2020, which is apparently invoked on > checking that the pdf manual can be built, adds some caching junk to the home > dir. It is unlikely that this is a real failure, probably just an artifact of > the test environment. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8782) [Rust] [DataFusion] Add benchmarks based on NYC Taxi data set
[ https://issues.apache.org/jira/browse/ARROW-8782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106398#comment-17106398 ] Neal Richardson commented on ARROW-8782: [~fsaintjacques] has a Python script somewhere for downloading taxi CSVs and turning them into Parquet > [Rust] [DataFusion] Add benchmarks based on NYC Taxi data set > - > > Key: ARROW-8782 > URL: https://issues.apache.org/jira/browse/ARROW-8782 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Fix For: 1.0.0 > > > I plan on adding a new benchmarks folder beneatch the datafusion crate, > containing benchmarks based on the NYC Taxi data set. The benchmark will be a > CLI and will support running a number of different queries against CSV and > Parquet. > The README will contain instructions for downloading the data set. > The benchmark will produce CSV files containing results. > These benchmarks will allow us to manually verify performance before major > releases and on an ongoing basis as we make changes to > Arrow/Parquet/DataFusion. > I will be basing this on existing benchmarks I recently built in Ballista [1] > (I am the only contributor to these benchmarks so far). > A dockerfile will be provided, making it easy to restrict CPU and RAM when > running these benchmarks. > [1] https://github.com/ballista-compute/ballista/tree/master/rust/benchmarks > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8787) [R] read_parquet() don't end
[ https://issues.apache.org/jira/browse/ARROW-8787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106377#comment-17106377 ] Neal Richardson commented on ARROW-8787: Your example works on my machine. Are you able to run the examples from the docs? {{example(read_parquet)}} > [R] read_parquet() don't end > > > Key: ARROW-8787 > URL: https://issues.apache.org/jira/browse/ARROW-8787 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Windows10 > R 3.6.3 >Reporter: Masaru >Priority: Major > Fix For: 0.17.0 > > > I have tried to use read_parquet() function as follows: > {code:java} > write_parquet(data.table(matrix(1,1)),"test.parquet") > read_parquet("test.parquet"){code} > The data set is very small. > However, the process never end at read_parquet(). > Could you please show how to fix the settings or code? > I have installed the package via cran site. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8779) [R] Implement conversion to List
[ https://issues.apache.org/jira/browse/ARROW-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8779: --- Summary: [R] Implement conversion to List (was: [R] Unable to write Struct Layout to file (.arrow, .parquet)) > [R] Implement conversion to List > > > Key: ARROW-8779 > URL: https://issues.apache.org/jira/browse/ARROW-8779 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Affects Versions: 0.16.0, 0.17.0 >Reporter: Dominic Dennenmoser >Priority: Major > > It seems there is no method implemented to write a StructArrow (within a > TableArrow) to file. A common case would be list columns in a dataframe. If I > have understood the documentation correctly, the should be realisable within > the current C++ library framework. > I tested this with the follow df structure: > {code:none} > df > |-- id > |-- data > | |-- a > | |-- b > | |-- c > | |-- d {code} > I got the follow error message: > {code:none} > Error in Table__from_dots(dots, schema) : NotImplemented: Converting vector > to arrow type struct indices=int8, ordered=0>, d: double> not implemented{code} > I have tried it with {{arrow}} 0.17.0 under {{R}} 3.6.1 . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8779) [R] Unable to write Struct Layout to file (.arrow, .parquet)
[ https://issues.apache.org/jira/browse/ARROW-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8779: --- Labels: (was: features patch) > [R] Unable to write Struct Layout to file (.arrow, .parquet) > > > Key: ARROW-8779 > URL: https://issues.apache.org/jira/browse/ARROW-8779 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Affects Versions: 0.16.0, 0.17.0 >Reporter: Dominic Dennenmoser >Priority: Major > > It seems there is no method implemented to write a StructArrow (within a > TableArrow) to file. A common case would be list columns in a dataframe. If I > have understood the documentation correctly, the should be realisable within > the current C++ library framework. > I tested this with the follow df structure: > {code:none} > df > |-- id > |-- data > | |-- a > | |-- b > | |-- c > | |-- d {code} > I got the follow error message: > {code:none} > Error in Table__from_dots(dots, schema) : NotImplemented: Converting vector > to arrow type struct indices=int8, ordered=0>, d: double> not implemented{code} > I have tried it with {{arrow}} 0.17.0 under {{R}} 3.6.1 . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8779) [R] Unable to write Struct Layout to file (.arrow, .parquet)
[ https://issues.apache.org/jira/browse/ARROW-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105751#comment-17105751 ] Neal Richardson commented on ARROW-8779: Here's a more minimal reproducer: {code:r} Array$create(list(data.frame(a = 1))) Error in Array__from_vector(x, type) : NotImplemented: Converting vector to arrow type struct not implemented {code} It seems that we support creating ListArrays and StructArrays from R, but not a List of Structs: {code} > Array$create(list(1)) ListArray > [ [ 1 ] ] > Array$create(data.frame(a = 1)) StructArray > -- is_valid: all not null -- child 0 type: double [ 1 ] {code} > [R] Unable to write Struct Layout to file (.arrow, .parquet) > > > Key: ARROW-8779 > URL: https://issues.apache.org/jira/browse/ARROW-8779 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Affects Versions: 0.16.0, 0.17.0 >Reporter: Dominic Dennenmoser >Priority: Major > Labels: features, patch > > It seems there is no method implemented to write a StructArrow (within a > TableArrow) to file. A common case would be list columns in a dataframe. If I > have understood the documentation correctly, the should be realisable within > the current C++ library framework. > I tested this with the follow df structure: > {code:none} > df > |-- id > |-- data > | |-- a > | |-- b > | |-- c > | |-- d {code} > I got the follow error message: > {code:none} > Error in Table__from_dots(dots, schema) : NotImplemented: Converting vector > to arrow type struct indices=int8, ordered=0>, d: double> not implemented{code} > I have tried it with {{arrow}} 0.17.0 under {{R}} 3.6.1 . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8779) [R] Unable to write Struct Layout to file (.arrow, .parquet)
[ https://issues.apache.org/jira/browse/ARROW-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105733#comment-17105733 ] Neal Richardson commented on ARROW-8779: Could you please provide a minimal reproducible example? > [R] Unable to write Struct Layout to file (.arrow, .parquet) > > > Key: ARROW-8779 > URL: https://issues.apache.org/jira/browse/ARROW-8779 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Affects Versions: 0.16.0, 0.17.0 >Reporter: Dominic Dennenmoser >Priority: Major > Labels: features, patch > > It seems there is no method implemented to write a StructArrow (within a > TableArrow) to file. A common case would be list columns in a dataframe. If I > have understood the documentation correctly, the should be realisable within > the current C++ library framework. > I tested this with the follow df structure: > {code:none} > df > |-- id > |-- data > | |-- a > | |-- b > | |-- c > | |-- d {code} > I got the follow error message: > {code:none} > Error in Table__from_dots(dots, schema) : NotImplemented: Converting vector > to arrow type struct indices=int8, ordered=0>, d: double> not implemented{code} > I have tried it with {{arrow}} 0.17.0 under {{R}} 3.6.1 . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8768) [R][CI] Fix nightly as-cran spurious failure
Neal Richardson created ARROW-8768: -- Summary: [R][CI] Fix nightly as-cran spurious failure Key: ARROW-8768 URL: https://issues.apache.org/jira/browse/ARROW-8768 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 An extra check we added to ensure that the package doesn't write anything to the user's home directory started failing on one of the 5 as-cran checks. It appears that a new feature of texlive2020, which is apparently invoked on checking that the pdf manual can be built, adds some caching junk to the home dir. It is unlikely that this is a real failure, probably just an artifact of the test environment. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8748) [R] Implementing methodes for combining arrow tabels using dplyr::bind_rows and dplyr::bind_cols
[ https://issues.apache.org/jira/browse/ARROW-8748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104730#comment-17104730 ] Neal Richardson commented on ARROW-8748: We could add methods to concatenate Tables in Arrow memory (the function probably exists in the C++ library). But I'm not sure that's the best solution to your problem. If you have several Tables and you dump them to a file, you don't need to concatenate them in memory first. You can use the lower-level {{RecordBatchStreamWriter}} that {{write_ipc_stream}} wraps. Something like: {code:r} file_obj <- FileOutputStream$create(file_name) writer <- RecordBatchFileWriter$create(file_obj, batch$schema) for (batch in batches) { writer$write(batch) } writer$close() file_obj$close() {code} See {{?RecordBatchWriter}}. > [R] Implementing methodes for combining arrow tabels using dplyr::bind_rows > and dplyr::bind_cols > > > Key: ARROW-8748 > URL: https://issues.apache.org/jira/browse/ARROW-8748 > Project: Apache Arrow > Issue Type: New Feature > Components: R >Reporter: Dominic Dennenmoser >Priority: Major > Labels: features, performance, pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > First at all, many thanks for your hard work! I was quite exited, when you > guys implemented some basic function of the the {{dplyr}} package. Is there a > why to combine tow or more arrow tables into one by rows or columns? At the > moment my workaround looks like this: > {code:r} > dplyr::bind_rows( >"a" = arrow.table.1 %>% dplyr::collect(), >"b" = arrow.table.2 %>% dplyr::collect(), >"c" = arrow.table.3 %>% dplyr::collect(), >"d" = arrow.table.4 %>% dplyr::collect(), >.id = "ID" > ) %>% > arrow::write_ipc_stream(sink = "file_name_combined_tables.arrow") > {code} > But this is actually not really a meaningful measure because of putting the > data back as dataframes/tibbles into the r environment, which might lead to > an exhaust of RAM space. Perhaps you might have a better workaround on hand. > It would be great if you guys could implement the {{bind_rows}} and > {{bind_cols}} methods provided by {{dplyr}}. > {code:java} > dplyr::bind_rows( >"a" = arrow.table.1, >"b" = arrow.table.2, >"c" = arrow.table.3, >"d" = arrow.table.4, >.id = "ID" > ) %>% > arrow::write_ipc_stream(sink = "file_name_combined_tables.arrow"){code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8748) [R] Implementing methodes for combining arrow tabels using dplyr::bind_rows and dplyr::bind_cols
[ https://issues.apache.org/jira/browse/ARROW-8748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8748: --- Labels: features performance (was: features performance pull-request-available) > [R] Implementing methodes for combining arrow tabels using dplyr::bind_rows > and dplyr::bind_cols > > > Key: ARROW-8748 > URL: https://issues.apache.org/jira/browse/ARROW-8748 > Project: Apache Arrow > Issue Type: New Feature > Components: R >Reporter: Dominic Dennenmoser >Priority: Major > Labels: features, performance > Time Spent: 20m > Remaining Estimate: 0h > > First at all, many thanks for your hard work! I was quite exited, when you > guys implemented some basic function of the the {{dplyr}} package. Is there a > why to combine tow or more arrow tables into one by rows or columns? At the > moment my workaround looks like this: > {code:r} > dplyr::bind_rows( >"a" = arrow.table.1 %>% dplyr::collect(), >"b" = arrow.table.2 %>% dplyr::collect(), >"c" = arrow.table.3 %>% dplyr::collect(), >"d" = arrow.table.4 %>% dplyr::collect(), >.id = "ID" > ) %>% > arrow::write_ipc_stream(sink = "file_name_combined_tables.arrow") > {code} > But this is actually not really a meaningful measure because of putting the > data back as dataframes/tibbles into the r environment, which might lead to > an exhaust of RAM space. Perhaps you might have a better workaround on hand. > It would be great if you guys could implement the {{bind_rows}} and > {{bind_cols}} methods provided by {{dplyr}}. > {code:java} > dplyr::bind_rows( >"a" = arrow.table.1, >"b" = arrow.table.2, >"c" = arrow.table.3, >"d" = arrow.table.4, >.id = "ID" > ) %>% > arrow::write_ipc_stream(sink = "file_name_combined_tables.arrow"){code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8549) [R] Assorted post-0.17 release cleanups
[ https://issues.apache.org/jira/browse/ARROW-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8549: --- Fix Version/s: 0.17.1 > [R] Assorted post-0.17 release cleanups > --- > > Key: ARROW-8549 > URL: https://issues.apache.org/jira/browse/ARROW-8549 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0, 0.17.1 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8699) [R] Fix automatic r_to_py conversion
[ https://issues.apache.org/jira/browse/ARROW-8699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8699: --- Fix Version/s: 0.17.1 > [R] Fix automatic r_to_py conversion > > > Key: ARROW-8699 > URL: https://issues.apache.org/jira/browse/ARROW-8699 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0, 0.17.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > See https://github.com/rstudio/reticulate/issues/748 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-8758) [R] Updates for compatibility with dplyr 1.0
[ https://issues.apache.org/jira/browse/ARROW-8758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8758. Resolution: Fixed Issue resolved by pull request 7147 [https://github.com/apache/arrow/pull/7147] > [R] Updates for compatibility with dplyr 1.0 > > > Key: ARROW-8758 > URL: https://issues.apache.org/jira/browse/ARROW-8758 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 0.17.1, 1.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8726) [R][Dataset] segfault with a mis-specified partition
[ https://issues.apache.org/jira/browse/ARROW-8726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8726: --- Fix Version/s: (was: 0.17.1) > [R][Dataset] segfault with a mis-specified partition > > > Key: ARROW-8726 > URL: https://issues.apache.org/jira/browse/ARROW-8726 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Jonathan Keane >Assignee: Francois Saint-Jacques >Priority: Major > Fix For: 1.0.0 > > > Calling filter + collect on a dataset with a mis-specified partitioning > causes a segfault. Though this is clearly input error, it would be nice if > there was some guidance that something was wrong with the partitioning. > {code:r} > library(arrow) > library(dplyr) > dir.create("multi_mtcars/one", recursive = TRUE) > dir.create("multi_mtcars/two", recursive = TRUE) > write_parquet(mtcars, "multi_mtcars/one/mtcars.parquet") > write_parquet(mtcars, "multi_mtcars/two/mtcars.parquet") > ds <- open_dataset("multi_mtcars", partitioning = c("level", "nothing")) > # the following will segfault > ds %>% > filter(cyl > 8) %>% > collect() > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8741) [Python][Packaging] Keep VS2015 with bundled dependencies for the windows wheels
[ https://issues.apache.org/jira/browse/ARROW-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8741: --- Fix Version/s: 1.0.0 > [Python][Packaging] Keep VS2015 with bundled dependencies for the windows > wheels > > > Key: ARROW-8741 > URL: https://issues.apache.org/jira/browse/ARROW-8741 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Reporter: Krisztian Szucs >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0, 0.17.1 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The windows wheels needs to be fixed for the release. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8758) [R] Updates for compatibility with dplyr 1.0
Neal Richardson created ARROW-8758: -- Summary: [R] Updates for compatibility with dplyr 1.0 Key: ARROW-8758 URL: https://issues.apache.org/jira/browse/ARROW-8758 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0, 0.17.1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8734) [R] autobrew script always builds from master
[ https://issues.apache.org/jira/browse/ARROW-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8734: --- Summary: [R] autobrew script always builds from master (was: [R] Compilation error on macOS) > [R] autobrew script always builds from master > - > > Key: ARROW-8734 > URL: https://issues.apache.org/jira/browse/ARROW-8734 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Jonathan Keane >Priority: Major > > I've tried to install / build from source (with from a git checkout and using > the built-in `install_arrow()`) and when compiling I'm getting the following > error reliably during the auto brew process: > {code:bash} > x System command 'R' failed, exit status: 1, stdout + stderr: > E> * checking for file ‘/Users/jkeane/Dropbox/arrow/r/DESCRIPTION’ ... OK > E> * preparing ‘arrow’: > E> * checking DESCRIPTION meta-information ... OK > E> * cleaning src > E> * running ‘cleanup’ > E> * installing the package to build vignettes > E> --- > E> * installing *source* package ‘arrow’ ... > E> ** using staged installation > E> *** Generating code with data-raw/codegen.R > E> There were 27 warnings (use warnings() to see them) > E> *** > 375 functions decorated with [[arrow|s3::export]] > E> *** > generated file `src/arrowExports.cpp` > E> *** > generated file `R/arrowExports.R` > E> *** Downloading apache-arrow > E> Using local manifest for apache-arrow > E> Thu May 7 13:13:42 CDT 2020: Auto-brewing apache-arrow in > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T//build-apache-arrow... > E> ==> Tapping autobrew/core from https://github.com/autobrew/homebrew-core > E> Tapped 2 commands and 4639 formulae (4,888 files, 12.7MB). > E> lz4 > E> openssl > E> thrift > E> snappy > E> ==> Downloading > https://homebrew.bintray.com/bottles/lz4-1.8.3.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/b4158ef68d619dbf78935df6a42a70b8339a65bc8876cbb4446355ccd40fa5de--lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Pouring lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/lz4/1.8.3: > 22 files, 512.7KB > E> ==> Downloading > https://homebrew.bintray.com/bottles/openssl-1.0.2p.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/fbb493745981c8b26c0fab115c76c2a70142bfde9e776c450277e9dfbbba0bb2--openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Pouring openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> ==> Caveats > E> openssl is keg-only, which means it was not symlinked into > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow, > E> because Apple has deprecated use of OpenSSL in favor of its own TLS and > crypto libraries. > E> > E> If you need to have openssl first in your PATH run: > E> echo 'export > PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/bin:$PATH"' > >> ~/.zshrc > E> > E> For compilers to find openssl you may need to set: > E> export > LDFLAGS="-L/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib" > E> export > CPPFLAGS="-I/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/include" > E> > E> For pkg-config to find openssl you may need to set: > E> export > PKG_CONFIG_PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib/pkgconfig" > E> > E> ==> Summary > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/openssl/1.0.2p: > 1,793 files, 12MB > E> ==> Downloading > https://homebrew.bintray.com/bottles/thrift-0.11.0.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/7e05ea11a9f7f924dd7f8f36252ec73a24958b7f214f71e3752a355e75e589bd--thrift-0.11.0.mojave.bottle.tar.gz > E> ==> Pouring thrift-0.11.0.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> ==> Caveats > E> To install Ruby binding: > E> gem install thrift > E> ==> Summary > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/thrift/0.11.0: > 102 files, 7MB > E> ==> Downloading > https://homebrew.bintray.com/bottles/snappy-1.1.7_1.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/1f09938804055499d1dd951b13b26d80c56eae359aa051284bf4f51d109a9f73--snappy-1.1.7_1.mojave.bottle.tar.gz > E> ==> Pouring snappy-1.1.7_1.mojave.bottle.tar.gz > E> ==> Skipping post_install step fo
[jira] [Commented] (ARROW-8734) [R] Compilation error on macOS
[ https://issues.apache.org/jira/browse/ARROW-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101968#comment-17101968 ] Neal Richardson commented on ARROW-8734: I think the load error is because you had the package already loaded, and I think it is fixed if you restart R. You're right on the other point, we aren't building nightly binaries for 4.0 yet it seems. > [R] Compilation error on macOS > -- > > Key: ARROW-8734 > URL: https://issues.apache.org/jira/browse/ARROW-8734 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Jonathan Keane >Priority: Major > > I've tried to install / build from source (with from a git checkout and using > the built-in `install_arrow()`) and when compiling I'm getting the following > error reliably during the auto brew process: > {code:bash} > x System command 'R' failed, exit status: 1, stdout + stderr: > E> * checking for file ‘/Users/jkeane/Dropbox/arrow/r/DESCRIPTION’ ... OK > E> * preparing ‘arrow’: > E> * checking DESCRIPTION meta-information ... OK > E> * cleaning src > E> * running ‘cleanup’ > E> * installing the package to build vignettes > E> --- > E> * installing *source* package ‘arrow’ ... > E> ** using staged installation > E> *** Generating code with data-raw/codegen.R > E> There were 27 warnings (use warnings() to see them) > E> *** > 375 functions decorated with [[arrow|s3::export]] > E> *** > generated file `src/arrowExports.cpp` > E> *** > generated file `R/arrowExports.R` > E> *** Downloading apache-arrow > E> Using local manifest for apache-arrow > E> Thu May 7 13:13:42 CDT 2020: Auto-brewing apache-arrow in > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T//build-apache-arrow... > E> ==> Tapping autobrew/core from https://github.com/autobrew/homebrew-core > E> Tapped 2 commands and 4639 formulae (4,888 files, 12.7MB). > E> lz4 > E> openssl > E> thrift > E> snappy > E> ==> Downloading > https://homebrew.bintray.com/bottles/lz4-1.8.3.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/b4158ef68d619dbf78935df6a42a70b8339a65bc8876cbb4446355ccd40fa5de--lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Pouring lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/lz4/1.8.3: > 22 files, 512.7KB > E> ==> Downloading > https://homebrew.bintray.com/bottles/openssl-1.0.2p.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/fbb493745981c8b26c0fab115c76c2a70142bfde9e776c450277e9dfbbba0bb2--openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Pouring openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> ==> Caveats > E> openssl is keg-only, which means it was not symlinked into > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow, > E> because Apple has deprecated use of OpenSSL in favor of its own TLS and > crypto libraries. > E> > E> If you need to have openssl first in your PATH run: > E> echo 'export > PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/bin:$PATH"' > >> ~/.zshrc > E> > E> For compilers to find openssl you may need to set: > E> export > LDFLAGS="-L/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib" > E> export > CPPFLAGS="-I/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/include" > E> > E> For pkg-config to find openssl you may need to set: > E> export > PKG_CONFIG_PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib/pkgconfig" > E> > E> ==> Summary > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/openssl/1.0.2p: > 1,793 files, 12MB > E> ==> Downloading > https://homebrew.bintray.com/bottles/thrift-0.11.0.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/7e05ea11a9f7f924dd7f8f36252ec73a24958b7f214f71e3752a355e75e589bd--thrift-0.11.0.mojave.bottle.tar.gz > E> ==> Pouring thrift-0.11.0.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> ==> Caveats > E> To install Ruby binding: > E> gem install thrift > E> ==> Summary > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/thrift/0.11.0: > 102 files, 7MB > E> ==> Downloading > https://homebrew.bintray.com/bottles/snappy-1.1.7_1.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/1f09938804055499d1dd951b13b26d80c56eae359aa051284bf4f51d1
[jira] [Commented] (ARROW-8734) [R] Compilation error on macOS
[ https://issues.apache.org/jira/browse/ARROW-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101958#comment-17101958 ] Neal Richardson commented on ARROW-8734: Aside: we don't yet have as simple tooling around setting up a full development build (C++ and R from source), and in particular on macOS because of the funky autobrew build system. It's on my wishlist but understandably it's been farther down the list given the effort to get release packaging smooth. I think the binary should work for you for your current need, but if you want to build from source the autobrew way, see what https://github.com/ursa-labs/arrow-r-nightly/blob/master/.travis.yml does--that's how the nightly binaries are made from a git checkout. > [R] Compilation error on macOS > -- > > Key: ARROW-8734 > URL: https://issues.apache.org/jira/browse/ARROW-8734 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Jonathan Keane >Priority: Major > > I've tried to install / build from source (with from a git checkout and using > the built-in `install_arrow()`) and when compiling I'm getting the following > error reliably during the auto brew process: > {code:bash} > x System command 'R' failed, exit status: 1, stdout + stderr: > E> * checking for file ‘/Users/jkeane/Dropbox/arrow/r/DESCRIPTION’ ... OK > E> * preparing ‘arrow’: > E> * checking DESCRIPTION meta-information ... OK > E> * cleaning src > E> * running ‘cleanup’ > E> * installing the package to build vignettes > E> --- > E> * installing *source* package ‘arrow’ ... > E> ** using staged installation > E> *** Generating code with data-raw/codegen.R > E> There were 27 warnings (use warnings() to see them) > E> *** > 375 functions decorated with [[arrow|s3::export]] > E> *** > generated file `src/arrowExports.cpp` > E> *** > generated file `R/arrowExports.R` > E> *** Downloading apache-arrow > E> Using local manifest for apache-arrow > E> Thu May 7 13:13:42 CDT 2020: Auto-brewing apache-arrow in > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T//build-apache-arrow... > E> ==> Tapping autobrew/core from https://github.com/autobrew/homebrew-core > E> Tapped 2 commands and 4639 formulae (4,888 files, 12.7MB). > E> lz4 > E> openssl > E> thrift > E> snappy > E> ==> Downloading > https://homebrew.bintray.com/bottles/lz4-1.8.3.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/b4158ef68d619dbf78935df6a42a70b8339a65bc8876cbb4446355ccd40fa5de--lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Pouring lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/lz4/1.8.3: > 22 files, 512.7KB > E> ==> Downloading > https://homebrew.bintray.com/bottles/openssl-1.0.2p.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/fbb493745981c8b26c0fab115c76c2a70142bfde9e776c450277e9dfbbba0bb2--openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Pouring openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> ==> Caveats > E> openssl is keg-only, which means it was not symlinked into > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow, > E> because Apple has deprecated use of OpenSSL in favor of its own TLS and > crypto libraries. > E> > E> If you need to have openssl first in your PATH run: > E> echo 'export > PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/bin:$PATH"' > >> ~/.zshrc > E> > E> For compilers to find openssl you may need to set: > E> export > LDFLAGS="-L/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib" > E> export > CPPFLAGS="-I/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/include" > E> > E> For pkg-config to find openssl you may need to set: > E> export > PKG_CONFIG_PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib/pkgconfig" > E> > E> ==> Summary > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/openssl/1.0.2p: > 1,793 files, 12MB > E> ==> Downloading > https://homebrew.bintray.com/bottles/thrift-0.11.0.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/7e05ea11a9f7f924dd7f8f36252ec73a24958b7f214f71e3752a355e75e589bd--thrift-0.11.0.mojave.bottle.tar.gz > E> ==> Pouring thrift-0.11.0.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> ==> Caveats > E> To install Ruby binding: > E> gem install thrift > E> ==> Summary >
[jira] [Comment Edited] (ARROW-8734) [R] Compilation error on macOS
[ https://issues.apache.org/jira/browse/ARROW-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101953#comment-17101953 ] Neal Richardson edited comment on ARROW-8734 at 5/7/20, 6:34 PM: - -You need to be on the exact same version of C++ library and R package. Are you installing from a git checkout?- Ah yes, you said that. But if you're installing from a checkout, don't use {{install_arrow}}. If you want a development version on macOS, why not use our nightly binaries and avoid the hassle of a source build? {{arrow::install_arrow(nightly=TRUE)}} should do it, or you could set the {{repos}} arg to install.packages yourself. was (Author: npr): You need to be on the exact same version of C++ library and R package. Are you installing from a git checkout? If you want a development version on macOS, why not use our nightly binaries and avoid the hassle of a source build? {{arrow::install_arrow(nightly=TRUE)}} should do it, or you could set the {{repos}} arg to install.packages yourself. > [R] Compilation error on macOS > -- > > Key: ARROW-8734 > URL: https://issues.apache.org/jira/browse/ARROW-8734 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Jonathan Keane >Priority: Major > > I've tried to install / build from source (with from a git checkout and using > the built-in `install_arrow()`) and when compiling I'm getting the following > error reliably during the auto brew process: > {code:bash} > x System command 'R' failed, exit status: 1, stdout + stderr: > E> * checking for file ‘/Users/jkeane/Dropbox/arrow/r/DESCRIPTION’ ... OK > E> * preparing ‘arrow’: > E> * checking DESCRIPTION meta-information ... OK > E> * cleaning src > E> * running ‘cleanup’ > E> * installing the package to build vignettes > E> --- > E> * installing *source* package ‘arrow’ ... > E> ** using staged installation > E> *** Generating code with data-raw/codegen.R > E> There were 27 warnings (use warnings() to see them) > E> *** > 375 functions decorated with [[arrow|s3::export]] > E> *** > generated file `src/arrowExports.cpp` > E> *** > generated file `R/arrowExports.R` > E> *** Downloading apache-arrow > E> Using local manifest for apache-arrow > E> Thu May 7 13:13:42 CDT 2020: Auto-brewing apache-arrow in > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T//build-apache-arrow... > E> ==> Tapping autobrew/core from https://github.com/autobrew/homebrew-core > E> Tapped 2 commands and 4639 formulae (4,888 files, 12.7MB). > E> lz4 > E> openssl > E> thrift > E> snappy > E> ==> Downloading > https://homebrew.bintray.com/bottles/lz4-1.8.3.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/b4158ef68d619dbf78935df6a42a70b8339a65bc8876cbb4446355ccd40fa5de--lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Pouring lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/lz4/1.8.3: > 22 files, 512.7KB > E> ==> Downloading > https://homebrew.bintray.com/bottles/openssl-1.0.2p.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/fbb493745981c8b26c0fab115c76c2a70142bfde9e776c450277e9dfbbba0bb2--openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Pouring openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> ==> Caveats > E> openssl is keg-only, which means it was not symlinked into > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow, > E> because Apple has deprecated use of OpenSSL in favor of its own TLS and > crypto libraries. > E> > E> If you need to have openssl first in your PATH run: > E> echo 'export > PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/bin:$PATH"' > >> ~/.zshrc > E> > E> For compilers to find openssl you may need to set: > E> export > LDFLAGS="-L/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib" > E> export > CPPFLAGS="-I/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/include" > E> > E> For pkg-config to find openssl you may need to set: > E> export > PKG_CONFIG_PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib/pkgconfig" > E> > E> ==> Summary > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/openssl/1.0.2p: > 1,793 files, 12MB > E> ==> Downloading > https://homebrew.bintray.com/bottles/thrift-0.11.0.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/7e
[jira] [Commented] (ARROW-8734) [R] Compilation error on macOS
[ https://issues.apache.org/jira/browse/ARROW-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101953#comment-17101953 ] Neal Richardson commented on ARROW-8734: You need to be on the exact same version of C++ library and R package. Are you installing from a git checkout? If you want a development version on macOS, why not use our nightly binaries and avoid the hassle of a source build? {{arrow::install_arrow(nightly=TRUE)}} should do it, or you could set the {{repos}} arg to install.packages yourself. > [R] Compilation error on macOS > -- > > Key: ARROW-8734 > URL: https://issues.apache.org/jira/browse/ARROW-8734 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Jonathan Keane >Priority: Major > > I've tried to install / build from source (with from a git checkout and using > the built-in `install_arrow()`) and when compiling I'm getting the following > error reliably during the auto brew process: > {code:bash} > x System command 'R' failed, exit status: 1, stdout + stderr: > E> * checking for file ‘/Users/jkeane/Dropbox/arrow/r/DESCRIPTION’ ... OK > E> * preparing ‘arrow’: > E> * checking DESCRIPTION meta-information ... OK > E> * cleaning src > E> * running ‘cleanup’ > E> * installing the package to build vignettes > E> --- > E> * installing *source* package ‘arrow’ ... > E> ** using staged installation > E> *** Generating code with data-raw/codegen.R > E> There were 27 warnings (use warnings() to see them) > E> *** > 375 functions decorated with [[arrow|s3::export]] > E> *** > generated file `src/arrowExports.cpp` > E> *** > generated file `R/arrowExports.R` > E> *** Downloading apache-arrow > E> Using local manifest for apache-arrow > E> Thu May 7 13:13:42 CDT 2020: Auto-brewing apache-arrow in > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T//build-apache-arrow... > E> ==> Tapping autobrew/core from https://github.com/autobrew/homebrew-core > E> Tapped 2 commands and 4639 formulae (4,888 files, 12.7MB). > E> lz4 > E> openssl > E> thrift > E> snappy > E> ==> Downloading > https://homebrew.bintray.com/bottles/lz4-1.8.3.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/b4158ef68d619dbf78935df6a42a70b8339a65bc8876cbb4446355ccd40fa5de--lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Pouring lz4-1.8.3.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/lz4/1.8.3: > 22 files, 512.7KB > E> ==> Downloading > https://homebrew.bintray.com/bottles/openssl-1.0.2p.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/fbb493745981c8b26c0fab115c76c2a70142bfde9e776c450277e9dfbbba0bb2--openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Pouring openssl-1.0.2p.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> ==> Caveats > E> openssl is keg-only, which means it was not symlinked into > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow, > E> because Apple has deprecated use of OpenSSL in favor of its own TLS and > crypto libraries. > E> > E> If you need to have openssl first in your PATH run: > E> echo 'export > PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/bin:$PATH"' > >> ~/.zshrc > E> > E> For compilers to find openssl you may need to set: > E> export > LDFLAGS="-L/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib" > E> export > CPPFLAGS="-I/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/include" > E> > E> For pkg-config to find openssl you may need to set: > E> export > PKG_CONFIG_PATH="/private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/opt/openssl/lib/pkgconfig" > E> > E> ==> Summary > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/openssl/1.0.2p: > 1,793 files, 12MB > E> ==> Downloading > https://homebrew.bintray.com/bottles/thrift-0.11.0.mojave.bottle.tar.gz > E> Already downloaded: > /var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/downloads/7e05ea11a9f7f924dd7f8f36252ec73a24958b7f214f71e3752a355e75e589bd--thrift-0.11.0.mojave.bottle.tar.gz > E> ==> Pouring thrift-0.11.0.mojave.bottle.tar.gz > E> ==> Skipping post_install step for autobrew... > E> ==> Caveats > E> To install Ruby binding: > E> gem install thrift > E> ==> Summary > E> 🍺 > /private/var/folders/45/n5gfjjtn05j877spnpbnhqqwgn/T/build-apache-arrow/Cellar/thrift/0.11.0: > 102 files, 7MB > E> ==> Downloading > https://homebrew.bintray.com/bottles/snappy-1.1.7_1.mojave.bottle.tar.gz
[jira] [Created] (ARROW-8718) [R] Add str() methods to objects
Neal Richardson created ARROW-8718: -- Summary: [R] Add str() methods to objects Key: ARROW-8718 URL: https://issues.apache.org/jira/browse/ARROW-8718 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 Apparently this will make the RStudio IDE show useful things in the environment panel. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8717) [CI][Packaging] Add build dependency on boost to homebrew
Neal Richardson created ARROW-8717: -- Summary: [CI][Packaging] Add build dependency on boost to homebrew Key: ARROW-8717 URL: https://issues.apache.org/jira/browse/ARROW-8717 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, Packaging Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 cf. https://github.com/Homebrew/homebrew-core/pull/54287 and revise the Travis jobs to uninstall boost and thrift before checking the formula -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8604) [R][CI] Update CI to use R 4.0
[ https://issues.apache.org/jira/browse/ARROW-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8604: --- Component/s: Continuous Integration > [R][CI] Update CI to use R 4.0 > -- > > Key: ARROW-8604 > URL: https://issues.apache.org/jira/browse/ARROW-8604 > Project: Apache Arrow > Issue Type: Bug > Components: Continuous Integration, R >Reporter: Francois Saint-Jacques >Assignee: Neal Richardson >Priority: Major > Fix For: 1.0.0 > > > [Master|[https://github.com/apache/arrow/runs/622393526]] fails to compile. > The C++ cmake build is not using the same > [compiler|[https://github.com/apache/arrow/runs/622393526#step:8:807]] than > the R extension > [compiler|[https://github.com/apache/arrow/runs/622393526#step:11:141]]. > {code:java} > // Files installed here > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libarrow.a (deflated 85%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libarrow_dataset.a (deflated 82%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libparquet.a (deflated 84%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libsnappy.a (deflated 61%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libthrift.a (deflated 81%) > // Linker is using `-L` > C:/Rtools/mingw_32/bin/g++ -shared -s -static-libgcc -o arrow.dll tmp.def > array.o array_from_vector.o array_to_vector.o arraydata.o arrowExports.o > buffer.o chunkedarray.o compression.o compute.o csv.o dataset.o datatype.o > expression.o feather.o field.o filesystem.o io.o json.o memorypool.o > message.o parquet.o py-to-r.o recordbatch.o recordbatchreader.o > recordbatchwriter.o schema.o symbols.o table.o threadpool.o > -L../windows/arrow-0.17.0.9000/lib-8.3.0/i386 > -L../windows/arrow-0.17.0.9000/lib/i386 -lparquet -larrow_dataset -larrow > -lthrift -lsnappy -lz -lzstd -llz4 -lcrypto -lcrypt32 -lws2_32 > -LC:/R/bin/i386 -lR > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lparquet > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -larrow_dataset > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -larrow > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lthrift > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lsnappy > {code} > > C++ developers, rejoice, this is almost the end of gcc-4.9. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8604) [R][CI] Update CI to use R 4.0
[ https://issues.apache.org/jira/browse/ARROW-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8604: --- Summary: [R][CI] Update CI to use R 4.0 (was: [R] Update CI to use R 4.0) > [R][CI] Update CI to use R 4.0 > -- > > Key: ARROW-8604 > URL: https://issues.apache.org/jira/browse/ARROW-8604 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Francois Saint-Jacques >Assignee: Neal Richardson >Priority: Major > Fix For: 1.0.0 > > > [Master|[https://github.com/apache/arrow/runs/622393526]] fails to compile. > The C++ cmake build is not using the same > [compiler|[https://github.com/apache/arrow/runs/622393526#step:8:807]] than > the R extension > [compiler|[https://github.com/apache/arrow/runs/622393526#step:11:141]]. > {code:java} > // Files installed here > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libarrow.a (deflated 85%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libarrow_dataset.a (deflated 82%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libparquet.a (deflated 84%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libsnappy.a (deflated 61%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libthrift.a (deflated 81%) > // Linker is using `-L` > C:/Rtools/mingw_32/bin/g++ -shared -s -static-libgcc -o arrow.dll tmp.def > array.o array_from_vector.o array_to_vector.o arraydata.o arrowExports.o > buffer.o chunkedarray.o compression.o compute.o csv.o dataset.o datatype.o > expression.o feather.o field.o filesystem.o io.o json.o memorypool.o > message.o parquet.o py-to-r.o recordbatch.o recordbatchreader.o > recordbatchwriter.o schema.o symbols.o table.o threadpool.o > -L../windows/arrow-0.17.0.9000/lib-8.3.0/i386 > -L../windows/arrow-0.17.0.9000/lib/i386 -lparquet -larrow_dataset -larrow > -lthrift -lsnappy -lz -lzstd -llz4 -lcrypto -lcrypt32 -lws2_32 > -LC:/R/bin/i386 -lR > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lparquet > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -larrow_dataset > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -larrow > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lthrift > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lsnappy > {code} > > C++ developers, rejoice, this is almost the end of gcc-4.9. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8703) [R] schema$metadata should be properly typed
[ https://issues.apache.org/jira/browse/ARROW-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8703: --- Priority: Major (was: Critical) > [R] schema$metadata should be properly typed > > > Key: ARROW-8703 > URL: https://issues.apache.org/jira/browse/ARROW-8703 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Affects Versions: 0.17.0 >Reporter: René Rex >Priority: Major > > Currently, I try to export numeric data plus some metadata in Python into to > a parquet file and read it in R. However, the metadata seems to be a dict in > Python but a string in R. I would have expected a list (which is roughly a > dict in Python). Am I missing something? Here is the code to demonstrate the > issue: > {{import sys}} > {{import numpy as np}} > {{import pyarrow as pa}} > {{import pyarrow.parquet as pq}} > {{print(sys.version)}} > {{print(pa.__version__)}} > {{x = np.random.randint(0, 10, (10, 3))}} > {{arrays = [pa.array(x[:, i]) for i in range(x.shape[1])]}} > {{table = pa.Table.from_arrays(arrays=arrays, names=['A', 'B', 'C'],}} > {{ metadata=\{'foo': '42'})}} > {{pq.write_table(table, 'array.parquet', compression='snappy')}} > {{table = pq.read_table('array.parquet')}} > {{metadata = table.schema.metadata}} > {{print(metadata)}} > {{print(type(metadata))}} > > And in R: > > {{library(arrow)}} > {{print(R.version)}} > {{print(packageVersion("arrow"))}} > {{table <- read_parquet("array.parquet", as_data_frame = FALSE)}} > {{metadata <- table$schema$metadata}} > {{print(metadata)}} > {{print(is(metadata))}} > {{print(metadata["foo"])}}{{ }} > > Output Python: > {{3.6.8 (default, Aug 7 2019, 17:28:10) }} > {{[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]}} > {{0.13.0}} > {{OrderedDict([(b'foo', b'42')])}} > {{}} > > Output R: > {{[1] ‘0.17.0’}} > {{[1] "\n-- metadata --\nfoo: 42"}} > {{[1] "character" "vector" "data.frameRowLabels"}} > {{[4] "SuperClassMethod" }} > {{[1] NA}} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-8699) [R] Fix automatic r_to_py conversion
[ https://issues.apache.org/jira/browse/ARROW-8699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8699. Resolution: Fixed Issue resolved by pull request 7102 [https://github.com/apache/arrow/pull/7102] > [R] Fix automatic r_to_py conversion > > > Key: ARROW-8699 > URL: https://issues.apache.org/jira/browse/ARROW-8699 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > See https://github.com/rstudio/reticulate/issues/748 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8699) [R] Fix automatic r_to_py conversion
Neal Richardson created ARROW-8699: -- Summary: [R] Fix automatic r_to_py conversion Key: ARROW-8699 URL: https://issues.apache.org/jira/browse/ARROW-8699 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 See https://github.com/rstudio/reticulate/issues/748 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8635) [R] test-filesystem.R takes ~40 seconds to run?
[ https://issues.apache.org/jira/browse/ARROW-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095975#comment-17095975 ] Neal Richardson commented on ARROW-8635: Have you set this aws-sdk environment variable? https://github.com/apache/arrow/blob/master/ci/scripts/r_test.sh#L44-L46 François found it, and it seems to help. > [R] test-filesystem.R takes ~40 seconds to run? > --- > > Key: ARROW-8635 > URL: https://issues.apache.org/jira/browse/ARROW-8635 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Wes McKinney >Priority: Major > Fix For: 1.0.0 > > > {code} > ✔ | 22 | Expressions > ✔ | 107 | Feather [0.2 s] > ✔ | 7 | Field > ✔ | 40 | File system [38.1 s] > ✔ | 6 | install_arrow() > ✔ | 26 | JsonTableReader [0.1 s] > ✔ | 24 | MessageReader > ✔ | 12 | Message > ✔ | 31 | Parquet file reading/writing [0.2 s] > ⠏ | 0 | To/from Pythonvirtualenv: arrow-test > {code} > Is this expected? I assume it's related to S3 but that seems like a long > time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8624) [Website] Install page should mention arrow-dataset packages
[ https://issues.apache.org/jira/browse/ARROW-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8624: --- Description: I've seen a few reports like [https://github.com/apache/arrow/issues/7055], where the user reports that they've installed the arrow system packages, we can see that they exist, but {{pkg-config}} reports that it doesn't have them. I think this is because {{-larrow_dataset}} isn't found. As the output on that issue shows, while arrow core headers and libraries are there, arrow_dataset is not. -Searching through the packaging scripts (such as [https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in]), while there is some metadata about a dataset package, I see that ARROW_DATASET=ON is not set anywhere, so I don't think we're building it.- So apparently we are building it, but we aren't documenting how to get it. was: I've seen a few reports like https://github.com/apache/arrow/issues/7055, where the user reports that they've installed the arrow system packages, we can see that they exist, but {{pkg-config}} reports that it doesn't have them. I think this is because {{-larrow_dataset}} isn't found. As the output on that issue shows, while arrow core headers and libraries are there, arrow_dataset is not. ~~Searching through the packaging scripts (such as https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in), while there is some metadata about a dataset package, I see that ARROW_DATASET=ON is not set anywhere, so I don't think we're building it.~~ So apparently we are building it, but we aren't documenting how to get it. > [Website] Install page should mention arrow-dataset packages > > > Key: ARROW-8624 > URL: https://issues.apache.org/jira/browse/ARROW-8624 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Affects Versions: 0.17.0 >Reporter: Neal Richardson >Priority: Critical > > I've seen a few reports like [https://github.com/apache/arrow/issues/7055], > where the user reports that they've installed the arrow system packages, we > can see that they exist, but {{pkg-config}} reports that it doesn't have > them. I think this is because {{-larrow_dataset}} isn't found. As the output > on that issue shows, while arrow core headers and libraries are there, > arrow_dataset is not. > -Searching through the packaging scripts (such as > [https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in]), > while there is some metadata about a dataset package, I see that > ARROW_DATASET=ON is not set anywhere, so I don't think we're building it.- So > apparently we are building it, but we aren't documenting how to get it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (ARROW-8624) [Packaging] Linux system packages aren't building with ARROW_DATASET=ON
[ https://issues.apache.org/jira/browse/ARROW-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reopened ARROW-8624: > [Packaging] Linux system packages aren't building with ARROW_DATASET=ON > --- > > Key: ARROW-8624 > URL: https://issues.apache.org/jira/browse/ARROW-8624 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Affects Versions: 0.17.0 >Reporter: Neal Richardson >Priority: Critical > > I've seen a few reports like https://github.com/apache/arrow/issues/7055, > where the user reports that they've installed the arrow system packages, we > can see that they exist, but {{pkg-config}} reports that it doesn't have > them. I think this is because {{-larrow_dataset}} isn't found. As the output > on that issue shows, while arrow core headers and libraries are there, > arrow_dataset is not. > Searching through the packaging scripts (such as > https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in), > while there is some metadata about a dataset package, I see that > ARROW_DATASET=ON is not set anywhere, so I don't think we're building it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8624) [Website] Install page should mention arrow-dataset packages
[ https://issues.apache.org/jira/browse/ARROW-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8624: --- Summary: [Website] Install page should mention arrow-dataset packages (was: [Packaging] Linux system packages aren't building with ARROW_DATASET=ON) > [Website] Install page should mention arrow-dataset packages > > > Key: ARROW-8624 > URL: https://issues.apache.org/jira/browse/ARROW-8624 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Affects Versions: 0.17.0 >Reporter: Neal Richardson >Priority: Critical > > I've seen a few reports like https://github.com/apache/arrow/issues/7055, > where the user reports that they've installed the arrow system packages, we > can see that they exist, but {{pkg-config}} reports that it doesn't have > them. I think this is because {{-larrow_dataset}} isn't found. As the output > on that issue shows, while arrow core headers and libraries are there, > arrow_dataset is not. > Searching through the packaging scripts (such as > https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in), > while there is some metadata about a dataset package, I see that > ARROW_DATASET=ON is not set anywhere, so I don't think we're building it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8624) [Website] Install page should mention arrow-dataset packages
[ https://issues.apache.org/jira/browse/ARROW-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8624: --- Description: I've seen a few reports like https://github.com/apache/arrow/issues/7055, where the user reports that they've installed the arrow system packages, we can see that they exist, but {{pkg-config}} reports that it doesn't have them. I think this is because {{-larrow_dataset}} isn't found. As the output on that issue shows, while arrow core headers and libraries are there, arrow_dataset is not. ~~Searching through the packaging scripts (such as https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in), while there is some metadata about a dataset package, I see that ARROW_DATASET=ON is not set anywhere, so I don't think we're building it.~~ So apparently we are building it, but we aren't documenting how to get it. was: I've seen a few reports like https://github.com/apache/arrow/issues/7055, where the user reports that they've installed the arrow system packages, we can see that they exist, but {{pkg-config}} reports that it doesn't have them. I think this is because {{-larrow_dataset}} isn't found. As the output on that issue shows, while arrow core headers and libraries are there, arrow_dataset is not. Searching through the packaging scripts (such as https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in), while there is some metadata about a dataset package, I see that ARROW_DATASET=ON is not set anywhere, so I don't think we're building it. > [Website] Install page should mention arrow-dataset packages > > > Key: ARROW-8624 > URL: https://issues.apache.org/jira/browse/ARROW-8624 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Affects Versions: 0.17.0 >Reporter: Neal Richardson >Priority: Critical > > I've seen a few reports like https://github.com/apache/arrow/issues/7055, > where the user reports that they've installed the arrow system packages, we > can see that they exist, but {{pkg-config}} reports that it doesn't have > them. I think this is because {{-larrow_dataset}} isn't found. As the output > on that issue shows, while arrow core headers and libraries are there, > arrow_dataset is not. > ~~Searching through the packaging scripts (such as > https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in), > while there is some metadata about a dataset package, I see that > ARROW_DATASET=ON is not set anywhere, so I don't think we're building it.~~ > So apparently we are building it, but we aren't documenting how to get it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8624) [Packaging] Linux system packages aren't building with ARROW_DATASET=ON
Neal Richardson created ARROW-8624: -- Summary: [Packaging] Linux system packages aren't building with ARROW_DATASET=ON Key: ARROW-8624 URL: https://issues.apache.org/jira/browse/ARROW-8624 Project: Apache Arrow Issue Type: Improvement Components: Packaging Affects Versions: 0.17.0 Reporter: Neal Richardson I've seen a few reports like https://github.com/apache/arrow/issues/7055, where the user reports that they've installed the arrow system packages, we can see that they exist, but {{pkg-config}} reports that it doesn't have them. I think this is because {{-larrow_dataset}} isn't found. As the output on that issue shows, while arrow core headers and libraries are there, arrow_dataset is not. Searching through the packaging scripts (such as https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in), while there is some metadata about a dataset package, I see that ARROW_DATASET=ON is not set anywhere, so I don't think we're building it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-8611) [R] Can't install arrow 0.17 on Ubuntu 18.04 R 3.6.3
[ https://issues.apache.org/jira/browse/ARROW-8611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8611. Fix Version/s: 1.0.0 Assignee: Neal Richardson Resolution: Information Provided > [R] Can't install arrow 0.17 on Ubuntu 18.04 R 3.6.3 > > > Key: ARROW-8611 > URL: https://issues.apache.org/jira/browse/ARROW-8611 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Zhuo Jia Dai >Assignee: Neal Richardson >Priority: Major > Fix For: 1.0.0 > > > This is the error I get when I try to install it. How do I provide more info > to help you diagnose? Seems to be an issue with Thrift which I have built on > my machine. > > How do I remove thrift and install it? > > "Unable to locate package libthrift-dev " when I try `sudo apt install > libthrift-dev` > > {quote} > ** testing if installed package can be loaded from temporary location > Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath > = DLLpath, ...): > unable to load shared object > '/home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so': > > /home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so: > undefined symbol: _ZTIN6apache6thrift8protocol9TProtocolE > Error: loading failed > Execution halted > ERROR: loading failed > * removing ‘/home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/arrow’ > * restoring previous ‘/home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/arrow’ > The downloaded source packages are in > ‘/tmp/RtmpUF6P1q/downloaded_packages’ > Warning message: > In install.packages("arrow") : > installation of package ‘arrow’ had non-zero exit status > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8611) [R] Can't install arrow 0.17 on Ubuntu 18.04 R 3.6.3
[ https://issues.apache.org/jira/browse/ARROW-8611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094910#comment-17094910 ] Neal Richardson commented on ARROW-8611: Glad the binary worked for you. For future reference, if you don't want to use whatever version of thrift you have on your system when you install arrow, you can set {{EXTRA_CMAKE_FLAGS="-DThrift_SOURCE=BUNDLED"}} (sadly, case sensitive) and the Arrow C++ build will build thrift itself. > [R] Can't install arrow 0.17 on Ubuntu 18.04 R 3.6.3 > > > Key: ARROW-8611 > URL: https://issues.apache.org/jira/browse/ARROW-8611 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Zhuo Jia Dai >Priority: Major > > This is the error I get when I try to install it. How do I provide more info > to help you diagnose? Seems to be an issue with Thrift which I have built on > my machine. > > How do I remove thrift and install it? > > "Unable to locate package libthrift-dev " when I try `sudo apt install > libthrift-dev` > > {quote} > ** testing if installed package can be loaded from temporary location > Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath > = DLLpath, ...): > unable to load shared object > '/home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so': > > /home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so: > undefined symbol: _ZTIN6apache6thrift8protocol9TProtocolE > Error: loading failed > Execution halted > ERROR: loading failed > * removing ‘/home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/arrow’ > * restoring previous ‘/home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/arrow’ > The downloaded source packages are in > ‘/tmp/RtmpUF6P1q/downloaded_packages’ > Warning message: > In install.packages("arrow") : > installation of package ‘arrow’ had non-zero exit status > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-8513) [Python] Expose Take with Table input in Python
[ https://issues.apache.org/jira/browse/ARROW-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8513. Resolution: Fixed Issue resolved by pull request 7039 [https://github.com/apache/arrow/pull/7039] > [Python] Expose Take with Table input in Python > --- > > Key: ARROW-8513 > URL: https://issues.apache.org/jira/browse/ARROW-8513 > Project: Apache Arrow > Issue Type: New Feature > Components: Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This is implemented in C++ but not exposed in the bindings -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-8572) [Python] Expose UnionArray.array and other fields
[ https://issues.apache.org/jira/browse/ARROW-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8572. Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 7027 [https://github.com/apache/arrow/pull/7027] > [Python] Expose UnionArray.array and other fields > - > > Key: ARROW-8572 > URL: https://issues.apache.org/jira/browse/ARROW-8572 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.17.0 >Reporter: David Li >Assignee: David Li >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Currently in Python, you can construct a UnionArray easily, but getting the > data back out (without copying) is near-impossible. We should expose the > getter for UnionArray.array so we can pull out the constituent arrays. We > should also expose fields like mode while we're at it. > The use case is: in Flight, we'd like to write multiple distinct datasets > (with distinct schemas) in a single logical call; using UnionArrays lets us > combine these datasets into a single logical dataset. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8611) [R] Can't install arrow 0.17 on Ubuntu 18.04 R 3.6.3
[ https://issues.apache.org/jira/browse/ARROW-8611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094602#comment-17094602 ] Neal Richardson commented on ARROW-8611: You can also set {{NOT_CRAN=true}} so that you'll install with a prebuilt binary. See http://arrow.apache.org/docs/r/articles/install.html for more details. > [R] Can't install arrow 0.17 on Ubuntu 18.04 R 3.6.3 > > > Key: ARROW-8611 > URL: https://issues.apache.org/jira/browse/ARROW-8611 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Zhuo Jia Dai >Priority: Major > > This is the error I get when I try to install it. How do I provide more info > to help you diagnose? Seems to be an issue with Thrift which I have built on > my machine. > > How do I remove thrift and install it? > > "Unable to locate package libthrift-dev " when I try `sudo apt install > libthrift-dev` > > {quote} > ** testing if installed package can be loaded from temporary location > Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath > = DLLpath, ...): > unable to load shared object > '/home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so': > > /home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so: > undefined symbol: _ZTIN6apache6thrift8protocol9TProtocolE > Error: loading failed > Execution halted > ERROR: loading failed > * removing ‘/home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/arrow’ > * restoring previous ‘/home/xiaodai/R/x86_64-pc-linux-gnu-library/3.6/arrow’ > The downloaded source packages are in > ‘/tmp/RtmpUF6P1q/downloaded_packages’ > Warning message: > In install.packages("arrow") : > installation of package ‘arrow’ had non-zero exit status > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8605) [R] Add support for brotli to Windows build
[ https://issues.apache.org/jira/browse/ARROW-8605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094599#comment-17094599 ] Neal Richardson commented on ARROW-8605: The more important part would be adding brotli to the rtools-packages project so that it could be included. See my comment on ARROW-6960. > [R] Add support for brotli to Windows build > --- > > Key: ARROW-8605 > URL: https://issues.apache.org/jira/browse/ARROW-8605 > Project: Apache Arrow > Issue Type: New Feature >Affects Versions: 0.17.0 >Reporter: Hei >Priority: Major > > Hi, > My friend installed arrow and tried to open a parquet file with brotli codec. > But then, he got an error when calling read_parquet("my.parquet") on Windows: > {code} > Error in parquet__arrow__FileReader__ReadTable(self) : >IOError: NotImplemented: Brotli codec support not built > {code} > It sounds similar to ARROW-6960. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8586) [R] installation failure on CentOS 7
[ https://issues.apache.org/jira/browse/ARROW-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094591#comment-17094591 ] Neal Richardson commented on ARROW-8586: Sorry, I should have checked before commenting. Apparently it's {{EXTRA_CMAKE_FLAGS}}, not {{EXTRA_CMAKE_ARGS}}. That's why there wasn't more output. If you're willing to try again with that correction, I'd be curious to see why thrift is failing to install. FTR, my conclusions at this point are: 1. Binary version detection on centos when lsb_release is installed isn't behaving correctly (so you have to specify LIBARROW_BINARY=centos-7 instead of that being determined automatically). Will fix that. 2. The centos-7 binary is being built with {{CC=/usr/bin/gcc CXX=/usr/bin/g++}}, which appears to mean gcc 4.8, and that doesn't play well with newer compiler versions if you have them. I'll have to explore why I set those in the build Dockerfile and think about ways to ensure that compatibility when you install. 3. We still don't know why thrift is having problems building. > [R] installation failure on CentOS 7 > > > Key: ARROW-8586 > URL: https://issues.apache.org/jira/browse/ARROW-8586 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: CentOS 7 >Reporter: Hei >Priority: Major > > Hi, > I am trying to install arrow via RStudio, but it seems like it is not working > that after I installed the package, it kept asking me to run > arrow::install_arrow() even after I did: > {code} > > install.packages("arrow") > Installing package into ‘/home/hc/R/x86_64-redhat-linux-gnu-library/3.6’ > (as ‘lib’ is unspecified) > trying URL 'https://cran.rstudio.com/src/contrib/arrow_0.17.0.tar.gz' > Content type 'application/x-gzip' length 242534 bytes (236 KB) > == > downloaded 236 KB > * installing *source* package ‘arrow’ ... > ** package ‘arrow’ successfully unpacked and MD5 sums checked > ** using staged installation > *** Successfully retrieved C++ source > *** Building C++ libraries > cmake > arrow > ./configure: line 132: cd: libarrow/arrow-0.17.0/lib: Not a directory > - NOTE --- > After installation, please run arrow::install_arrow() > for help installing required runtime libraries > - > ** libs > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array.cpp -o array.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array_from_vector.cpp -o > array_from_vector.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array_to_vector.cpp -o > array_to_vector.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c arraydata.cpp -o arraydata.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c arrowExports.cpp -o > arrowExports.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c buffer.cpp -o buffer.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-s
[jira] [Updated] (ARROW-8556) [R] zstd symbol not found if there are multiple installations of zstd
[ https://issues.apache.org/jira/browse/ARROW-8556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8556: --- Summary: [R] zstd symbol not found if there are multiple installations of zstd (was: [R] zstd symbol not found on Ubuntu 19.10) > [R] zstd symbol not found if there are multiple installations of zstd > - > > Key: ARROW-8556 > URL: https://issues.apache.org/jira/browse/ARROW-8556 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Ubuntu 19.10 > R 3.6.1 >Reporter: Karl Dunkle Werner >Priority: Major > > I would like to install the `arrow` R package on my Ubuntu 19.10 system. > Prebuilt binaries are unavailable, and I want to enable compression, so I set > the {{LIBARROW_MINIMAL=false}} environment variable. When I do so, it looks > like the package is able to compile, but can't be loaded. I'm able to install > correctly if I don't set the {{LIBARROW_MINIMAL}} variable. > Here's the error I get: > {code:java} > ** testing if installed package can be loaded from temporary location > Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath > = DLLpath, ...): > unable to load shared object > '~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so': > ~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so: undefined symbol: > ZSTD_initCStream > Error: loading failed > Execution halted > ERROR: loading failed > * removing ‘~/.R/3.6/arrow’ > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8556) [R] zstd symbol not found on Ubuntu 19.10
[ https://issues.apache.org/jira/browse/ARROW-8556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094017#comment-17094017 ] Neal Richardson commented on ARROW-8556: Thanks, that makes some sense. Googling the original undefined symbol error message, all I found were issues caused by having multiple versions of zstd installed (e.g. https://github.com/facebook/wangle/issues/73), but since you said you didn't have it installed before, I didn't think it was relevant. I wish there were a good way to make it not fail in that case, to make sure that if you build from source in the R build, that that version gets picked up. Maybe someone else will have an idea on how to achieve that. > [R] zstd symbol not found on Ubuntu 19.10 > - > > Key: ARROW-8556 > URL: https://issues.apache.org/jira/browse/ARROW-8556 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Ubuntu 19.10 > R 3.6.1 >Reporter: Karl Dunkle Werner >Priority: Major > > I would like to install the `arrow` R package on my Ubuntu 19.10 system. > Prebuilt binaries are unavailable, and I want to enable compression, so I set > the {{LIBARROW_MINIMAL=false}} environment variable. When I do so, it looks > like the package is able to compile, but can't be loaded. I'm able to install > correctly if I don't set the {{LIBARROW_MINIMAL}} variable. > Here's the error I get: > {code:java} > ** testing if installed package can be loaded from temporary location > Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath > = DLLpath, ...): > unable to load shared object > '~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so': > ~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so: undefined symbol: > ZSTD_initCStream > Error: loading failed > Execution halted > ERROR: loading failed > * removing ‘~/.R/3.6/arrow’ > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8586) [R] installation failure on CentOS 7
[ https://issues.apache.org/jira/browse/ARROW-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094013#comment-17094013 ] Neal Richardson commented on ARROW-8586: Thanks. A few thoughts. Apologies if this is confusing; we're going deep in some different directions: * {{ARROW_R_DEV=true}} is for installation verbosity only, not for crash reporting, and from the install logs you shared, I can see that apparently thrift failed to build/install. I haven't seen it fail in that specific way before, I don't think. If you want to go deeper into the Matrix with me, try reinstalling with {{ARROW_R_DEV=true}} and {{EXTRA_CMAKE_ARGS="-DARROW_VERBOSE_THIRDPARTY_BUILD=ON"}} (but unset {{LIBARROW_BINARY}} so that we build from source) and maybe we'll see what's going on there. * Alternatively, you could try installing {{thrift}} from {{yum}}, though I'm not sure that they have a new enough version (0.11 is the minimum). * Odd that you got a segfault when reading a parquet file. Is there anything special about how your system is configured (compilers, toolchains, etc.) beyond a vanilla CentOS 7 environment? The centos-7 binary is built on a base centos image with this Dockerfile: https://github.com/ursa-labs/arrow-r-nightly/blob/master/linux/yum.Dockerfile So maybe see if setting {{CC=/usr/bin/gcc CXX=/usr/bin/g++}} before installing the R package (with {{LIBARROW_BINARY=centos-7}}). * If that makes a difference, I wonder if https://github.com/ursa-labs/arrow-r-nightly/blob/master/linux/yum.Dockerfile#L18-L20 is what is needed to get the thrift compilation when building everything from source to work. * Thanks for the {{lsb_release}} output. That confirms my suspicion about why it did not try to download the centos-7 binary to begin with (though obviously that's not desirable unless we get it not to segfault for you). > [R] installation failure on CentOS 7 > > > Key: ARROW-8586 > URL: https://issues.apache.org/jira/browse/ARROW-8586 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: CentOS 7 >Reporter: Hei >Priority: Major > > Hi, > I am trying to install arrow via RStudio, but it seems like it is not working > that after I installed the package, it kept asking me to run > arrow::install_arrow() even after I did: > {code} > > install.packages("arrow") > Installing package into ‘/home/hc/R/x86_64-redhat-linux-gnu-library/3.6’ > (as ‘lib’ is unspecified) > trying URL 'https://cran.rstudio.com/src/contrib/arrow_0.17.0.tar.gz' > Content type 'application/x-gzip' length 242534 bytes (236 KB) > == > downloaded 236 KB > * installing *source* package ‘arrow’ ... > ** package ‘arrow’ successfully unpacked and MD5 sums checked > ** using staged installation > *** Successfully retrieved C++ source > *** Building C++ libraries > cmake > arrow > ./configure: line 132: cd: libarrow/arrow-0.17.0/lib: Not a directory > - NOTE --- > After installation, please run arrow::install_arrow() > for help installing required runtime libraries > - > ** libs > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array.cpp -o array.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array_from_vector.cpp -o > array_from_vector.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array_to_vector.cpp -o > array_to_vector.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c arraydata.cpp -o arraydata.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wal
[jira] [Commented] (ARROW-8556) [R] zstd symbol not found on Ubuntu 19.10
[ https://issues.apache.org/jira/browse/ARROW-8556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094002#comment-17094002 ] Neal Richardson commented on ARROW-8556: Any ideas [~fsaintjacques] [~bkietz]? > [R] zstd symbol not found on Ubuntu 19.10 > - > > Key: ARROW-8556 > URL: https://issues.apache.org/jira/browse/ARROW-8556 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Ubuntu 19.10 > R 3.6.1 >Reporter: Karl Dunkle Werner >Priority: Major > > I would like to install the `arrow` R package on my Ubuntu 19.10 system. > Prebuilt binaries are unavailable, and I want to enable compression, so I set > the {{LIBARROW_MINIMAL=false}} environment variable. When I do so, it looks > like the package is able to compile, but can't be loaded. I'm able to install > correctly if I don't set the {{LIBARROW_MINIMAL}} variable. > Here's the error I get: > {code:java} > ** testing if installed package can be loaded from temporary location > Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath > = DLLpath, ...): > unable to load shared object > '~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so': > ~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so: undefined symbol: > ZSTD_initCStream > Error: loading failed > Execution halted > ERROR: loading failed > * removing ‘~/.R/3.6/arrow’ > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-8607) [R][CI] Unbreak builds following R 4.0 release
[ https://issues.apache.org/jira/browse/ARROW-8607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8607. Resolution: Fixed Issue resolved by pull request 7047 [https://github.com/apache/arrow/pull/7047] > [R][CI] Unbreak builds following R 4.0 release > -- > > Key: ARROW-8607 > URL: https://issues.apache.org/jira/browse/ARROW-8607 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration, R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Just a tourniquet to get master passing again while I work on ARROW-8604. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8607) [R][CI] Unbreak builds following R 4.0 release
Neal Richardson created ARROW-8607: -- Summary: [R][CI] Unbreak builds following R 4.0 release Key: ARROW-8607 URL: https://issues.apache.org/jira/browse/ARROW-8607 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 Just a tourniquet to get master passing again while I work on ARROW-8604. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8606) [CI] Don't trigger all builds on a change to any file in ci/
Neal Richardson created ARROW-8606: -- Summary: [CI] Don't trigger all builds on a change to any file in ci/ Key: ARROW-8606 URL: https://issues.apache.org/jira/browse/ARROW-8606 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8605) [R] Add support for brotli to Windows build
[ https://issues.apache.org/jira/browse/ARROW-8605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093766#comment-17093766 ] Neal Richardson commented on ARROW-8605: You are correct. We do not build the windows package with brotli. Here is what we do build with: https://github.com/apache/arrow/blob/master/ci/scripts/PKGBUILD#L28-L31 If you were interested in adding it, ARROW-6960 is the right model to follow. > [R] Add support for brotli to Windows build > --- > > Key: ARROW-8605 > URL: https://issues.apache.org/jira/browse/ARROW-8605 > Project: Apache Arrow > Issue Type: New Feature >Affects Versions: 0.17.0 >Reporter: Hei >Priority: Major > > Hi, > My friend installed arrow and tried to open a parquet file with brotli codec. > But then, he got an error when calling read_parquet("my.parquet") on Windows: > {code} > Error in parquet__arrow__FileReader__ReadTable(self) : >IOError: NotImplemented: Brotli codec support not built > {code} > It sounds similar to ARROW-6960. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8605) [R] Add support for brotli to Windows build
[ https://issues.apache.org/jira/browse/ARROW-8605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8605: --- Summary: [R] Add support for brotli to Windows build (was: Missing brotli Support in R Package?) > [R] Add support for brotli to Windows build > --- > > Key: ARROW-8605 > URL: https://issues.apache.org/jira/browse/ARROW-8605 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.17.0 >Reporter: Hei >Priority: Major > > Hi, > My friend installed arrow and tried to open a parquet file with brotli codec. > But then, he got an error when calling read_parquet("my.parquet") on Windows: > {code} > Error in parquet__arrow__FileReader__ReadTable(self) : >IOError: NotImplemented: Brotli codec support not built > {code} > It sounds similar to ARROW-6960. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8605) [R] Add support for brotli to Windows build
[ https://issues.apache.org/jira/browse/ARROW-8605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8605: --- Issue Type: New Feature (was: Bug) > [R] Add support for brotli to Windows build > --- > > Key: ARROW-8605 > URL: https://issues.apache.org/jira/browse/ARROW-8605 > Project: Apache Arrow > Issue Type: New Feature >Affects Versions: 0.17.0 >Reporter: Hei >Priority: Major > > Hi, > My friend installed arrow and tried to open a parquet file with brotli codec. > But then, he got an error when calling read_parquet("my.parquet") on Windows: > {code} > Error in parquet__arrow__FileReader__ReadTable(self) : >IOError: NotImplemented: Brotli codec support not built > {code} > It sounds similar to ARROW-6960. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-8604) [R] Windows compilation failure
[ https://issues.apache.org/jira/browse/ARROW-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-8604: -- Assignee: Neal Richardson > [R] Windows compilation failure > --- > > Key: ARROW-8604 > URL: https://issues.apache.org/jira/browse/ARROW-8604 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Francois Saint-Jacques >Assignee: Neal Richardson >Priority: Major > Fix For: 1.0.0 > > > [Master|[https://github.com/apache/arrow/runs/622393526]] fails to compile. > The C++ cmake build is not using the same > [compiler|[https://github.com/apache/arrow/runs/622393526#step:8:807]] than > the R extension > [compiler|[https://github.com/apache/arrow/runs/622393526#step:11:141]]. > {code:java} > // Files installed here > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libarrow.a (deflated 85%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libarrow_dataset.a (deflated 82%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libparquet.a (deflated 84%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libsnappy.a (deflated 61%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libthrift.a (deflated 81%) > // Linker is using `-L` > C:/Rtools/mingw_32/bin/g++ -shared -s -static-libgcc -o arrow.dll tmp.def > array.o array_from_vector.o array_to_vector.o arraydata.o arrowExports.o > buffer.o chunkedarray.o compression.o compute.o csv.o dataset.o datatype.o > expression.o feather.o field.o filesystem.o io.o json.o memorypool.o > message.o parquet.o py-to-r.o recordbatch.o recordbatchreader.o > recordbatchwriter.o schema.o symbols.o table.o threadpool.o > -L../windows/arrow-0.17.0.9000/lib-8.3.0/i386 > -L../windows/arrow-0.17.0.9000/lib/i386 -lparquet -larrow_dataset -larrow > -lthrift -lsnappy -lz -lzstd -llz4 -lcrypto -lcrypt32 -lws2_32 > -LC:/R/bin/i386 -lR > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lparquet > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -larrow_dataset > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -larrow > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lthrift > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lsnappy > {code} > > C++ developers, rejoice, this is almost the end of gcc-4.9. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8604) [R] Update CI to use R 4.0
[ https://issues.apache.org/jira/browse/ARROW-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8604: --- Summary: [R] Update CI to use R 4.0 (was: [R] Windows compilation failure) > [R] Update CI to use R 4.0 > -- > > Key: ARROW-8604 > URL: https://issues.apache.org/jira/browse/ARROW-8604 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Francois Saint-Jacques >Assignee: Neal Richardson >Priority: Major > Fix For: 1.0.0 > > > [Master|[https://github.com/apache/arrow/runs/622393526]] fails to compile. > The C++ cmake build is not using the same > [compiler|[https://github.com/apache/arrow/runs/622393526#step:8:807]] than > the R extension > [compiler|[https://github.com/apache/arrow/runs/622393526#step:11:141]]. > {code:java} > // Files installed here > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libarrow.a (deflated 85%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libarrow_dataset.a (deflated 82%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libparquet.a (deflated 84%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libsnappy.a (deflated 61%) > adding: arrow-0.17.0.9000/lib-4.9.3/i386/libthrift.a (deflated 81%) > // Linker is using `-L` > C:/Rtools/mingw_32/bin/g++ -shared -s -static-libgcc -o arrow.dll tmp.def > array.o array_from_vector.o array_to_vector.o arraydata.o arrowExports.o > buffer.o chunkedarray.o compression.o compute.o csv.o dataset.o datatype.o > expression.o feather.o field.o filesystem.o io.o json.o memorypool.o > message.o parquet.o py-to-r.o recordbatch.o recordbatchreader.o > recordbatchwriter.o schema.o symbols.o table.o threadpool.o > -L../windows/arrow-0.17.0.9000/lib-8.3.0/i386 > -L../windows/arrow-0.17.0.9000/lib/i386 -lparquet -larrow_dataset -larrow > -lthrift -lsnappy -lz -lzstd -llz4 -lcrypto -lcrypt32 -lws2_32 > -LC:/R/bin/i386 -lR > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lparquet > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -larrow_dataset > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -larrow > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lthrift > C:/Rtools/mingw_32/bin/../lib/gcc/i686-w64-mingw32/4.9.3/../../../../i686-w64-mingw32/bin/ld.exe: > cannot find -lsnappy > {code} > > C++ developers, rejoice, this is almost the end of gcc-4.9. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8593) [C++] Parquet file_serialize_test.cc fails to build with musl libc
[ https://issues.apache.org/jira/browse/ARROW-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8593: --- Summary: [C++] Parquet file_serialize_test.cc fails to build with musl libc (was: Parquet file_serialize_test.cc fails to build with musl libc) > [C++] Parquet file_serialize_test.cc fails to build with musl libc > -- > > Key: ARROW-8593 > URL: https://issues.apache.org/jira/browse/ARROW-8593 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 0.17.0 >Reporter: Tobias Mayer >Assignee: Tobias Mayer >Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {{TestBufferedRowGroupWriter}} declares a variable named {{PAGE_SIZE}}. This > clashes with a macro constant by the same name defined in musl's {{limits.h}}. > I don't think using ALLCAPS for a local name adds value here, so I'm going to > change it to {{page_size}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8556) [R] zstd symbol not found on Ubuntu 19.10
[ https://issues.apache.org/jira/browse/ARROW-8556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091937#comment-17091937 ] Neal Richardson commented on ARROW-8556: Thanks, that's helpful. So what I see is that when the C++ library builds, `cmake` finds the system `zstd` so it opts to use that instead of build it from source too. But then when the R package shared library tries to load, it can't find it. This is beyond my level of C++ competence to debug further, so I'll solicit help from someone else. > [R] zstd symbol not found on Ubuntu 19.10 > - > > Key: ARROW-8556 > URL: https://issues.apache.org/jira/browse/ARROW-8556 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Ubuntu 19.10 > R 3.6.1 >Reporter: Karl Dunkle Werner >Priority: Major > > I would like to install the `arrow` R package on my Ubuntu 19.10 system. > Prebuilt binaries are unavailable, and I want to enable compression, so I set > the {{LIBARROW_MINIMAL=false}} environment variable. When I do so, it looks > like the package is able to compile, but can't be loaded. I'm able to install > correctly if I don't set the {{LIBARROW_MINIMAL}} variable. > Here's the error I get: > {code:java} > ** testing if installed package can be loaded from temporary location > Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath > = DLLpath, ...): > unable to load shared object > '~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so': > ~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so: undefined symbol: > ZSTD_initCStream > Error: loading failed > Execution halted > ERROR: loading failed > * removing ‘~/.R/3.6/arrow’ > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8556) [R] zstd symbol not found on Ubuntu 19.10
[ https://issues.apache.org/jira/browse/ARROW-8556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091890#comment-17091890 ] Neal Richardson commented on ARROW-8556: Maybe it's something about 19.10, maybe it's something about your particular setup, or maybe it's a more general issue. To debug, I'd recommend setting `ARROW_R_DEV=true` (for verbosity), `LIBARROW_BINARY=false` (to ensure that we build from source), and `LIBARROW_MINIMAL=false` (so that it turns on zstd) and reinstalling. Then attach here the full installation logs, and I can try to sift through them. Then I may have some other ideas of things to try. Thanks for your help! > [R] zstd symbol not found on Ubuntu 19.10 > - > > Key: ARROW-8556 > URL: https://issues.apache.org/jira/browse/ARROW-8556 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Ubuntu 19.10 > R 3.6.1 >Reporter: Karl Dunkle Werner >Priority: Major > > I would like to install the `arrow` R package on my Ubuntu 19.10 system. > Prebuilt binaries are unavailable, and I want to enable compression, so I set > the {{LIBARROW_MINIMAL=false}} environment variable. When I do so, it looks > like the package is able to compile, but can't be loaded. I'm able to install > correctly if I don't set the {{LIBARROW_MINIMAL}} variable. > Here's the error I get: > {code:java} > ** testing if installed package can be loaded from temporary location > Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath > = DLLpath, ...): > unable to load shared object > '~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so': > ~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so: undefined symbol: > ZSTD_initCStream > Error: loading failed > Execution halted > ERROR: loading failed > * removing ‘~/.R/3.6/arrow’ > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-8575) [Developer] Add issue_comment workflow to rebase a PR
[ https://issues.apache.org/jira/browse/ARROW-8575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8575. Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 7028 [https://github.com/apache/arrow/pull/7028] > [Developer] Add issue_comment workflow to rebase a PR > - > > Key: ARROW-8575 > URL: https://issues.apache.org/jira/browse/ARROW-8575 > Project: Apache Arrow > Issue Type: Improvement > Components: Developer Tools >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8556) [R] zstd symbol not found on Ubuntu 19.10
[ https://issues.apache.org/jira/browse/ARROW-8556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091678#comment-17091678 ] Neal Richardson commented on ARROW-8556: Thanks. I've mapped ubuntu 19.10 to ubuntu-18.04 [here|https://github.com/ursa-labs/arrow-r-nightly/blob/master/linux/distro-map.csv#L13] so installation with a binary should Just Work now. I'm curious why zstd wasn't included correctly before (see that there is no {{-lzstd}} in the {{PKG_LIBS}} line), but if you want to let it lie and move on, that's fine with me, we can wait and see if anyone else experiences that. > [R] zstd symbol not found on Ubuntu 19.10 > - > > Key: ARROW-8556 > URL: https://issues.apache.org/jira/browse/ARROW-8556 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Ubuntu 19.10 > R 3.6.1 >Reporter: Karl Dunkle Werner >Priority: Major > > I would like to install the `arrow` R package on my Ubuntu 19.10 system. > Prebuilt binaries are unavailable, and I want to enable compression, so I set > the {{LIBARROW_MINIMAL=false}} environment variable. When I do so, it looks > like the package is able to compile, but can't be loaded. I'm able to install > correctly if I don't set the {{LIBARROW_MINIMAL}} variable. > Here's the error I get: > {code:java} > ** testing if installed package can be loaded from temporary location > Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath > = DLLpath, ...): > unable to load shared object > '~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so': > ~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so: undefined symbol: > ZSTD_initCStream > Error: loading failed > Execution halted > ERROR: loading failed > * removing ‘~/.R/3.6/arrow’ > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8586) [R] installation failure on CentOS 7
[ https://issues.apache.org/jira/browse/ARROW-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8586: --- Summary: [R] installation failure on CentOS 7 (was: Failed to Install arrow From CRAN) > [R] installation failure on CentOS 7 > > > Key: ARROW-8586 > URL: https://issues.apache.org/jira/browse/ARROW-8586 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: CentOS 7 >Reporter: Hei >Priority: Major > > Hi, > I am trying to install arrow via RStudio, but it seems like it is not working > that after I installed the package, it kept asking me to run > arrow::install_arrow() even after I did: > {code} > > install.packages("arrow") > Installing package into ‘/home/hc/R/x86_64-redhat-linux-gnu-library/3.6’ > (as ‘lib’ is unspecified) > trying URL 'https://cran.rstudio.com/src/contrib/arrow_0.17.0.tar.gz' > Content type 'application/x-gzip' length 242534 bytes (236 KB) > == > downloaded 236 KB > * installing *source* package ‘arrow’ ... > ** package ‘arrow’ successfully unpacked and MD5 sums checked > ** using staged installation > *** Successfully retrieved C++ source > *** Building C++ libraries > cmake > arrow > ./configure: line 132: cd: libarrow/arrow-0.17.0/lib: Not a directory > - NOTE --- > After installation, please run arrow::install_arrow() > for help installing required runtime libraries > - > ** libs > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array.cpp -o array.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array_from_vector.cpp -o > array_from_vector.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array_to_vector.cpp -o > array_to_vector.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c arraydata.cpp -o arraydata.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c arrowExports.cpp -o > arrowExports.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c buffer.cpp -o buffer.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c chunkedarray.cpp -o > chunkedarray.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c compression.cpp -o > compression.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c compute.cpp -o compute.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library
[jira] [Commented] (ARROW-8586) Failed to Install arrow From CRAN
[ https://issues.apache.org/jira/browse/ARROW-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091670#comment-17091670 ] Neal Richardson commented on ARROW-8586: Thanks for the report. There seem to be two issues: (1) C++ build from source is failing, and (2) when {{install_arrow}} tries to download a prebuilt binary, it's not correctly identifying your OS version. To debug the first issue, could you please set the environment variable {{ARROW_R_DEV=true}} and retry, and share with me the (much more verbose) installation logs? To debug the second, could you please tell me what {{lsb_release -rs}} says at the command line? A workaround will be to set {{LIBARROW_BINARY=centos-7}} and reinstall (or, equivalently, call {{arrow::install_arrow(binary="centos-7")}} from R, since you have that installed). But I'd appreciate your help in debugging the issue so that we can make it work correctly going forward. > Failed to Install arrow From CRAN > - > > Key: ARROW-8586 > URL: https://issues.apache.org/jira/browse/ARROW-8586 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: CentOS 7 >Reporter: Hei >Priority: Major > > Hi, > I am trying to install arrow via RStudio, but it seems like it is not working > that after I installed the package, it kept asking me to run > arrow::install_arrow() even after I did: > {code} > > install.packages("arrow") > Installing package into ‘/home/hc/R/x86_64-redhat-linux-gnu-library/3.6’ > (as ‘lib’ is unspecified) > trying URL 'https://cran.rstudio.com/src/contrib/arrow_0.17.0.tar.gz' > Content type 'application/x-gzip' length 242534 bytes (236 KB) > == > downloaded 236 KB > * installing *source* package ‘arrow’ ... > ** package ‘arrow’ successfully unpacked and MD5 sums checked > ** using staged installation > *** Successfully retrieved C++ source > *** Building C++ libraries > cmake > arrow > ./configure: line 132: cd: libarrow/arrow-0.17.0/lib: Not a directory > - NOTE --- > After installation, please run arrow::install_arrow() > for help installing required runtime libraries > - > ** libs > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array.cpp -o array.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array_from_vector.cpp -o > array_from_vector.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c array_to_vector.cpp -o > array_to_vector.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c arraydata.cpp -o arraydata.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c arrowExports.cpp -o > arrowExports.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c buffer.cpp -o buffer.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/hc/R/x86_64-redhat-linux-gnu-library/3.6/Rcpp/include" > -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 > -grecord-gcc-switches -m64 -mtune=generic -c chunkedarray.cpp -o > chunkedarray.o > g++ -m64 -std=gnu++11 -I"/usr/include/R" -DNDEBUG > -I"/home/
[jira] [Created] (ARROW-8575) [Developer] Add issue_comment workflow to rebase a PR
Neal Richardson created ARROW-8575: -- Summary: [Developer] Add issue_comment workflow to rebase a PR Key: ARROW-8575 URL: https://issues.apache.org/jira/browse/ARROW-8575 Project: Apache Arrow Issue Type: Improvement Components: Developer Tools Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8566) [R] error when writing POSIXct to spark
[ https://issues.apache.org/jira/browse/ARROW-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090916#comment-17090916 ] Neal Richardson commented on ARROW-8566: Great, thanks for debugging with me. I created https://github.com/sparklyr/sparklyr/issues/2439 because I think the current {{arrow}} behavior is correct (certainly the 0.16 behavior was not correct, unless you happen to live in UTC) so this might need to be worked around in {{sparklyr}}. > [R] error when writing POSIXct to spark > --- > > Key: ARROW-8566 > URL: https://issues.apache.org/jira/browse/ARROW-8566 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: #> R version 3.6.3 (2020-02-29) > #> Platform: x86_64-apple-darwin15.6.0 (64-bit) > #> Running under: macOS Mojave 10.14.6 > sparklyr::spark_version(sc) > #> [1] '2.4.5' >Reporter: Curt Bergmann >Priority: Major > > monospaced text}}``` r > library(DBI) > library(sparklyr) > library(arrow) > #> > #> Attaching package: 'arrow' > #> The following object is masked from 'package:utils': > #> > #> timestamp > sc <- spark_connect(master = "local") > sparklyr::spark_version(sc) > #> [1] '2.4.5' > x <- data.frame(y = Sys.time()) > dbWriteTable(sc, "test_posixct", x) > #> Error: org.apache.spark.SparkException: Job aborted. > #> at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198) > #> at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159) > #> at > org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:503) > #> at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:217) > #> at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:176) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > #> at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > #> at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) > #> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) > #> at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83) > #> at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81) > #> at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80) > #> at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127) > #> at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75) > #> at > org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:474) > #> at > org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:453) > #> at > org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:409) > #> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > #> at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > #> at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > #> at java.lang.reflect.Method.invoke(Method.java:498) > #> at sparklyr.Invoke.invoke(invoke.scala:147) > #> at sparklyr.StreamHandler.handleMethodCall(stream.scala:136) > #> at sparklyr.StreamHandler.read(stream.scala:61) > #> at > sparklyr.BackendHandler$$anonfun$channelRead0$1.apply$mcV$sp(handler.scala:58) > #> at scala.util.control.Breaks.breakable(Breaks.scala:38) > #> at sparklyr.BackendHandler.channelRead0(handler.scala:38) > #> at sparklyr.BackendHandler.channelRead0(handler.scala:14) > #> at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > #> at > io.netty.channel.AbstractChannelHandl
[jira] [Updated] (ARROW-8566) [R] error when writing POSIXct to spark
[ https://issues.apache.org/jira/browse/ARROW-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8566: --- Summary: [R] error when writing POSIXct to spark (was: Upgraded from r package arrow 16 to r package arrow 17 and now get an error when writing posixct to spark) > [R] error when writing POSIXct to spark > --- > > Key: ARROW-8566 > URL: https://issues.apache.org/jira/browse/ARROW-8566 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: #> R version 3.6.3 (2020-02-29) > #> Platform: x86_64-apple-darwin15.6.0 (64-bit) > #> Running under: macOS Mojave 10.14.6 > sparklyr::spark_version(sc) > #> [1] '2.4.5' >Reporter: Curt Bergmann >Priority: Major > > monospaced text}}``` r > library(DBI) > library(sparklyr) > library(arrow) > #> > #> Attaching package: 'arrow' > #> The following object is masked from 'package:utils': > #> > #> timestamp > sc <- spark_connect(master = "local") > sparklyr::spark_version(sc) > #> [1] '2.4.5' > x <- data.frame(y = Sys.time()) > dbWriteTable(sc, "test_posixct", x) > #> Error: org.apache.spark.SparkException: Job aborted. > #> at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198) > #> at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159) > #> at > org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:503) > #> at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:217) > #> at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:176) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > #> at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > #> at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) > #> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) > #> at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83) > #> at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81) > #> at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80) > #> at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127) > #> at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75) > #> at > org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:474) > #> at > org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:453) > #> at > org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:409) > #> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > #> at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > #> at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > #> at java.lang.reflect.Method.invoke(Method.java:498) > #> at sparklyr.Invoke.invoke(invoke.scala:147) > #> at sparklyr.StreamHandler.handleMethodCall(stream.scala:136) > #> at sparklyr.StreamHandler.read(stream.scala:61) > #> at > sparklyr.BackendHandler$$anonfun$channelRead0$1.apply$mcV$sp(handler.scala:58) > #> at scala.util.control.Breaks.breakable(Breaks.scala:38) > #> at sparklyr.BackendHandler.channelRead0(handler.scala:38) > #> at sparklyr.BackendHandler.channelRead0(handler.scala:14) > #> at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > #> at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) > #> at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
[jira] [Updated] (ARROW-8566) Upgraded from r package arrow 16 to r package arrow 17 and now get an error when writing posixct to spark
[ https://issues.apache.org/jira/browse/ARROW-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8566: --- Priority: Major (was: Blocker) > Upgraded from r package arrow 16 to r package arrow 17 and now get an error > when writing posixct to spark > - > > Key: ARROW-8566 > URL: https://issues.apache.org/jira/browse/ARROW-8566 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: #> R version 3.6.3 (2020-02-29) > #> Platform: x86_64-apple-darwin15.6.0 (64-bit) > #> Running under: macOS Mojave 10.14.6 > sparklyr::spark_version(sc) > #> [1] '2.4.5' >Reporter: Curt Bergmann >Priority: Major > > monospaced text}}``` r > library(DBI) > library(sparklyr) > library(arrow) > #> > #> Attaching package: 'arrow' > #> The following object is masked from 'package:utils': > #> > #> timestamp > sc <- spark_connect(master = "local") > sparklyr::spark_version(sc) > #> [1] '2.4.5' > x <- data.frame(y = Sys.time()) > dbWriteTable(sc, "test_posixct", x) > #> Error: org.apache.spark.SparkException: Job aborted. > #> at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198) > #> at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159) > #> at > org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:503) > #> at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:217) > #> at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:176) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > #> at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > #> at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) > #> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) > #> at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83) > #> at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81) > #> at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80) > #> at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127) > #> at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75) > #> at > org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:474) > #> at > org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:453) > #> at > org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:409) > #> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > #> at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > #> at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > #> at java.lang.reflect.Method.invoke(Method.java:498) > #> at sparklyr.Invoke.invoke(invoke.scala:147) > #> at sparklyr.StreamHandler.handleMethodCall(stream.scala:136) > #> at sparklyr.StreamHandler.read(stream.scala:61) > #> at > sparklyr.BackendHandler$$anonfun$channelRead0$1.apply$mcV$sp(handler.scala:58) > #> at scala.util.control.Breaks.breakable(Breaks.scala:38) > #> at sparklyr.BackendHandler.channelRead0(handler.scala:38) > #> at sparklyr.BackendHandler.channelRead0(handler.scala:14) > #> at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > #> at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) > #> at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360
[jira] [Resolved] (ARROW-8569) [CI] Upgrade xcode version for testing homebrew formulae
[ https://issues.apache.org/jira/browse/ARROW-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8569. Resolution: Fixed Issue resolved by pull request 7019 [https://github.com/apache/arrow/pull/7019] > [CI] Upgrade xcode version for testing homebrew formulae > > > Key: ARROW-8569 > URL: https://issues.apache.org/jira/browse/ARROW-8569 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration, Packaging >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Minor > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > To prevent as many bottles from being built from source. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8566) Upgraded from r package arrow 16 to r package arrow 17 and now get an error when writing posixct to spark
[ https://issues.apache.org/jira/browse/ARROW-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090772#comment-17090772 ] Neal Richardson commented on ARROW-8566: Hmm. Unfortunately, {{java.lang.UnsupportedOperationException}} doesn't tell me anything about what is unsupported. The only thing about posixt types that changed in the last {{arrow}} release was a fix for ARROW-3543, specifically https://github.com/apache/arrow/commit/507762fa51d17e61f08d36d3626ab8b8df716198. I wonder, does it work if you explicitly set {{tz="GMT"}} on a POSIXct and send that? > Upgraded from r package arrow 16 to r package arrow 17 and now get an error > when writing posixct to spark > - > > Key: ARROW-8566 > URL: https://issues.apache.org/jira/browse/ARROW-8566 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: #> R version 3.6.3 (2020-02-29) > #> Platform: x86_64-apple-darwin15.6.0 (64-bit) > #> Running under: macOS Mojave 10.14.6 > sparklyr::spark_version(sc) > #> [1] '2.4.5' >Reporter: Curt Bergmann >Priority: Blocker > > monospaced text}}``` r > library(DBI) > library(sparklyr) > library(arrow) > #> > #> Attaching package: 'arrow' > #> The following object is masked from 'package:utils': > #> > #> timestamp > sc <- spark_connect(master = "local") > sparklyr::spark_version(sc) > #> [1] '2.4.5' > x <- data.frame(y = Sys.time()) > dbWriteTable(sc, "test_posixct", x) > #> Error: org.apache.spark.SparkException: Job aborted. > #> at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198) > #> at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159) > #> at > org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:503) > #> at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:217) > #> at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:176) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > #> at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > #> at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) > #> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) > #> at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83) > #> at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81) > #> at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80) > #> at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127) > #> at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75) > #> at > org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:474) > #> at > org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:453) > #> at > org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:409) > #> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > #> at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > #> at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > #> at java.lang.reflect.Method.invoke(Method.java:498) > #> at sparklyr.Invoke.invoke(invoke.scala:147) > #> at sparklyr.StreamHandler.handleMethodCall(stream.scala:136) > #> at sparklyr.StreamHandler.read(stream.scala:61) > #> at > sparklyr.BackendHandler$$anonfun$channelRead0$1.apply$mcV$sp(handler.scala:58) > #> at scala.util.control.Breaks.breakable(Breaks.scala:38) > #> at sparklyr.BackendHandler
[jira] [Created] (ARROW-8569) [CI] Upgrade xcode version for testing homebrew formulae
Neal Richardson created ARROW-8569: -- Summary: [CI] Upgrade xcode version for testing homebrew formulae Key: ARROW-8569 URL: https://issues.apache.org/jira/browse/ARROW-8569 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, Packaging Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 To prevent as many bottles from being built from source. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8566) Upgraded from r package arrow 16 to r package arrow 17 and now get an error when writing posixct to spark
[ https://issues.apache.org/jira/browse/ARROW-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090673#comment-17090673 ] Neal Richardson commented on ARROW-8566: Is this consistently reproducible? Do any other data types cause issues? I can't tell from the spark traceback what is failing exactly. > Upgraded from r package arrow 16 to r package arrow 17 and now get an error > when writing posixct to spark > - > > Key: ARROW-8566 > URL: https://issues.apache.org/jira/browse/ARROW-8566 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: #> R version 3.6.3 (2020-02-29) > #> Platform: x86_64-apple-darwin15.6.0 (64-bit) > #> Running under: macOS Mojave 10.14.6 > sparklyr::spark_version(sc) > #> [1] '2.4.5' >Reporter: Curt Bergmann >Priority: Blocker > > monospaced text}}``` r > library(DBI) > library(sparklyr) > library(arrow) > #> > #> Attaching package: 'arrow' > #> The following object is masked from 'package:utils': > #> > #> timestamp > sc <- spark_connect(master = "local") > sparklyr::spark_version(sc) > #> [1] '2.4.5' > x <- data.frame(y = Sys.time()) > dbWriteTable(sc, "test_posixct", x) > #> Error: org.apache.spark.SparkException: Job aborted. > #> at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198) > #> at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159) > #> at > org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:503) > #> at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:217) > #> at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:176) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102) > #> at > org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > #> at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > #> at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > #> at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) > #> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) > #> at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83) > #> at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81) > #> at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80) > #> at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127) > #> at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75) > #> at > org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676) > #> at > org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:474) > #> at > org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:453) > #> at > org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:409) > #> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > #> at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > #> at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > #> at java.lang.reflect.Method.invoke(Method.java:498) > #> at sparklyr.Invoke.invoke(invoke.scala:147) > #> at sparklyr.StreamHandler.handleMethodCall(stream.scala:136) > #> at sparklyr.StreamHandler.read(stream.scala:61) > #> at > sparklyr.BackendHandler$$anonfun$channelRead0$1.apply$mcV$sp(handler.scala:58) > #> at scala.util.control.Breaks.breakable(Breaks.scala:38) > #> at sparklyr.BackendHandler.channelRead0(handler.scala:38) > #> at sparklyr.BackendHandler.channelRead0(handler.scala:14) > #> at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > #> at > io.netty.channel.AbstractChannelHandlerContext.invokeChann
[jira] [Resolved] (ARROW-8549) [R] Assorted post-0.17 release cleanups
[ https://issues.apache.org/jira/browse/ARROW-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-8549. Resolution: Fixed Issue resolved by pull request 6995 [https://github.com/apache/arrow/pull/6995] > [R] Assorted post-0.17 release cleanups > --- > > Key: ARROW-8549 > URL: https://issues.apache.org/jira/browse/ARROW-8549 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8556) [R] zstd symbol not found on Ubuntu 19.10
[ https://issues.apache.org/jira/browse/ARROW-8556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8556: --- Summary: [R] zstd symbol not found on Ubuntu 19.10 (was: [R] Installation fails with `LIBARROW_MINIMAL=false`) > [R] zstd symbol not found on Ubuntu 19.10 > - > > Key: ARROW-8556 > URL: https://issues.apache.org/jira/browse/ARROW-8556 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Ubuntu 19.10 > R 3.6.1 >Reporter: Karl Dunkle Werner >Priority: Major > > I would like to install the `arrow` R package on my Ubuntu 19.10 system. > Prebuilt binaries are unavailable, and I want to enable compression, so I set > the {{LIBARROW_MINIMAL=false}} environment variable. When I do so, it looks > like the package is able to compile, but can't be loaded. I'm able to install > correctly if I don't set the {{LIBARROW_MINIMAL}} variable. > Here's the error I get: > {code:java} > ** testing if installed package can be loaded from temporary location > Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath > = DLLpath, ...): > unable to load shared object > '~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so': > ~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so: undefined symbol: > ZSTD_initCStream > Error: loading failed > Execution halted > ERROR: loading failed > * removing ‘~/.R/3.6/arrow’ > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8556) [R] Installation fails with `LIBARROW_MINIMAL=false`
[ https://issues.apache.org/jira/browse/ARROW-8556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089913#comment-17089913 ] Neal Richardson commented on ARROW-8556: Thanks for the report. Several ideas: * Could you please share the install logs from above that, where it's compiling? * You could retry with {{LIBARROW_BINARY=ubuntu-18.04}} and see if that works * Do you have zstd installed on the system already? If so, what version? (Maybe there's a minimum version we require and that's not set right) * If not, you could {{apt-get install zstd}} and then retry * You could retry with {{ARROW_R_DEV=true}} for more verbosity in the build step (but let's go through the other options first) > [R] Installation fails with `LIBARROW_MINIMAL=false` > > > Key: ARROW-8556 > URL: https://issues.apache.org/jira/browse/ARROW-8556 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 0.17.0 > Environment: Ubuntu 19.10 > R 3.6.1 >Reporter: Karl Dunkle Werner >Priority: Major > > I would like to install the `arrow` R package on my Ubuntu 19.10 system. > Prebuilt binaries are unavailable, and I want to enable compression, so I set > the {{LIBARROW_MINIMAL=false}} environment variable. When I do so, it looks > like the package is able to compile, but can't be loaded. I'm able to install > correctly if I don't set the {{LIBARROW_MINIMAL}} variable. > Here's the error I get: > {code:java} > ** testing if installed package can be loaded from temporary location > Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath > = DLLpath, ...): > unable to load shared object > '~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so': > ~/.R/3.6/00LOCK-arrow/00new/arrow/libs/arrow.so: undefined symbol: > ZSTD_initCStream > Error: loading failed > Execution halted > ERROR: loading failed > * removing ‘~/.R/3.6/arrow’ > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8550) [CI] Don't run cron GHA jobs on forks
Neal Richardson created ARROW-8550: -- Summary: [CI] Don't run cron GHA jobs on forks Key: ARROW-8550 URL: https://issues.apache.org/jira/browse/ARROW-8550 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration Reporter: Neal Richardson Assignee: Neal Richardson It's wasteful, and I'm tired of seeing them clogging up my Actions tab and notifications. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8549) [R] Assorted post-0.17 release cleanups
Neal Richardson created ARROW-8549: -- Summary: [R] Assorted post-0.17 release cleanups Key: ARROW-8549 URL: https://issues.apache.org/jira/browse/ARROW-8549 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8548) [Website] 0.17 release post
Neal Richardson created ARROW-8548: -- Summary: [Website] 0.17 release post Key: ARROW-8548 URL: https://issues.apache.org/jira/browse/ARROW-8548 Project: Apache Arrow Issue Type: Improvement Components: Website Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8545) [Python] Allow fast writing of Decimal column to parquet
[ https://issues.apache.org/jira/browse/ARROW-8545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8545: --- Summary: [Python] Allow fast writing of Decimal column to parquet (was: Allow fast writing of Decimal column to parquet) > [Python] Allow fast writing of Decimal column to parquet > > > Key: ARROW-8545 > URL: https://issues.apache.org/jira/browse/ARROW-8545 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.17.0 >Reporter: Fons de Leeuw >Priority: Minor > > Currently, when one wants to use a decimal datatype in Pandas, the only > possibility is to use the `decimal.Decimal` standard-libary type. This is > then an "object" column in the DataFrame. > Arrow can write a column of decimal type to Parquet, which is quite > impressive given that [fastparquet does not write decimals|#data-types]] at > all. However, the writing is *very* slow, in the code snippet below a factor > of 4. > *Improvements* > Of course the best outcome would be if the conversion of a decimal column can > be made faster, but I am not familiar enough with pandas internals to know if > that's possible. (This same behavior also applies to `.to_pickle` etc.) > It would be nice, if a warning is shown that object-typed columns are being > converted which is very slow. That would at least make this behavior more > explicit. > Now, if fast parsing of a decimal.Decimal object column is not possible, it > would be nice if a workaround is possible. For example, pass an int and then > shift the dot "x" places to the left. (It is already possible to pass an int > column and specify "decimal" dtype in the Arrow schema during > `pa.Table.from_pandas()` but then it simply becomes a decimal without > decimals.) Also, it might be nice if it can be encoded as a 128-bit byte > string in the pandas column and then directly interpreted by Arrow. > *Usecase* > I need to save large dataframes (~10GB) of geospatial data with > latitude/longitude. I can't use float as comparisons need to be exact, and > the BigQuery "clustering" feature needs either an integer or a decimal but > not a float. In the meantime, I have to do a workaround where I use only ints > (the original number multiplied by 1000.) > *Snippet* > {code:java} > import decimal > from time import time > import numpy as np > import pandas as pd > d = dict() > for col in "abcdefghijklmnopqrstuvwxyz": > d[col] = np.random.rand(int(1E7)) * 100 > df = pd.DataFrame(d) > t0 = time() > df.to_parquet("/tmp/testabc.pq", engine="pyarrow") > t1 = time() > df["a"] = df["a"].round(decimals=3).astype(str).map(decimal.Decimal) > t2 = time() > df.to_parquet("/tmp/testabc_dec.pq", engine="pyarrow") > t3 = time() > print(f"Saving the normal dataframe took {t1-t0:.3f}s, with one decimal > column {t3-t2:.3f}s") > # Saving the normal dataframe took 4.430s, with one decimal column > 17.673s{code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8538) [Packaging] Remove boost from homebrew formula
Neal Richardson created ARROW-8538: -- Summary: [Packaging] Remove boost from homebrew formula Key: ARROW-8538 URL: https://issues.apache.org/jira/browse/ARROW-8538 Project: Apache Arrow Issue Type: Improvement Components: C++, Packaging Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-8488) [R] Replace VALUE_OR_STOP with ValueOrStop
[ https://issues.apache.org/jira/browse/ARROW-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-8488: -- Assignee: Francois Saint-Jacques (was: Neal Richardson) > [R] Replace VALUE_OR_STOP with ValueOrStop > -- > > Key: ARROW-8488 > URL: https://issues.apache.org/jira/browse/ARROW-8488 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > We should avoid macro as much as possible as per style guide. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-8488) [R] Replace VALUE_OR_STOP with ValueOrStop
[ https://issues.apache.org/jira/browse/ARROW-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-8488: -- Assignee: Neal Richardson > [R] Replace VALUE_OR_STOP with ValueOrStop > -- > > Key: ARROW-8488 > URL: https://issues.apache.org/jira/browse/ARROW-8488 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Francois Saint-Jacques >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > We should avoid macro as much as possible as per style guide. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8498) [Python] Schema.from_pandas fails on extension type, while Table.from_pandas works
[ https://issues.apache.org/jira/browse/ARROW-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8498: --- Summary: [Python] Schema.from_pandas fails on extension type, while Table.from_pandas works (was: Schema.from_pandas fails on extension type, while Table.from_pandas works) > [Python] Schema.from_pandas fails on extension type, while Table.from_pandas > works > -- > > Key: ARROW-8498 > URL: https://issues.apache.org/jira/browse/ARROW-8498 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.16.0 >Reporter: Thomas Buhrmann >Priority: Major > > While Table.from_pandas() seems to work as expected with extension types, > Schema.from_pandas() raises an ArrowTypeError: > {code:python} > df = pd.DataFrame({ >"x": pd.Series([1, 2, None], dtype="Int8"), >"y": pd.Series(["a", "b", None], dtype="category"), >"z": pd.Series(["ab", "bc", None], dtype="string"), > }) > print(pa.Table.from_pandas(df).schema) > print(pa.Schema.from_pandas(df)) > {code} > > Results in: > {noformat} > x: int8 > y: dictionary > z: string > metadata > > {b'pandas': b'{"index_columns": [{"kind": "range", "name": null, "start": 0, > "' > b'stop": 3, "step": 1}], "column_indexes": [{"name": null, > "field_' > b'name": null, "pandas_type": "unicode", "numpy_type": "object", > "' > b'metadata": {"encoding": "UTF-8"}}], "columns": [{"name": "x", > "f' > b'ield_name": "x", "pandas_type": "int8", "numpy_type": "Int8", > "m' > b'etadata": null}, {"name": "y", "field_name": "y", > "pandas_type":' > b' "categorical", "numpy_type": "int8", "metadata": > {"num_categori' > b'es": 2, "ordered": false}}, {"name": "z", "field_name": "z", > "pa' > b'ndas_type": "unicode", "numpy_type": "string", "metadata": > null}' > b'], "creator": {"library": "pyarrow", "version": "0.16.0"}, > "pand' > b'as_version": "1.0.3"}'} > --- > ArrowTypeErrorTraceback (most recent call last) > ... > ArrowTypeError: Did not pass numpy.dtype object > {noformat} > I'd imagine Table.from_pandas(df).schema and Schema.from_pandas(df) should > result in the exact same object? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-8473) [Rust] "Statistics support" in rust/parquet readme is incorrect
[ https://issues.apache.org/jira/browse/ARROW-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8473: --- Summary: [Rust] "Statistics support" in rust/parquet readme is incorrect (was: "Statistics support" in rust/parquet readme is incorrect) > [Rust] "Statistics support" in rust/parquet readme is incorrect > --- > > Key: ARROW-8473 > URL: https://issues.apache.org/jira/browse/ARROW-8473 > Project: Apache Arrow > Issue Type: Bug > Components: Rust >Reporter: Krzysztof Stanisławek >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Statistics are not actually supported in rust implementation of parquet. See > [https://github.com/apache/arrow/blob/3e3712a14a3242d70145fb9d3d6f0f4b8c374e68/rust/parquet/src/column/writer.rs#L522] > or similar lines in this file, or writer.rs. > https://github.com/apache/arrow/pull/6951 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-7801) [Developer] Add issue_comment workflow to fix lint/style/codegen
[ https://issues.apache.org/jira/browse/ARROW-7801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-7801. Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 6932 [https://github.com/apache/arrow/pull/6932] > [Developer] Add issue_comment workflow to fix lint/style/codegen > > > Key: ARROW-7801 > URL: https://issues.apache.org/jira/browse/ARROW-7801 > Project: Apache Arrow > Issue Type: Improvement > Components: Developer Tools >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Like https://github.com/r-lib/actions/tree/master/examples#render-readme. > * If changes to r/README.Rmd, render readme > * If changes to r/R, render docs > * If changes to r/src, lint.sh --fix -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-7801) [Developer] Add issue_comment workflow to fix lint/style/codegen
[ https://issues.apache.org/jira/browse/ARROW-7801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-7801: --- Component/s: (was: R) (was: Continuous Integration) Developer Tools > [Developer] Add issue_comment workflow to fix lint/style/codegen > > > Key: ARROW-7801 > URL: https://issues.apache.org/jira/browse/ARROW-7801 > Project: Apache Arrow > Issue Type: Improvement > Components: Developer Tools >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > Like https://github.com/r-lib/actions/tree/master/examples#render-readme. > * If changes to r/README.Rmd, render readme > * If changes to r/R, render docs > * If changes to r/src, lint.sh --fix -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-7801) [Developer] Add issue_comment workflow to fix lint/style/codegen
[ https://issues.apache.org/jira/browse/ARROW-7801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-7801: --- Summary: [Developer] Add issue_comment workflow to fix lint/style/codegen (was: [R][CI] Add lint and doc GitHub Action workflows) > [Developer] Add issue_comment workflow to fix lint/style/codegen > > > Key: ARROW-7801 > URL: https://issues.apache.org/jira/browse/ARROW-7801 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration, R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > Like https://github.com/r-lib/actions/tree/master/examples#render-readme. > * If changes to r/README.Rmd, render readme > * If changes to r/R, render docs > * If changes to r/src, lint.sh --fix -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6439) [R] Implement S3 file-system interface in R
[ https://issues.apache.org/jira/browse/ARROW-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-6439. Resolution: Fixed Issue resolved by pull request 6901 [https://github.com/apache/arrow/pull/6901] > [R] Implement S3 file-system interface in R > --- > > Key: ARROW-6439 > URL: https://issues.apache.org/jira/browse/ARROW-6439 > Project: Apache Arrow > Issue Type: New Feature > Components: R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8489) [Developer] Autotune more things
Neal Richardson created ARROW-8489: -- Summary: [Developer] Autotune more things Key: ARROW-8489 URL: https://issues.apache.org/jira/browse/ARROW-8489 Project: Apache Arrow Issue Type: Improvement Components: Developer Tools, Python Reporter: Neal Richardson ARROW-7801 added the "autotune" comment bot to fix linting errors and rebuild some generated files. cmake-format was left off because of Python problems (see description on https://github.com/apache/arrow/pull/6932). And there's probably other things we want to add (autopep8 for python, and similar for other languages?) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8482) [Python][R][Parquet] Possible time zone handling inconsistencies
[ https://issues.apache.org/jira/browse/ARROW-8482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085119#comment-17085119 ] Neal Richardson commented on ARROW-8482: read_parquet() doesn't alter types. It reads what is in the file. > [Python][R][Parquet] Possible time zone handling inconsistencies > - > > Key: ARROW-8482 > URL: https://issues.apache.org/jira/browse/ARROW-8482 > Project: Apache Arrow > Issue Type: Bug > Components: Python, R >Reporter: Olaf >Priority: Critical > > Hello there! > > First of all, thanks for making parquet files a reality in *R* and *Python*. > This is really great. > I found a very nasty bug when exchanging parquet files between the two > platforms. Consider this. > > > {code:java} > import pandas as pd > import pyarrow.parquet as pq > import numpy as np > df = pd.DataFrame({'string_time_utc' : [pd.to_datetime('2018-02-01 > 14:00:00.531'), > pd.to_datetime('2018-02-01 14:01:00.456'), > pd.to_datetime('2018-03-05 14:01:02.200')]}) > df['timestamp_est'] = > pd.to_datetime(df.string_time_utc).dt.tz_localize('UTC').dt.tz_convert('US/Eastern').dt.tz_localize(None) > df > Out[5]: > string_time_utc timestamp_est > 0 2018-02-01 14:00:00.531 2018-02-01 09:00:00.531 > 1 2018-02-01 14:01:00.456 2018-02-01 09:01:00.456 > 2 2018-03-05 14:01:02.200 2018-03-05 09:01:02.200 > {code} > > Now I simply write to disk > > {code:java} > df.to_parquet('myparquet.pq') > {code} > > And the use *R* to load it. > > {code:java} > test <- read_parquet('myparquet.pq') > > test > # A tibble: 3 x 2 > string_time_utc timestamp_est > > 1 2018-02-01 09:00:00.530999 2018-02-01 04:00:00.530999 > 2 2018-02-01 09:01:00.456000 2018-02-01 04:01:00.456000 > 3 2018-03-05 09:01:02.20 2018-03-05 04:01:02.20 > {code} > > > As you can see, the timestamps have been converted in the process. I first > referenced this bug in feather but I still it is still there. This is a very > dangerous, silent bug. > > What do you think? > Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8482) [Python][R][Parquet] Possible time zone handling inconsistencies
[ https://issues.apache.org/jira/browse/ARROW-8482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085008#comment-17085008 ] Neal Richardson commented on ARROW-8482: This shows the fix wrt 0.16 https://github.com/apache/arrow/commit/507762fa51d17e61f08d36d3626ab8b8df716198 But that doesn't affect how R prints datetime data with no timezone specified. > [Python][R][Parquet] Possible time zone handling inconsistencies > - > > Key: ARROW-8482 > URL: https://issues.apache.org/jira/browse/ARROW-8482 > Project: Apache Arrow > Issue Type: Bug > Components: Python, R >Reporter: Olaf >Assignee: Wes McKinney >Priority: Critical > > Hello there! > > First of all, thanks for making parquet files a reality in *R* and *Python*. > This is really great. > I found a very nasty bug when exchanging parquet files between the two > platforms. Consider this. > > > {code:java} > import pandas as pd > import pyarrow.parquet as pq > import numpy as np > df = pd.DataFrame({'string_time_utc' : [pd.to_datetime('2018-02-01 > 14:00:00.531'), > pd.to_datetime('2018-02-01 14:01:00.456'), > pd.to_datetime('2018-03-05 14:01:02.200')]}) > df['timestamp_est'] = > pd.to_datetime(df.string_time_utc).dt.tz_localize('UTC').dt.tz_convert('US/Eastern').dt.tz_localize(None) > df > Out[5]: > string_time_utc timestamp_est > 0 2018-02-01 14:00:00.531 2018-02-01 09:00:00.531 > 1 2018-02-01 14:01:00.456 2018-02-01 09:01:00.456 > 2 2018-03-05 14:01:02.200 2018-03-05 09:01:02.200 > {code} > > Now I simply write to disk > > {code:java} > df.to_parquet('myparquet.pq') > {code} > > And the use *R* to load it. > > {code:java} > test <- read_parquet('myparquet.pq') > > test > # A tibble: 3 x 2 > string_time_utc timestamp_est > > 1 2018-02-01 09:00:00.530999 2018-02-01 04:00:00.530999 > 2 2018-02-01 09:01:00.456000 2018-02-01 04:01:00.456000 > 3 2018-03-05 09:01:02.20 2018-03-05 04:01:02.20 > {code} > > > As you can see, the timestamps have been converted in the process. I first > referenced this bug in feather but I still it is still there. This is a very > dangerous, silent bug. > > What do you think? > Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8482) [Python][R][Parquet] Possible time zone handling inconsistencies
[ https://issues.apache.org/jira/browse/ARROW-8482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085002#comment-17085002 ] Neal Richardson commented on ARROW-8482: > R apparently treats naive timestamps as localtime Yes, in the print method, but as you say, it doesn't alter the data itself. See https://issues.apache.org/jira/browse/ARROW-3543?focusedCommentId=16929592&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16929592 > [Python][R][Parquet] Possible time zone handling inconsistencies > - > > Key: ARROW-8482 > URL: https://issues.apache.org/jira/browse/ARROW-8482 > Project: Apache Arrow > Issue Type: Bug > Components: Python, R >Reporter: Olaf >Assignee: Wes McKinney >Priority: Critical > > Hello there! > > First of all, thanks for making parquet files a reality in *R* and *Python*. > This is really great. > I found a very nasty bug when exchanging parquet files between the two > platforms. Consider this. > > > {code:java} > import pandas as pd > import pyarrow.parquet as pq > import numpy as np > df = pd.DataFrame({'string_time_utc' : [pd.to_datetime('2018-02-01 > 14:00:00.531'), > pd.to_datetime('2018-02-01 14:01:00.456'), > pd.to_datetime('2018-03-05 14:01:02.200')]}) > df['timestamp_est'] = > pd.to_datetime(df.string_time_utc).dt.tz_localize('UTC').dt.tz_convert('US/Eastern').dt.tz_localize(None) > df > Out[5]: > string_time_utc timestamp_est > 0 2018-02-01 14:00:00.531 2018-02-01 09:00:00.531 > 1 2018-02-01 14:01:00.456 2018-02-01 09:01:00.456 > 2 2018-03-05 14:01:02.200 2018-03-05 09:01:02.200 > {code} > > Now I simply write to disk > > {code:java} > df.to_parquet('myparquet.pq') > {code} > > And the use *R* to load it. > > {code:java} > test <- read_parquet('myparquet.pq') > > test > # A tibble: 3 x 2 > string_time_utc timestamp_est > > 1 2018-02-01 09:00:00.530999 2018-02-01 04:00:00.530999 > 2 2018-02-01 09:01:00.456000 2018-02-01 04:01:00.456000 > 3 2018-03-05 09:01:02.20 2018-03-05 04:01:02.20 > {code} > > > As you can see, the timestamps have been converted in the process. I first > referenced this bug in feather but I still it is still there. This is a very > dangerous, silent bug. > > What do you think? > Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)