[jira] [Created] (ARROW-11993) [C++] Don't download xsimd if ARROW_SIMD_LEVEL=NONE

2021-03-16 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-11993:
---

 Summary: [C++] Don't download xsimd if ARROW_SIMD_LEVEL=NONE
 Key: ARROW-11993
 URL: https://issues.apache.org/jira/browse/ARROW-11993
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Neal Richardson


It doesn't get used if SIMD level is NONE, so we shouldn't bother downloading 
it.

cc [~apitrou]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11994) [R] Build fails if dataset enabled but parquet is not

2021-03-16 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-11994:
---

 Summary: [R] Build fails if dataset enabled but parquet is not
 Key: ARROW-11994
 URL: https://issues.apache.org/jira/browse/ARROW-11994
 Project: Apache Arrow
  Issue Type: Bug
  Components: R
Reporter: Neal Richardson


Following ARROW-11735; discovered while working on ARROW-10734. The 
arrow::dataset::ParquetFileFormat and related classes require both dataset and 
parquet. The {{#if defined}} logic in r/src/dataset.cpp is right and both are 
required, but in the wrapping that is generated for arrowExports.cpp, we only 
use the annotation on the functions, {{[[dataset::export]]}} to wrap. So the 
ParquetFileFormat methods in arrowExports.cpp are if defined 
ARROW_R_WITH_DATASET and fail if parquet is not available.

Not a priority to fix (for Solaris I can turn off ARROW_DATASET and avoid 
this), just wanted to note it in case we need to revisit this wrapping logic 
later anyway. cc [~icook]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-11988) [C++][Gandiva]Implements the last_day function in Gandiva

2021-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-11988:
---
Labels: pull-request-available pull_request_available  (was: 
pull_request_available)

> [C++][Gandiva]Implements the last_day function in Gandiva
> -
>
> Key: ARROW-11988
> URL: https://issues.apache.org/jira/browse/ARROW-11988
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++ - Gandiva
>Reporter: Anthony Louis Gotlib Ferreira
>Priority: Trivial
>  Labels: pull-request-available, pull_request_available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Adds the support for `last_day` function inside the Gandiva, similar to the 
> Apache Impala implementation: 
> https://docs.datafabric.hpe.com/62/Impala/new_features_impala_2100.html.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11995) [C++][Gandiva] Add support for IN expressions for date and timestamp

2021-03-16 Thread Sagnik Chakraborty (Jira)
Sagnik Chakraborty created ARROW-11995:
--

 Summary: [C++][Gandiva] Add support for IN expressions for date 
and timestamp
 Key: ARROW-11995
 URL: https://issues.apache.org/jira/browse/ARROW-11995
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Sagnik Chakraborty
Assignee: Sagnik Chakraborty






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-11995) [C++][Gandiva] Add support for IN expressions for date and timestamp

2021-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-11995:
---
Labels: pull-request-available  (was: )

> [C++][Gandiva] Add support for IN expressions for date and timestamp
> 
>
> Key: ARROW-11995
> URL: https://issues.apache.org/jira/browse/ARROW-11995
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Sagnik Chakraborty
>Assignee: Sagnik Chakraborty
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302971#comment-17302971
 ] 

Ahmed Riza commented on ARROW-11963:


[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

 

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
 

 

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
>  3: loadNamespace(package, lib.loc)
>  4: doTryCatch(return(expr), name, parentenv, handler)
>  5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  6: tryCatchList(expr, classes, parentenv, handlers)
>  7: tryCatch({    attr(package, "LibPath") <- which.lib.loc    ns <- 
> loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, 
> deps, exclude, include.only)}, error = function(e) {    P <- if (!is.null(cc 
> <- conditionCall(
> e))) paste(" in", deparse(cc)[1L])    else ""    msg <- 
> gettextf("package or namespace load failed for %s%s:\n %s", 
> sQuote(package), P, conditionMessage(e))    if (logical.return) 
> message(paste("Error:", msg), do
> main = NA)    else stop(msg, call. = FALSE, domain = NA)})
>  8: library(pkg_name, lib.loc = lib, character.only = TRUE, logical.return = 
> TRUE)
>  9: withCallingHandlers(expr, packageStartupMessage = function(c) 
> tryInvokeRestart("muffleMessage"))
> 10: suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE))
> 11: doTryCatch(return(expr), name, parentenv, handler)
> 12: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 13: tryCatchList(expr, classes, parentenv, handlers)
> 14: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if 
> (!is.null(call)) {    if (identical(call[[1L]], quote(doTryCatch)))   
>   call <- sys.call(-4L)    dcall <- deparse(call)[1L]    prefix 
> <- past
> e("Error in", dcall, ": ")    LONG <- 75L    sm <- 
> strsplit(conditionMessage(e), "\n")[[1L]]    w <- 14L + nchar(dcall, type 
> = "w") + nchar(sm[1L], type = "w")    if (is.na(w)) w <- 14L 
> + nchar(dcall, type = 
> "b") + nchar(sm[1L], type = "b")    if (w > LONG) 
> prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg 
> <- paste0(prefix, conditionMessage(e), "\n")    
> .Internal(seterrmessage(msg[1L])
> )    if (!silent && isTRUE(getOption

[jira] [Updated] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Riza updated ARROW-11963:
---
Attachment: cmake.log

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: cmake.log
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
>  3: loadNamespace(package, lib.loc)
>  4: doTryCatch(return(expr), name, parentenv, handler)
>  5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  6: tryCatchList(expr, classes, parentenv, handlers)
>  7: tryCatch({    attr(package, "LibPath") <- which.lib.loc    ns <- 
> loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, 
> deps, exclude, include.only)}, error = function(e) {    P <- if (!is.null(cc 
> <- conditionCall(
> e))) paste(" in", deparse(cc)[1L])    else ""    msg <- 
> gettextf("package or namespace load failed for %s%s:\n %s", 
> sQuote(package), P, conditionMessage(e))    if (logical.return) 
> message(paste("Error:", msg), do
> main = NA)    else stop(msg, call. = FALSE, domain = NA)})
>  8: library(pkg_name, lib.loc = lib, character.only = TRUE, logical.return = 
> TRUE)
>  9: withCallingHandlers(expr, packageStartupMessage = function(c) 
> tryInvokeRestart("muffleMessage"))
> 10: suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE))
> 11: doTryCatch(return(expr), name, parentenv, handler)
> 12: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 13: tryCatchList(expr, classes, parentenv, handlers)
> 14: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if 
> (!is.null(call)) {    if (identical(call[[1L]], quote(doTryCatch)))   
>   call <- sys.call(-4L)    dcall <- deparse(call)[1L]    prefix 
> <- past
> e("Error in", dcall, ": ")    LONG <- 75L    sm <- 
> strsplit(conditionMessage(e), "\n")[[1L]]    w <- 14L + nchar(dcall, type 
> = "w") + nchar(sm[1L], type = "w")    if (is.na(w)) w <- 14L 
> + nchar(dcall, type = 
> "b") + nchar(sm[1L], type = "b")    if (w > LONG) 
> prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg 
> <- paste0(prefix, conditionMessage(e), "\n")    
> .Internal(seterrmessage(msg[1L])
> )    if (!silent && isTRUE(getOption("show.error.messages"))) {    
> cat(msg, file = outFile)    .Internal(printDeferredWarnings())    }    
> invisible(structure(msg, class = "try-error", condition = e))})
> 15: try(suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE)))
> 16: tools:::.test_load_package("arrow", 
> "/work/R/x86_64-redhat-linux-gnu-library/4.0/00LOCK-arrow/00new")
> An irrecoverable exception occurred. R is aborting now ...
> ERROR: loading failed
> {code}
> R version info:
> {code:java}
> R version 4.0.4 (2021-02-15) -- "Lost Library Book"
> Copyright (C) 2021 The R Foundation for Statistical Computing
> Platform: x86_64-redhat-linux-gnu (64-bit)
> {code}
> Any thoughts on where to look? (I can only get arrow to work with the latest 
> development version of R and not the release version of 4.0.4).  Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302971#comment-17302971
 ] 

Ahmed Riza edited comment on ARROW-11963 at 3/16/21, 11:30 PM:
---

[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

 

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
Attached the output from `cmake` and `make`.

 


was (Author: dr.r...@gmail.com):
[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

 

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
 

 

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: cmake.log
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
>  3: loadNamespace(package, lib.loc)
>  4: doTryCatch(return(expr), name, parentenv, handler)
>  5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  6: tryCatchList(expr, classes, parentenv, handlers)
>  7: tryCatch({    attr(package, "LibPath") <- which.lib.loc    ns <- 
> loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, 
> deps, exclude, include.only)}, error = function(e) {    P <- if (!is.null(cc 

[jira] [Comment Edited] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302971#comment-17302971
 ] 

Ahmed Riza edited comment on ARROW-11963 at 3/16/21, 11:33 PM:
---

[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

I've done some basic sanity checks with the compiled C++ libraries by running 
some test C++ code with them, e.g. reading a parquet file etc and that works 
fine.

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
Attached the output from `cmake` and `make`.

 


was (Author: dr.r...@gmail.com):
[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

 

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
Attached the output from `cmake` and `make`.

 

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: cmake.log
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
>  3: loadNamespace(package, lib.loc)
>  4: doTryCatch(return(expr), name, parentenv, handler)
>  5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  6: tryCatchList(expr, classes, parentenv, handlers)
>  7: tryCatch({    attr

[jira] [Updated] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Riza updated ARROW-11963:
---
Attachment: R_arrow_install.log.gz

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: R_arrow_install.log.gz, cmake.log
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
>  3: loadNamespace(package, lib.loc)
>  4: doTryCatch(return(expr), name, parentenv, handler)
>  5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  6: tryCatchList(expr, classes, parentenv, handlers)
>  7: tryCatch({    attr(package, "LibPath") <- which.lib.loc    ns <- 
> loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, 
> deps, exclude, include.only)}, error = function(e) {    P <- if (!is.null(cc 
> <- conditionCall(
> e))) paste(" in", deparse(cc)[1L])    else ""    msg <- 
> gettextf("package or namespace load failed for %s%s:\n %s", 
> sQuote(package), P, conditionMessage(e))    if (logical.return) 
> message(paste("Error:", msg), do
> main = NA)    else stop(msg, call. = FALSE, domain = NA)})
>  8: library(pkg_name, lib.loc = lib, character.only = TRUE, logical.return = 
> TRUE)
>  9: withCallingHandlers(expr, packageStartupMessage = function(c) 
> tryInvokeRestart("muffleMessage"))
> 10: suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE))
> 11: doTryCatch(return(expr), name, parentenv, handler)
> 12: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 13: tryCatchList(expr, classes, parentenv, handlers)
> 14: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if 
> (!is.null(call)) {    if (identical(call[[1L]], quote(doTryCatch)))   
>   call <- sys.call(-4L)    dcall <- deparse(call)[1L]    prefix 
> <- past
> e("Error in", dcall, ": ")    LONG <- 75L    sm <- 
> strsplit(conditionMessage(e), "\n")[[1L]]    w <- 14L + nchar(dcall, type 
> = "w") + nchar(sm[1L], type = "w")    if (is.na(w)) w <- 14L 
> + nchar(dcall, type = 
> "b") + nchar(sm[1L], type = "b")    if (w > LONG) 
> prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg 
> <- paste0(prefix, conditionMessage(e), "\n")    
> .Internal(seterrmessage(msg[1L])
> )    if (!silent && isTRUE(getOption("show.error.messages"))) {    
> cat(msg, file = outFile)    .Internal(printDeferredWarnings())    }    
> invisible(structure(msg, class = "try-error", condition = e))})
> 15: try(suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE)))
> 16: tools:::.test_load_package("arrow", 
> "/work/R/x86_64-redhat-linux-gnu-library/4.0/00LOCK-arrow/00new")
> An irrecoverable exception occurred. R is aborting now ...
> ERROR: loading failed
> {code}
> R version info:
> {code:java}
> R version 4.0.4 (2021-02-15) -- "Lost Library Book"
> Copyright (C) 2021 The R Foundation for Statistical Computing
> Platform: x86_64-redhat-linux-gnu (64-bit)
> {code}
> Any thoughts on where to look? (I can only get arrow to work with the latest 
> development version of R and not the release version of 4.0.4).  Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302971#comment-17302971
 ] 

Ahmed Riza edited comment on ARROW-11963 at 3/16/21, 11:39 PM:
---

[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

I've done some basic sanity checks with the compiled C++ libraries by running 
some test C++ code with them, e.g. reading a parquet file etc and that works 
fine.

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
Attached the output from `cmake`,  `make` and the install of the "arrow" 
package from R.

 


was (Author: dr.r...@gmail.com):
[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

I've done some basic sanity checks with the compiled C++ libraries by running 
some test C++ code with them, e.g. reading a parquet file etc and that works 
fine.

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
Attached the output from `cmake` and `make`.

 

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: R_arrow_install.log.gz, cmake.log
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
> 

[jira] [Updated] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Riza updated ARROW-11963:
---
Attachment: make.log.gz

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: R_arrow_install.log.gz, cmake.log, make.log.gz
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
>  3: loadNamespace(package, lib.loc)
>  4: doTryCatch(return(expr), name, parentenv, handler)
>  5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  6: tryCatchList(expr, classes, parentenv, handlers)
>  7: tryCatch({    attr(package, "LibPath") <- which.lib.loc    ns <- 
> loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, 
> deps, exclude, include.only)}, error = function(e) {    P <- if (!is.null(cc 
> <- conditionCall(
> e))) paste(" in", deparse(cc)[1L])    else ""    msg <- 
> gettextf("package or namespace load failed for %s%s:\n %s", 
> sQuote(package), P, conditionMessage(e))    if (logical.return) 
> message(paste("Error:", msg), do
> main = NA)    else stop(msg, call. = FALSE, domain = NA)})
>  8: library(pkg_name, lib.loc = lib, character.only = TRUE, logical.return = 
> TRUE)
>  9: withCallingHandlers(expr, packageStartupMessage = function(c) 
> tryInvokeRestart("muffleMessage"))
> 10: suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE))
> 11: doTryCatch(return(expr), name, parentenv, handler)
> 12: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 13: tryCatchList(expr, classes, parentenv, handlers)
> 14: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if 
> (!is.null(call)) {    if (identical(call[[1L]], quote(doTryCatch)))   
>   call <- sys.call(-4L)    dcall <- deparse(call)[1L]    prefix 
> <- past
> e("Error in", dcall, ": ")    LONG <- 75L    sm <- 
> strsplit(conditionMessage(e), "\n")[[1L]]    w <- 14L + nchar(dcall, type 
> = "w") + nchar(sm[1L], type = "w")    if (is.na(w)) w <- 14L 
> + nchar(dcall, type = 
> "b") + nchar(sm[1L], type = "b")    if (w > LONG) 
> prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg 
> <- paste0(prefix, conditionMessage(e), "\n")    
> .Internal(seterrmessage(msg[1L])
> )    if (!silent && isTRUE(getOption("show.error.messages"))) {    
> cat(msg, file = outFile)    .Internal(printDeferredWarnings())    }    
> invisible(structure(msg, class = "try-error", condition = e))})
> 15: try(suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE)))
> 16: tools:::.test_load_package("arrow", 
> "/work/R/x86_64-redhat-linux-gnu-library/4.0/00LOCK-arrow/00new")
> An irrecoverable exception occurred. R is aborting now ...
> ERROR: loading failed
> {code}
> R version info:
> {code:java}
> R version 4.0.4 (2021-02-15) -- "Lost Library Book"
> Copyright (C) 2021 The R Foundation for Statistical Computing
> Platform: x86_64-redhat-linux-gnu (64-bit)
> {code}
> Any thoughts on where to look? (I can only get arrow to work with the latest 
> development version of R and not the release version of 4.0.4).  Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302971#comment-17302971
 ] 

Ahmed Riza edited comment on ARROW-11963 at 3/16/21, 11:40 PM:
---

[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

I've done some basic sanity checks with the compiled C++ libraries by running 
some test C++ code with them, e.g. reading a parquet file etc and that works 
fine.

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
Attached the output from `cmake`,  `make` and the install of the "arrow" 
package from R.

Please let me know, if there's anything else I can provide.


was (Author: dr.r...@gmail.com):
[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

I've done some basic sanity checks with the compiled C++ libraries by running 
some test C++ code with them, e.g. reading a parquet file etc and that works 
fine.

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
Attached the output from `cmake`,  `make` and the install of the "arrow" 
package from R.

 

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: R_arrow_install.log.gz, cmake.log, make.log.gz
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not

[jira] [Comment Edited] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302971#comment-17302971
 ] 

Ahmed Riza edited comment on ARROW-11963 at 3/16/21, 11:49 PM:
---

[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

I've done some basic sanity checks with the compiled C++ libraries by running 
some test C++ code with them, e.g. reading a parquet file etc and that works 
fine.

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
Attached the output from `cmake`,  `make` and the install of the "arrow" 
package from R.

I also removed my locally built C++ arrow libraries and tried the installation 
of "arrow" package from R again from afresh.  The logs are attached 
{color:#ff5454}R_arrow_install_clean.log.gz.
{color}

Please let me know, if there's anything else I can provide.


was (Author: dr.r...@gmail.com):
[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

I've done some basic sanity checks with the compiled C++ libraries by running 
some test C++ code with them, e.g. reading a parquet file etc and that works 
fine.

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
Attached the output from `cmake`,  `make` and the install of the "arrow" 
package from R.

Please let me know, if there's anything else I can provide.

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: R_arrow_install.log.gz, R_arrow_install_clean.log.gz, 
> cmake.log, make.log.gz
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** t

[jira] [Updated] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Riza updated ARROW-11963:
---
Attachment: R_arrow_install_clean.log.gz

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: R_arrow_install.log.gz, R_arrow_install_clean.log.gz, 
> cmake.log, make.log.gz
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
>  3: loadNamespace(package, lib.loc)
>  4: doTryCatch(return(expr), name, parentenv, handler)
>  5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  6: tryCatchList(expr, classes, parentenv, handlers)
>  7: tryCatch({    attr(package, "LibPath") <- which.lib.loc    ns <- 
> loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, 
> deps, exclude, include.only)}, error = function(e) {    P <- if (!is.null(cc 
> <- conditionCall(
> e))) paste(" in", deparse(cc)[1L])    else ""    msg <- 
> gettextf("package or namespace load failed for %s%s:\n %s", 
> sQuote(package), P, conditionMessage(e))    if (logical.return) 
> message(paste("Error:", msg), do
> main = NA)    else stop(msg, call. = FALSE, domain = NA)})
>  8: library(pkg_name, lib.loc = lib, character.only = TRUE, logical.return = 
> TRUE)
>  9: withCallingHandlers(expr, packageStartupMessage = function(c) 
> tryInvokeRestart("muffleMessage"))
> 10: suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE))
> 11: doTryCatch(return(expr), name, parentenv, handler)
> 12: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 13: tryCatchList(expr, classes, parentenv, handlers)
> 14: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if 
> (!is.null(call)) {    if (identical(call[[1L]], quote(doTryCatch)))   
>   call <- sys.call(-4L)    dcall <- deparse(call)[1L]    prefix 
> <- past
> e("Error in", dcall, ": ")    LONG <- 75L    sm <- 
> strsplit(conditionMessage(e), "\n")[[1L]]    w <- 14L + nchar(dcall, type 
> = "w") + nchar(sm[1L], type = "w")    if (is.na(w)) w <- 14L 
> + nchar(dcall, type = 
> "b") + nchar(sm[1L], type = "b")    if (w > LONG) 
> prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg 
> <- paste0(prefix, conditionMessage(e), "\n")    
> .Internal(seterrmessage(msg[1L])
> )    if (!silent && isTRUE(getOption("show.error.messages"))) {    
> cat(msg, file = outFile)    .Internal(printDeferredWarnings())    }    
> invisible(structure(msg, class = "try-error", condition = e))})
> 15: try(suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE)))
> 16: tools:::.test_load_package("arrow", 
> "/work/R/x86_64-redhat-linux-gnu-library/4.0/00LOCK-arrow/00new")
> An irrecoverable exception occurred. R is aborting now ...
> ERROR: loading failed
> {code}
> R version info:
> {code:java}
> R version 4.0.4 (2021-02-15) -- "Lost Library Book"
> Copyright (C) 2021 The R Foundation for Statistical Computing
> Platform: x86_64-redhat-linux-gnu (64-bit)
> {code}
> Any thoughts on where to look? (I can only get arrow to work with the latest 
> development version of R and not the release version of 4.0.4).  Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302971#comment-17302971
 ] 

Ahmed Riza edited comment on ARROW-11963 at 3/16/21, 11:53 PM:
---

[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

I've done some basic sanity checks with the compiled C++ libraries by running 
some test C++ code with them, e.g. reading a parquet file etc and that works 
fine.

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
Attached the output from `cmake`,  `make` and the install of the "arrow" 
package from R.

I also removed my locally built C++ arrow libraries and tried the installation 
of "arrow" package from R again from afresh.  The logs are attached 
{color:#ff5454}R_arrow_install_clean.log.gz.{color}

As can be seen from the latter log, the "arrow" package install from R looks 
OK, but then run into the same library issue as before with the same error from 
R as above.

Please let me know, if there's anything else I can provide.


was (Author: dr.r...@gmail.com):
[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

I've done some basic sanity checks with the compiled C++ libraries by running 
some test C++ code with them, e.g. reading a parquet file etc and that works 
fine.

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
Attached the output from `cmake`,  `make` and the install of the "arrow" 
package from R.

I also removed my locally built C++ arrow libraries and tried the installation 
of "arrow" package from R again from afresh.  The logs are attached 
{color:#ff5454}R_arrow_install_clean.log.gz.
{color}

Please let me know, if there's anything else I can provide.

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: R_arrow_install.log.gz, R_arrow_install_clean.log.gz, 
> cmake.log, make.log.gz
>
>
> I have been trying to install "arrow" package, using R 4

[jira] [Comment Edited] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302971#comment-17302971
 ] 

Ahmed Riza edited comment on ARROW-11963 at 3/16/21, 11:57 PM:
---

[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/local/lib64".

I've done some basic sanity checks with the compiled C++ libraries by running 
some test C++ code with them, e.g. reading a parquet file etc and that works 
fine.

The reason I compiled the C++ libraries myself is because the attempt to 
install "arrow" did not succeed entirely.  The `install.package("arrow")` 
seemed to do the right thing, but then when I try to use any of the "arrow" 
functions from R I ran into the error shown below:
{code:java}
> library(arrow)

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> df <- read_parquet("/tmp/small.parquet")  
>    
Error in io___MemoryMappedFile__Open(path, mode) :  
  Cannot call io___MemoryMappedFile__Open(). See 
https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow 
C++ libraries.  
> 
{code}
I have set LD_LIBRARY_PATH to the location of the C++ libs:
{code:java}
$ echo $LD_LIBRARY_PATH  
/usr/local/lib64

$ ls -latr /usr/local/lib64/

total 108992
drwxr-xr-x.  3 root root   27 Sep 21  2019 python3.7
drwxr-xr-x   2 root root    6 Jul 27  2020 bpf
drwxr-xr-x. 16 root root  265 Jul 27  2020 ..
-rw-r--r--   1 root root   529554 Aug 28  2020 libavro.a
-rwxr-xr-x   1 root root   385984 Aug 28  2020 libavro.so.23.0.0
lrwxrwxrwx   1 root root   17 Aug 28  2020 libavro.so.23 -> 
libavro.so.23.0.0
lrwxrwxrwx   1 root root   13 Aug 28  2020 libavro.so -> libavro.so.23
drwxr-xr-x   3 root root   27 Jan 30 22:17 python3.9
drwxr-xr-x   3 root root   19 Mar 11 16:33 cmake
-rwxr-xr-x   1 root root 26147944 Mar 16 23:37 libarrow.so.300.0.0
-rw-r--r--   1 root root 38207524 Mar 16 23:37 libarrow_bundled_dependencies.a
-rw-r--r--   1 root root 30107900 Mar 16 23:37 libarrow.a
-rw-r--r--   1 root root  8223390 Mar 16 23:38 libparquet.a
-rwxr-xr-x   1 root root  4243704 Mar 16 23:39 libparquet.so.300.0.0
-rwxr-xr-x   1 root root  1273496 Mar 16 23:39 libarrow_dataset.so.300.0.0
-rw-r--r--   1 root root  2465090 Mar 16 23:39 libarrow_dataset.a
lrwxrwxrwx   1 root root   19 Mar 16 23:51 libarrow.so.300 -> 
libarrow.so.300.0.0
lrwxrwxrwx   1 root root   15 Mar 16 23:51 libarrow.so -> libarrow.so.300
lrwxrwxrwx   1 root root   27 Mar 16 23:51 libarrow_dataset.so.300 -> 
libarrow_dataset.so.300.0.0
lrwxrwxrwx   1 root root   23 Mar 16 23:51 libarrow_dataset.so -> 
libarrow_dataset.so.300
lrwxrwxrwx   1 root root   21 Mar 16 23:51 libparquet.so.300 -> 
libparquet.so.300.0.0
lrwxrwxrwx   1 root root   17 Mar 16 23:51 libparquet.so -> 
libparquet.so.300
drwxr-xr-x.  7 root root 4096 Mar 16 23:51 .
drwxr-xr-x   2 root root  173 Mar 16 23:51 pkgconfig
{code}
 

Attached the output from `cmake`,  `make` and the install of the "arrow" 
package from R.

I also removed my locally built C++ arrow libraries and tried the installation 
of "arrow" package from R again from afresh.  The logs are attached 
{color:#ff5454}R_arrow_install_clean.log.gz.{color}

As can be seen from the latter log, the "arrow" package install from R looks 
OK, but then run into the same library issue as before with the same error from 
R as above.

Please let me know, if there's anything else I can provide.


was (Author: dr.r...@gmail.com):
[~npr], here's how I built the C++ libraries after downloading the release tar 
ball from 
[https://www.mirrorservice.org/sites/ftp.apache.org/arrow/arrow-3.0.0/apache-arrow-3.0.0.tar.gz,]
 and extracting it:
{code:java}
cd /work/apache-arrow-3.0.0/cpp
mkdir build
cd build

cmake -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON 
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON -DARROW_PARQUET=ON 
-DCMAKE_BUILD_TYPE=release -DARROW_INSTALL_NAME_RPATH=OFF -DARROW_HDFS=ON 
-DARROW_S3=ON -DARROW_PYTHON=OFF -DARROW_WITH_SNAPPY=ON -DARROW_WITH_ZLIB=ON 
-DARROW_EXTRA_ERROR_CONTEXT=ON   ..

sudo make install{code}
This installs the headers and libraries to "/usr/local/include/arrow" and 
"/usr/loc

[jira] [Commented] (ARROW-11994) [R] Build fails if dataset enabled but parquet is not

2021-03-16 Thread Ian Cook (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302979#comment-17302979
 ] 

Ian Cook commented on ARROW-11994:
--

There was a brief discussion of this dependency of Dataset on Parquet at 
https://issues.apache.org/jira/browse/ARROW-11735?focusedCommentId=17291960&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17291960
 but the outcome of that was simply that we allowed both Dataset and Parquet to 
be toggled off in the R package build. (The original scope of ARROW-11735 was 
solely Dataset.)

One simple solution would be to detect that Dataset is enabled and Parquet is 
not and fail the build with a helpful message indicating you must enable both 
or neither.

> [R] Build fails if dataset enabled but parquet is not
> -
>
> Key: ARROW-11994
> URL: https://issues.apache.org/jira/browse/ARROW-11994
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Neal Richardson
>Priority: Minor
>
> Following ARROW-11735; discovered while working on ARROW-10734. The 
> arrow::dataset::ParquetFileFormat and related classes require both dataset 
> and parquet. The {{#if defined}} logic in r/src/dataset.cpp is right and both 
> are required, but in the wrapping that is generated for arrowExports.cpp, we 
> only use the annotation on the functions, {{[[dataset::export]]}} to wrap. So 
> the ParquetFileFormat methods in arrowExports.cpp are if defined 
> ARROW_R_WITH_DATASET and fail if parquet is not available.
> Not a priority to fix (for Solaris I can turn off ARROW_DATASET and avoid 
> this), just wanted to note it in case we need to revisit this wrapping logic 
> later anyway. cc [~icook]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Riza updated ARROW-11963:
---
Attachment: (was: R_arrow_install.log.gz)

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: R_arrow_install.log.gz, R_arrow_install_clean.log.gz, 
> cmake.log, make.log.gz
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
>  3: loadNamespace(package, lib.loc)
>  4: doTryCatch(return(expr), name, parentenv, handler)
>  5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  6: tryCatchList(expr, classes, parentenv, handlers)
>  7: tryCatch({    attr(package, "LibPath") <- which.lib.loc    ns <- 
> loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, 
> deps, exclude, include.only)}, error = function(e) {    P <- if (!is.null(cc 
> <- conditionCall(
> e))) paste(" in", deparse(cc)[1L])    else ""    msg <- 
> gettextf("package or namespace load failed for %s%s:\n %s", 
> sQuote(package), P, conditionMessage(e))    if (logical.return) 
> message(paste("Error:", msg), do
> main = NA)    else stop(msg, call. = FALSE, domain = NA)})
>  8: library(pkg_name, lib.loc = lib, character.only = TRUE, logical.return = 
> TRUE)
>  9: withCallingHandlers(expr, packageStartupMessage = function(c) 
> tryInvokeRestart("muffleMessage"))
> 10: suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE))
> 11: doTryCatch(return(expr), name, parentenv, handler)
> 12: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 13: tryCatchList(expr, classes, parentenv, handlers)
> 14: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if 
> (!is.null(call)) {    if (identical(call[[1L]], quote(doTryCatch)))   
>   call <- sys.call(-4L)    dcall <- deparse(call)[1L]    prefix 
> <- past
> e("Error in", dcall, ": ")    LONG <- 75L    sm <- 
> strsplit(conditionMessage(e), "\n")[[1L]]    w <- 14L + nchar(dcall, type 
> = "w") + nchar(sm[1L], type = "w")    if (is.na(w)) w <- 14L 
> + nchar(dcall, type = 
> "b") + nchar(sm[1L], type = "b")    if (w > LONG) 
> prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg 
> <- paste0(prefix, conditionMessage(e), "\n")    
> .Internal(seterrmessage(msg[1L])
> )    if (!silent && isTRUE(getOption("show.error.messages"))) {    
> cat(msg, file = outFile)    .Internal(printDeferredWarnings())    }    
> invisible(structure(msg, class = "try-error", condition = e))})
> 15: try(suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE)))
> 16: tools:::.test_load_package("arrow", 
> "/work/R/x86_64-redhat-linux-gnu-library/4.0/00LOCK-arrow/00new")
> An irrecoverable exception occurred. R is aborting now ...
> ERROR: loading failed
> {code}
> R version info:
> {code:java}
> R version 4.0.4 (2021-02-15) -- "Lost Library Book"
> Copyright (C) 2021 The R Foundation for Statistical Computing
> Platform: x86_64-redhat-linux-gnu (64-bit)
> {code}
> Any thoughts on where to look? (I can only get arrow to work with the latest 
> development version of R and not the release version of 4.0.4).  Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Ahmed Riza (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Riza updated ARROW-11963:
---
Attachment: R_arrow_install.log.gz

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: R_arrow_install.log.gz, R_arrow_install_clean.log.gz, 
> cmake.log, make.log.gz
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
>  3: loadNamespace(package, lib.loc)
>  4: doTryCatch(return(expr), name, parentenv, handler)
>  5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  6: tryCatchList(expr, classes, parentenv, handlers)
>  7: tryCatch({    attr(package, "LibPath") <- which.lib.loc    ns <- 
> loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, 
> deps, exclude, include.only)}, error = function(e) {    P <- if (!is.null(cc 
> <- conditionCall(
> e))) paste(" in", deparse(cc)[1L])    else ""    msg <- 
> gettextf("package or namespace load failed for %s%s:\n %s", 
> sQuote(package), P, conditionMessage(e))    if (logical.return) 
> message(paste("Error:", msg), do
> main = NA)    else stop(msg, call. = FALSE, domain = NA)})
>  8: library(pkg_name, lib.loc = lib, character.only = TRUE, logical.return = 
> TRUE)
>  9: withCallingHandlers(expr, packageStartupMessage = function(c) 
> tryInvokeRestart("muffleMessage"))
> 10: suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE))
> 11: doTryCatch(return(expr), name, parentenv, handler)
> 12: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 13: tryCatchList(expr, classes, parentenv, handlers)
> 14: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if 
> (!is.null(call)) {    if (identical(call[[1L]], quote(doTryCatch)))   
>   call <- sys.call(-4L)    dcall <- deparse(call)[1L]    prefix 
> <- past
> e("Error in", dcall, ": ")    LONG <- 75L    sm <- 
> strsplit(conditionMessage(e), "\n")[[1L]]    w <- 14L + nchar(dcall, type 
> = "w") + nchar(sm[1L], type = "w")    if (is.na(w)) w <- 14L 
> + nchar(dcall, type = 
> "b") + nchar(sm[1L], type = "b")    if (w > LONG) 
> prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg 
> <- paste0(prefix, conditionMessage(e), "\n")    
> .Internal(seterrmessage(msg[1L])
> )    if (!silent && isTRUE(getOption("show.error.messages"))) {    
> cat(msg, file = outFile)    .Internal(printDeferredWarnings())    }    
> invisible(structure(msg, class = "try-error", condition = e))})
> 15: try(suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE)))
> 16: tools:::.test_load_package("arrow", 
> "/work/R/x86_64-redhat-linux-gnu-library/4.0/00LOCK-arrow/00new")
> An irrecoverable exception occurred. R is aborting now ...
> ERROR: loading failed
> {code}
> R version info:
> {code:java}
> R version 4.0.4 (2021-02-15) -- "Lost Library Book"
> Copyright (C) 2021 The R Foundation for Statistical Computing
> Platform: x86_64-redhat-linux-gnu (64-bit)
> {code}
> Any thoughts on where to look? (I can only get arrow to work with the latest 
> development version of R and not the release version of 4.0.4).  Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Neal Richardson (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302991#comment-17302991
 ] 

Neal Richardson commented on ARROW-11963:
-

Thanks. This kind of failure often means that the compiler that built the Arrow 
C++ library is not the same as the one that R used to compile the package. When 
you just do `install.packages("arrow")` without installing the C++ library 
separately, it will build with the matching compiler and flags. I'm not sure 
why that installation failed for you, but if you look at the docs link that the 
error message suggested 
(https://arrow.apache.org/docs/r/articles/install.html#troubleshooting), there 
are a few possibilities. 

Among the options now:

1. Check out what {{R CMD config CXX11}} says, and check that it matches the 
compiler from your cmake output (gcc 10 I believe I saw, it's at the top). If 
that doesn't match, you can set CC and CXX to match what R has set and retry 
the C++ library build. (See 
https://github.com/apache/arrow/blob/master/r/tools/linuxlibs.R#L317-L318 for 
reference.)
2. Uninstall the Arrow C++ library you built, set the env var 
{{ARROW_R_DEV=true}} for more verbosity on the build, retry the bundled build 
with install.packages("arrow"), and let's see why that failed. Even if #1 works 
for you, I would be interested to see this if you don't mind--we work hard so 
that this doesn't happen, and I'd like to know more so we can fix it.

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: R_arrow_install.log.gz, R_arrow_install_clean.log.gz, 
> cmake.log, make.log.gz
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
>  3: loadNamespace(package, lib.loc)
>  4: doTryCatch(return(expr), name, parentenv, handler)
>  5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  6: tryCatchList(expr, classes, parentenv, handlers)
>  7: tryCatch({    attr(package, "LibPath") <- which.lib.loc    ns <- 
> loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, 
> deps, exclude, include.only)}, error = function(e) {    P <- if (!is.null(cc 
> <- conditionCall(
> e))) paste(" in", deparse(cc)[1L])    else ""    msg <- 
> gettextf("package or namespace load failed for %s%s:\n %s", 
> sQuote(package), P, conditionMessage(e))    if (logical.return) 
> message(paste("Error:", msg), do
> main = NA)    else stop(msg, call. = FALSE, domain = NA)})
>  8: library(pkg_name, lib.loc = lib, character.only = TRUE, logical.return = 
> TRUE)
>  9: withCallingHandlers(expr, packageStartupMessage = function(c) 
> tryInvokeRestart("muffleMessage"))
> 10: suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE))
> 11: doTryCatch(return(expr), name, parentenv, handler)
> 12: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 13: tryCatchList(expr, classes, parentenv, handlers)
> 14: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if 
> (!is.null(call)) {    if (identical(call[[1L]], quote(doTryCatch)))   
>   call <- sys.call(-4L)    dcall <- deparse(call)[1L]    prefix 
> <- past
> e("Error in", dcall, ": ")    LONG <- 75L    sm <- 
> strsplit(conditionMessage(e), "\n")[[1L]]    w <- 14L + nchar(dcall, type 
> = "w") + nchar(sm[1L], type = "w")    if (is.na(w)) w <- 14L 
> + nchar(dcall, type = 
> "b") + nchar(sm[1L], type = "b")    if (w > LONG) 
> prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg 
> <- paste0(prefix, conditionMessage(e), "\n")    
> .Internal(seterrmessage(msg[1L])
> )    if (!silent && isTRUE(getOption("show.error.messages"))) {    
> 

[jira] [Commented] (ARROW-11963) Arrow installation issue with R 4.0.4 on Fedora 33

2021-03-16 Thread Neal Richardson (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302995#comment-17302995
 ] 

Neal Richardson commented on ARROW-11963:
-

Sorry, I started writing that before I saw that you had added more logs. I 
think those two recommendations are still what we need to do: check compiler 
versions, and retry with ARROW_R_DEV=true for more output. (We should make the 
build failure suggest that more directly.)

> Arrow installation issue with R 4.0.4 on Fedora 33
> --
>
> Key: ARROW-11963
> URL: https://issues.apache.org/jira/browse/ARROW-11963
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.0.0
> Environment: Linux, Fedora 33
>Reporter: Ahmed Riza
>Priority: Major
> Attachments: R_arrow_install.log.gz, R_arrow_install_clean.log.gz, 
> cmake.log, make.log.gz
>
>
> I have been trying to install "arrow" package, using R 4.0.4 on Linux (Fedora 
> 33).  I have built and installed the C++ arrow libraries (using release 
> version 3.0.0) following the instructions at 
> [https://arrow.apache.org/docs/r/.|https://arrow.apache.org/docs/r/]
> Then, from R, I tried to install "arrow":
> {code:java}
>  install.packages("arrow"){code}
> This fails during the verification stage:
> {code:java}
> ** testing if installed package can be loaded from temporary location 
>  sh: line 1:  8386 Segmentation fault  (core dumped) R_TESTS= 
> '/usr/lib64/R/bin/R' --no-save --no-restore --no-echo 2>&1 < 
> '/tmp/RtmpWtq6vV/file1f4b570a7335'
> caught segfault ***
>  address (nil), cause 'memory not mapped'
> Traceback:
>  1: dyn.load(file, DLLpath = DLLpath, ...)
>  2: library.dynam(lib, package, package.lib)
>  3: loadNamespace(package, lib.loc)
>  4: doTryCatch(return(expr), name, parentenv, handler)
>  5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
>  6: tryCatchList(expr, classes, parentenv, handlers)
>  7: tryCatch({    attr(package, "LibPath") <- which.lib.loc    ns <- 
> loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, 
> deps, exclude, include.only)}, error = function(e) {    P <- if (!is.null(cc 
> <- conditionCall(
> e))) paste(" in", deparse(cc)[1L])    else ""    msg <- 
> gettextf("package or namespace load failed for %s%s:\n %s", 
> sQuote(package), P, conditionMessage(e))    if (logical.return) 
> message(paste("Error:", msg), do
> main = NA)    else stop(msg, call. = FALSE, domain = NA)})
>  8: library(pkg_name, lib.loc = lib, character.only = TRUE, logical.return = 
> TRUE)
>  9: withCallingHandlers(expr, packageStartupMessage = function(c) 
> tryInvokeRestart("muffleMessage"))
> 10: suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE))
> 11: doTryCatch(return(expr), name, parentenv, handler)
> 12: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 13: tryCatchList(expr, classes, parentenv, handlers)
> 14: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if 
> (!is.null(call)) {    if (identical(call[[1L]], quote(doTryCatch)))   
>   call <- sys.call(-4L)    dcall <- deparse(call)[1L]    prefix 
> <- past
> e("Error in", dcall, ": ")    LONG <- 75L    sm <- 
> strsplit(conditionMessage(e), "\n")[[1L]]    w <- 14L + nchar(dcall, type 
> = "w") + nchar(sm[1L], type = "w")    if (is.na(w)) w <- 14L 
> + nchar(dcall, type = 
> "b") + nchar(sm[1L], type = "b")    if (w > LONG) 
> prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg 
> <- paste0(prefix, conditionMessage(e), "\n")    
> .Internal(seterrmessage(msg[1L])
> )    if (!silent && isTRUE(getOption("show.error.messages"))) {    
> cat(msg, file = outFile)    .Internal(printDeferredWarnings())    }    
> invisible(structure(msg, class = "try-error", condition = e))})
> 15: try(suppressPackageStartupMessages(library(pkg_name, lib.loc = lib, 
> character.only = TRUE, logical.return = TRUE)))
> 16: tools:::.test_load_package("arrow", 
> "/work/R/x86_64-redhat-linux-gnu-library/4.0/00LOCK-arrow/00new")
> An irrecoverable exception occurred. R is aborting now ...
> ERROR: loading failed
> {code}
> R version info:
> {code:java}
> R version 4.0.4 (2021-02-15) -- "Lost Library Book"
> Copyright (C) 2021 The R Foundation for Statistical Computing
> Platform: x86_64-redhat-linux-gnu (64-bit)
> {code}
> Any thoughts on where to look? (I can only get arrow to work with the latest 
> development version of R and not the release version of 4.0.4).  Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11994) [R] Build fails if dataset enabled but parquet is not

2021-03-16 Thread Neal Richardson (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302994#comment-17302994
 ] 

Neal Richardson commented on ARROW-11994:
-

Right, my recollection of that discussion was different from this though. We 
are able to build libarrow_dataset without libparquet, that's all fine. This is 
just in our cpp codegen, it can't handle that as currently written (and it's 
probably not worth trying to make it work). 

Your simple solution is probably fine (and not worth the trouble right now TBH, 
I'll just make both of them off in the solaris build).

> [R] Build fails if dataset enabled but parquet is not
> -
>
> Key: ARROW-11994
> URL: https://issues.apache.org/jira/browse/ARROW-11994
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Neal Richardson
>Priority: Minor
>
> Following ARROW-11735; discovered while working on ARROW-10734. The 
> arrow::dataset::ParquetFileFormat and related classes require both dataset 
> and parquet. The {{#if defined}} logic in r/src/dataset.cpp is right and both 
> are required, but in the wrapping that is generated for arrowExports.cpp, we 
> only use the annotation on the functions, {{[[dataset::export]]}} to wrap. So 
> the ParquetFileFormat methods in arrowExports.cpp are if defined 
> ARROW_R_WITH_DATASET and fail if parquet is not available.
> Not a priority to fix (for Solaris I can turn off ARROW_DATASET and avoid 
> this), just wanted to note it in case we need to revisit this wrapping logic 
> later anyway. cc [~icook]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-11954) [C++] arrow/util/io_util.cc does not compile on Solaris

2021-03-16 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson reassigned ARROW-11954:
---

Assignee: Neal Richardson

> [C++] arrow/util/io_util.cc does not compile on Solaris
> ---
>
> Key: ARROW-11954
> URL: https://issues.apache.org/jira/browse/ARROW-11954
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Neal Richardson
>Assignee: Neal Richardson
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Looks similar to ARROW-11740
> {code}
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:
>  In function ‘arrow::Status arrow::internal::MemoryMapRemap(void*, 
> std::size_t, std::size_t, int, void**)’:
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1089:48:
>  error: ‘MREMAP_MAYMOVE’ was not declared in this scope
> *new_addr = mremap(addr, old_size, new_size, MREMAP_MAYMOVE);
>  ^
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1089:62:
>  error: ‘mremap’ was not declared in this scope
> *new_addr = mremap(addr, old_size, new_size, MREMAP_MAYMOVE);
>  ^
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:
>  In function ‘arrow::Status arrow::internal::MemoryAdviseWillNeed(const 
> std::vector&)’:
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1144:59:
>  error: ‘POSIX_MADV_WILLNEED’ was not declared in this scope
> int err = posix_madvise(aligned.addr, aligned.size, POSIX_MADV_WILLNEED);
>  ^
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1144:78:
>  error: ‘posix_madvise’ was not declared in this scope
> int err = posix_madvise(aligned.addr, aligned.size, POSIX_MADV_WILLNEED);
>  ^
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-11954) [C++] arrow/util/io_util.cc does not compile on Solaris

2021-03-16 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson reassigned ARROW-11954:
---

Assignee: Antoine Pitrou  (was: Neal Richardson)

> [C++] arrow/util/io_util.cc does not compile on Solaris
> ---
>
> Key: ARROW-11954
> URL: https://issues.apache.org/jira/browse/ARROW-11954
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Neal Richardson
>Assignee: Antoine Pitrou
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Looks similar to ARROW-11740
> {code}
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:
>  In function ‘arrow::Status arrow::internal::MemoryMapRemap(void*, 
> std::size_t, std::size_t, int, void**)’:
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1089:48:
>  error: ‘MREMAP_MAYMOVE’ was not declared in this scope
> *new_addr = mremap(addr, old_size, new_size, MREMAP_MAYMOVE);
>  ^
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1089:62:
>  error: ‘mremap’ was not declared in this scope
> *new_addr = mremap(addr, old_size, new_size, MREMAP_MAYMOVE);
>  ^
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:
>  In function ‘arrow::Status arrow::internal::MemoryAdviseWillNeed(const 
> std::vector&)’:
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1144:59:
>  error: ‘POSIX_MADV_WILLNEED’ was not declared in this scope
> int err = posix_madvise(aligned.addr, aligned.size, POSIX_MADV_WILLNEED);
>  ^
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1144:78:
>  error: ‘posix_madvise’ was not declared in this scope
> int err = posix_madvise(aligned.addr, aligned.size, POSIX_MADV_WILLNEED);
>  ^
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-11736) [R] Allow string compute functions to be optional

2021-03-16 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson reassigned ARROW-11736:
---

Assignee: Neal Richardson

> [R] Allow string compute functions to be optional
> -
>
> Key: ARROW-11736
> URL: https://issues.apache.org/jira/browse/ARROW-11736
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: R
>Reporter: Neal Richardson
>Assignee: Neal Richardson
>Priority: Major
> Fix For: 4.0.0
>
>
> The Solaris build fails to build {{libarrow_bundled_dependencies.a}} because 
> of some mismatch of arguments to the {{ar}} command: 
> {code}
> [ 19%] Bundling 
> /export/home/XnknpBn/Rtemp/RtmpBOhxfH/file66df7a592ae4/release/libarrow_bundled_dependencies.a
> gmake[2]: Entering directory 
> '/export/home/XnknpBn/Rtemp/RtmpBOhxfH/file66df7a592ae4'
> usage: ar -d[-SvV] archive file ...
>ar -m[-abiSvV] [posname] archive file ...
>ar -p[-vV][-sS] archive [file ...]
>ar -q[-cuvSV] [-abi] [posname] [file ...]
>ar -r[-cuvSV] [-abi] [posname] [file ...]
>ar -t[-vV][-sS] archive [file ...]
>ar -x[-vV][-sSCT] archive [file ...]
> gmake[2]: *** 
> [src/arrow/CMakeFiles/arrow_bundled_dependencies.dir/build.make:61: 
> release/libarrow_bundled_dependencies.a] Error 1
> {code}
> If ARROW_PARQUET=OFF (ARROW-11735), the only dependencies to bundle are re2 
> and utf8proc. So we could either fix the {{ar}} invocation, or we could make 
> re2 and utf8proc optional. Build-wise, they are optional, but we have some 
> tests that call the string kernels, and we'd need to know that they should be 
> skipped (i.e. another option in {{skip_if_not_available()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11996) [R] Make r/configure run successfully on Solaris

2021-03-16 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-11996:
---

 Summary: [R] Make r/configure run successfully on Solaris
 Key: ARROW-11996
 URL: https://issues.apache.org/jira/browse/ARROW-11996
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: R
Reporter: Neal Richardson
Assignee: Neal Richardson


Replace some {{$()}} with backticks and use {{sed}} in a safe way



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-11994) [R] Build fails if dataset enabled but parquet is not

2021-03-16 Thread Ian Cook (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303003#comment-17303003
 ] 

Ian Cook commented on ARROW-11994:
--

Yep, ARROW-11735 didn't change anything in {{cpp/}}. IIRC it was and is 
possible to build libarrow with {{ARROW_DATASET=ON}} and {{ARROW_PARQUET=OFF}} 
but the build might fail if both are not actually installed and that building 
that way might cause runtime errors.

> [R] Build fails if dataset enabled but parquet is not
> -
>
> Key: ARROW-11994
> URL: https://issues.apache.org/jira/browse/ARROW-11994
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Neal Richardson
>Priority: Minor
>
> Following ARROW-11735; discovered while working on ARROW-10734. The 
> arrow::dataset::ParquetFileFormat and related classes require both dataset 
> and parquet. The {{#if defined}} logic in r/src/dataset.cpp is right and both 
> are required, but in the wrapping that is generated for arrowExports.cpp, we 
> only use the annotation on the functions, {{[[dataset::export]]}} to wrap. So 
> the ParquetFileFormat methods in arrowExports.cpp are if defined 
> ARROW_R_WITH_DATASET and fail if parquet is not available.
> Not a priority to fix (for Solaris I can turn off ARROW_DATASET and avoid 
> this), just wanted to note it in case we need to revisit this wrapping logic 
> later anyway. cc [~icook]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ARROW-11994) [R] Build fails if dataset enabled but parquet is not

2021-03-16 Thread Ian Cook (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-11994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303003#comment-17303003
 ] 

Ian Cook edited comment on ARROW-11994 at 3/17/21, 12:56 AM:
-

Yep, ARROW-11735 didn't change anything in {{cpp/}}. IIRC it was and is 
possible to build libarrow with {{ARROW_DATASET=ON}} and {{ARROW_PARQUET=OFF}} 
but the build might fail if both are not actually installed and building that 
way might cause runtime errors.


was (Author: icook):
Yep, ARROW-11735 didn't change anything in {{cpp/}}. IIRC it was and is 
possible to build libarrow with {{ARROW_DATASET=ON}} and {{ARROW_PARQUET=OFF}} 
but the build might fail if both are not actually installed and that building 
that way might cause runtime errors.

> [R] Build fails if dataset enabled but parquet is not
> -
>
> Key: ARROW-11994
> URL: https://issues.apache.org/jira/browse/ARROW-11994
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Neal Richardson
>Priority: Minor
>
> Following ARROW-11735; discovered while working on ARROW-10734. The 
> arrow::dataset::ParquetFileFormat and related classes require both dataset 
> and parquet. The {{#if defined}} logic in r/src/dataset.cpp is right and both 
> are required, but in the wrapping that is generated for arrowExports.cpp, we 
> only use the annotation on the functions, {{[[dataset::export]]}} to wrap. So 
> the ParquetFileFormat methods in arrowExports.cpp are if defined 
> ARROW_R_WITH_DATASET and fail if parquet is not available.
> Not a priority to fix (for Solaris I can turn off ARROW_DATASET and avoid 
> this), just wanted to note it in case we need to revisit this wrapping logic 
> later anyway. cc [~icook]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-11986) Implement IN expressions for doubles and floats

2021-03-16 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou reassigned ARROW-11986:


Assignee: João Victor Huguenin

> Implement IN expressions for doubles and floats
> ---
>
> Key: ARROW-11986
> URL: https://issues.apache.org/jira/browse/ARROW-11986
> Project: Apache Arrow
>  Issue Type: Task
>  Components: C++ - Gandiva
>Reporter: João Victor Huguenin
>Assignee: João Victor Huguenin
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add functions to process IN expressions for Arrows fields with double and 
> float types.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-11987) [C++][Gandiva] Implement trigonometric functions on Gandiva

2021-03-16 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou reassigned ARROW-11987:


Assignee: João Pedro Antunes Ferreira

> [C++][Gandiva] Implement trigonometric functions on Gandiva
> ---
>
> Key: ARROW-11987
> URL: https://issues.apache.org/jira/browse/ARROW-11987
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: C++ - Gandiva
>Reporter: João Pedro Antunes Ferreira
>Assignee: João Pedro Antunes Ferreira
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 7h 50m
>
> Implement base trigonometric functions:
>  *  sin;
>  * cos;
>  * asin;
>  * acos;
>  * tan;
>  * atan;
>  * sinh;
>  * cosh;
>  * tanh;
>  * cotg;
>  * radians;
>  * degrees;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-11992) [Rust][Parquet] Add upgrade notes on 4.0 rename of LogicalType #9731

2021-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-11992:
---
Labels: pull-request-available  (was: )

> [Rust][Parquet] Add upgrade notes on 4.0 rename of LogicalType #9731
> 
>
> Key: ARROW-11992
> URL: https://issues.apache.org/jira/browse/ARROW-11992
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust
>Reporter: Andrew Lamb
>Assignee: Andrew Lamb
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-11988) [C++][Gandiva]Implements the last_day function in Gandiva

2021-03-16 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou reassigned ARROW-11988:


Assignee: Anthony Louis Gotlib Ferreira

> [C++][Gandiva]Implements the last_day function in Gandiva
> -
>
> Key: ARROW-11988
> URL: https://issues.apache.org/jira/browse/ARROW-11988
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++ - Gandiva
>Reporter: Anthony Louis Gotlib Ferreira
>Assignee: Anthony Louis Gotlib Ferreira
>Priority: Trivial
>  Labels: pull-request-available, pull_request_available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Adds the support for `last_day` function inside the Gandiva, similar to the 
> Apache Impala implementation: 
> https://docs.datafabric.hpe.com/62/Impala/new_features_impala_2100.html.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-11988) [C++][Gandiva] Implements the last_day function

2021-03-16 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou updated ARROW-11988:
-
Summary: [C++][Gandiva] Implements the last_day function  (was: 
[C++][Gandiva]Implements the last_day function in Gandiva)

> [C++][Gandiva] Implements the last_day function
> ---
>
> Key: ARROW-11988
> URL: https://issues.apache.org/jira/browse/ARROW-11988
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++ - Gandiva
>Reporter: Anthony Louis Gotlib Ferreira
>Assignee: Anthony Louis Gotlib Ferreira
>Priority: Trivial
>  Labels: pull-request-available, pull_request_available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Adds the support for `last_day` function inside the Gandiva, similar to the 
> Apache Impala implementation: 
> https://docs.datafabric.hpe.com/62/Impala/new_features_impala_2100.html.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-11995) [C++][Gandiva] Add support for IN expressions for date and timestamp

2021-03-16 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou updated ARROW-11995:
-
Issue Type: New Feature  (was: Bug)

> [C++][Gandiva] Add support for IN expressions for date and timestamp
> 
>
> Key: ARROW-11995
> URL: https://issues.apache.org/jira/browse/ARROW-11995
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Sagnik Chakraborty
>Assignee: Sagnik Chakraborty
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-11954) [C++] arrow/util/io_util.cc does not compile on Solaris

2021-03-16 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson resolved ARROW-11954.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 9729
[https://github.com/apache/arrow/pull/9729]

> [C++] arrow/util/io_util.cc does not compile on Solaris
> ---
>
> Key: ARROW-11954
> URL: https://issues.apache.org/jira/browse/ARROW-11954
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Neal Richardson
>Assignee: Antoine Pitrou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Looks similar to ARROW-11740
> {code}
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:
>  In function ‘arrow::Status arrow::internal::MemoryMapRemap(void*, 
> std::size_t, std::size_t, int, void**)’:
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1089:48:
>  error: ‘MREMAP_MAYMOVE’ was not declared in this scope
> *new_addr = mremap(addr, old_size, new_size, MREMAP_MAYMOVE);
>  ^
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1089:62:
>  error: ‘mremap’ was not declared in this scope
> *new_addr = mremap(addr, old_size, new_size, MREMAP_MAYMOVE);
>  ^
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:
>  In function ‘arrow::Status arrow::internal::MemoryAdviseWillNeed(const 
> std::vector&)’:
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1144:59:
>  error: ‘POSIX_MADV_WILLNEED’ was not declared in this scope
> int err = posix_madvise(aligned.addr, aligned.size, POSIX_MADV_WILLNEED);
>  ^
> /export/home/XI4sjNd/Rtemp/RtmpvN4Lx2/fileef105d2909/cpp/src/arrow/util/io_util.cc:1144:78:
>  error: ‘posix_madvise’ was not declared in this scope
> int err = posix_madvise(aligned.addr, aligned.size, POSIX_MADV_WILLNEED);
>  ^
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-11774) [R] one-line install from source on macOS

2021-03-16 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson resolved ARROW-11774.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 9579
[https://github.com/apache/arrow/pull/9579]

> [R] one-line install from source on macOS
> -
>
> Key: ARROW-11774
> URL: https://issues.apache.org/jira/browse/ARROW-11774
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: R
>Reporter: Jonathan Keane
>Assignee: Jonathan Keane
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-11659) [R] Preserve group_by .drop argument

2021-03-16 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson resolved ARROW-11659.
-
Resolution: Fixed

Issue resolved by pull request 9716
[https://github.com/apache/arrow/pull/9716]

> [R] Preserve group_by .drop argument
> 
>
> Key: ARROW-11659
> URL: https://issues.apache.org/jira/browse/ARROW-11659
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: R
>Reporter: Neal Richardson
>Assignee: Ian Cook
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-11996) [R] Make r/configure run successfully on Solaris

2021-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-11996:
---
Labels: pull-request-available  (was: )

> [R] Make r/configure run successfully on Solaris
> 
>
> Key: ARROW-11996
> URL: https://issues.apache.org/jira/browse/ARROW-11996
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: R
>Reporter: Neal Richardson
>Assignee: Neal Richardson
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Replace some {{$()}} with backticks and use {{sed}} in a safe way



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-11997) [Python] concat_table crashes python interpreter

2021-03-16 Thread Young Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Young Wu updated ARROW-11997:
-
Summary: [Python] concat_table crashes python interpreter  (was: 
concat_table crashes python interpreter)

> [Python] concat_table crashes python interpreter
> 
>
> Key: ARROW-11997
> URL: https://issues.apache.org/jira/browse/ARROW-11997
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 3.0.0
> Environment: Python 3.8.6
>Reporter: Young Wu
>Priority: Major
>
> `pyarrow.concat_table([None])` crashes python interpreter. It would be better 
> to throw a python exception, otherwise, when used in the Jupyter environment, 
> all the unsaved data will be lost when the crash happens.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11997) concat_table crashes python interpreter

2021-03-16 Thread Young Wu (Jira)
Young Wu created ARROW-11997:


 Summary: concat_table crashes python interpreter
 Key: ARROW-11997
 URL: https://issues.apache.org/jira/browse/ARROW-11997
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 3.0.0
 Environment: Python 3.8.6
Reporter: Young Wu


`pyarrow.concat_table([None])` crashes python interpreter. It would be better 
to throw a python exception, otherwise, when used in the Jupyter environment, 
all the unsaved data will be lost when the crash happens.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ARROW-9226) [Python] pyarrow.fs.HadoopFileSystem - retrieve options from core-site.xml or hdfs-site.xml if available

2021-03-16 Thread wondertx (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303037#comment-17303037
 ] 

wondertx edited comment on ARROW-9226 at 3/17/21, 2:33 AM:
---

If HA support is not supported by `fs.HadoopFileSystem`, `pyarrow.hdfs.connect` 
cannot be simply replaced


was (Author: wondertx):
If HA support is not supported by `fs.HadoopFileSystem`, `pyarrow.hdfs.connect` 
cannot be simply replaced

`

> [Python] pyarrow.fs.HadoopFileSystem - retrieve options from core-site.xml or 
> hdfs-site.xml if available
> 
>
> Key: ARROW-9226
> URL: https://issues.apache.org/jira/browse/ARROW-9226
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Python
>Affects Versions: 0.17.1
>Reporter: Bruno Quinart
>Priority: Minor
>  Labels: hdfs
> Fix For: 4.0.0
>
>
> 'Legacy' pyarrow.hdfs.connect was somehow able to get the namenode info from 
> the hadoop configuration files.
> The new pyarrow.fs.HadoopFileSystem requires the host to be specified.
> Inferring this info from "the environment" makes it easier to deploy 
> pipelines.
> But more important, for HA namenodes it is almost impossible to know for sure 
> what to specify. If a rolling restart is ongoing, the namenode is changing. 
> There is no guarantee on which will be active in a HA setup.
> I tried connecting to the standby namenode. The connection gets established, 
> but when writing a file an error is raised that standby namenodes are not 
> allowed to write to.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-9226) [Python] pyarrow.fs.HadoopFileSystem - retrieve options from core-site.xml or hdfs-site.xml if available

2021-03-16 Thread wondertx (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303037#comment-17303037
 ] 

wondertx commented on ARROW-9226:
-

If HA support is not supported by `fs.HadoopFileSystem`, `pyarrow.hdfs.connect` 
cannot be simply replaced

`

> [Python] pyarrow.fs.HadoopFileSystem - retrieve options from core-site.xml or 
> hdfs-site.xml if available
> 
>
> Key: ARROW-9226
> URL: https://issues.apache.org/jira/browse/ARROW-9226
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Python
>Affects Versions: 0.17.1
>Reporter: Bruno Quinart
>Priority: Minor
>  Labels: hdfs
> Fix For: 4.0.0
>
>
> 'Legacy' pyarrow.hdfs.connect was somehow able to get the namenode info from 
> the hadoop configuration files.
> The new pyarrow.fs.HadoopFileSystem requires the host to be specified.
> Inferring this info from "the environment" makes it easier to deploy 
> pipelines.
> But more important, for HA namenodes it is almost impossible to know for sure 
> what to specify. If a rolling restart is ongoing, the namenode is changing. 
> There is no guarantee on which will be active in a HA setup.
> I tried connecting to the standby namenode. The connection gets established, 
> but when writing a file an error is raised that standby namenodes are not 
> allowed to write to.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11998) [C++] Make it easier to create vectors of move-only data types for tests

2021-03-16 Thread Weston Pace (Jira)
Weston Pace created ARROW-11998:
---

 Summary: [C++] Make it easier to create vectors of move-only data 
types for tests
 Key: ARROW-11998
 URL: https://issues.apache.org/jira/browse/ARROW-11998
 Project: Apache Arrow
  Issue Type: Task
  Components: C++
Reporter: Weston Pace
Assignee: Weston Pace


We take advantage of aggregate initialization in our tests in a number of 
places.


Also for consideration: 
https://stackoverflow.com/questions/8468774/can-i-list-initialize-a-vector-of-move-only-type



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-11998) [C++] Make it easier to create vectors of move-only data types for tests

2021-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-11998:
---
Labels: pull-request-available  (was: )

> [C++] Make it easier to create vectors of move-only data types for tests
> 
>
> Key: ARROW-11998
> URL: https://issues.apache.org/jira/browse/ARROW-11998
> Project: Apache Arrow
>  Issue Type: Task
>  Components: C++
>Reporter: Weston Pace
>Assignee: Weston Pace
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We take advantage of aggregate initialization in our tests in a number of 
> places.
> Also for consideration: 
> https://stackoverflow.com/questions/8468774/can-i-list-initialize-a-vector-of-move-only-type



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11999) [Java] Support parallel vector element search with user-specified comparator

2021-03-16 Thread Liya Fan (Jira)
Liya Fan created ARROW-11999:


 Summary: [Java] Support parallel vector element search with 
user-specified comparator
 Key: ARROW-11999
 URL: https://issues.apache.org/jira/browse/ARROW-11999
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Java
Reporter: Liya Fan
Assignee: Liya Fan


This is in response to the discussion in 
https://github.com/apache/arrow/pull/5631#discussion_r339110228

Currently, we only support parallel search with {{RangeEqualsVisitor}}, which 
does not support user-specified comparators.
We want to provide the functionality in this issue to support wider range of 
use cases. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


<    1   2