[jira] [Updated] (ARROW-13125) [R] Throw error when 2+ args passed to desc() in arrange()
[ https://issues.apache.org/jira/browse/ARROW-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-13125: --- Labels: pull-request-available (was: ) > [R] Throw error when 2+ args passed to desc() in arrange() > -- > > Key: ARROW-13125 > URL: https://issues.apache.org/jira/browse/ARROW-13125 > Project: Apache Arrow > Issue Type: Bug > Components: R >Affects Versions: 4.0.1 >Reporter: Ian Cook >Assignee: Ian Cook >Priority: Minor > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently this does not result in an error, but it should: > {code:r}Table$create(x = 1:3, y = 4:6) %>% arrange(desc(x, y)){code} > The same problem affects dplyr on R data frames. I opened > https://github.com/tidyverse/dplyr/issues/5921 for that. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13125) [R] Throw error when 2+ args passed to desc() in arrange()
Ian Cook created ARROW-13125: Summary: [R] Throw error when 2+ args passed to desc() in arrange() Key: ARROW-13125 URL: https://issues.apache.org/jira/browse/ARROW-13125 Project: Apache Arrow Issue Type: Bug Components: R Affects Versions: 4.0.1 Reporter: Ian Cook Assignee: Ian Cook Fix For: 5.0.0 Currently this does not result in an error, but it should: {code:r}Table$create(x = 1:3, y = 4:6) %>% arrange(desc(x, y)){code} The same problem affects dplyr on R data frames. I opened https://github.com/tidyverse/dplyr/issues/5921 for that. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13124) [Ruby] Add support for memory view
[ https://issues.apache.org/jira/browse/ARROW-13124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-13124: --- Labels: pull-request-available (was: ) > [Ruby] Add support for memory view > -- > > Key: ARROW-13124 > URL: https://issues.apache.org/jira/browse/ARROW-13124 > Project: Apache Arrow > Issue Type: Improvement > Components: Ruby >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13124) [Ruby] Add support for memory view
Kouhei Sutou created ARROW-13124: Summary: [Ruby] Add support for memory view Key: ARROW-13124 URL: https://issues.apache.org/jira/browse/ARROW-13124 Project: Apache Arrow Issue Type: Improvement Components: Ruby Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13123) Cannot build Pyarro on python38 docker image
Alexandre Campino created ARROW-13123: - Summary: Cannot build Pyarro on python38 docker image Key: ARROW-13123 URL: https://issues.apache.org/jira/browse/ARROW-13123 Project: Apache Arrow Issue Type: Bug Reporter: Alexandre Campino Attachments: Dockerfile Hi all, Trying to build pyarrow on a python38 docker image in order to build a aws-lambda layer. I had success on doing this couple months back with the dockerfile attached. (Python37) but now getting the error either I try either python version below. {code:java} FROM lambci/lambda:build-python3.7{code} {code:java} FROM lambci/lambda:build-python3.8{code} which leads me to think that something else has changed. {noformat} #8 6.648 Scanning dependencies of target boost_ep #8 6.657 [ 0%] Creating directories for 'boost_ep' #8 6.689 [ 1%] Performing download step (download, verify and extract) for 'boost_ep' #8 8.038 -- boost_ep download command succeeded. See also /arrow/cpp/build/boost_ep-prefix/src/boost_ep-stamp/boost_ep-download-*.log #8 8.049 [ 1%] No patch step for 'boost_ep' #8 8.060 [ 1%] No update step for 'boost_ep' #8 8.071 [ 2%] Performing configure step for 'boost_ep' #8 8.088 CMake Error at /arrow/cpp/build/boost_ep-prefix/src/boost_ep-stamp/boost_ep-configure-RELEASE.cmake:16 (message): #8 8.088 Command failed: 1 #8 8.088 #8 8.088'./bootstrap.sh' '--prefix=/arrow/cpp/build/boost_ep-prefix/src/boost_ep' '--with-libraries=filesystem,regex,system' #8 8.088 #8 8.088 See also #8 8.088 #8 8.088 /arrow/cpp/build/boost_ep-prefix/src/boost_ep-stamp/boost_ep-configure-*.log #8 8.088 #8 8.088 #8 8.089 make[2]: *** [boost_ep-prefix/src/boost_ep-stamp/boost_ep-configure] Error 1 #8 8.089 make[1]: *** [CMakeFiles/boost_ep.dir/all] Error 2 #8 8.090 make: *** [all] Error 2{noformat} {noformat} executor failed running [/bin/sh -c mkdir /arrow && curl -o /tmp/apache-arrow.tar.gz -SL https://github.com/apache/arrow/archive/apache-arrow-${ARROW_VERSION}.tar.gz && tar -xvf /tmp/apache-arrow.tar.gz -C /arrow --strip-components 1 && mkdir /arrow/dist && export LD_LIBRARY_PATH=/dist/lib:$LD_LIBRARY_PATH && mkdir -p /arrow/cpp/build && cd /arrow/cpp/build && cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE -DCMAKE_INSTALL_LIBDIR=lib -DCMAKE_INSTALL_PREFIX=$ARROW_HOME -DARROW_PARQUET=on -DARROW_PYTHON=on -DARROW_PLASMA=on -DARROW_WITH_SNAPPY=on -DARROW_BUILD_TESTS=OFF .. && make && make install]: exit code: 2 The terminal process "C:\WINDOWS\System32\cmd.exe /K C:\tools\cmder\vendor\bin\vscode_init.cmd /d /c docker build --pull --rm -f "Docker\pyarrow\Dockerfile" -t pyarrow37:lambci-lambda "Docker\pyarrow"" terminated with exit code: 1.{noformat} Note: Dockerfile has plenty of commented out code. Output above corresponds to the non commented out code built attempt. Thank you Alex -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13110) [C++] Deadlock can happen when using BackgroundGenerator without transferring callbacks
[ https://issues.apache.org/jira/browse/ARROW-13110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-13110: --- Labels: pull-request-available (was: ) > [C++] Deadlock can happen when using BackgroundGenerator without transferring > callbacks > --- > > Key: ARROW-13110 > URL: https://issues.apache.org/jira/browse/ARROW-13110 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Weston Pace >Assignee: Weston Pace >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13064) [C++] Add a general "if, ifelse, ..., else" kernel
[ https://issues.apache.org/jira/browse/ARROW-13064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-13064: --- Labels: pull-request-available (was: ) > [C++] Add a general "if, ifelse, ..., else" kernel > -- > > Key: ARROW-13064 > URL: https://issues.apache.org/jira/browse/ARROW-13064 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Ian Cook >Assignee: David Li >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > ARROW-10640 added a ternary {{if_else}} kernel. Add another kernel that > extends this concept to an arbitrary number of conditions and associated > results, like a vectorized {{if-ifelse-...-else}} with an arbitrary number of > {{ifelse}} and with the {{else}} optional. This is like a SQL {{CASE}} > statement. > How best to achieve this is not obvious. To enable SQL-style uses, it would > be most efficient to implement this as a variadic kernel where the > even-number arguments (0, 2, ...) are the arrays of boolean conditions, the > odd-number arguments (1, 3, ...) are the corresponding arrays of results, and > the final argument is the {{else}} result. But I'm not sure if this is > practical. Maybe instead we should implement this to operate on listarrays, > like NumPy's > {{[np.where|https://numpy.org/doc/stable/reference/generated/numpy.where.html]}} > or > {{[np.select|https://numpy.org/doc/stable/reference/generated/numpy.select.html]}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-11514) [R][C++] Bindings for paste(), paste0(), str_c()
[ https://issues.apache.org/jira/browse/ARROW-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11514: --- Labels: pull-request-available (was: ) > [R][C++] Bindings for paste(), paste0(), str_c() > > > Key: ARROW-11514 > URL: https://issues.apache.org/jira/browse/ARROW-11514 > Project: Apache Arrow > Issue Type: New Feature > Components: R >Reporter: Neal Richardson >Assignee: Ian Cook >Priority: Major > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > * In {{paste()}} and {{paste0()}}, use the {{REPLACE}} null handling behavior > with replacement string {{"NA"}} (for consistency with base R) > * In {{str_c()}}, use the {{EMIT_NULL}} null handling behavior (for > consistency with stringr) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13092) [C++] CreateDir should fail if the target exists and is not a directory
[ https://issues.apache.org/jira/browse/ARROW-13092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-13092: --- Labels: pull-request-available (was: ) > [C++] CreateDir should fail if the target exists and is not a directory > --- > > Key: ARROW-13092 > URL: https://issues.apache.org/jira/browse/ARROW-13092 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > As discussed in > https://github.com/apache/arrow/pull/10540#issuecomment-862284472 . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13034) [Python][Docs] Update outdated examples for hdfs/azure on the Parquet doc page
[ https://issues.apache.org/jira/browse/ARROW-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-13034: --- Labels: pull-request-available (was: ) > [Python][Docs] Update outdated examples for hdfs/azure on the Parquet doc page > -- > > Key: ARROW-13034 > URL: https://issues.apache.org/jira/browse/ARROW-13034 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Joris Van den Bossche >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > From https://github.com/apache/arrow/issues/10492 > - The chapter "Writing to Partitioned Datasets" still presents a "solution" > with "hdfs.connect" but since it's mentioned as deprecated no more a good > idea to mention it. > - The chapter "Reading a Parquet File from Azure Blob storage" is based on > the package "azure.storage.blob" ... but an old one and the actual > "azure-sdk-for-python" doesn't have any-more methods like > get_blob_to_stream(). Possible to update this part with new blob storage > possibilities, and also another mentioning the same concept with Delta Lake > (similar principle but since there are differences ...) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13095) [C++] Implement trigonometric compute functions
[ https://issues.apache.org/jira/browse/ARROW-13095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-13095: --- Labels: beginner pull-request-available (was: beginner) > [C++] Implement trigonometric compute functions > --- > > Key: ARROW-13095 > URL: https://issues.apache.org/jira/browse/ARROW-13095 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: David Li >Assignee: David Li >Priority: Major > Labels: beginner, pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > sin, cos, asin, acos, tan, atan, cotan, atan2 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13116) [R] Test for RecordBatchReader to C-interface fails on arrow-r-minimal due to missing dependencies
[ https://issues.apache.org/jira/browse/ARROW-13116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-13116: --- Labels: pull-request-available (was: ) > [R] Test for RecordBatchReader to C-interface fails on arrow-r-minimal due to > missing dependencies > -- > > Key: ARROW-13116 > URL: https://issues.apache.org/jira/browse/ARROW-13116 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Nic Crane >Assignee: Nic Crane >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The test just needs updating with skip_if_not_available("dataset") -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13042) [C++] Automatic checks that kernels don't leave uninitialized data in output
[ https://issues.apache.org/jira/browse/ARROW-13042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-13042: --- Labels: pull-request-available (was: ) > [C++] Automatic checks that kernels don't leave uninitialized data in output > > > Key: ARROW-13042 > URL: https://issues.apache.org/jira/browse/ARROW-13042 > Project: Apache Arrow > Issue Type: Task > Components: C++ >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > To minimize the risk of issues such as ARROW-13041, perhaps our compute > kernel test harness should include a check that allocated data is always > initialized (using Valgrind). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13097) [C++] Provide a simple reflection utility for {{struct}}s
[ https://issues.apache.org/jira/browse/ARROW-13097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-13097: --- Labels: pull-request-available (was: ) > [C++] Provide a simple reflection utility for {{struct}}s > - > > Key: ARROW-13097 > URL: https://issues.apache.org/jira/browse/ARROW-13097 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Ben Kietzman >Assignee: Ben Kietzman >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In cases such as ARROW-13025 it's advantageous to avoid boilerplate when > dealing with objects which are basic structs of data members. A simple > reflection utility (get/set the value of a data member, print the name of a > member to string) would allow writing functionality generically in terms of a > tuple of properties, greatly reducing boilerplate. > See a sketch of one such utility here > https://gist.github.com/bkietz/7899f477e86df49f21ab17201c518d74 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-12074) [C++][Compute] Add scalar arithmetic kernels for decimal inputs
[ https://issues.apache.org/jira/browse/ARROW-12074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman resolved ARROW-12074. -- Resolution: Fixed Issue resolved by pull request 10364 [https://github.com/apache/arrow/pull/10364] > [C++][Compute] Add scalar arithmetic kernels for decimal inputs > --- > > Key: ARROW-12074 > URL: https://issues.apache.org/jira/browse/ARROW-12074 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Antoine Pitrou >Assignee: Yibo Cai >Priority: Major > Labels: pull-request-available > Fix For: 5.0.0 > > Time Spent: 6h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13122) [C++][Compute] Dispatch* should examine options as well as input types
Ben Kietzman created ARROW-13122: Summary: [C++][Compute] Dispatch* should examine options as well as input types Key: ARROW-13122 URL: https://issues.apache.org/jira/browse/ARROW-13122 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Ben Kietzman {{Function::Dispatch*}} should have access to options as well as argument types. This will allow kernel authors to write kernels which are specific to certain configurations of options. Otherwise we may be leaving performance on the table when for example a function's output *could* be contiguously preallocated, but only for the default FunctionOptions. Currently the author would have no choice but to choose the lowest-common-denominator flags for the kernel. In another vein, "cast" is currently a MetaFunction instead of a ScalarFunction since it derives its output type from CastOptions. This requires a special case in Expressions since Expressions can only represent calls to scalar functions. Ideally a function which is semantically scalar like "cast" wouldn't need to resort to using a MetaFunction for dispatch See also: https://github.com/apache/arrow/pull/10547#discussion_r654573800 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-13121) [C++][Compute] Extract preallocation logic from KernelExecutor
[ https://issues.apache.org/jira/browse/ARROW-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365633#comment-17365633 ] Antoine Pitrou commented on ARROW-13121: I'd also mention that when I worked on ARROW-13042, I spent a lot of time trying to figure out which code paths exactly got executed in {{exec.cc}} and I never fully figured it out (one particularly case was implicit casting and broadcasting with a NullScalar LHS and a ChunkedArray RHS on a scalar kernel). I ended up trying to find clues in other places instead. > [C++][Compute] Extract preallocation logic from KernelExecutor > -- > > Key: ARROW-13121 > URL: https://issues.apache.org/jira/browse/ARROW-13121 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Ben Kietzman >Priority: Major > > Currently KernelExecutor handles preallocation of null bitmaps and other > buffers based on simple flags on each Kernel. This is not very flexible and > we end up leaving a lot of performance on the table in cases where we can > preallocate but the behavior can't be captured in the available flags. For > example, in the case of {{binary_string_join_element_wise}}, it would be > possible to preallocate all buffers (even the character buffer) and write > output into slices. > Having this as a public function would enable us to unit test it directly > (currently Executors are only tested indirectly through calling of > compute::Functions) and reuse it, for example to correctly preallocate a > small temporary for pipelined execution > One way this could be added is as a new method on each Kernel: > {code} > // Output preallocated Datums sufficient for execution of the kernel on each > ExecBatch. > // The output Datums may not be identically chunked to the input batches, for > example > // kernels which support contiguous output preallocation will preallocate a > single Datum > // (and can then output into slices of that Datum). > Result> Kernel::prepare_output( > const Kernel*, > KernelContext*, > const std::vector& inputs) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13121) [C++][Compute] Extract preallocation logic from KernelExecutor
[ https://issues.apache.org/jira/browse/ARROW-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman updated ARROW-13121: - Summary: [C++][Compute] Extract preallocation logic from KernelExecutor (was: [C++][Compute] Extract preallocation logic to a public function) > [C++][Compute] Extract preallocation logic from KernelExecutor > -- > > Key: ARROW-13121 > URL: https://issues.apache.org/jira/browse/ARROW-13121 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Ben Kietzman >Priority: Major > > Currently KernelExecutor handles preallocation of null bitmaps and other > buffers based on simple flags on each Kernel. This is not very flexible and > we end up leaving a lot of performance on the table in cases where we can > preallocate but the behavior can't be captured in the available flags. For > example, in the case of {{binary_string_join_element_wise}}, it would be > possible to preallocate all buffers (even the character buffer) and write > output into slices. > Having this as a public function would enable us to unit test it directly > (currently Executors are only tested indirectly through calling of > compute::Functions) and reuse it, for example to correctly preallocate a > small temporary for pipelined execution > One way this could be added is as a new method on each Kernel: > {code} > // Output preallocated Datums sufficient for execution of the kernel on each > ExecBatch. > // The output Datums may not be identically chunked to the input batches, for > example > // kernels which support contiguous output preallocation will preallocate a > single Datum > // (and can then output into slices of that Datum). > Result> Kernel::prepare_output( > const Kernel*, > KernelContext*, > const std::vector& inputs) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13121) [C++][Compute] Extract preallocation logic to a public function
[ https://issues.apache.org/jira/browse/ARROW-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman updated ARROW-13121: - Summary: [C++][Compute] Extract preallocation logic to a public function (was: [C++][Compute] Extract preallocation logic to a method of kernels) > [C++][Compute] Extract preallocation logic to a public function > --- > > Key: ARROW-13121 > URL: https://issues.apache.org/jira/browse/ARROW-13121 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Ben Kietzman >Priority: Major > > Currently KernelExecutor handles preallocation of null bitmaps and other > buffers based on simple flags on each Kernel. This is not very flexible and > we end up leaving a lot of performance on the table in cases where we can > preallocate but the behavior can't be captured in the available flags. For > example, in the case of {{binary_string_join_element_wise}}, it would be > possible to preallocate all buffers (even the character buffer) and write > output into slices. > Having this as a public function would enable us to unit test it directly > (currently Executors are only tested indirectly through calling of > compute::Functions) and reuse it, for example to correctly preallocate a > small temporary for pipelined execution > One way this could be added is as a new method on each Kernel: > {code} > // Output preallocated Datums sufficient for execution of the kernel on each > ExecBatch. > // The output Datums may not be identically chunked to the input batches, for > example > // kernels which support contiguous output preallocation will preallocate a > single Datum > // (and can then output into slices of that Datum). > Result> Kernel::prepare_output( > const Kernel*, > KernelContext*, > const std::vector& inputs) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-13121) [C++][Compute] Extract preallocation logic to a method of kernels
[ https://issues.apache.org/jira/browse/ARROW-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365627#comment-17365627 ] Ben Kietzman commented on ARROW-13121: -- [~wesm] > [C++][Compute] Extract preallocation logic to a method of kernels > - > > Key: ARROW-13121 > URL: https://issues.apache.org/jira/browse/ARROW-13121 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Ben Kietzman >Priority: Major > > Currently KernelExecutor handles preallocation of null bitmaps and other > buffers based on simple flags on each Kernel. This is not very flexible and > we end up leaving a lot of performance on the table in cases where we can > preallocate but the behavior can't be captured in the available flags. For > example, in the case of {{binary_string_join_element_wise}}, it would be > possible to preallocate all buffers (even the character buffer) and write > output into slices. > Having this as a public function would enable us to unit test it directly > (currently Executors are only tested indirectly through calling of > compute::Functions) and reuse it, for example to correctly preallocate a > small temporary for pipelined execution > One way this could be added is as a new method on each Kernel: > {code} > // Output preallocated Datums sufficient for execution of the kernel on each > ExecBatch. > // The output Datums may not be identically chunked to the input batches, for > example > // kernels which support contiguous output preallocation will preallocate a > single Datum > // (and can then output into slices of that Datum). > Result> Kernel::prepare_output( > const Kernel*, > KernelContext*, > const std::vector& inputs) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13121) [C++][Compute] Extract preallocation logic to a method of kernels
Ben Kietzman created ARROW-13121: Summary: [C++][Compute] Extract preallocation logic to a method of kernels Key: ARROW-13121 URL: https://issues.apache.org/jira/browse/ARROW-13121 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Ben Kietzman Currently KernelExecutor handles preallocation of null bitmaps and other buffers based on simple flags on each Kernel. This is not very flexible and we end up leaving a lot of performance on the table in cases where we can preallocate but the behavior can't be captured in the available flags. For example, in the case of {{binary_string_join_element_wise}}, it would be possible to preallocate all buffers (even the character buffer) and write output into slices. Having this as a public function would enable us to unit test it directly (currently Executors are only tested indirectly through calling of compute::Functions) and reuse it, for example to correctly preallocate a small temporary for pipelined execution One way this could be added is as a new method on each Kernel: {code} // Output preallocated Datums sufficient for execution of the kernel on each ExecBatch. // The output Datums may not be identically chunked to the input batches, for example // kernels which support contiguous output preallocation will preallocate a single Datum // (and can then output into slices of that Datum). Result> Kernel::prepare_output( const Kernel*, KernelContext*, const std::vector& inputs) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13120) [Rust][Parquet] Cannot read multiple batches from parquet with string list column
[ https://issues.apache.org/jira/browse/ARROW-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Morgan Cassels updated ARROW-13120: --- Description: This issue only occurs when the batch size < the number of rows in the table. The attached parquet `test.parquet` has 31430 rows and a single column containing string lists. This issue does not appear to occur for parquets with integer list columns. {code:java} #[test] fn failing_test() { let parquet_file_reader = get_test_reader("test.parquet"); let mut arrow_reader = ParquetFileArrowReader::new(parquet_file_reader); let mut record_batches = Vec::new(); let record_batch_reader = arrow_reader.get_record_reader(1024).unwrap(); for batch in record_batch_reader { record_batches.push(batch); } } {code} {code:java} arrow::arrow_reader::tests::failing_test stdout thread 'arrow::arrow_reader::tests::failing_test' panicked at 'Expected infallable creation of GenericListArray from ArrayDataRef failed: InvalidArgumentError("offsets do not start at zero")', arrow/src/array/array_list.rs:195:45 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace {code} was: This issue only occurs when the batch size < the number of rows in the table. The attached parquet `test.parquet` has 31430 rows and a single column containing string lists. This issue does not appear to occur for parquets with integer list columns. {code:java} #[test] fn failing_test() { let parquet_file_reader = get_test_reader("test.parquet"); let mut arrow_reader = ParquetFileArrowReader::new(parquet_file_reader); let mut record_batches = Vec::new(); let record_batch_reader = arrow_reader.get_record_reader(1024).unwrap(); for batch in record_batch_reader { record_batches.push(batch); } } {code} {code:java} arrow::arrow_reader::tests::failing_test stdout thread 'arrow::arrow_reader::tests::failing_test' panicked at 'Expected infallable creation of GenericListArray from ArrayDataRef failed: InvalidArgumentError("offsets do not start at zero")', arrow/src/array/array_list.rs:195:45 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace {code} > [Rust][Parquet] Cannot read multiple batches from parquet with string list > column > - > > Key: ARROW-13120 > URL: https://issues.apache.org/jira/browse/ARROW-13120 > Project: Apache Arrow > Issue Type: Bug >Reporter: Morgan Cassels >Priority: Major > Attachments: test.parquet > > > This issue only occurs when the batch size < the number of rows in the table. > The attached parquet `test.parquet` has 31430 rows and a single column > containing string lists. This issue does not appear to occur for parquets > with integer list columns. > > {code:java} > #[test] > fn failing_test() { > let parquet_file_reader = get_test_reader("test.parquet"); > let mut arrow_reader = ParquetFileArrowReader::new(parquet_file_reader); > let mut record_batches = Vec::new(); > let record_batch_reader = arrow_reader.get_record_reader(1024).unwrap(); > for batch in record_batch_reader { >record_batches.push(batch); > } > } > {code} > > {code:java} > arrow::arrow_reader::tests::failing_test stdout > thread 'arrow::arrow_reader::tests::failing_test' panicked at 'Expected > infallable creation of GenericListArray from ArrayDataRef failed: > InvalidArgumentError("offsets do not start at zero")', > arrow/src/array/array_list.rs:195:45 > note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13120) [Rust][Parquet] Cannot read multiple batches from parquet with string list column
[ https://issues.apache.org/jira/browse/ARROW-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Morgan Cassels updated ARROW-13120: --- Description: This issue only occurs when the batch size < the number of rows in the table. The attached parquet `test.parquet` has 31430 rows and a single column containing string lists. This issue does not appear to occur for parquets with integer list columns. {code:java} #[test] fn failing_test() { let parquet_file_reader = get_test_reader("test.parquet"); let mut arrow_reader = ParquetFileArrowReader::new(parquet_file_reader); let mut record_batches = Vec::new(); let record_batch_reader = arrow_reader.get_record_reader(1024).unwrap(); for batch in record_batch_reader { record_batches.push(batch); } } {code} {code:java} arrow::arrow_reader::tests::failing_test stdout thread 'arrow::arrow_reader::tests::failing_test' panicked at 'Expected infallable creation of GenericListArray from ArrayDataRef failed: InvalidArgumentError("offsets do not start at zero")', arrow/src/array/array_list.rs:195:45 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace {code} was: This issue only occurs when the batch size < the number of rows in the table. The attached parquet `test.parquet` has 31430 rows and a single column containing string lists. This issue does not appear to occur for parquets with integer list columns. ``` #[test] fnfailing_test() { letparquet_file_reader = get_test_reader("test.parquet"); letmutarrow_reader = ParquetFileArrowReader::new(parquet_file_reader); letmutrecord_batches = Vec::new(); letrecord_batch_reader = arrow_reader.get_record_reader(1024).unwrap(); forbatchinrecord_batch_reader { record_batches.push(batch); } } ``` ``` arrow::arrow_reader::tests::failing_test stdout thread 'arrow::arrow_reader::tests::failing_test' panicked at 'Expected infallable creation of GenericListArray from ArrayDataRef failed: InvalidArgumentError("offsets do not start at zero")', arrow/src/array/array_list.rs:195:45 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace ``` > [Rust][Parquet] Cannot read multiple batches from parquet with string list > column > - > > Key: ARROW-13120 > URL: https://issues.apache.org/jira/browse/ARROW-13120 > Project: Apache Arrow > Issue Type: Bug >Reporter: Morgan Cassels >Priority: Major > Attachments: test.parquet > > > This issue only occurs when the batch size < the number of rows in the table. > The attached parquet `test.parquet` has 31430 rows and a single column > containing string lists. This issue does not appear to occur for parquets > with integer list columns. > > > {code:java} > #[test] > fn failing_test() { > let parquet_file_reader = get_test_reader("test.parquet"); > let mut arrow_reader = ParquetFileArrowReader::new(parquet_file_reader); > let mut record_batches = Vec::new(); > let record_batch_reader = arrow_reader.get_record_reader(1024).unwrap(); > for batch in record_batch_reader { >record_batches.push(batch); > } > } > {code} > > {code:java} > arrow::arrow_reader::tests::failing_test stdout > thread 'arrow::arrow_reader::tests::failing_test' panicked at 'Expected > infallable creation of GenericListArray from ArrayDataRef failed: > InvalidArgumentError("offsets do not start at zero")', > arrow/src/array/array_list.rs:195:45 > note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13120) [Rust][Parquet] Cannot read multiple batches from parquet with string list column
Morgan Cassels created ARROW-13120: -- Summary: [Rust][Parquet] Cannot read multiple batches from parquet with string list column Key: ARROW-13120 URL: https://issues.apache.org/jira/browse/ARROW-13120 Project: Apache Arrow Issue Type: Bug Reporter: Morgan Cassels Attachments: test.parquet This issue only occurs when the batch size < the number of rows in the table. The attached parquet `test.parquet` has 31430 rows and a single column containing string lists. This issue does not appear to occur for parquets with integer list columns. ``` #[test] fnfailing_test() { letparquet_file_reader = get_test_reader("test.parquet"); letmutarrow_reader = ParquetFileArrowReader::new(parquet_file_reader); letmutrecord_batches = Vec::new(); letrecord_batch_reader = arrow_reader.get_record_reader(1024).unwrap(); forbatchinrecord_batch_reader { record_batches.push(batch); } } ``` ``` arrow::arrow_reader::tests::failing_test stdout thread 'arrow::arrow_reader::tests::failing_test' panicked at 'Expected infallable creation of GenericListArray from ArrayDataRef failed: InvalidArgumentError("offsets do not start at zero")', arrow/src/array/array_list.rs:195:45 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace ``` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-11514) [R][C++] Bindings for paste(), paste0(), str_c()
[ https://issues.apache.org/jira/browse/ARROW-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Cook updated ARROW-11514: - Summary: [R][C++] Bindings for paste(), paste0(), str_c() (was: [R] Bindings for paste(), paste0(), str_c()) > [R][C++] Bindings for paste(), paste0(), str_c() > > > Key: ARROW-11514 > URL: https://issues.apache.org/jira/browse/ARROW-11514 > Project: Apache Arrow > Issue Type: New Feature > Components: R >Reporter: Neal Richardson >Assignee: Ian Cook >Priority: Major > Fix For: 5.0.0 > > > * In {{paste()}} and {{paste0()}}, use the {{REPLACE}} null handling behavior > with replacement string {{"NA"}} (for consistency with base R) > * In {{str_c()}}, use the {{EMIT_NULL}} null handling behavior (for > consistency with stringr) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13119) [R] Set empty schema in scalar Expressions
Ian Cook created ARROW-13119: Summary: [R] Set empty schema in scalar Expressions Key: ARROW-13119 URL: https://issues.apache.org/jira/browse/ARROW-13119 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Ian Cook Assignee: Ian Cook Fix For: 5.0.0 Closely related to ARROW-13117 is the problem of {{type()}} and {{type_id()}} not working for scalar expressions. For example, currently this happens: {code:r}> Expression$scalar("foo")$type() Error: !is.null(schema) is not TRUE > Expression$scalar(42L)$type() Error: !is.null(schema) is not TRUE{code} This is what we want to happen: {code:r}> Expression$scalar("foo")$type() Utf8 string > Expression$scalar(42L)$type() Int32 int32{code} This is simple to solve; we just need to set {{schema}} to an empty schema for all scalar expressions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13119) [R] Set empty schema in scalar Expressions
[ https://issues.apache.org/jira/browse/ARROW-13119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Cook updated ARROW-13119: - Description: Closely related to ARROW-13117 is the problem of {{type()}} and {{type_id()}} not working for scalar expressions. For example, currently this happens: {code:r}> Expression$scalar("foo")$type() Error: !is.null(schema) is not TRUE > Expression$scalar(42L)$type() Error: !is.null(schema) is not TRUE{code} This is what we want to happen: {code:r}> Expression$scalar("foo")$type() Utf8 string > Expression$scalar(42L)$type() Int32 int32{code} This is simple to solve; we just need to set {{schema}} to an empty schema for all scalar expressions. was: Closely related to ARROW-13117 is the problem of {{type()}} and {{type_id()}} not working for scalar expressions. For example, currently this happens: {code:r}> Expression$scalar("foo")$type() Error: !is.null(schema) is not TRUE > Expression$scalar(42L)$type() Error: !is.null(schema) is not TRUE{code} This is what we want to happen: {code:r}> Expression$scalar("foo")$type() Utf8 string > Expression$scalar(42L)$type() Int32 int32{code} This is simple to solve; we just need to set {{schema}} to an empty schema for all scalar expressions. > [R] Set empty schema in scalar Expressions > -- > > Key: ARROW-13119 > URL: https://issues.apache.org/jira/browse/ARROW-13119 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Ian Cook >Assignee: Ian Cook >Priority: Major > Fix For: 5.0.0 > > > Closely related to ARROW-13117 is the problem of {{type()}} and {{type_id()}} > not working for scalar expressions. For example, currently this happens: > {code:r}> Expression$scalar("foo")$type() > Error: !is.null(schema) is not TRUE > > Expression$scalar(42L)$type() > Error: !is.null(schema) is not TRUE{code} > This is what we want to happen: > {code:r}> Expression$scalar("foo")$type() > Utf8 > string > > Expression$scalar(42L)$type() > Int32 > int32{code} > This is simple to solve; we just need to set {{schema}} to an empty schema > for all scalar expressions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-13118) [R] Improve handling of R scalars in some nse_funcs
[ https://issues.apache.org/jira/browse/ARROW-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365552#comment-17365552 ] Ian Cook commented on ARROW-13118: -- [~npr] what do you think? Does the solution described above (creating a {{wrap_r_scalar}} function) seem reasonable? > [R] Improve handling of R scalars in some nse_funcs > --- > > Key: ARROW-13118 > URL: https://issues.apache.org/jira/browse/ARROW-13118 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Ian Cook >Assignee: Ian Cook >Priority: Major > Fix For: 5.0.0 > > > Some of the functions in {{nse_funcs}} do not behave properly when passed R > scalar input in expressions in dplyr verbs. Some examples: > {code:r} > Table$create(x = 1) %>% mutate(as.character(42)) > Table$create(x = 1) %>% mutate(is.character(("foo"))) > Table$create(x = 1) %>% mutate(nchar("foo")) > Table$create(x = 1) %>% mutate(is.infinite(Inf)) > {code} > This could be resolved by using {{build_expr()}} instead of > {{Expression$create()}}, but {{build_expr()}} is awfully heavy. The only part > of it we really need to make this work is this: > {code:r} > args <- lapply(args, function(x) { > if (!inherits(x, "Expression")) { > x <- Expression$scalar(x) > } > x > }){code} > Maybe we could make a function called {{wrap_r_scalar}}, like this: > {code:r} > wrap_r_scalar <- function(x) { > if (!inherits(x "Expression")) { > assert_that( > length(x) == 1, > msg = "Literal vectors of length != 1 not supported" > ) > Expression$scalar(x) > } else { > x > } > } > {code} > and use it as needed in various of the {{nse_funcs}} functions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13118) [R] Improve handling of R scalars in some nse_funcs
Ian Cook created ARROW-13118: Summary: [R] Improve handling of R scalars in some nse_funcs Key: ARROW-13118 URL: https://issues.apache.org/jira/browse/ARROW-13118 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Ian Cook Assignee: Ian Cook Fix For: 5.0.0 Some of the functions in {{nse_funcs}} do not behave properly when passed R scalar input in expressions in dplyr verbs. Some examples: {code:r} Table$create(x = 1) %>% mutate(as.character(42)) Table$create(x = 1) %>% mutate(is.character(("foo"))) Table$create(x = 1) %>% mutate(nchar("foo")) Table$create(x = 1) %>% mutate(is.infinite(Inf)) {code} This could be resolved by using {{build_expr()}} instead of {{Expression$create()}}, but {{build_expr()}} is awfully heavy. The only part of it we really need to make this work is this: {code:r} args <- lapply(args, function(x) { if (!inherits(x, "Expression")) { x <- Expression$scalar(x) } x }){code} Maybe we could make a function called {{wrap_r_scalar}}, like this: {code:r} wrap_r_scalar <- function(x) { if (!inherits(x "Expression")) { assert_that( length(x) == 1, msg = "Literal vectors of length != 1 not supported" ) Expression$scalar(x) } else { x } } {code} and use it as needed in various of the {{nse_funcs}} functions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13117) [R] Retain schema in new Expressions
Ian Cook created ARROW-13117: Summary: [R] Retain schema in new Expressions Key: ARROW-13117 URL: https://issues.apache.org/jira/browse/ARROW-13117 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Ian Cook Assignee: Ian Cook Fix For: 5.0.0 When a new Expression is created, {{schema}} should be retained from the expression(s) it was created from. That way, the {{type()}} and {{type_id()}} methods of the new Expression will work. For example, currently this happens: {code:r} > x <- Expression$field_ref("x") > x$schema <- Schema$create(x = int32()) > > y <- Expression$field_ref("y") > y$schema <- Schema$create(x = int32()) > > Expression$create("add_checked", x, y)$type() Error: !is.null(schema) is not TRUE {code} This is what we want to happen: {code:r} > Expression$create("add_checked", x, y)$type() Int32 int32 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-13097) [C++] Provide a simple reflection utility for {{struct}}s
[ https://issues.apache.org/jira/browse/ARROW-13097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman reassigned ARROW-13097: Assignee: Ben Kietzman > [C++] Provide a simple reflection utility for {{struct}}s > - > > Key: ARROW-13097 > URL: https://issues.apache.org/jira/browse/ARROW-13097 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Ben Kietzman >Assignee: Ben Kietzman >Priority: Major > > In cases such as ARROW-13025 it's advantageous to avoid boilerplate when > dealing with objects which are basic structs of data members. A simple > reflection utility (get/set the value of a data member, print the name of a > member to string) would allow writing functionality generically in terms of a > tuple of properties, greatly reducing boilerplate. > See a sketch of one such utility here > https://gist.github.com/bkietz/7899f477e86df49f21ab17201c518d74 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13116) [R] Test for RecordBatchReader to C-interface fails on arrow-r-minimal due to missing dependencies
Nic Crane created ARROW-13116: - Summary: [R] Test for RecordBatchReader to C-interface fails on arrow-r-minimal due to missing dependencies Key: ARROW-13116 URL: https://issues.apache.org/jira/browse/ARROW-13116 Project: Apache Arrow Issue Type: Bug Components: R Reporter: Nic Crane Assignee: Nic Crane The test just needs updating with skip_if_not_available("dataset") -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-13115) plasma.PlasmaClient do not disconnect when user tried to delete it
[ https://issues.apache.org/jira/browse/ARROW-13115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuxian Meng updated ARROW-13115: Description: ``` import pyarrow.plasma as plasma for _ in range(1): c = plasma.connect("/tmp/plasma") del c ``` The above code turns out not to call c.disconnect() automatically, and will cause `Connection to IPC socket failed` error. was: ``` import pyarrow.plasma as plasma c = plasma.connect("/tmp.plasma") del c ``` > plasma.PlasmaClient do not disconnect when user tried to delete it > -- > > Key: ARROW-13115 > URL: https://issues.apache.org/jira/browse/ARROW-13115 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 4.0.0 >Reporter: Yuxian Meng >Priority: Critical > > ``` > import pyarrow.plasma as plasma > for _ in range(1): > c = plasma.connect("/tmp/plasma") > del c > ``` > The above code turns out not to call c.disconnect() automatically, and will > cause `Connection to IPC socket failed` error. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13115) plasma.PlasmaClient do not disconnect when user tried to delete it
Yuxian Meng created ARROW-13115: --- Summary: plasma.PlasmaClient do not disconnect when user tried to delete it Key: ARROW-13115 URL: https://issues.apache.org/jira/browse/ARROW-13115 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 4.0.0 Reporter: Yuxian Meng ``` import pyarrow.plasma as plasma c = plasma.connect("/tmp.plasma") del c ``` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13114) [R] use altrep when possible for (RecordBatch, Table) -> data.frame
Romain Francois created ARROW-13114: --- Summary: [R] use altrep when possible for (RecordBatch, Table) -> data.frame Key: ARROW-13114 URL: https://issues.apache.org/jira/browse/ARROW-13114 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Romain Francois Assignee: Romain Francois -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13113) [R] use RTasks to manage parallel in converting arrow to R
Romain Francois created ARROW-13113: --- Summary: [R] use RTasks to manage parallel in converting arrow to R Key: ARROW-13113 URL: https://issues.apache.org/jira/browse/ARROW-13113 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Romain Francois Assignee: Romain Francois -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13112) [R] altrep vectors for strings
Romain Francois created ARROW-13112: --- Summary: [R] altrep vectors for strings Key: ARROW-13112 URL: https://issues.apache.org/jira/browse/ARROW-13112 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Romain Francois Assignee: Romain Francois -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-13111) [R] altrep vectors for ChunkedArray
Romain Francois created ARROW-13111: --- Summary: [R] altrep vectors for ChunkedArray Key: ARROW-13111 URL: https://issues.apache.org/jira/browse/ARROW-13111 Project: Apache Arrow Issue Type: Bug Components: R Reporter: Romain Francois Assignee: Romain Francois -- This message was sent by Atlassian Jira (v8.3.4#803005)