[GitHub] [arrow] tianchen92 commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

2020-09-07 Thread GitBox
tianchen92 commented on pull request #7887: URL: https://github.com/apache/arrow/pull/7887#issuecomment-688620320 Hi, @emkornfield , please also take a look at this patch when you have time. This patch add AppendEmptyValue/AppendEmptyValues used in StructBuilder#AppendNull/AppendNulls, w

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-09-07 Thread GitBox
jorgecarleitao commented on a change in pull request #7967: URL: https://github.com/apache/arrow/pull/7967#discussion_r484642205 ## File path: rust/datafusion/src/execution/dataframe_impl.rs ## @@ -232,6 +241,50 @@ mod tests { Ok(()) } +#[test] +fn regis

[GitHub] [arrow] tianchen92 commented on pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

2020-09-07 Thread GitBox
tianchen92 commented on pull request #7887: URL: https://github.com/apache/arrow/pull/7887#issuecomment-688588498 > As for naming, according to various dictionaries, "empty" as a noun (and "empties") is rather informal. > It seems `AppendEmptyValue` and `AppendEmptyValues` may be more a

[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

2020-09-07 Thread GitBox
tianchen92 commented on a change in pull request #7887: URL: https://github.com/apache/arrow/pull/7887#discussion_r484622055 ## File path: cpp/src/arrow/buffer_builder.h ## @@ -292,6 +292,11 @@ class TypedBufferBuilder { return Status::OK(); } + void Forward(int64_t

[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

2020-09-07 Thread GitBox
tianchen92 commented on a change in pull request #7887: URL: https://github.com/apache/arrow/pull/7887#discussion_r484621904 ## File path: cpp/src/arrow/pretty_print.cc ## @@ -339,6 +339,17 @@ class ArrayPrinter : public PrettyPrinter { children.reserve(array.num_fields())

[GitHub] [arrow] tianchen92 commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

2020-09-07 Thread GitBox
tianchen92 commented on a change in pull request #7887: URL: https://github.com/apache/arrow/pull/7887#discussion_r484621822 ## File path: cpp/src/arrow/pretty_print.cc ## @@ -339,6 +339,17 @@ class ArrayPrinter : public PrettyPrinter { children.reserve(array.num_fields())

[GitHub] [arrow] raduteo opened a new pull request #8130: Parquet file writer snapshot API and proper ColumnChunk.file_path utilization

2020-09-07 Thread GitBox
raduteo opened a new pull request #8130: URL: https://github.com/apache/arrow/pull/8130 This is a follow up to the thread: https://mail-archives.apache.org/mod_mbox/arrow-dev/202009.mbox/%3ccdd00783-0ffc-4934-aa24-529fb2a44...@yahoo.com%3e The specific use case I am targeting is h

[GitHub] [arrow] github-actions[bot] commented on pull request #8130: Parquet file writer snapshot API and proper ColumnChunk.file_path utilization

2020-09-07 Thread GitBox
github-actions[bot] commented on pull request #8130: URL: https://github.com/apache/arrow/pull/8130#issuecomment-688572112 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then could

[GitHub] [arrow] elferherrera commented on pull request #8129: ARROW-9934 [Rust] Shape and stride check in tensor

2020-09-07 Thread GitBox
elferherrera commented on pull request #8129: URL: https://github.com/apache/arrow/pull/8129#issuecomment-688550811 Is the team interested in more features for the tensor struct? I have some ideas that could be implemented but I'm not sure if you are looking for more implementation on that

[GitHub] [arrow] elferherrera commented on a change in pull request #8129: ARROW-9934 [Rust] Shape and stride check in tensor

2020-09-07 Thread GitBox
elferherrera commented on a change in pull request #8129: URL: https://github.com/apache/arrow/pull/8129#discussion_r484591510 ## File path: rust/arrow/src/tensor.rs ## @@ -112,13 +123,44 @@ impl<'a, T: ArrowPrimitiveType> Tensor<'a, T> { )

[GitHub] [arrow] andygrove commented on a change in pull request #8129: ARROW-9934 [Rust] Shape and stride check in tensor

2020-09-07 Thread GitBox
andygrove commented on a change in pull request #8129: URL: https://github.com/apache/arrow/pull/8129#discussion_r484571618 ## File path: rust/arrow/src/tensor.rs ## @@ -112,13 +123,44 @@ impl<'a, T: ArrowPrimitiveType> Tensor<'a, T> { )

[GitHub] [arrow] jhorstmann commented on pull request #8092: ARROW-9895: [Rust] Improve sorting kernels

2020-09-07 Thread GitBox
jhorstmann commented on pull request #8092: URL: https://github.com/apache/arrow/pull/8092#issuecomment-688515093 @paddyhoran @andygrove Thanks, I'm not sure what happened in that last merge. I rebased and squashed it into one commit now. --

[GitHub] [arrow] kou closed pull request #8120: ARROW-9926: [GLib] Use placement new for GArrowRecordBatchFileReader

2020-09-07 Thread GitBox
kou closed pull request #8120: URL: https://github.com/apache/arrow/pull/8120 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow] kou commented on pull request #8120: ARROW-9926: [GLib] Use placement new for GArrowRecordBatchFileReader

2020-09-07 Thread GitBox
kou commented on pull request #8120: URL: https://github.com/apache/arrow/pull/8120#issuecomment-688514565 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [arrow] kou closed pull request #8119: ARROW-9925: [GLib] Add low level value readers for GArrowListArray family

2020-09-07 Thread GitBox
kou closed pull request #8119: URL: https://github.com/apache/arrow/pull/8119 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow] github-actions[bot] commented on pull request #8129: ARROW-9934 [Rust] Shape and stride check in tensor

2020-09-07 Thread GitBox
github-actions[bot] commented on pull request #8129: URL: https://github.com/apache/arrow/pull/8129#issuecomment-688514484 https://issues.apache.org/jira/browse/ARROW-9934 This is an automated message from the Apache Git Serv

[GitHub] [arrow] kou commented on pull request #8119: ARROW-9925: [GLib] Add low level value readers for GArrowListArray family

2020-09-07 Thread GitBox
kou commented on pull request #8119: URL: https://github.com/apache/arrow/pull/8119#issuecomment-688514092 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [arrow] elferherrera opened a new pull request #8129: ARROW-9934 [Rust] Shape and stride check in tensor

2020-09-07 Thread GitBox
elferherrera opened a new pull request #8129: URL: https://github.com/apache/arrow/pull/8129 The provided shape and stride for the tensor should be checked before creating a tensor. The shape and the stride will be used to access the elements in the buffer when working with the tensor --

[GitHub] [arrow] andygrove closed pull request #8032: ARROW-9836: [Rust][DataFusion] Improve API for usage of UDFs

2020-09-07 Thread GitBox
andygrove closed pull request #8032: URL: https://github.com/apache/arrow/pull/8032 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove commented on pull request #8092: ARROW-9895: [Rust] Improve sorting kernels

2020-09-07 Thread GitBox
andygrove commented on pull request #8092: URL: https://github.com/apache/arrow/pull/8092#issuecomment-688500929 Sorry @jhorstmann but could you rebase against master rather than merge from master. I see unrelated changes in the last commit in this PR now.

[GitHub] [arrow] andygrove commented on pull request #8033: ARROW-9837: [Rust][DataFusion] Add provider for variable

2020-09-07 Thread GitBox
andygrove commented on pull request #8033: URL: https://github.com/apache/arrow/pull/8033#issuecomment-688500627 @wqc200 Thanks for making those changes. However, I still see merges in the commit history for this PR. I would suggest creating a new PR from master with these approved changes

[GitHub] [arrow] jorgecarleitao commented on pull request #8032: ARROW-9836: [Rust][DataFusion] Improve API for usage of UDFs

2020-09-07 Thread GitBox
jorgecarleitao commented on pull request #8032: URL: https://github.com/apache/arrow/pull/8032#issuecomment-688499191 I think so, @andygrove . There is probably some renaming once we have UDAFs. For now, I think it is fine.

[GitHub] [arrow] andygrove edited a comment on pull request #7873: ARROW-9608: [Rust] Leaner feature gating for arrow in parquet

2020-09-07 Thread GitBox
andygrove edited a comment on pull request #7873: URL: https://github.com/apache/arrow/pull/7873#issuecomment-688499002 @vertexclique Just checking in on status. The PR has been idle for a while now. Should we close it? This

[GitHub] [arrow] andygrove commented on pull request #7873: ARROW-9608: [Rust] Leaner feature gating for arrow in parquet

2020-09-07 Thread GitBox
andygrove commented on pull request #7873: URL: https://github.com/apache/arrow/pull/7873#issuecomment-688499002 @vertexclique Just checking in no status. The PR has been idle for a while now. Should we close it? This is an

[GitHub] [arrow] andygrove commented on pull request #8032: ARROW-9836: [Rust][DataFusion] Improve API for usage of UDFs

2020-09-07 Thread GitBox
andygrove commented on pull request #8032: URL: https://github.com/apache/arrow/pull/8032#issuecomment-688498545 @jorgecarleitao @alamb I'm catching up on the PRs today. It looks like this one is ready to merge? This is an a

[GitHub] [arrow] andygrove closed pull request #8124: ARROW-9908: [Rust] Add support for temporal types in JSON reader

2020-09-07 Thread GitBox
andygrove closed pull request #8124: URL: https://github.com/apache/arrow/pull/8124 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove commented on a change in pull request #8097: ARROW-9821: [Rust][DataFusion] Support for User Defined ExtensionNodes in the LogicalPlan

2020-09-07 Thread GitBox
andygrove commented on a change in pull request #8097: URL: https://github.com/apache/arrow/pull/8097#discussion_r484556394 ## File path: rust/datafusion/src/logical_plan/mod.rs ## @@ -755,8 +756,72 @@ impl fmt::Debug for Expr { } } -/// The LogicalPlan represents diffe

[GitHub] [arrow] andygrove commented on a change in pull request #8097: ARROW-9821: [Rust][DataFusion] Support for User Defined ExtensionNodes in the LogicalPlan

2020-09-07 Thread GitBox
andygrove commented on a change in pull request #8097: URL: https://github.com/apache/arrow/pull/8097#discussion_r484556134 ## File path: rust/datafusion/src/logical_plan/mod.rs ## @@ -755,8 +756,72 @@ impl fmt::Debug for Expr { } } -/// The LogicalPlan represents diffe

[GitHub] [arrow] andygrove commented on a change in pull request #8097: ARROW-9821: [Rust][DataFusion] Support for User Defined ExtensionNodes in the LogicalPlan

2020-09-07 Thread GitBox
andygrove commented on a change in pull request #8097: URL: https://github.com/apache/arrow/pull/8097#discussion_r48454 ## File path: rust/datafusion/src/execution/context.rs ## @@ -375,15 +380,39 @@ impl ScalarFunctionRegistry for ExecutionContext { } } +/// Provid

[GitHub] [arrow] andygrove commented on a change in pull request #8097: ARROW-9821: [Rust][DataFusion] Support for User Defined ExtensionNodes in the LogicalPlan

2020-09-07 Thread GitBox
andygrove commented on a change in pull request #8097: URL: https://github.com/apache/arrow/pull/8097#discussion_r484555403 ## File path: rust/datafusion/src/execution/context.rs ## @@ -288,9 +288,17 @@ impl ExecutionContext { /// Optimize the logical plan by applying op

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8116: ARROW-9919: [Rust][DataFusion] Speedup math operations by 15%+

2020-09-07 Thread GitBox
jorgecarleitao commented on a change in pull request #8116: URL: https://github.com/apache/arrow/pull/8116#discussion_r484544619 ## File path: rust/datafusion/src/physical_plan/math_expressions.rs ## @@ -19,23 +19,25 @@ use crate::error::{ExecutionError, Result}; -use arro

[GitHub] [arrow] jorisvandenbossche closed pull request #7991: ARROW-9718: [Python] ParquetWriter to work with new FileSystem API

2020-09-07 Thread GitBox
jorisvandenbossche closed pull request #7991: URL: https://github.com/apache/arrow/pull/7991 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] jorisvandenbossche commented on pull request #7991: ARROW-9718: [Python] ParquetWriter to work with new FileSystem API

2020-09-07 Thread GitBox
jorisvandenbossche commented on pull request #7991: URL: https://github.com/apache/arrow/pull/7991#issuecomment-688469658 Yep This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] nevi-me commented on a change in pull request #8116: ARROW-9919: [Rust][DataFusion] Speedup math operations by 15%+

2020-09-07 Thread GitBox
nevi-me commented on a change in pull request #8116: URL: https://github.com/apache/arrow/pull/8116#discussion_r484534336 ## File path: rust/datafusion/src/physical_plan/math_expressions.rs ## @@ -19,23 +19,25 @@ use crate::error::{ExecutionError, Result}; -use arrow::arra

[GitHub] [arrow] wesm commented on issue #8121: Add dplyr group_by, summarise and mutate support in function open_dataset R arrow package

2020-09-07 Thread GitBox
wesm commented on issue #8121: URL: https://github.com/apache/arrow/issues/8121#issuecomment-688433438 I see you opened a JIRA This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] wesm closed issue #8121: Add dplyr group_by, summarise and mutate support in function open_dataset R arrow package

2020-09-07 Thread GitBox
wesm closed issue #8121: URL: https://github.com/apache/arrow/issues/8121 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow] nealrichardson closed pull request #8123: ARROW-9929: [Dev] Autotune cmake-format

2020-09-07 Thread GitBox
nealrichardson closed pull request #8123: URL: https://github.com/apache/arrow/pull/8123 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] nealrichardson commented on a change in pull request #8123: ARROW-9929: [Dev] Autotune cmake-format

2020-09-07 Thread GitBox
nealrichardson commented on a change in pull request #8123: URL: https://github.com/apache/arrow/pull/8123#discussion_r484499701 ## File path: .github/workflows/comment_bot.yml ## @@ -69,7 +69,7 @@ jobs: git remote add upstream https://github.com/apache/arrow

[GitHub] [arrow] github-actions[bot] commented on pull request #8127: WIP: ARROW-8359: [C++/Python] Enable linux-aarch64 builds

2020-09-07 Thread GitBox
github-actions[bot] commented on pull request #8127: URL: https://github.com/apache/arrow/pull/8127#issuecomment-688401397 Revision: cf894700a416f70288460ee60b7f061bc3f90637 Submitted crossbow builds: [ursa-labs/crossbow @ actions-507](https://github.com/ursa-labs/crossbow/branches/a

[GitHub] [arrow] xhochy commented on pull request #8127: WIP: ARROW-8359: [C++/Python] Enable linux-aarch64 builds

2020-09-07 Thread GitBox
xhochy commented on pull request #8127: URL: https://github.com/apache/arrow/pull/8127#issuecomment-688400164 @github-actions crossbow submit conda-linux-gcc-py36-aarch64 This is an automated message from the Apache G

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8116: ARROW-9919: [Rust][DataFusion] Speedup math operations by 15%+

2020-09-07 Thread GitBox
jorgecarleitao commented on a change in pull request #8116: URL: https://github.com/apache/arrow/pull/8116#discussion_r484498652 ## File path: rust/datafusion/src/physical_plan/math_expressions.rs ## @@ -19,23 +19,25 @@ use crate::error::{ExecutionError, Result}; -use arro

[GitHub] [arrow] kszucs commented on pull request #8128: ARROW-9933: [Developer] Add drone as a CI provider

2020-09-07 Thread GitBox
kszucs commented on pull request #8128: URL: https://github.com/apache/arrow/pull/8128#issuecomment-688399715 There are no unit tests set up for crossbow yet, but you can check the rendered template by running submit with `--dry-run` option. I don't think you need a separate PR for addi

[GitHub] [arrow] nevi-me commented on pull request #8116: ARROW-9919: [Rust][DataFusion] Speedup math operations by 15%+

2020-09-07 Thread GitBox
nevi-me commented on pull request #8116: URL: https://github.com/apache/arrow/pull/8116#issuecomment-688394965 @paddyhoran @jorgecarleitao @andygrove PTAL at my suggestion before merging (https://github.com/apache/arrow/pull/8116#discussion_r484493616) ---

[GitHub] [arrow] nevi-me commented on a change in pull request #8116: ARROW-9919: [Rust][DataFusion] Speedup math operations by 15%+

2020-09-07 Thread GitBox
nevi-me commented on a change in pull request #8116: URL: https://github.com/apache/arrow/pull/8116#discussion_r484493616 ## File path: rust/datafusion/src/physical_plan/math_expressions.rs ## @@ -19,23 +19,25 @@ use crate::error::{ExecutionError, Result}; -use arrow::arra

[GitHub] [arrow] github-actions[bot] commented on pull request #8128: ARROW-9933: [Developer] Add drone as a CI provider

2020-09-07 Thread GitBox
github-actions[bot] commented on pull request #8128: URL: https://github.com/apache/arrow/pull/8128#issuecomment-688391044 https://issues.apache.org/jira/browse/ARROW-9933 This is an automated message from the Apache Git Serv

[GitHub] [arrow] xhochy commented on pull request #8128: ARROW-9933: [Developer] Add drone as a CI provider

2020-09-07 Thread GitBox
xhochy commented on pull request #8128: URL: https://github.com/apache/arrow/pull/8128#issuecomment-688388932 @kszucs I need this for https://github.com/apache/arrow/pull/8127, is there an easy way to test this? This is an a

[GitHub] [arrow] xhochy opened a new pull request #8128: ARROW-9933: [Developer] Add drone as a CI provider

2020-09-07 Thread GitBox
xhochy opened a new pull request #8128: URL: https://github.com/apache/arrow/pull/8128 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] pitrou commented on pull request #7991: ARROW-9718: [Python] ParquetWriter to work with new FileSystem API

2020-09-07 Thread GitBox
pitrou commented on pull request #7991: URL: https://github.com/apache/arrow/pull/7991#issuecomment-688376250 @jorisvandenbossche Do you want to merge this? This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] nevi-me commented on a change in pull request #8116: ARROW-9919: [Rust][DataFusion] Speedup math operations by 15%+

2020-09-07 Thread GitBox
nevi-me commented on a change in pull request #8116: URL: https://github.com/apache/arrow/pull/8116#discussion_r484477474 ## File path: rust/datafusion/src/physical_plan/math_expressions.rs ## @@ -19,23 +19,25 @@ use crate::error::{ExecutionError, Result}; -use arrow::arra

[GitHub] [arrow] pitrou closed pull request #8109: ARROW-9913: [C++] Make outputs of Decimal128::FromString independent of the presence of one another.

2020-09-07 Thread GitBox
pitrou closed pull request #8109: URL: https://github.com/apache/arrow/pull/8109 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] pitrou closed pull request #8103: ARROW-9904: [C++] Unroll the loop of CountSetBits.

2020-09-07 Thread GitBox
pitrou closed pull request #8103: URL: https://github.com/apache/arrow/pull/8103 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] pitrou commented on a change in pull request #8114: ARROW-9588: [C++] Partially support building with clang in an MSVC setting

2020-09-07 Thread GitBox
pitrou commented on a change in pull request #8114: URL: https://github.com/apache/arrow/pull/8114#discussion_r484472463 ## File path: cpp/src/arrow/python/type_traits.h ## @@ -105,7 +105,7 @@ struct npy_traits { using TypeClass = FloatType; using BuilderClass = FloatBuil

[GitHub] [arrow] xhochy commented on pull request #8127: WIP: ARROW-8359: [C++/Python] Enable linux-aarch64 builds

2020-09-07 Thread GitBox
xhochy commented on pull request #8127: URL: https://github.com/apache/arrow/pull/8127#issuecomment-688369186 @github-actions crossbow submit conda-linux-gcc-py36-aarch64 This is an automated message from the Apache Git Servi

[GitHub] [arrow] xhochy opened a new pull request #8127: WIP: ARROW-8359: [C++/Python] Enable linux-aarch64 builds

2020-09-07 Thread GitBox
xhochy opened a new pull request #8127: URL: https://github.com/apache/arrow/pull/8127 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] andygrove commented on pull request #8097: ARROW-9821: [Rust][DataFusion] Support for User Defined ExtensionNodes in the LogicalPlan

2020-09-07 Thread GitBox
andygrove commented on pull request #8097: URL: https://github.com/apache/arrow/pull/8097#issuecomment-688368606 @alamb Apologies .. I did not get to this yet due to work commitments, but I am going to make time today to review this. ---

[GitHub] [arrow] pitrou commented on a change in pull request #8115: ARROW-9917: [Python][Compute] Bindings for mode kernel

2020-09-07 Thread GitBox
pitrou commented on a change in pull request #8115: URL: https://github.com/apache/arrow/pull/8115#discussion_r484471282 ## File path: python/pyarrow/tests/test_compute.py ## @@ -109,6 +109,77 @@ def test_sum_chunked_array(arrow_type): assert pc.sum(arr).as_py() is None #

[GitHub] [arrow] pitrou commented on a change in pull request #8115: ARROW-9917: [Python][Compute] Bindings for mode kernel

2020-09-07 Thread GitBox
pitrou commented on a change in pull request #8115: URL: https://github.com/apache/arrow/pull/8115#discussion_r484470809 ## File path: python/pyarrow/array.pxi ## @@ -802,6 +802,12 @@ cdef class Array(_PandasConvertible): """ return _pc().call_function('sum',

[GitHub] [arrow] github-actions[bot] commented on pull request #8126: ARROW-9931: [C++] Fix undefined behaviour on invalid IPC input

2020-09-07 Thread GitBox
github-actions[bot] commented on pull request #8126: URL: https://github.com/apache/arrow/pull/8126#issuecomment-688358423 https://issues.apache.org/jira/browse/ARROW-9931 This is an automated message from the Apache Git Serv

[GitHub] [arrow] pitrou opened a new pull request #8126: ARROW-9931: [C++] Fix undefined behaviour on invalid IPC input

2020-09-07 Thread GitBox
pitrou opened a new pull request #8126: URL: https://github.com/apache/arrow/pull/8126 Should fix the following issue: - https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=25461 This is an automated message from the Apa

[GitHub] [arrow] pitrou closed pull request #8100: ARROW-9901: [C++] Add hand-crafted Parquet to Arrow reconstruction tests

2020-09-07 Thread GitBox
pitrou closed pull request #8100: URL: https://github.com/apache/arrow/pull/8100 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] github-actions[bot] commented on pull request #8125: ARROW-9387: [R] Use new C++ table select method

2020-09-07 Thread GitBox
github-actions[bot] commented on pull request #8125: URL: https://github.com/apache/arrow/pull/8125#issuecomment-688322569 https://issues.apache.org/jira/browse/ARROW-9387 This is an automated message from the Apache Git Serv

[GitHub] [arrow] romainfrancois opened a new pull request #8125: ARROW-9387: [R] Use new C++ table select method

2020-09-07 Thread GitBox
romainfrancois opened a new pull request #8125: URL: https://github.com/apache/arrow/pull/8125 R follow up from #7272 The current `$select()` uses a more familiar (though more expensive) tidyselect interface: ``` r library(arrow, warn.conflicts = FALSE) tab <- Table

[GitHub] [arrow] pitrou commented on a change in pull request #8100: ARROW-9901: [C++] Add hand-crafted Parquet to Arrow reconstruction tests

2020-09-07 Thread GitBox
pitrou commented on a change in pull request #8100: URL: https://github.com/apache/arrow/pull/8100#discussion_r484410655 ## File path: cpp/src/parquet/arrow/reconstruct_internal_test.cc ## @@ -0,0 +1,1528 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] pitrou closed pull request #8104: ARROW-9928: [C++] Speed up integer parsing slightly

2020-09-07 Thread GitBox
pitrou closed pull request #8104: URL: https://github.com/apache/arrow/pull/8104 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] pitrou commented on pull request #8100: ARROW-9901: [C++] Add hand-crafted Parquet to Arrow reconstruction tests

2020-09-07 Thread GitBox
pitrou commented on pull request #8100: URL: https://github.com/apache/arrow/pull/8100#issuecomment-688298194 I think I've addressed your concerns. Using object notation actually helped me fix a test... which made it fail. So one more bug to fix :-) ---

[GitHub] [arrow] pitrou commented on a change in pull request #8100: ARROW-9901: [C++] Add hand-crafted Parquet to Arrow reconstruction tests

2020-09-07 Thread GitBox
pitrou commented on a change in pull request #8100: URL: https://github.com/apache/arrow/pull/8100#discussion_r484405009 ## File path: cpp/src/parquet/arrow/reconstruct_internal_test.cc ## @@ -0,0 +1,1528 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] github-actions[bot] commented on pull request #8124: ARROW-9908: [RUST] Add support for temporal types in JSON reader

2020-09-07 Thread GitBox
github-actions[bot] commented on pull request #8124: URL: https://github.com/apache/arrow/pull/8124#issuecomment-688293690 https://issues.apache.org/jira/browse/ARROW-9908 This is an automated message from the Apache Git Serv

[GitHub] [arrow] ch-sc opened a new pull request #8124: ARROW-9908: [RUST] Add support for temporal types in JSON reader

2020-09-07 Thread GitBox
ch-sc opened a new pull request #8124: URL: https://github.com/apache/arrow/pull/8124 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow] romainfrancois commented on pull request #8122: ARROW-9557: [R] Iterating over parquet columns is slow in R

2020-09-07 Thread GitBox
romainfrancois commented on pull request #8122: URL: https://github.com/apache/arrow/pull/8122#issuecomment-688284689 also, I guess `$ReadTable()` could either be simplified to only use column indices (as in the C++ function) ```cpp virtual ::arrow::Status ReadTable(const std::vec

[GitHub] [arrow] alamb commented on pull request #8097: ARROW-9821: [Rust][DataFusion] Support for User Defined ExtensionNodes in the LogicalPlan

2020-09-07 Thread GitBox
alamb commented on pull request #8097: URL: https://github.com/apache/arrow/pull/8097#issuecomment-688254785 @andygrove -- is there any chance you have had a chance to think through the interactions withg providing alternate planners and optimization rules? If not, no worries. I am just t

[GitHub] [arrow] paddyhoran commented on pull request #8116: ARROW-9919: [Rust][DataFusion] Speedup math operations by 15%+

2020-09-07 Thread GitBox
paddyhoran commented on pull request #8116: URL: https://github.com/apache/arrow/pull/8116#issuecomment-688252444 Hey @jorgecarleitao, Sorry, my info was wrong. As you said `>` does do a full copy of the underlying data which uses memory.rs [here](https://github.com/apache/arrow/bl

[GitHub] [arrow] github-actions[bot] commented on pull request #8123: ARROW-9929: [Dev] Autotune cmake-format

2020-09-07 Thread GitBox
github-actions[bot] commented on pull request #8123: URL: https://github.com/apache/arrow/pull/8123#issuecomment-688252334 https://issues.apache.org/jira/browse/ARROW-9929 This is an automated message from the Apache Git Serv

[GitHub] [arrow] github-actions[bot] commented on pull request #8122: ARROW-9557: [R] Iterating over parquet columns is slow in R

2020-09-07 Thread GitBox
github-actions[bot] commented on pull request #8122: URL: https://github.com/apache/arrow/pull/8122#issuecomment-688252332 https://issues.apache.org/jira/browse/ARROW-9557 This is an automated message from the Apache Git Serv

[GitHub] [arrow] alamb commented on a change in pull request #7967: ARROW-9751: [Rust] [DataFusion] Allow UDFs to accept multiple data types per argument

2020-09-07 Thread GitBox
alamb commented on a change in pull request #7967: URL: https://github.com/apache/arrow/pull/7967#discussion_r484350796 ## File path: rust/datafusion/examples/simple_udf.rs ## @@ -0,0 +1,138 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribut

[GitHub] [arrow] xhochy commented on pull request #8123: ARROW-9929: [Dev] Autotune cmake-format

2020-09-07 Thread GitBox
xhochy commented on pull request #8123: URL: https://github.com/apache/arrow/pull/8123#issuecomment-688247262 Successfully tested on my fork using https://github.com/xhochy/arrow/pull/5 This is an automated message from the A

[GitHub] [arrow] xhochy commented on a change in pull request #8123: ARROW-9929: [Dev] Autotune cmake-format

2020-09-07 Thread GitBox
xhochy commented on a change in pull request #8123: URL: https://github.com/apache/arrow/pull/8123#discussion_r484356615 ## File path: .github/workflows/comment_bot.yml ## @@ -69,7 +69,7 @@ jobs: git remote add upstream https://github.com/apache/arrow git

[GitHub] [arrow] xhochy opened a new pull request #8123: ARROW-9929: [Dev] Autotune cmake-format

2020-09-07 Thread GitBox
xhochy opened a new pull request #8123: URL: https://github.com/apache/arrow/pull/8123 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] romainfrancois opened a new pull request #8122: ARROW-9557: [R] Iterating over parquet columns is slow in R

2020-09-07 Thread GitBox
romainfrancois opened a new pull request #8122: URL: https://github.com/apache/arrow/pull/8122 I don't think this is about `shared_ptr_is_null()` as indicated in the jira issue: https://issues.apache.org/jira/browse/ARROW-9557 I guess profvis (or probably the underlying profiler) struggles

[GitHub] [arrow] kszucs commented on a change in pull request #8088: [C++][Python] Refactor python to arrow conversions based on a reusable conversion API

2020-09-07 Thread GitBox
kszucs commented on a change in pull request #8088: URL: https://github.com/apache/arrow/pull/8088#discussion_r484348861 ## File path: cpp/src/arrow/python/python_test.cc ## @@ -422,17 +421,15 @@ TEST_F(DecimalTest, TestNoneAndNaN) { ASSERT_EQ(0, PyList_SetItem(list, 2, miss

[GitHub] [arrow] kszucs commented on a change in pull request #8088: [C++][Python] Refactor python to arrow conversions based on a reusable conversion API

2020-09-07 Thread GitBox
kszucs commented on a change in pull request #8088: URL: https://github.com/apache/arrow/pull/8088#discussion_r484348787 ## File path: cpp/src/arrow/util/hashing.h ## @@ -851,6 +851,11 @@ struct HashTraits::value && using MemoTableType = BinaryMemoTable; }; +template <>

[GitHub] [arrow] pitrou commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

2020-09-07 Thread GitBox
pitrou commented on a change in pull request #7887: URL: https://github.com/apache/arrow/pull/7887#discussion_r484348159 ## File path: cpp/src/arrow/pretty_print.cc ## @@ -339,6 +339,17 @@ class ArrayPrinter : public PrettyPrinter { children.reserve(array.num_fields());

[GitHub] [arrow] pitrou commented on a change in pull request #7887: ARROW-9304: [C++] Add "AppendEmpty" builder APIs for use inside StructBuilder::AppendNull

2020-09-07 Thread GitBox
pitrou commented on a change in pull request #7887: URL: https://github.com/apache/arrow/pull/7887#discussion_r484347796 ## File path: cpp/src/arrow/pretty_print.cc ## @@ -339,6 +339,17 @@ class ArrayPrinter : public PrettyPrinter { children.reserve(array.num_fields());

[GitHub] [arrow] pitrou commented on pull request #8104: ARROW-9928: [C++] Speed up integer parsing slightly

2020-09-07 Thread GitBox
pitrou commented on pull request #8104: URL: https://github.com/apache/arrow/pull/8104#issuecomment-688236639 CSV conversion micro-benchmark: * before: ``` Int64Conversion 100800 ns 100782 ns21238 items_per_second=82.6831M/s ``` * after: ``` Int64Conve

[GitHub] [arrow] pitrou commented on pull request #8104: ARROW-9928: [C++] Speed up integer parsing slightly

2020-09-07 Thread GitBox
pitrou commented on pull request #8104: URL: https://github.com/apache/arrow/pull/8104#issuecomment-688236145 Integer parsing micro-benchmarks: * before: ``` IntegerParsing2553 ns 2553 ns 826045 items_per_second=391.727M/s IntegerParsing 4728

[GitHub] [arrow] kszucs commented on a change in pull request #8088: [C++][Python] Refactor python to arrow conversions based on a reusable conversion API

2020-09-07 Thread GitBox
kszucs commented on a change in pull request #8088: URL: https://github.com/apache/arrow/pull/8088#discussion_r484346043 ## File path: cpp/src/arrow/python/python_to_arrow.cc ## @@ -1347,64 +841,49 @@ Status ConvertToSequenceAndInferSize(PyObject* obj, PyObject** seq, int64_t*

[GitHub] [arrow] kszucs commented on a change in pull request #8088: [C++][Python] Refactor python to arrow conversions based on a reusable conversion API

2020-09-07 Thread GitBox
kszucs commented on a change in pull request #8088: URL: https://github.com/apache/arrow/pull/8088#discussion_r484345700 ## File path: cpp/src/arrow/util/converter.h ## @@ -0,0 +1,281 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lice

[GitHub] [arrow] pitrou commented on pull request #8104: ARROW-9928: [C++] Speed up integer parsing slightly

2020-09-07 Thread GitBox
pitrou commented on pull request #8104: URL: https://github.com/apache/arrow/pull/8104#issuecomment-688234321 Thank you. The speed up is relatively minor, but I can confirm it on Ubuntu 20.04 with clang 10. This is an automa

[GitHub] [arrow] github-actions[bot] commented on pull request #8104: ARROW-9928: [C++] Speed up integer parsing slightly

2020-09-07 Thread GitBox
github-actions[bot] commented on pull request #8104: URL: https://github.com/apache/arrow/pull/8104#issuecomment-688234077 https://issues.apache.org/jira/browse/ARROW-9928 This is an automated message from the Apache Git Serv

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #8115: ARROW-9917: [Python][Compute] Bindings for mode kernel

2020-09-07 Thread GitBox
jorisvandenbossche commented on a change in pull request #8115: URL: https://github.com/apache/arrow/pull/8115#discussion_r484311322 ## File path: python/pyarrow/array.pxi ## @@ -802,6 +802,12 @@ cdef class Array(_PandasConvertible): """ return _pc().call_func

[GitHub] [arrow] AndrewTsao commented on pull request #8104: fast exit loop if length equals zero

2020-09-07 Thread GitBox
AndrewTsao commented on pull request #8104: URL: https://github.com/apache/arrow/pull/8104#issuecomment-688166550 On my windows machine, test result. BM_ParseUnsignedBreak is use break statment to break unroll loop. (/O2 /Oi) ``` 2020-09-07T13:28:06+08:00 Running release\Relea

[GitHub] [arrow] PalGal2 opened a new issue #8121: Add dplyr group_by, summarise and mutate support in function open_dataset R arrow package

2020-09-07 Thread GitBox
PalGal2 opened a new issue #8121: URL: https://github.com/apache/arrow/issues/8121 Hi, The open_dataset() function in the R arrow package already includes the support for dplyr filter, select and rename functions. However, it would be a huge improvement if it also could include ot