[GitHub] [arrow] edponce commented on a change in pull request #12080: ARROW-15118: [C++] Avoid bitmap buffer if all inputs are all valid for Scalar Kernels

2022-01-06 Thread GitBox
edponce commented on a change in pull request #12080: URL: https://github.com/apache/arrow/pull/12080#discussion_r779356911 ## File path: cpp/src/arrow/compute/exec.cc ## @@ -370,12 +370,15 @@ class NullPropagator { public: NullPropagator(KernelContext* ctx, const ExecBatc

[GitHub] [arrow] edponce commented on a change in pull request #12078: ARROW-14448: [Python] Update pyarrow.array() docstring note on timestamp (timezone) conversion

2022-01-06 Thread GitBox
edponce commented on a change in pull request #12078: URL: https://github.com/apache/arrow/pull/12078#discussion_r779367812 ## File path: python/pyarrow/array.pxi ## @@ -159,9 +159,10 @@ def array(object obj, type=None, mask=None, size=None, from_pandas=None, Notes

[GitHub] [arrow] westonpace closed pull request #10795: ARROW-13155: [C++] MapGenerator should optionally forward reentrant pressure

2022-01-06 Thread GitBox
westonpace closed pull request #10795: URL: https://github.com/apache/arrow/pull/10795 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsu

[GitHub] [arrow] 9prady9 commented on a change in pull request #12080: ARROW-15118: [C++] Avoid bitmap buffer if all inputs are all valid for Scalar Kernels

2022-01-06 Thread GitBox
9prady9 commented on a change in pull request #12080: URL: https://github.com/apache/arrow/pull/12080#discussion_r779374250 ## File path: cpp/src/arrow/compute/exec.cc ## @@ -370,12 +370,15 @@ class NullPropagator { public: NullPropagator(KernelContext* ctx, const ExecBatc

[GitHub] [arrow] 9prady9 commented on a change in pull request #12080: ARROW-15118: [C++] Avoid bitmap buffer if all inputs are all valid for Scalar Kernels

2022-01-06 Thread GitBox
9prady9 commented on a change in pull request #12080: URL: https://github.com/apache/arrow/pull/12080#discussion_r779375284 ## File path: cpp/src/arrow/compute/exec.cc ## @@ -370,12 +370,15 @@ class NullPropagator { public: NullPropagator(KernelContext* ctx, const ExecBatc

[GitHub] [arrow] 9prady9 commented on a change in pull request #12080: ARROW-15118: [C++] Avoid bitmap buffer if all inputs are all valid for Scalar Kernels

2022-01-06 Thread GitBox
9prady9 commented on a change in pull request #12080: URL: https://github.com/apache/arrow/pull/12080#discussion_r779375284 ## File path: cpp/src/arrow/compute/exec.cc ## @@ -370,12 +370,15 @@ class NullPropagator { public: NullPropagator(KernelContext* ctx, const ExecBatc

[GitHub] [arrow] edponce commented on a change in pull request #12077: ARROW-15109: [Python] Add arrow_info() to print build, component, and system info

2022-01-06 Thread GitBox
edponce commented on a change in pull request #12077: URL: https://github.com/apache/arrow/pull/12077#discussion_r779372956 ## File path: python/pyarrow/__init__.py ## @@ -75,19 +77,74 @@ def show_versions(): """ Print various version information, to help with error r

[GitHub] [arrow] romainfrancois commented on a change in pull request #11956: ARROW-10456: [R] Implement MapType and MapArray

2022-01-06 Thread GitBox
romainfrancois commented on a change in pull request #11956: URL: https://github.com/apache/arrow/pull/11956#discussion_r779376573 ## File path: r/src/symbols.cpp ## @@ -75,6 +75,10 @@ SEXP data::classes_arrow_large_list = precious(cpp11::writable::strings( SEXP data::classes

[GitHub] [arrow] edponce commented on a change in pull request #12080: ARROW-15118: [C++] Avoid bitmap buffer if all inputs are all valid for Scalar Kernels

2022-01-06 Thread GitBox
edponce commented on a change in pull request #12080: URL: https://github.com/apache/arrow/pull/12080#discussion_r779376706 ## File path: cpp/src/arrow/compute/exec.cc ## @@ -370,12 +370,15 @@ class NullPropagator { public: NullPropagator(KernelContext* ctx, const ExecBatc

[GitHub] [arrow] edponce commented on a change in pull request #12080: ARROW-15118: [C++] Avoid bitmap buffer if all inputs are all valid for Scalar Kernels

2022-01-06 Thread GitBox
edponce commented on a change in pull request #12080: URL: https://github.com/apache/arrow/pull/12080#discussion_r779376706 ## File path: cpp/src/arrow/compute/exec.cc ## @@ -370,12 +370,15 @@ class NullPropagator { public: NullPropagator(KernelContext* ctx, const ExecBatc

[GitHub] [arrow] edponce commented on a change in pull request #12080: ARROW-15118: [C++] Avoid bitmap buffer if all inputs are all valid for Scalar Kernels

2022-01-06 Thread GitBox
edponce commented on a change in pull request #12080: URL: https://github.com/apache/arrow/pull/12080#discussion_r779366434 ## File path: cpp/src/arrow/compute/exec.cc ## @@ -370,12 +370,15 @@ class NullPropagator { public: NullPropagator(KernelContext* ctx, const ExecBatc

[GitHub] [arrow] sanjibansg commented on a change in pull request #12078: ARROW-14448: [Python] Update pyarrow.array() docstring note on timestamp (timezone) conversion

2022-01-06 Thread GitBox
sanjibansg commented on a change in pull request #12078: URL: https://github.com/apache/arrow/pull/12078#discussion_r779390745 ## File path: python/pyarrow/array.pxi ## @@ -159,9 +159,10 @@ def array(object obj, type=None, mask=None, size=None, from_pandas=None, Notes

[GitHub] [arrow] ursabot edited a comment on pull request #12016: ARROW-14603: [Doc] Tutorial - R bindings

2022-01-06 Thread GitBox
ursabot edited a comment on pull request #12016: URL: https://github.com/apache/arrow/pull/12016#issuecomment-1005679942 Benchmark runs are scheduled for baseline = 08096d4125fcbfe43ecf48614a15f1205cd4e8f3 and contender = 3bf06f2fdb7966be4e513564d2df553d09ae98b1. 3bf06f2fdb7966be4e513564d

[GitHub] [arrow] ursabot edited a comment on pull request #11514: ARROW-14430: [Go] Basic Expression, Field Reference and Datum handling

2022-01-06 Thread GitBox
ursabot edited a comment on pull request #11514: URL: https://github.com/apache/arrow/pull/11514#issuecomment-1006061615 Benchmark runs are scheduled for baseline = ec38aebb36e99e54e69089cbc6a623a616575dde and contender = d30669b7f27f60b20f02c45f1e4ca22f4abd06e1. d30669b7f27f60b20f02c45f1

[GitHub] [arrow] jvanstraten commented on a change in pull request #12084: ARROW-15029: [C++] Split compute/kernels/scalar_string.cc

2022-01-06 Thread GitBox
jvanstraten commented on a change in pull request #12084: URL: https://github.com/apache/arrow/pull/12084#discussion_r779405314 ## File path: cpp/src/arrow/CMakeLists.txt ## @@ -422,7 +422,9 @@ if(ARROW_COMPUTE) compute/kernels/scalar_nested.cc compute/kernels/s

[GitHub] [arrow] westonpace commented on pull request #11991: ARROW-13554: [C++] Remove deprecated Scanner::Scan

2022-01-06 Thread GitBox
westonpace commented on pull request #11991: URL: https://github.com/apache/arrow/pull/11991#issuecomment-1006424708 The only failure remaining is a Java failure. Unfortunately, the test failure is missing line numbers so I'm not sure exactly what assert is failing (it expects 2 but I can

[GitHub] [arrow] b41sh commented on pull request #11238: ARROW-13628: [Rust] Activate IPC month_day_nano_interval integration test for rust

2022-01-06 Thread GitBox
b41sh commented on pull request #11238: URL: https://github.com/apache/arrow/pull/11238#issuecomment-1006441731 Thanks for your help @pitrou @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] joosthooz opened a new pull request #12089: WIP: Arrow-9285: Detect unauthorized memory allocations in function kernels

2022-01-06 Thread GitBox
joosthooz opened a new pull request #12089: URL: https://github.com/apache/arrow/pull/12089 Targets https://issues.apache.org/jira/browse/ARROW-9285 If a kernel uses preallocated memory allocation (`MemAllocation::PREALLOCATE`), it should not create new buffers. This code aims to check

[GitHub] [arrow] github-actions[bot] commented on pull request #12089: WIP: Arrow-9285: Detect unauthorized memory allocations in function kernels

2022-01-06 Thread GitBox
github-actions[bot] commented on pull request #12089: URL: https://github.com/apache/arrow/pull/12089#issuecomment-1006465933 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you op

[GitHub] [arrow] ursabot edited a comment on pull request #12040: ARROW-15199: [Java] Update protobuf-maven-plugin to avoid 'Text file busy' failure

2022-01-06 Thread GitBox
ursabot edited a comment on pull request #12040: URL: https://github.com/apache/arrow/pull/12040#issuecomment-1006067533 Benchmark runs are scheduled for baseline = d30669b7f27f60b20f02c45f1e4ca22f4abd06e1 and contender = 4d8fe01c0ce6804f1d30e156756707e6c0daf9d2. 4d8fe01c0ce6804f1d30e1567

[GitHub] [arrow] ursabot edited a comment on pull request #12018: ARROW-14757: [Doc] Steps in making your first PR - R bindings

2022-01-06 Thread GitBox
ursabot edited a comment on pull request #12018: URL: https://github.com/apache/arrow/pull/12018#issuecomment-1005679950 Benchmark runs are scheduled for baseline = 3bf06f2fdb7966be4e513564d2df553d09ae98b1 and contender = 67a29fdffec3c2646b29aa07b49729305aac0d38. 67a29fdffec3c2646b29aa07b

[GitHub] [arrow-datafusion] alamb merged pull request #1524: Remove one copy of ballista datatype serialization code

2022-01-06 Thread GitBox
alamb merged pull request #1524: URL: https://github.com/apache/arrow-datafusion/pull/1524 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb commented on pull request #1475: Use`eq_dyn`, `neq_dyn`, `lt_dyn`, `lt_eq_dyn`, `gt_dyn`, `gt_eq_dyn` kernels from arrow

2022-01-06 Thread GitBox
alamb commented on pull request #1475: URL: https://github.com/apache/arrow-datafusion/pull/1475#issuecomment-1006533716 Rebased on top of https://github.com/apache/arrow-datafusion/pull/1523 (aka upgrade to arrow 7.0.0) which includes https://github.com/apache/arrow-rs/pull/1095 from @

[GitHub] [arrow-rs] alamb commented on issue #1120: More frequent major releases for arrow-rs

2022-01-06 Thread GitBox
alamb commented on issue #1120: URL: https://github.com/apache/arrow-rs/issues/1120#issuecomment-1006541097 > Maybe we need more active PMC for arrow in the rust ecosystem. Yeah -- I don't really know how to improve this. We do have several representatives of the Rust implementation

[GitHub] [arrow] ursabot edited a comment on pull request #11853: ARROW-1699: [C++] forward, backward fill kernel functions

2022-01-06 Thread GitBox
ursabot edited a comment on pull request #11853: URL: https://github.com/apache/arrow/pull/11853#issuecomment-1005743192 Benchmark runs are scheduled for baseline = 67a29fdffec3c2646b29aa07b49729305aac0d38 and contender = ec38aebb36e99e54e69089cbc6a623a616575dde. ec38aebb36e99e54e69089cbc

[GitHub] [arrow] ursabot edited a comment on pull request #12082: MINOR: [Docs] Add fill_null functions to docs

2022-01-06 Thread GitBox
ursabot edited a comment on pull request #12082: URL: https://github.com/apache/arrow/pull/12082#issuecomment-1006067549 Benchmark runs are scheduled for baseline = 4d8fe01c0ce6804f1d30e156756707e6c0daf9d2 and contender = 63010622e6d817321106ee392eff69962806c2b3. 63010622e6d817321106ee392

[GitHub] [arrow] edponce commented on pull request #12089: Arrow-9285: Detect unauthorized memory allocations in function kernels

2022-01-06 Thread GitBox
edponce commented on pull request #12089: URL: https://github.com/apache/arrow/pull/12089#issuecomment-1006577201 @joosthooz I have following initial comments: 1. [ScalarExecutor](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/exec.cc#L635) has two possible preallocat

[GitHub] [arrow] edponce edited a comment on pull request #12089: Arrow-9285: Detect unauthorized memory allocations in function kernels

2022-01-06 Thread GitBox
edponce edited a comment on pull request #12089: URL: https://github.com/apache/arrow/pull/12089#issuecomment-1006577201 @joosthooz I have following initial comments: 1. [ScalarExecutor](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/exec.cc#L635) has two possible pre

[GitHub] [arrow] edponce commented on pull request #12089: Arrow-9285: Detect unauthorized memory allocations in function kernels

2022-01-06 Thread GitBox
edponce commented on pull request #12089: URL: https://github.com/apache/arrow/pull/12089#issuecomment-1006578879 @joosthooz Need to change this PR title to: `ARROW-9285: [C++] Detect unauthorized memory allocations in function kernels` -- This is an automated message from the Apache Git

[GitHub] [arrow] edponce edited a comment on pull request #12089: Arrow-9285: Detect unauthorized memory allocations in function kernels

2022-01-06 Thread GitBox
edponce edited a comment on pull request #12089: URL: https://github.com/apache/arrow/pull/12089#issuecomment-1006577201 @joosthooz I have following initial comments: 1. [ScalarExecutor](https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/exec.cc#L635) has two possible pre

[GitHub] [arrow] lidavidm commented on a change in pull request #11991: ARROW-13554: [C++] Remove deprecated Scanner::Scan

2022-01-06 Thread GitBox
lidavidm commented on a change in pull request #11991: URL: https://github.com/apache/arrow/pull/11991#discussion_r779543359 ## File path: r/R/dataset-scan.R ## @@ -118,9 +113,6 @@ Scanner$create <- function(dataset, if (use_threads) { scanner_builder$UseThreads() }

[GitHub] [arrow] github-actions[bot] commented on pull request #12089: ARROW-9285: [C++] Detect unauthorized memory allocations in function kernels

2022-01-06 Thread GitBox
github-actions[bot] commented on pull request #12089: URL: https://github.com/apache/arrow/pull/12089#issuecomment-1006598279 https://issues.apache.org/jira/browse/ARROW-9285 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] dragosmg commented on pull request #11921: ARROW-12743 [R] Add DESCRIPTION fields for dev dependencies

2022-01-06 Thread GitBox
dragosmg commented on pull request #11921: URL: https://github.com/apache/arrow/pull/11921#issuecomment-1006612212 @thisisnic & @jonkeane would you mind having a look? 🙏🏻 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] paleolimbot opened a new pull request #12090: ARROW-15266: [R] [CI] Test reorganization triggering valgrind errors

2022-01-06 Thread GitBox
paleolimbot opened a new pull request #12090: URL: https://github.com/apache/arrow/pull/12090 This PR is to fix a valgrind failure that started occurring after #11904 (ARROW-15010) was merged. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] github-actions[bot] commented on pull request #12090: ARROW-15266: [R] [CI] Test reorganization triggering valgrind errors

2022-01-06 Thread GitBox
github-actions[bot] commented on pull request #12090: URL: https://github.com/apache/arrow/pull/12090#issuecomment-1006614208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] jonkeane commented on a change in pull request #12073: ARROW-14919: [R] write_parquet() drops attributes for grouped dataframes

2022-01-06 Thread GitBox
jonkeane commented on a change in pull request #12073: URL: https://github.com/apache/arrow/pull/12073#discussion_r779568493 ## File path: r/R/metadata.R ## @@ -133,24 +133,24 @@ remove_attributes <- function(x) { } arrow_attributes <- function(x, only_top_level = FALSE) {

[GitHub] [arrow] thisisnic commented on pull request #12083: ARROW-14744: [R] open_dataset() error when `schema` argument supplied, but `column_names` not supplied to `CSVReadOptions`

2022-01-06 Thread GitBox
thisisnic commented on pull request #12083: URL: https://github.com/apache/arrow/pull/12083#issuecomment-1006618779 Just pasting here the conversation from JIRA: > I did, however, run into trouble. Say, for example, the user has set skip_rows-option like this: `read_options=arrow:

[GitHub] [arrow] bkmgit commented on pull request #12085: ARROW-15248: [C++][Docs] Improve docs about linting/formatting

2022-01-06 Thread GitBox
bkmgit commented on pull request #12085: URL: https://github.com/apache/arrow/pull/12085#issuecomment-1006621048 This is helpful. The Dev / Lint check also gives information on what to change should the tests fail. This does not require setting up the tool chain, so maybe helpful when cont

[GitHub] [arrow] guyuqi commented on a change in pull request #12009: ARROW-15172: [Go] Add Arm64 Neon implementation for Arrow-math

2022-01-06 Thread GitBox
guyuqi commented on a change in pull request #12009: URL: https://github.com/apache/arrow/pull/12009#discussion_r779576032 ## File path: go/arrow/math/Makefile ## @@ -37,27 +41,42 @@ INTEL_SOURCES := \ int64_avx2_amd64.s int64_sse4_amd64.s \ uint64_avx2_amd64.s

[GitHub] [arrow] paleolimbot commented on pull request #12090: ARROW-15266: [R] [CI] Test reorganization triggering valgrind errors

2022-01-06 Thread GitBox
paleolimbot commented on pull request #12090: URL: https://github.com/apache/arrow/pull/12090#issuecomment-1006625224 @github-actions crossbow submit test-r-linux-valgrind -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [arrow] github-actions[bot] commented on pull request #12090: ARROW-15266: [R] [CI] Test reorganization triggering valgrind errors

2022-01-06 Thread GitBox
github-actions[bot] commented on pull request #12090: URL: https://github.com/apache/arrow/pull/12090#issuecomment-1006626027 Revision: 2488ad76c416a95c75143a505a39e241c73c47dd Submitted crossbow builds: [ursacomputing/crossbow @ actions-1377](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] guyuqi commented on a change in pull request #12009: ARROW-15172: [Go] Add Arm64 Neon implementation for Arrow-math

2022-01-06 Thread GitBox
guyuqi commented on a change in pull request #12009: URL: https://github.com/apache/arrow/pull/12009#discussion_r779576032 ## File path: go/arrow/math/Makefile ## @@ -37,27 +41,42 @@ INTEL_SOURCES := \ int64_avx2_amd64.s int64_sse4_amd64.s \ uint64_avx2_amd64.s

[GitHub] [arrow-rs] sum12 commented on issue #38: Read temporal values from JSON

2022-01-06 Thread GitBox
sum12 commented on issue #38: URL: https://github.com/apache/arrow-rs/issues/38#issuecomment-1006626932 It seems the behavior here is not consistent. TemporalValues can get parsed to `u64` if they look like `u64` otherwise it is a `null`, but only while dealing with `build_struct_array` w

[GitHub] [arrow] vibhatha commented on a change in pull request #12085: ARROW-15248: [C++][Docs] Improve docs about linting/formatting

2022-01-06 Thread GitBox
vibhatha commented on a change in pull request #12085: URL: https://github.com/apache/arrow/pull/12085#discussion_r779589235 ## File path: docs/source/developers/cpp/development.rst ## @@ -92,32 +92,51 @@ Address Sanitizer and Undefined Behavior Sanitizer to check for various

[GitHub] [arrow] lidavidm commented on a change in pull request #12085: ARROW-15248: [C++][Docs] Improve docs about linting/formatting

2022-01-06 Thread GitBox
lidavidm commented on a change in pull request #12085: URL: https://github.com/apache/arrow/pull/12085#discussion_r779590041 ## File path: docs/source/developers/cpp/development.rst ## @@ -92,32 +92,51 @@ Address Sanitizer and Undefined Behavior Sanitizer to check for various

[GitHub] [arrow] vibhatha commented on a change in pull request #12085: ARROW-15248: [C++][Docs] Improve docs about linting/formatting

2022-01-06 Thread GitBox
vibhatha commented on a change in pull request #12085: URL: https://github.com/apache/arrow/pull/12085#discussion_r779590500 ## File path: docs/source/developers/cpp/development.rst ## @@ -92,32 +92,51 @@ Address Sanitizer and Undefined Behavior Sanitizer to check for various

[GitHub] [arrow] vibhatha commented on pull request #12085: ARROW-15248: [C++][Docs] Improve docs about linting/formatting

2022-01-06 Thread GitBox
vibhatha commented on pull request #12085: URL: https://github.com/apache/arrow/pull/12085#issuecomment-1006640159 > CC @vibhatha and @AlvinJ15 since we had some trouble with formatting earlier, are these docs clearer? This is really nice and thanks for the quick PR. I added a few c

[GitHub] [arrow] vibhatha commented on a change in pull request #12085: ARROW-15248: [C++][Docs] Improve docs about linting/formatting

2022-01-06 Thread GitBox
vibhatha commented on a change in pull request #12085: URL: https://github.com/apache/arrow/pull/12085#discussion_r779594147 ## File path: docs/source/developers/cpp/development.rst ## @@ -92,32 +92,51 @@ Address Sanitizer and Undefined Behavior Sanitizer to check for various

[GitHub] [arrow] ursabot edited a comment on pull request #11514: ARROW-14430: [Go] Basic Expression, Field Reference and Datum handling

2022-01-06 Thread GitBox
ursabot edited a comment on pull request #11514: URL: https://github.com/apache/arrow/pull/11514#issuecomment-1006061615 Benchmark runs are scheduled for baseline = ec38aebb36e99e54e69089cbc6a623a616575dde and contender = d30669b7f27f60b20f02c45f1e4ca22f4abd06e1. d30669b7f27f60b20f02c45f1

[GitHub] [arrow] dhruv9vats commented on a change in pull request #11946: ARROW-13663: [C++] RecordBatchReader STL-like iteration

2022-01-06 Thread GitBox
dhruv9vats commented on a change in pull request #11946: URL: https://github.com/apache/arrow/pull/11946#discussion_r779597891 ## File path: cpp/src/arrow/record_batch.h ## @@ -234,6 +234,67 @@ class ARROW_EXPORT RecordBatchReader { return batch; } + class RecordBatc

[GitHub] [arrow] Jimexist commented on a change in pull request #12028: ARROW-15192: [Java] Allow use of Jackson 2.12 and higher

2022-01-06 Thread GitBox
Jimexist commented on a change in pull request #12028: URL: https://github.com/apache/arrow/pull/12028#discussion_r779599743 ## File path: java/vector/src/main/java/org/apache/arrow/vector/util/ObjectMapperFactory.java ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Softwar

[GitHub] [arrow] Jimexist commented on a change in pull request #12028: ARROW-15192: [Java] Allow use of Jackson 2.12 and higher

2022-01-06 Thread GitBox
Jimexist commented on a change in pull request #12028: URL: https://github.com/apache/arrow/pull/12028#discussion_r779600487 ## File path: java/vector/src/main/java/org/apache/arrow/vector/util/ObjectMapperFactory.java ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Softwar

[GitHub] [arrow] zeroshade commented on a change in pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-06 Thread GitBox
zeroshade commented on a change in pull request #11538: URL: https://github.com/apache/arrow/pull/11538#discussion_r779607502 ## File path: go/parquet/internal/testutils/primitive_typed.go ## @@ -0,0 +1,305 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// o

[GitHub] [arrow] zeroshade commented on pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-06 Thread GitBox
zeroshade commented on pull request #11538: URL: https://github.com/apache/arrow/pull/11538#issuecomment-1006659303 @emkornfield I'll update based on the comments, but as far as integration testing, currently the integration testing I have is that I use several of the test parquet files fr

[GitHub] [arrow] edponce commented on a change in pull request #12089: ARROW-9285: [C++] Detect unauthorized memory allocations in function kernels

2022-01-06 Thread GitBox
edponce commented on a change in pull request #12089: URL: https://github.com/apache/arrow/pull/12089#discussion_r779611886 ## File path: cpp/src/arrow/compute/exec.cc ## @@ -110,6 +112,78 @@ int64_t ExecBatch::TotalBufferSize() const { return sum; } +bool AddBuffersToSet

[GitHub] [arrow] edponce commented on a change in pull request #12089: ARROW-9285: [C++] Detect unauthorized memory allocations in function kernels

2022-01-06 Thread GitBox
edponce commented on a change in pull request #12089: URL: https://github.com/apache/arrow/pull/12089#discussion_r779611886 ## File path: cpp/src/arrow/compute/exec.cc ## @@ -110,6 +112,78 @@ int64_t ExecBatch::TotalBufferSize() const { return sum; } +bool AddBuffersToSet

[GitHub] [arrow] laurentgo commented on a change in pull request #12028: ARROW-15192: [Java] Allow use of Jackson 2.12 and higher

2022-01-06 Thread GitBox
laurentgo commented on a change in pull request #12028: URL: https://github.com/apache/arrow/pull/12028#discussion_r779615535 ## File path: java/vector/src/main/java/org/apache/arrow/vector/util/ObjectMapperFactory.java ## @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Softwa

[GitHub] [arrow] edponce commented on pull request #12014: ARROW-10924: [C++] Validate temporal data in ValidateArrayFull

2022-01-06 Thread GitBox
edponce commented on pull request #12014: URL: https://github.com/apache/arrow/pull/12014#issuecomment-1006674558 @pitrou So there is a limit for `Date64Type` but not for `Date32Type`? I now understand that `TimestampType` does not has limits. -- This is an automated message from the Apa

[GitHub] [arrow] wjones127 commented on a change in pull request #12077: ARROW-15109: [Python] Add arrow_info() to print build, component, and system info

2022-01-06 Thread GitBox
wjones127 commented on a change in pull request #12077: URL: https://github.com/apache/arrow/pull/12077#discussion_r779632641 ## File path: python/pyarrow/__init__.py ## @@ -75,19 +77,74 @@ def show_versions(): """ Print various version information, to help with error

[GitHub] [arrow] wjones127 commented on a change in pull request #12077: ARROW-15109: [Python] Add arrow_info() to print build, component, and system info

2022-01-06 Thread GitBox
wjones127 commented on a change in pull request #12077: URL: https://github.com/apache/arrow/pull/12077#discussion_r779634887 ## File path: python/pyarrow/__init__.py ## @@ -75,19 +77,74 @@ def show_versions(): """ Print various version information, to help with error

[GitHub] [arrow] lidavidm commented on pull request #12014: ARROW-10924: [C++] Validate temporal data in ValidateArrayFull

2022-01-06 Thread GitBox
lidavidm commented on pull request #12014: URL: https://github.com/apache/arrow/pull/12014#issuecomment-1006699051 Date32Type's unit is days and so there's nothing to check, but Date64Type's unit is milliseconds, hence it doesn't have a limit per se, but it must be a multiple of 8640 (

[GitHub] [arrow] edponce edited a comment on pull request #12014: ARROW-10924: [C++] Validate temporal data in ValidateArrayFull

2022-01-06 Thread GitBox
edponce edited a comment on pull request #12014: URL: https://github.com/apache/arrow/pull/12014#issuecomment-1006674558 @pitrou So there is a limit for `Date64Type` but not for `Date32Type`? I now understand that `TimestampType` does not have limits. -- This is an automated message from

[GitHub] [arrow] edponce commented on a change in pull request #12077: ARROW-15109: [Python] Add arrow_info() to print build, component, and system info

2022-01-06 Thread GitBox
edponce commented on a change in pull request #12077: URL: https://github.com/apache/arrow/pull/12077#discussion_r779651524 ## File path: python/pyarrow/__init__.py ## @@ -75,19 +77,74 @@ def show_versions(): """ Print various version information, to help with error r

[GitHub] [arrow] thisisnic commented on a change in pull request #12073: ARROW-14919: [R] write_parquet() drops attributes for grouped dataframes

2022-01-06 Thread GitBox
thisisnic commented on a change in pull request #12073: URL: https://github.com/apache/arrow/pull/12073#discussion_r779651711 ## File path: r/R/metadata.R ## @@ -133,24 +133,24 @@ remove_attributes <- function(x) { } arrow_attributes <- function(x, only_top_level = FALSE) {

[GitHub] [arrow-datafusion] ic4y commented on pull request #1520: use bumpalo for GroupState

2022-01-06 Thread GitBox
ic4y commented on pull request #1520: URL: https://github.com/apache/arrow-datafusion/pull/1520#issuecomment-1006702019 @alamb Thank you, I tried your method, and the `Drop` does consumes so much time. But I added the following to the code (this is mentioned in the [user guide](https:/

[GitHub] [arrow] edponce commented on a change in pull request #12077: ARROW-15109: [Python] Add arrow_info() to print build, component, and system info

2022-01-06 Thread GitBox
edponce commented on a change in pull request #12077: URL: https://github.com/apache/arrow/pull/12077#discussion_r779652288 ## File path: python/pyarrow/__init__.py ## @@ -75,19 +77,74 @@ def show_versions(): """ Print various version information, to help with error r

[GitHub] [arrow] jonkeane commented on a change in pull request #12073: ARROW-14919: [R] write_parquet() drops attributes for grouped dataframes

2022-01-06 Thread GitBox
jonkeane commented on a change in pull request #12073: URL: https://github.com/apache/arrow/pull/12073#discussion_r779656272 ## File path: r/R/metadata.R ## @@ -133,24 +133,24 @@ remove_attributes <- function(x) { } arrow_attributes <- function(x, only_top_level = FALSE) {

[GitHub] [arrow] thisisnic commented on a change in pull request #12073: ARROW-14919: [R] write_parquet() drops attributes for grouped dataframes

2022-01-06 Thread GitBox
thisisnic commented on a change in pull request #12073: URL: https://github.com/apache/arrow/pull/12073#discussion_r779660661 ## File path: r/R/metadata.R ## @@ -133,24 +133,24 @@ remove_attributes <- function(x) { } arrow_attributes <- function(x, only_top_level = FALSE) {

[GitHub] [arrow-datafusion] ic4y closed issue #1504: The destruction of GroupState in high cardinality aggregation takes a lot of time

2022-01-06 Thread GitBox
ic4y closed issue #1504: URL: https://github.com/apache/arrow-datafusion/issues/1504 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow-datafusion] ic4y commented on issue #1504: The destruction of GroupState in high cardinality aggregation takes a lot of time

2022-01-06 Thread GitBox
ic4y commented on issue #1504: URL: https://github.com/apache/arrow-datafusion/issues/1504#issuecomment-1006710778 > By using --features "mimalloc", it was found that the test results did not differ much. Using --features "mimalloc" did not take effect. Add the following code

[GitHub] [arrow] jonkeane commented on a change in pull request #12073: ARROW-14919: [R] write_parquet() drops attributes for grouped dataframes

2022-01-06 Thread GitBox
jonkeane commented on a change in pull request #12073: URL: https://github.com/apache/arrow/pull/12073#discussion_r779662565 ## File path: r/R/metadata.R ## @@ -133,24 +133,24 @@ remove_attributes <- function(x) { } arrow_attributes <- function(x, only_top_level = FALSE) {

[GitHub] [arrow] jonkeane commented on a change in pull request #11894: ARROW-14029: [R] Repair map_batches()

2022-01-06 Thread GitBox
jonkeane commented on a change in pull request #11894: URL: https://github.com/apache/arrow/pull/11894#discussion_r779659047 ## File path: r/tests/testthat/test-dataset.R ## @@ -453,15 +453,38 @@ test_that("Creating UnionDataset", { }) test_that("map_batches", { - skip("ma

[GitHub] [arrow] dragosmg commented on pull request #11921: ARROW-12743 [R] Add DESCRIPTION fields for dev dependencies

2022-01-06 Thread GitBox
dragosmg commented on pull request #11921: URL: https://github.com/apache/arrow/pull/11921#issuecomment-1006722558 There is a [new GHA workflow for installing dependencies](https://github.com/r-lib/actions/tree/v2-branch/setup-r-dependencies). Do we want to make use of it or should I aim t

[GitHub] [arrow] dragosmg edited a comment on pull request #11921: ARROW-12743 [R] Add DESCRIPTION fields for dev dependencies

2022-01-06 Thread GitBox
dragosmg edited a comment on pull request #11921: URL: https://github.com/apache/arrow/pull/11921#issuecomment-1006722558 There is a [new(ish) GHA workflow for installing dependencies](https://github.com/r-lib/actions/tree/v2-branch/setup-r-dependencies). Do we want to make use of it or sh

[GitHub] [arrow] zeroshade commented on a change in pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-06 Thread GitBox
zeroshade commented on a change in pull request #11538: URL: https://github.com/apache/arrow/pull/11538#discussion_r779674695 ## File path: go/parquet/file/page_writer.go ## @@ -0,0 +1,466 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] ursabot edited a comment on pull request #12040: ARROW-15199: [Java] Update protobuf-maven-plugin to avoid 'Text file busy' failure

2022-01-06 Thread GitBox
ursabot edited a comment on pull request #12040: URL: https://github.com/apache/arrow/pull/12040#issuecomment-1006067533 Benchmark runs are scheduled for baseline = d30669b7f27f60b20f02c45f1e4ca22f4abd06e1 and contender = 4d8fe01c0ce6804f1d30e156756707e6c0daf9d2. 4d8fe01c0ce6804f1d30e1567

[GitHub] [arrow] eerhardt commented on issue #12049: How to read arrow files data in C#

2022-01-06 Thread GitBox
eerhardt commented on issue #12049: URL: https://github.com/apache/arrow/issues/12049#issuecomment-1006729352 ``` var col = readBatch.Column(0); ``` Here `col` is an `IArrowArray`. In order to read the values from an array, you need to know which type of data is contained in th

[GitHub] [arrow] github-actions[bot] commented on pull request #12091: ARROW-14798: [C++][Python] Add child window to PrettyPrintOptions [WIP]

2022-01-06 Thread GitBox
github-actions[bot] commented on pull request #12091: URL: https://github.com/apache/arrow/pull/12091#issuecomment-1006732723 https://issues.apache.org/jira/browse/ARROW-14798 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] zeroshade commented on a change in pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-06 Thread GitBox
zeroshade commented on a change in pull request #11538: URL: https://github.com/apache/arrow/pull/11538#discussion_r779685988 ## File path: go/parquet/file/page_writer.go ## @@ -0,0 +1,466 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] wjones127 commented on a change in pull request #11894: ARROW-14029: [R] Repair map_batches()

2022-01-06 Thread GitBox
wjones127 commented on a change in pull request #11894: URL: https://github.com/apache/arrow/pull/11894#discussion_r779687169 ## File path: r/tests/testthat/test-dataset.R ## @@ -453,15 +453,38 @@ test_that("Creating UnionDataset", { }) test_that("map_batches", { - skip("m

[GitHub] [arrow] wjones127 commented on a change in pull request #11894: ARROW-14029: [R] Repair map_batches()

2022-01-06 Thread GitBox
wjones127 commented on a change in pull request #11894: URL: https://github.com/apache/arrow/pull/11894#discussion_r779694269 ## File path: r/tests/testthat/test-dataset.R ## @@ -453,15 +453,38 @@ test_that("Creating UnionDataset", { }) test_that("map_batches", { - skip("m

[GitHub] [arrow] wjones127 commented on a change in pull request #11894: ARROW-14029: [R] Repair map_batches()

2022-01-06 Thread GitBox
wjones127 commented on a change in pull request #11894: URL: https://github.com/apache/arrow/pull/11894#discussion_r779696866 ## File path: r/R/dataset-scan.R ## @@ -174,8 +174,6 @@ ScanTask <- R6Class("ScanTask", #' a `data.frame` for further aggregation, even if you couldn't

[GitHub] [arrow] jonkeane commented on a change in pull request #11894: ARROW-14029: [R] Repair map_batches()

2022-01-06 Thread GitBox
jonkeane commented on a change in pull request #11894: URL: https://github.com/apache/arrow/pull/11894#discussion_r779698651 ## File path: r/tests/testthat/test-dataset.R ## @@ -453,15 +453,38 @@ test_that("Creating UnionDataset", { }) test_that("map_batches", { - skip("ma

[GitHub] [arrow] jonkeane commented on a change in pull request #11894: ARROW-14029: [R] Repair map_batches()

2022-01-06 Thread GitBox
jonkeane commented on a change in pull request #11894: URL: https://github.com/apache/arrow/pull/11894#discussion_r779700156 ## File path: r/R/dataset-scan.R ## @@ -174,8 +174,6 @@ ScanTask <- R6Class("ScanTask", #' a `data.frame` for further aggregation, even if you couldn't

[GitHub] [arrow] jonkeane commented on a change in pull request #11894: ARROW-14029: [R] Repair map_batches()

2022-01-06 Thread GitBox
jonkeane commented on a change in pull request #11894: URL: https://github.com/apache/arrow/pull/11894#discussion_r779700853 ## File path: r/tests/testthat/test-dataset.R ## @@ -453,15 +453,38 @@ test_that("Creating UnionDataset", { }) test_that("map_batches", { - skip("ma

[GitHub] [arrow] westonpace commented on issue #11846: Crash when import/export between C++ and Rust

2022-01-06 Thread GitBox
westonpace commented on issue #11846: URL: https://github.com/apache/arrow/issues/11846#issuecomment-1006754217 @hu6360567 I noticed that you reopened this issue. Is there still work to be done here? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] westonpace closed issue #11661: [Python] hi, are there any ways to copy files from hdfs to local or vice versa, but preserving metadata as the last modification date, etc?

2022-01-06 Thread GitBox
westonpace closed issue #11661: URL: https://github.com/apache/arrow/issues/11661 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

[GitHub] [arrow] tachyonwill commented on a change in pull request #11984: PARQUET-2109: [C++] Check if Parquet page has too few values

2022-01-06 Thread GitBox
tachyonwill commented on a change in pull request #11984: URL: https://github.com/apache/arrow/pull/11984#discussion_r779717505 ## File path: cpp/src/parquet/column_reader.cc ## @@ -970,6 +970,9 @@ int64_t TypedColumnReaderImpl::ReadBatchWithDictionary( // Read dictionary i

[GitHub] [arrow] lidavidm commented on pull request #12085: ARROW-15248: [C++][Docs] Improve docs about linting/formatting

2022-01-06 Thread GitBox
lidavidm commented on pull request #12085: URL: https://github.com/apache/arrow/pull/12085#issuecomment-1006771350 Thanks for all the suggestions, I've tweaked the wording. Rendered ![image](https://user-images.githubusercontent.com/327919/148424872-be0e7006-aa8b-4554-

[GitHub] [arrow] tachyonwill commented on a change in pull request #11984: PARQUET-2109: [C++] Check if Parquet page has too few values

2022-01-06 Thread GitBox
tachyonwill commented on a change in pull request #11984: URL: https://github.com/apache/arrow/pull/11984#discussion_r779722792 ## File path: cpp/src/parquet/column_reader.cc ## @@ -993,6 +996,9 @@ int64_t TypedColumnReaderImpl::ReadBatch(int64_t batch_size, int16_t* def

[GitHub] [arrow] tachyonwill commented on a change in pull request #11984: PARQUET-2109: [C++] Check if Parquet page has too few values

2022-01-06 Thread GitBox
tachyonwill commented on a change in pull request #11984: URL: https://github.com/apache/arrow/pull/11984#discussion_r779722792 ## File path: cpp/src/parquet/column_reader.cc ## @@ -993,6 +996,9 @@ int64_t TypedColumnReaderImpl::ReadBatch(int64_t batch_size, int16_t* def

[GitHub] [arrow-rs] helgikrs opened a new pull request #1140: feat(ipc): support for reading union arrays through IPC

2022-01-06 Thread GitBox
helgikrs opened a new pull request #1140: URL: https://github.com/apache/arrow-rs/pull/1140 # Which issue does this PR close? Together with #1135, closes #654. # What changes are included in this PR? With this change, IPC serialized arrow data containing unions can be d

[GitHub] [arrow] wjones127 commented on a change in pull request #11894: ARROW-14029: [R] Repair map_batches()

2022-01-06 Thread GitBox
wjones127 commented on a change in pull request #11894: URL: https://github.com/apache/arrow/pull/11894#discussion_r779754498 ## File path: r/R/dataset-scan.R ## @@ -185,17 +183,36 @@ ScanTask <- R6Class("ScanTask", #' `data.frame`? Default `TRUE` #' @export map_batches <- f

[GitHub] [arrow] zeroshade commented on a change in pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-06 Thread GitBox
zeroshade commented on a change in pull request #11538: URL: https://github.com/apache/arrow/pull/11538#discussion_r779755973 ## File path: go/parquet/file/page_writer.go ## @@ -0,0 +1,466 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] zeroshade commented on a change in pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-06 Thread GitBox
zeroshade commented on a change in pull request #11538: URL: https://github.com/apache/arrow/pull/11538#discussion_r779756809 ## File path: go/parquet/internal/testutils/primitive_typed.go ## @@ -0,0 +1,305 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// o

[GitHub] [arrow] zeroshade commented on pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-06 Thread GitBox
zeroshade commented on pull request #11538: URL: https://github.com/apache/arrow/pull/11538#issuecomment-1006808664 @emkornfield Updated from feedback, rebased/merged changes from master. Let me know if you're happy with the changes/fixes I made from your comments. Thanks again! -- This

[GitHub] [arrow] wjones127 commented on a change in pull request #11894: ARROW-14029: [R] Repair map_batches()

2022-01-06 Thread GitBox
wjones127 commented on a change in pull request #11894: URL: https://github.com/apache/arrow/pull/11894#discussion_r779761467 ## File path: r/R/dataset-scan.R ## @@ -174,8 +174,6 @@ ScanTask <- R6Class("ScanTask", #' a `data.frame` for further aggregation, even if you couldn't

[GitHub] [arrow] ursabot edited a comment on pull request #12082: MINOR: [Docs] Add fill_null functions to docs

2022-01-06 Thread GitBox
ursabot edited a comment on pull request #12082: URL: https://github.com/apache/arrow/pull/12082#issuecomment-1006067549 Benchmark runs are scheduled for baseline = 4d8fe01c0ce6804f1d30e156756707e6c0daf9d2 and contender = 63010622e6d817321106ee392eff69962806c2b3. 63010622e6d817321106ee392

[GitHub] [arrow] emkornfield commented on pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-06 Thread GitBox
emkornfield commented on pull request #11538: URL: https://github.com/apache/arrow/pull/11538#issuecomment-1006815354 Also I forget if I CCed you on https://issues.apache.org/jira/browse/PARQUET-2067 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] wjones127 commented on a change in pull request #11894: ARROW-14029: [R] Repair map_batches()

2022-01-06 Thread GitBox
wjones127 commented on a change in pull request #11894: URL: https://github.com/apache/arrow/pull/11894#discussion_r779764431 ## File path: r/tests/testthat/test-dataset.R ## @@ -453,15 +453,38 @@ test_that("Creating UnionDataset", { }) test_that("map_batches", { - skip("m

  1   2   >