[GitHub] [arrow] amol- commented on a change in pull request #11886: ARROW-13035: [C++] indices_nonzero compute function

2022-01-04 Thread GitBox
amol- commented on a change in pull request #11886: URL: https://github.com/apache/arrow/pull/11886#discussion_r777976116 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -2378,26 +2378,44 @@ struct NonZeroVisitor { using T = typename GetViewType::T;

[GitHub] [arrow] amol- commented on a change in pull request #11886: ARROW-13035: [C++] indices_nonzero compute function

2022-01-04 Thread GitBox
amol- commented on a change in pull request #11886: URL: https://github.com/apache/arrow/pull/11886#discussion_r777980504 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -2355,6 +2358,98 @@ const FunctionDoc array_take_doc( "given by `indices`. Null

[GitHub] [arrow] amol- commented on a change in pull request #11886: ARROW-13035: [C++] indices_nonzero compute function

2022-01-04 Thread GitBox
amol- commented on a change in pull request #11886: URL: https://github.com/apache/arrow/pull/11886#discussion_r777981323 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -2355,6 +2358,98 @@ const FunctionDoc array_take_doc( "given by `indices`. Null

[GitHub] [arrow] zhixingheyi-tian commented on a change in pull request #11763: ARROW-14153: [C++][Dataset] Add support for batch_size in the ORC Scanner

2022-01-04 Thread GitBox
zhixingheyi-tian commented on a change in pull request #11763: URL: https://github.com/apache/arrow/pull/11763#discussion_r777983208 ## File path: cpp/src/arrow/adapters/orc/adapter.h ## @@ -231,6 +231,19 @@ class ARROW_EXPORT ORCFileReader { Status NextStripeReader(int64_t

[GitHub] [arrow] pitrou closed pull request #11841: ARROW-13923: [C++] Faster CSV chunker with long CSV cells

2022-01-04 Thread GitBox
pitrou closed pull request #11841: URL: https://github.com/apache/arrow/pull/11841 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] tustvold commented on pull request #1130: Fix reading of dictionary encoded pages with null values (#1111)

2022-01-04 Thread GitBox
tustvold commented on pull request #1130: URL: https://github.com/apache/arrow-rs/pull/1130#issuecomment-1004710388 Thank you for this :tada:. I will take a look today if I have time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [arrow-rs] alamb commented on a change in pull request #1130: Fix reading of dictionary encoded pages with null values (#1111)

2022-01-04 Thread GitBox
alamb commented on a change in pull request #1130: URL: https://github.com/apache/arrow-rs/pull/1130#discussion_r778014042 ## File path: parquet/src/arrow/arrow_array_reader.rs ## @@ -420,6 +456,24 @@ impl<'a, C: ArrayConverter + 'a> ArrowArrayReader<'a, C> { } }

[GitHub] [arrow-datafusion] alamb commented on pull request #1477: Fix SortExec discards field metadata on the output schema

2022-01-04 Thread GitBox
alamb commented on pull request #1477: URL: https://github.com/apache/arrow-datafusion/pull/1477#issuecomment-1004741399 > @alamb If I recall the reason this change was made was because in our cases the schema was stripped of the timezone information but the underlying record batches pres

[GitHub] [arrow-rs] alamb commented on issue #1128: Implement `Array` for `ArrayRef`

2022-01-04 Thread GitBox
alamb commented on issue #1128: URL: https://github.com/apache/arrow-rs/issues/1128#issuecomment-1004742974 Another challenge for ``` fn as_primitive_array<'a, T>(arr: &'a impl AsRef) -> &'a PrimitiveArray ``` Is that I think all the existing callsites like `as_primtive

[GitHub] [arrow] zhixingheyi-tian commented on a change in pull request #11763: ARROW-14153: [C++][Dataset] Add support for batch_size in the ORC Scanner

2022-01-04 Thread GitBox
zhixingheyi-tian commented on a change in pull request #11763: URL: https://github.com/apache/arrow/pull/11763#discussion_r778025410 ## File path: cpp/src/arrow/dataset/file_orc.cc ## @@ -85,24 +85,20 @@ class OrcScanTask : public ScanTask { included_fields.push_back

[GitHub] [arrow] pitrou opened a new pull request #12069: ARROW-15243: [CI][Python] Make PyArrow installation more robust in CI

2022-01-04 Thread GitBox
pitrou opened a new pull request #12069: URL: https://github.com/apache/arrow/pull/12069 On Debian/Ubuntu, the system-provided setuptools is patched to install into the system-specific `dist-packages` directory. However, the newer setuptools from upstream would install PyArrow in the

[GitHub] [arrow] pitrou commented on pull request #12069: ARROW-15243: [CI][Python] Make PyArrow installation more robust in CI

2022-01-04 Thread GitBox
pitrou commented on pull request #12069: URL: https://github.com/apache/arrow/pull/12069#issuecomment-1004753094 @github-actions crossbow submit -g python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] github-actions[bot] commented on pull request #12069: ARROW-15243: [CI][Python] Make PyArrow installation more robust in CI

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12069: URL: https://github.com/apache/arrow/pull/12069#issuecomment-1004753133 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] github-actions[bot] commented on pull request #12069: ARROW-15243: [CI][Python] Make PyArrow installation more robust in CI

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12069: URL: https://github.com/apache/arrow/pull/12069#issuecomment-1004753842 Revision: b79a50a38a5677e9d0ea810ef01829c4cad30d58 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1362](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] paleolimbot opened a new pull request #12070: ARROW-13087: [R] Expose Parquet ArrowReaderProperties::coerce_int96_timestamp_unit_

2022-01-04 Thread GitBox
paleolimbot opened a new pull request #12070: URL: https://github.com/apache/arrow/pull/12070 This PR adds a setter and getter for `ArrowReaderProperties::coerce_int96_timestamp_unit_`. Reprex: ``` r library(arrow, warn.conflicts = FALSE) tf <- tempfile() write_pa

[GitHub] [arrow] github-actions[bot] commented on pull request #12070: ARROW-13087: [R] Expose Parquet ArrowReaderProperties::coerce_int96_timestamp_unit_

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12070: URL: https://github.com/apache/arrow/pull/12070#issuecomment-1004760367 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] pitrou commented on pull request #12069: ARROW-15243: [CI][Python] Make PyArrow installation more robust in CI

2022-01-04 Thread GitBox
pitrou commented on pull request #12069: URL: https://github.com/apache/arrow/pull/12069#issuecomment-1004774814 @github-actions crossbow submit -g python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] github-actions[bot] commented on pull request #12069: ARROW-15243: [CI][Python] Make PyArrow installation more robust in CI

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12069: URL: https://github.com/apache/arrow/pull/12069#issuecomment-1004775652 Revision: 491ea6cffc8592f26a9f3c7334ac15781aa65684 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1363](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] lidavidm commented on a change in pull request #12066: ARROW-15241: [C++] MakeArrayOfNull fails on extension types with a nested storage type

2022-01-04 Thread GitBox
lidavidm commented on a change in pull request #12066: URL: https://github.com/apache/arrow/pull/12066#discussion_r778084326 ## File path: cpp/src/arrow/array/util.cc ## @@ -511,6 +511,9 @@ class NullArrayFactory { } Status Visit(const ExtensionType& type) { +std::v

[GitHub] [arrow] pitrou commented on pull request #12069: ARROW-15243: [CI][Python] Make PyArrow installation more robust in CI

2022-01-04 Thread GitBox
pitrou commented on pull request #12069: URL: https://github.com/apache/arrow/pull/12069#issuecomment-1004817151 @github-actions crossbow submit -g python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] github-actions[bot] commented on pull request #12069: ARROW-15243: [CI][Python] Make PyArrow installation more robust in CI

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12069: URL: https://github.com/apache/arrow/pull/12069#issuecomment-1004817901 Revision: 6995ac404704b5f8c2d6965b44e82626228ed4ba Submitted crossbow builds: [ursacomputing/crossbow @ actions-1364](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] amol- commented on pull request #11886: ARROW-13035: [C++] indices_nonzero compute function

2022-01-04 Thread GitBox
amol- commented on pull request #11886: URL: https://github.com/apache/arrow/pull/11886#issuecomment-1004825795 > Thanks for the update, here are two more comments. Should have addressed all of them -- This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [arrow] lidavidm commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
lidavidm commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778089595 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1899,6 +1900,71 @@ def test_assume_timezone(): result.equals(pa.array(expected)) +def

[GitHub] [arrow] domoritz edited a comment on issue #11746: [JS] Where is the filter function?

2022-01-04 Thread GitBox
domoritz edited a comment on issue #11746: URL: https://github.com/apache/arrow/issues/11746#issuecomment-988826978 The docs are outdated. We only have filter on data frames. The data frame type extends table. -- This is an automated message from the Apache Git Service. To respond to th

[GitHub] [arrow] github-actions[bot] commented on pull request #10371: ARROW-12549: [JS] Table and RecordBatch should not extend Vector, make JS lib smaller

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #10371: URL: https://github.com/apache/arrow/pull/10371#issuecomment-1004838632 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [arrow] lidavidm commented on a change in pull request #11853: ARROW-1699: [C++] forward, backward fill kernel functions

2022-01-04 Thread GitBox
lidavidm commented on a change in pull request #11853: URL: https://github.com/apache/arrow/pull/11853#discussion_r778106647 ## File path: cpp/src/arrow/compute/kernels/vector_replace_test.cc ## @@ -793,5 +851,772 @@ TYPED_TEST(TestReplaceBinary, ReplaceWithMaskRandom) { }

[GitHub] [arrow] lidavidm commented on a change in pull request #11853: ARROW-1699: [C++] forward, backward fill kernel functions

2022-01-04 Thread GitBox
lidavidm commented on a change in pull request #11853: URL: https://github.com/apache/arrow/pull/11853#discussion_r778107623 ## File path: cpp/src/arrow/compute/kernels/vector_replace_test.cc ## @@ -793,5 +851,772 @@ TYPED_TEST(TestReplaceBinary, ReplaceWithMaskRandom) { }

[GitHub] [arrow] coryan commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-04 Thread GitBox
coryan commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1004869230 @xhochy ping, is there something I can do to make this move forward? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[GitHub] [arrow] jonkeane closed pull request #11904: ARROW-15010: [R] Create a function registry for our NSE funcs

2022-01-04 Thread GitBox
jonkeane closed pull request #11904: URL: https://github.com/apache/arrow/pull/11904 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow] vibhatha commented on a change in pull request #12033: ARROW-15091: [C++][Doc] Document nodes in C++ streaming execution engine

2022-01-04 Thread GitBox
vibhatha commented on a change in pull request #12033: URL: https://github.com/apache/arrow/pull/12033#discussion_r778143970 ## File path: docs/source/cpp/streaming_execution.rst ## @@ -305,3 +305,601 @@ Datasets may be scanned multiple times; just make multiple scan nodes fr

[GitHub] [arrow] AlenkaF commented on pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

2022-01-04 Thread GitBox
AlenkaF commented on pull request #12010: URL: https://github.com/apache/arrow/pull/12010#issuecomment-1004884636 I have corrected the code so that `pylist` is meant to be structured as a list of dicts, one dict per row. If the column name is missing from the schema or from the data, `None

[GitHub] [arrow] pitrou closed pull request #12015: ARROW-15184: [C++] Unit tests of reading delta-encoded Parquet files with and without nulls

2022-01-04 Thread GitBox
pitrou closed pull request #12015: URL: https://github.com/apache/arrow/pull/12015 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] pitrou opened a new pull request #12071: MINOR: Update testing submodule

2022-01-04 Thread GitBox
pitrou opened a new pull request #12071: URL: https://github.com/apache/arrow/pull/12071 The testing submodule had been mistakenly reverted to a previous changeset. See https://github.com/apache/arrow/pull/11911#issuecomment-998226398 -- This is an automated message from the Apache Git

[GitHub] [arrow-datafusion] ic4y opened a new pull request #1520: use bumpalo for GroupState

2022-01-04 Thread GitBox
ic4y opened a new pull request #1520: URL: https://github.com/apache/arrow-datafusion/pull/1520 # Which issue does this PR close? Closes #1504 # Rationale for this change Using bumpalo to allocate GroupState to a chunk of memory as much as possible # What cha

[GitHub] [arrow] pitrou commented on pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

2022-01-04 Thread GitBox
pitrou commented on pull request #11855: URL: https://github.com/apache/arrow/pull/11855#issuecomment-1004896133 Thanks for the update! I rebased and will merge if CI is green. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] pitrou commented on a change in pull request #12062: ARROW-15173: [R] Provide backward compatibility for bridge to older versions of pyarrow

2022-01-04 Thread GitBox
pitrou commented on a change in pull request #12062: URL: https://github.com/apache/arrow/pull/12062#discussion_r778163085 ## File path: r/R/python.R ## @@ -181,9 +257,15 @@ r_to_py.RecordBatchReader <- function(x, convert = FALSE) { pa <- reticulate::import("pyarrow", conve

[GitHub] [arrow] rok commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
rok commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778163857 ## File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc ## @@ -1324,6 +1357,578 @@ TEST_F(ScalarTemporalTest, TestTemporalDifferenceZoned) { }

[GitHub] [arrow] rok commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
rok commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778167172 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1899,6 +1900,71 @@ def test_assume_timezone(): result.equals(pa.array(expected)) +def _chec

[GitHub] [arrow] rok commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
rok commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778170813 ## File path: cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc ## @@ -452,6 +474,296 @@ struct Nanosecond { } }; +// -

[GitHub] [arrow] rok commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
rok commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778172402 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1899,6 +1900,71 @@ def test_assume_timezone(): result.equals(pa.array(expected)) +def _chec

[GitHub] [arrow] zeroshade commented on pull request #11538: ARROW-13986: [Go][Parquet] Add File Writers and tests

2022-01-04 Thread GitBox
zeroshade commented on pull request #11538: URL: https://github.com/apache/arrow/pull/11538#issuecomment-1004908750 @emkornfield Sorry I've been out for a few weeks + a week for being sick. I'd love to be able to get this merged in time for the v7 release if at all possible. Would you have

[GitHub] [arrow] amol- commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

2022-01-04 Thread GitBox
amol- commented on a change in pull request #12010: URL: https://github.com/apache/arrow/pull/12010#discussion_r778175902 ## File path: python/pyarrow/table.pxi ## @@ -2442,6 +2602,46 @@ def _from_pydict(cls, mapping, schema, metadata): raise TypeError('Schema must be

[GitHub] [arrow] amol- commented on a change in pull request #12010: ARROW-6001 [Python]: Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

2022-01-04 Thread GitBox
amol- commented on a change in pull request #12010: URL: https://github.com/apache/arrow/pull/12010#discussion_r778175902 ## File path: python/pyarrow/table.pxi ## @@ -2442,6 +2602,46 @@ def _from_pydict(cls, mapping, schema, metadata): raise TypeError('Schema must be

[GitHub] [arrow] github-actions[bot] commented on pull request #12072: ARROW-15235: [R] drop support for R 3.3

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12072: URL: https://github.com/apache/arrow/pull/12072#issuecomment-1004913872 https://issues.apache.org/jira/browse/ARROW-15235 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] jonkeane commented on pull request #12072: ARROW-15235: [R] drop support for R 3.3

2022-01-04 Thread GitBox
jonkeane commented on pull request #12072: URL: https://github.com/apache/arrow/pull/12072#issuecomment-1004913965 @github-actions crossbow submit test-r-versions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [arrow] github-actions[bot] commented on pull request #12072: ARROW-15235: [R] drop support for R 3.3

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12072: URL: https://github.com/apache/arrow/pull/12072#issuecomment-1004914846 Revision: 9ca260a0553677280150734e1a6a5988453985cb Submitted crossbow builds: [ursacomputing/crossbow @ actions-1365](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] pitrou commented on a change in pull request #11946: ARROW-13663: [C++] RecordBatchReader STL-like iteration

2022-01-04 Thread GitBox
pitrou commented on a change in pull request #11946: URL: https://github.com/apache/arrow/pull/11946#discussion_r778168422 ## File path: cpp/src/arrow/record_batch.h ## @@ -234,6 +234,53 @@ class ARROW_EXPORT RecordBatchReader { return batch; } + class RecordBatchRea

[GitHub] [arrow] pitrou commented on pull request #11886: ARROW-13035: [C++] indices_nonzero compute function

2022-01-04 Thread GitBox
pitrou commented on pull request #11886: URL: https://github.com/apache/arrow/pull/11886#issuecomment-1004919285 Thanks for the update! I rebased and will merge if CI is green. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] pitrou commented on pull request #12072: ARROW-15235: [R] drop support for R 3.3

2022-01-04 Thread GitBox
pitrou commented on pull request #12072: URL: https://github.com/apache/arrow/pull/12072#issuecomment-1004922153 Shouldn't you update other occurrences? ``` r/DESCRIPTION 21:Depends: R (>= 3.3) r/tests/testthat/test-compute-aggregate.R 286:"The median generic lacks dot

[GitHub] [arrow] kszucs closed pull request #12069: ARROW-15243: [CI][Python] Make PyArrow installation more robust in CI

2022-01-04 Thread GitBox
kszucs closed pull request #12069: URL: https://github.com/apache/arrow/pull/12069 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] kszucs merged pull request #12071: MINOR: Update testing submodule

2022-01-04 Thread GitBox
kszucs merged pull request #12071: URL: https://github.com/apache/arrow/pull/12071 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] pitrou commented on pull request #11911: ARROW-15019: [Python] Add bindings for new dataset writing options

2022-01-04 Thread GitBox
pitrou commented on pull request #11911: URL: https://github.com/apache/arrow/pull/11911#issuecomment-1004927985 Submodule updated in https://github.com/apache/arrow/pull/12071 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] pitrou closed pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

2022-01-04 Thread GitBox
pitrou closed pull request #11855: URL: https://github.com/apache/arrow/pull/11855 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-datafusion] Jimexist merged pull request #1518: remove python wrapper and redirect to the contrib repo

2022-01-04 Thread GitBox
Jimexist merged pull request #1518: URL: https://github.com/apache/arrow-datafusion/pull/1518 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gith

[GitHub] [arrow-datafusion] Jimexist closed issue #1324: move datafusion-python to the contrib org?

2022-01-04 Thread GitBox
Jimexist closed issue #1324: URL: https://github.com/apache/arrow-datafusion/issues/1324 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-un

[GitHub] [arrow] pitrou commented on pull request #12055: ARROW-11989: [C++][Python] Improve ChunkedArray's complexity for the access of elements

2022-01-04 Thread GitBox
pitrou commented on pull request #12055: URL: https://github.com/apache/arrow/pull/12055#issuecomment-1004932886 I think we should simply factor out and reuse the chunk resolver from `arrow/compute/kernels/chunked_internal.h` -- This is an automated message from the Apache Git Service. T

[GitHub] [arrow] pitrou commented on a change in pull request #12019: (docs) Clarify that offsets are monotonic for binary like arrays

2022-01-04 Thread GitBox
pitrou commented on a change in pull request #12019: URL: https://github.com/apache/arrow/pull/12019#discussion_r778202893 ## File path: docs/source/format/Columnar.rst ## @@ -309,7 +309,11 @@ That is, a null value may occupy a **non-empty** memory space in the data buffer. W

[GitHub] [arrow] dhruv9vats commented on a change in pull request #11946: ARROW-13663: [C++] RecordBatchReader STL-like iteration

2022-01-04 Thread GitBox
dhruv9vats commented on a change in pull request #11946: URL: https://github.com/apache/arrow/pull/11946#discussion_r778203973 ## File path: cpp/src/arrow/record_batch.h ## @@ -234,6 +234,53 @@ class ARROW_EXPORT RecordBatchReader { return batch; } + class RecordBatc

[GitHub] [arrow] github-actions[bot] commented on pull request #12019: ARROW-15244: [Format] Clarify that offsets are monotonic for binary like arrays

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12019: URL: https://github.com/apache/arrow/pull/12019#issuecomment-1004941015 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-datafusion] ic4y commented on pull request #1520: use bumpalo for GroupState

2022-01-04 Thread GitBox
ic4y commented on pull request #1520: URL: https://github.com/apache/arrow-datafusion/pull/1520#issuecomment-1004941980 From ```rust struct Accumulators { map: RawTable<(u64, usize)>, group_states: Vec, } ``` To ```rust struct Accumulators {

[GitHub] [arrow] edponce commented on pull request #12055: ARROW-11989: [C++][Python] Improve ChunkedArray's complexity for the access of elements

2022-01-04 Thread GitBox
edponce commented on pull request #12055: URL: https://github.com/apache/arrow/pull/12055#issuecomment-1004941921 @pitrou Thanks! I had no idea of the existence of `ChunkResolver`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [arrow] pitrou commented on pull request #12055: ARROW-11989: [C++][Python] Improve ChunkedArray's complexity for the access of elements

2022-01-04 Thread GitBox
pitrou commented on pull request #12055: URL: https://github.com/apache/arrow/pull/12055#issuecomment-1004942494 Sorry. I should have mentioned it on the JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [arrow] pitrou commented on pull request #11360: ARROW-13610: [R] Unvendor cpp11

2022-01-04 Thread GitBox
pitrou commented on pull request #11360: URL: https://github.com/apache/arrow/pull/11360#issuecomment-1004945800 Is an update needed here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [arrow-datafusion] matthewmturner commented on issue #907: S3 Support

2022-01-04 Thread GitBox
matthewmturner commented on issue #907: URL: https://github.com/apache/arrow-datafusion/issues/907#issuecomment-1004945831 @houqp @alamb im very interested in getting s3 / glue support added to datafusion cli. do you know of a way to feature gate other libraries without adding their code

[GitHub] [arrow] pitrou closed pull request #12019: ARROW-15244: [Format] Clarify that offsets are monotonic for binary like arrays

2022-01-04 Thread GitBox
pitrou closed pull request #12019: URL: https://github.com/apache/arrow/pull/12019 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] jonkeane commented on pull request #11360: ARROW-13610: [R] Unvendor cpp11

2022-01-04 Thread GitBox
jonkeane commented on pull request #11360: URL: https://github.com/apache/arrow/pull/11360#issuecomment-1004949714 Yeah, I haven't had a chance to see what's causing RTools 3.5 to fail with this change, but we need to address that before we merge -- This is an automated message from the

[GitHub] [arrow] vibhatha commented on a change in pull request #12033: ARROW-15091: [C++][Doc] Document nodes in C++ streaming execution engine

2022-01-04 Thread GitBox
vibhatha commented on a change in pull request #12033: URL: https://github.com/apache/arrow/pull/12033#discussion_r778214202 ## File path: docs/source/cpp/streaming_execution.rst ## @@ -305,3 +305,601 @@ Datasets may be scanned multiple times; just make multiple scan nodes fr

[GitHub] [arrow] vibhatha commented on a change in pull request #12033: ARROW-15091: [C++][Doc] Document nodes in C++ streaming execution engine

2022-01-04 Thread GitBox
vibhatha commented on a change in pull request #12033: URL: https://github.com/apache/arrow/pull/12033#discussion_r778215575 ## File path: docs/source/cpp/streaming_execution.rst ## @@ -305,3 +305,601 @@ Datasets may be scanned multiple times; just make multiple scan nodes fr

[GitHub] [arrow] jonkeane commented on pull request #12072: ARROW-15235: [R] drop support for R 3.3

2022-01-04 Thread GitBox
jonkeane commented on pull request #12072: URL: https://github.com/apache/arrow/pull/12072#issuecomment-1004954409 Thanks for catching those, I'll update the relevant ones (most of those are false postives/reference other things). At first I didn't want to pull out the various 3.3 bits (si

[GitHub] [arrow] zeroshade commented on a change in pull request #12009: ARROW-15172: [Go] Add Arm64 Neon implementation for Arrow-math

2022-01-04 Thread GitBox
zeroshade commented on a change in pull request #12009: URL: https://github.com/apache/arrow/pull/12009#discussion_r778217652 ## File path: go/arrow/math/Makefile ## @@ -37,27 +41,42 @@ INTEL_SOURCES := \ int64_avx2_amd64.s int64_sse4_amd64.s \ uint64_avx2_amd64

[GitHub] [arrow] zeroshade commented on pull request #11514: ARROW-14430: [Go] Basic Expression, Field Reference and Datum handling

2022-01-04 Thread GitBox
zeroshade commented on pull request #11514: URL: https://github.com/apache/arrow/pull/11514#issuecomment-1004956504 @emkornfield As with the parquet writer, if at all possible, I'd like to get this merged for v7 if you can carve some time out to give this a review. Thanks much! -- This

[GitHub] [arrow] paleolimbot commented on a change in pull request #12062: ARROW-15173: [R] Provide backward compatibility for bridge to older versions of pyarrow

2022-01-04 Thread GitBox
paleolimbot commented on a change in pull request #12062: URL: https://github.com/apache/arrow/pull/12062#discussion_r778225004 ## File path: r/R/python.R ## @@ -181,9 +257,15 @@ r_to_py.RecordBatchReader <- function(x, convert = FALSE) { pa <- reticulate::import("pyarrow",

[GitHub] [arrow] vibhatha commented on a change in pull request #12033: ARROW-15091: [C++][Doc] Document nodes in C++ streaming execution engine

2022-01-04 Thread GitBox
vibhatha commented on a change in pull request #12033: URL: https://github.com/apache/arrow/pull/12033#discussion_r778232832 ## File path: docs/source/cpp/streaming_execution.rst ## @@ -175,9 +175,607 @@ their completion:: // alive until this future is marked finished.

[GitHub] [arrow] vibhatha commented on a change in pull request #12033: ARROW-15091: [C++][Doc] Document nodes in C++ streaming execution engine

2022-01-04 Thread GitBox
vibhatha commented on a change in pull request #12033: URL: https://github.com/apache/arrow/pull/12033#discussion_r778233710 ## File path: docs/source/cpp/streaming_execution.rst ## @@ -175,9 +175,607 @@ their completion:: // alive until this future is marked finished.

[GitHub] [arrow] lidavidm commented on a change in pull request #12033: ARROW-15091: [C++][Doc] Document nodes in C++ streaming execution engine

2022-01-04 Thread GitBox
lidavidm commented on a change in pull request #12033: URL: https://github.com/apache/arrow/pull/12033#discussion_r778073274 ## File path: cpp/examples/arrow/execution_plan_documentation_examples.cc ## @@ -177,6 +177,16 @@ cp::Expression Materialize(std::vector names, retu

[GitHub] [arrow] lidavidm commented on pull request #12033: ARROW-15091: [C++][Doc] Document nodes in C++ streaming execution engine

2022-01-04 Thread GitBox
lidavidm commented on pull request #12033: URL: https://github.com/apache/arrow/pull/12033#issuecomment-1004989201 Can we add a placeholder section and link to that perhaps? On Tue, Jan 4, 2022, at 11:48, Vibhatha Lakmal Abeykoon wrote: > > > ***@***. commented on this

[GitHub] [arrow] vibhatha commented on a change in pull request #12033: ARROW-15091: [C++][Doc] Document nodes in C++ streaming execution engine

2022-01-04 Thread GitBox
vibhatha commented on a change in pull request #12033: URL: https://github.com/apache/arrow/pull/12033#discussion_r778234433 ## File path: docs/source/cpp/streaming_execution.rst ## @@ -175,9 +175,607 @@ their completion:: // alive until this future is marked finished.

[GitHub] [arrow] vibhatha commented on a change in pull request #12033: ARROW-15091: [C++][Doc] Document nodes in C++ streaming execution engine

2022-01-04 Thread GitBox
vibhatha commented on a change in pull request #12033: URL: https://github.com/apache/arrow/pull/12033#discussion_r778239472 ## File path: docs/source/cpp/streaming_execution.rst ## @@ -175,9 +175,607 @@ their completion:: // alive until this future is marked finished.

[GitHub] [arrow] lidavidm commented on a change in pull request #12033: ARROW-15091: [C++][Doc] Document nodes in C++ streaming execution engine

2022-01-04 Thread GitBox
lidavidm commented on a change in pull request #12033: URL: https://github.com/apache/arrow/pull/12033#discussion_r778242335 ## File path: docs/source/cpp/streaming_execution.rst ## @@ -305,3 +305,484 @@ Datasets may be scanned multiple times; just make multiple scan nodes fr

[GitHub] [arrow] lidavidm commented on pull request #12033: ARROW-15091: [C++][Doc] Document nodes in C++ streaming execution engine

2022-01-04 Thread GitBox
lidavidm commented on pull request #12033: URL: https://github.com/apache/arrow/pull/12033#issuecomment-1005004369 Also, just a thought about some of the longer code blocks in reST…how about using the `literalinclude` directive to reference the example instead? It will be a little painful

[GitHub] [arrow] github-actions[bot] commented on pull request #12073: ARROW-14919: [R] write_parquet() drops attributes for grouped dataframes

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12073: URL: https://github.com/apache/arrow/pull/12073#issuecomment-1005011917 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] emkornfield commented on pull request #11514: ARROW-14430: [Go] Basic Expression, Field Reference and Datum handling

2022-01-04 Thread GitBox
emkornfield commented on pull request #11514: URL: https://github.com/apache/arrow/pull/11514#issuecomment-1005023871 Yes, sorry I will take a look tonight, hopefully getting back into a routine this week. Happy new year. -- This is an automated message from the Apache Git Service. To r

[GitHub] [arrow-rs] sunchao commented on pull request #1126: Replace ambiguous Any with All in comments

2022-01-04 Thread GitBox
sunchao commented on pull request #1126: URL: https://github.com/apache/arrow-rs/pull/1126#issuecomment-1005024566 Merged, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [arrow-rs] sunchao merged pull request #1126: Replace ambiguous Any with All in comments

2022-01-04 Thread GitBox
sunchao merged pull request #1126: URL: https://github.com/apache/arrow-rs/pull/1126 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow] zeroshade commented on pull request #11514: ARROW-14430: [Go] Basic Expression, Field Reference and Datum handling

2022-01-04 Thread GitBox
zeroshade commented on pull request #11514: URL: https://github.com/apache/arrow/pull/11514#issuecomment-1005025162 Thanks much and happy new year to you too! :) Hope all is well, I'm literally just recovering from Covid myself so I completely understand haha. :) -- This is an automated

[GitHub] [arrow-rs] domoritz commented on issue #1064: Docs: Improve clarity by rewriting 'Any' -> 'All unless overwritten'

2022-01-04 Thread GitBox
domoritz commented on issue #1064: URL: https://github.com/apache/arrow-rs/issues/1064#issuecomment-1005038465 Fixed in https://github.com/apache/arrow-rs/pull/1126 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] jonkeane commented on pull request #12072: ARROW-15235: [R] drop support for R 3.3

2022-01-04 Thread GitBox
jonkeane commented on pull request #12072: URL: https://github.com/apache/arrow/pull/12072#issuecomment-1005051636 @github-actions crossbow submit test-r-versions test-r-linux-as-cran test-r-arrow-backwards-compatibility -- This is an automated message from the Apache Git Service. To res

[GitHub] [arrow] github-actions[bot] commented on pull request #12072: ARROW-15235: [R] drop support for R 3.3

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12072: URL: https://github.com/apache/arrow/pull/12072#issuecomment-1005052370 Revision: 210296a0f4d0445d4a3c8612332082d90ebe119f Submitted crossbow builds: [ursacomputing/crossbow @ actions-1366](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] rok commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
rok commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778289913 ## File path: cpp/src/arrow/compute/api_scalar.h ## @@ -90,6 +90,36 @@ class ARROW_EXPORT RoundOptions : public FunctionOptions { RoundMode round_mode; };

[GitHub] [arrow] rok commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
rok commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778291501 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1899,6 +1900,71 @@ def test_assume_timezone(): result.equals(pa.array(expected)) +def _chec

[GitHub] [arrow] rok commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
rok commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778301184 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1899,6 +1900,71 @@ def test_assume_timezone(): result.equals(pa.array(expected)) +def _chec

[GitHub] [arrow] rok commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
rok commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778301184 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1899,6 +1900,71 @@ def test_assume_timezone(): result.equals(pa.array(expected)) +def _chec

[GitHub] [arrow] github-actions[bot] commented on pull request #12074: ARROW-15245: [Go] Address most of the staticcheck linting issues.

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12074: URL: https://github.com/apache/arrow/pull/12074#issuecomment-1005074821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] lidavidm commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
lidavidm commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778306321 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1899,6 +1900,71 @@ def test_assume_timezone(): result.equals(pa.array(expected)) +def

[GitHub] [arrow] rok commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
rok commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778307918 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1899,6 +1900,71 @@ def test_assume_timezone(): result.equals(pa.array(expected)) +def _chec

[GitHub] [arrow] rok commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
rok commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778291501 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1899,6 +1900,71 @@ def test_assume_timezone(): result.equals(pa.array(expected)) +def _chec

[GitHub] [arrow] rok commented on a change in pull request #11818: ARROW-14822: [C++] Implement floor/ceil/round for temporal objects

2022-01-04 Thread GitBox
rok commented on a change in pull request #11818: URL: https://github.com/apache/arrow/pull/11818#discussion_r778315758 ## File path: python/pyarrow/tests/test_compute.py ## @@ -1899,6 +1900,71 @@ def test_assume_timezone(): result.equals(pa.array(expected)) +def _chec

[GitHub] [arrow] WillAyd commented on a change in pull request #11990: ARROW-15032: [C++] Add DateStruct Function

2022-01-04 Thread GitBox
WillAyd commented on a change in pull request #11990: URL: https://github.com/apache/arrow/pull/11990#discussion_r778320769 ## File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc ## @@ -546,6 +591,23 @@ TEST_F(ScalarTemporalTest, TestZoned2) { auto unit = time

[GitHub] [arrow] paleolimbot opened a new pull request #12075: ARROW-15193: [R][Documentation] Update R binding documentation

2022-01-04 Thread GitBox
paleolimbot opened a new pull request #12075: URL: https://github.com/apache/arrow/pull/12075 Now that #11904 is merged (ARROW-15010), we have slightly different syntax for defining bindings between compute C++ and R functions. @thisisnic wrote some excellent documentation for creating bin

[GitHub] [arrow] github-actions[bot] commented on pull request #12075: ARROW-15193: [R][Documentation] Update R binding documentation

2022-01-04 Thread GitBox
github-actions[bot] commented on pull request #12075: URL: https://github.com/apache/arrow/pull/12075#issuecomment-1005101521 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

  1   2   3   >