[GitHub] [arrow] github-actions[bot] commented on pull request #12795: ARROW-16102: [C++] Add support for building with system gRPC and bundled GCS

2022-04-07 Thread GitBox
github-actions[bot] commented on PR #12795: URL: https://github.com/apache/arrow/pull/12795#issuecomment-1092502885 Revision: 3e279ab2928f9498909e106d3d5a89d266d3f82c Submitted crossbow builds: [ursacomputing/crossbow @ actions-1832](https://github.com/ursacomputing/crossbow/branches/

[GitHub] [arrow] kou commented on pull request #12795: ARROW-16102: [C++] Add support for building with system gRPC and bundled GCS

2022-04-07 Thread GitBox
kou commented on PR #12795: URL: https://github.com/apache/arrow/pull/12795#issuecomment-1092501994 @github-actions crossbow submit -g cpp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #2156: Add an InSet as an optimized version for IN_LIST

2022-04-07 Thread GitBox
Ted-Jiang commented on code in PR #2156: URL: https://github.com/apache/arrow-datafusion/pull/2156#discussion_r845784661 ## datafusion/physical-expr/src/expressions/in_list.rs: ## @@ -32,13 +33,19 @@ use arrow::{ record_batch::RecordBatch, }; -use crate::PhysicalExpr; +u

[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #2156: Add an InSet as an optimized version for IN_LIST

2022-04-07 Thread GitBox
Ted-Jiang commented on code in PR #2156: URL: https://github.com/apache/arrow-datafusion/pull/2156#discussion_r845783834 ## datafusion/physical-expr/src/expressions/in_list.rs: ## @@ -32,13 +33,19 @@ use arrow::{ record_batch::RecordBatch, }; -use crate::PhysicalExpr; +u

[GitHub] [arrow] vibhatha commented on a diff in pull request #12672: ARROW-15779: [Python] Create python bindings for Substrait consumer

2022-04-07 Thread GitBox
vibhatha commented on code in PR #12672: URL: https://github.com/apache/arrow/pull/12672#discussion_r845780769 ## cpp/src/arrow/engine/substrait/serde_test.cc: ## @@ -724,5 +728,99 @@ TEST(Substrait, ExtensionSetFromPlan) { EXPECT_EQ(decoded_add_func.name, "add"); } +TEST(

[GitHub] [arrow] vibhatha commented on a diff in pull request #12672: ARROW-15779: [Python] Create python bindings for Substrait consumer

2022-04-07 Thread GitBox
vibhatha commented on code in PR #12672: URL: https://github.com/apache/arrow/pull/12672#discussion_r845776851 ## python/pyarrow/tests/test_substrait.py: ## @@ -0,0 +1,91 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

[GitHub] [arrow-datafusion] Dandandan commented on a diff in pull request #2156: Add an InSet as an optimized version for IN_LIST

2022-04-07 Thread GitBox
Dandandan commented on code in PR #2156: URL: https://github.com/apache/arrow-datafusion/pull/2156#discussion_r845769945 ## datafusion/physical-expr/src/expressions/in_list.rs: ## @@ -32,13 +33,19 @@ use arrow::{ record_batch::RecordBatch, }; -use crate::PhysicalExpr; +u

[GitHub] [arrow-datafusion] yjshen commented on a diff in pull request #2156: Add an InSet as an optimized version for IN_LIST

2022-04-07 Thread GitBox
yjshen commented on code in PR #2156: URL: https://github.com/apache/arrow-datafusion/pull/2156#discussion_r845769129 ## datafusion/physical-expr/src/expressions/in_list.rs: ## @@ -32,13 +33,19 @@ use arrow::{ record_batch::RecordBatch, }; -use crate::PhysicalExpr; +use

[GitHub] [arrow-rs] HaoYang670 closed issue #1400: Interesting benchmark results of `min_max_helper`

2022-04-07 Thread GitBox
HaoYang670 closed issue #1400: Interesting benchmark results of `min_max_helper` URL: https://github.com/apache/arrow-rs/issues/1400 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [arrow] westonpace commented on pull request #12769: ARROW-16076: [R] Bindings for the new TPC-H generator

2022-04-07 Thread GitBox
westonpace commented on PR #12769: URL: https://github.com/apache/arrow/pull/12769#issuecomment-1092470363 Based on ARROW-16100 I think we are pausing this PR. @save-buffer Can we pull these C++ fixes into their own dedicated PR? -- This is an automated message from the Apache Git Servic

[GitHub] [arrow] westonpace commented on a diff in pull request #12812: ARROW-16131 [C++] support saving and retrieving custom metadata in batches for IPC file

2022-04-07 Thread GitBox
westonpace commented on code in PR #12812: URL: https://github.com/apache/arrow/pull/12812#discussion_r845759831 ## cpp/src/arrow/ipc/writer.cc: ## @@ -263,6 +263,14 @@ class RecordBatchSerializer { out_->body_length = offset - buffer_start_offset_; DCHECK(bit_util::Is

[GitHub] [arrow] westonpace commented on a diff in pull request #12672: ARROW-15779: [Python] Create python bindings for Substrait consumer

2022-04-07 Thread GitBox
westonpace commented on code in PR #12672: URL: https://github.com/apache/arrow/pull/12672#discussion_r845748534 ## cpp/src/arrow/engine/substrait/serde_test.cc: ## @@ -724,5 +728,99 @@ TEST(Substrait, ExtensionSetFromPlan) { EXPECT_EQ(decoded_add_func.name, "add"); } +TES

[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #2156: Add an InSet as an optimized version for IN_LIST

2022-04-07 Thread GitBox
Ted-Jiang commented on code in PR #2156: URL: https://github.com/apache/arrow-datafusion/pull/2156#discussion_r845757545 ## datafusion/physical-expr/src/expressions/in_list.rs: ## @@ -32,13 +33,19 @@ use arrow::{ record_batch::RecordBatch, }; -use crate::PhysicalExpr; +u

[GitHub] [arrow] niyue commented on a diff in pull request #12812: ARROW-16131 [C++] support saving and retrieving custom metadata in batches for IPC file

2022-04-07 Thread GitBox
niyue commented on code in PR #12812: URL: https://github.com/apache/arrow/pull/12812#discussion_r845757121 ## cpp/src/arrow/ipc/writer.cc: ## @@ -263,6 +263,14 @@ class RecordBatchSerializer { out_->body_length = offset - buffer_start_offset_; DCHECK(bit_util::IsMulti

[GitHub] [arrow] ursabot commented on pull request #12609: ARROW-15067: [C++] Add tracing spans to the scanner

2022-04-07 Thread GitBox
ursabot commented on PR #12609: URL: https://github.com/apache/arrow/pull/12609#issuecomment-1092431721 Benchmark runs are scheduled for baseline = 542158fa0847810f375189a36173693d1fe507b8 and contender = 8ea2c931abfe9b2dc76f274f79327056a3496140. 8ea2c931abfe9b2dc76f274f79327056a3496140 is

[GitHub] [arrow-datafusion] gaojun2048 commented on pull request #2131: [Ballista] Add ballista plugin manager and UDF plugin

2022-04-07 Thread GitBox
gaojun2048 commented on PR #2131: URL: https://github.com/apache/arrow-datafusion/pull/2131#issuecomment-1092397001 Thank you all. I will iterate on it lately. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [arrow-datafusion] happysalada commented on issue #2095: Create next DataFusion release (after 7.0)

2022-04-07 Thread GitBox
happysalada commented on issue #2095: URL: https://github.com/apache/arrow-datafusion/issues/2095#issuecomment-1092390291 Question related, do you plan to release the datafusion-cli as a crate as well ? I see that the 7.0.0 datafusion-cli crate has been yanked (for reasons that I ignore).

[GitHub] [arrow-datafusion] yahoNanJing commented on pull request #2131: [Ballista] Add ballista plugin manager and UDF plugin

2022-04-07 Thread GitBox
yahoNanJing commented on PR #2131: URL: https://github.com/apache/arrow-datafusion/pull/2131#issuecomment-1092382677 @andygrove, if the udf/udaf libraries can only be loaded from local disk, we need to build a new image and redeploy the whole cluster when there's any changes for the librar

[GitHub] [arrow] kou commented on a diff in pull request #12759: WIP: DO NOT MERGE: Apache Arrow 2022-04 board report

2022-04-07 Thread GitBox
kou commented on code in PR #12759: URL: https://github.com/apache/arrow/pull/12759#discussion_r845681987 ## board-report-2022-04.md: ## @@ -0,0 +1,47 @@ +## Description: + +The mission of Apache Arrow is the creation and maintenance of +software related to columnar in-memory pr

[GitHub] [arrow] ursabot commented on pull request #12442: ARROW-15706: [C++][FlightRPC] Implement a UCX transport

2022-04-07 Thread GitBox
ursabot commented on PR #12442: URL: https://github.com/apache/arrow/pull/12442#issuecomment-1092363345 Benchmark runs are scheduled for baseline = e5072bdacbe715f64d6d16f5deb0bb4a7f22c62f and contender = 542158fa0847810f375189a36173693d1fe507b8. 542158fa0847810f375189a36173693d1fe507b8 is

[GitHub] [arrow] nealrichardson commented on a diff in pull request #12759: WIP: DO NOT MERGE: Apache Arrow 2022-04 board report

2022-04-07 Thread GitBox
nealrichardson commented on code in PR #12759: URL: https://github.com/apache/arrow/pull/12759#discussion_r845671928 ## board-report-2022-04.md: ## @@ -0,0 +1,47 @@ +## Description: + +The mission of Apache Arrow is the creation and maintenance of +software related to columnar i

[GitHub] [arrow] github-actions[bot] commented on pull request #12795: ARROW-16102: [C++] Add support for building with system gRPC and bundled GCS

2022-04-07 Thread GitBox
github-actions[bot] commented on PR #12795: URL: https://github.com/apache/arrow/pull/12795#issuecomment-1092348800 Revision: 9685ef52fa2d2559cf3a47b2187308d43359e997 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1831](https://github.com/ursacomputing/crossbow/branches/

[GitHub] [arrow] kou commented on pull request #12795: ARROW-16102: [C++] Add support for building with system gRPC and bundled GCS

2022-04-07 Thread GitBox
kou commented on PR #12795: URL: https://github.com/apache/arrow/pull/12795#issuecomment-1092348182 @github-actions crossbow submit -g cpp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [arrow] westonpace commented on pull request #12828: MINOR: [C++] Fix typo that causes segfault when unknown functions are passed to Substrait

2022-04-07 Thread GitBox
westonpace commented on PR #12828: URL: https://github.com/apache/arrow/pull/12828#issuecomment-1092344157 Can you add a unit test to cover this case? Something like this should do: https://github.com/westonpace/arrow/commit/3e303518e7cbcadac34a75e520ddff31e4b62fd8 -- This is an automate

[GitHub] [arrow] kou commented on pull request #12795: ARROW-16102: [C++] Add support for building with system gRPC and bundled GCS

2022-04-07 Thread GitBox
kou commented on PR #12795: URL: https://github.com/apache/arrow/pull/12795#issuecomment-1092326575 @emkornfield OK. Thanks for confirming it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] kou commented on pull request #12828: MINOR: [C++] Fix typo that causes segfault when unknown functions are passed to Substrait

2022-04-07 Thread GitBox
kou commented on PR #12828: URL: https://github.com/apache/arrow/pull/12828#issuecomment-1092323354 > I'm not sure if this is outside the scope of MINOR (I'm also happy to create a JIRA) Could you create a JIRA issue? https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#mino

[GitHub] [arrow] ianmcook commented on a diff in pull request #12460: ARROW-13530: [C++] Implement cumulative sum compute function

2022-04-07 Thread GitBox
ianmcook commented on code in PR #12460: URL: https://github.com/apache/arrow/pull/12460#discussion_r845643144 ## docs/source/python/api/compute.rst: ## @@ -45,6 +45,21 @@ Aggregations tdigest variance +Cumulative Functions + + +Cumulative functions

[GitHub] [arrow] ianmcook commented on a diff in pull request #12460: ARROW-13530: [C++] Implement cumulative sum compute function

2022-04-07 Thread GitBox
ianmcook commented on code in PR #12460: URL: https://github.com/apache/arrow/pull/12460#discussion_r845643144 ## docs/source/python/api/compute.rst: ## @@ -45,6 +45,21 @@ Aggregations tdigest variance +Cumulative Functions + + +Cumulative functions

[GitHub] [arrow-rs] HaoYang670 commented on pull request #1527: fix clippy errors in 1.60

2022-04-07 Thread GitBox
HaoYang670 commented on PR #1527: URL: https://github.com/apache/arrow-rs/pull/1527#issuecomment-1092291976 Do we need a follow-on PR to really fix the lints? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [arrow] ursabot commented on pull request #12754: ARROW-15429: [Python] Address docstrings for ChunkedArray class, methods, attributes and constructor

2022-04-07 Thread GitBox
ursabot commented on PR #12754: URL: https://github.com/apache/arrow/pull/12754#issuecomment-1092291127 Benchmark runs are scheduled for baseline = 77db0cfdd689d69f0090b5153cb1cbaeaf8a7496 and contender = e5072bdacbe715f64d6d16f5deb0bb4a7f22c62f. e5072bdacbe715f64d6d16f5deb0bb4a7f22c62f is

[GitHub] [arrow] boshek opened a new pull request, #12831: ARROW-15879: [R] passing a schema calls open_dataset to fail on hive-partitioned csv files

2022-04-07 Thread GitBox
boshek opened a new pull request, #12831: URL: https://github.com/apache/arrow/pull/12831 This is a WIP that introduces a failing test for csv schema reading. I _think_ this is a duplicate of this ticket: https://issues.apache.org/jira/browse/ARROW-14743 -- This is an automated message f

[GitHub] [arrow] github-actions[bot] commented on pull request #12831: ARROW-15879: [R] passing a schema calls open_dataset to fail on hive-partitioned csv files

2022-04-07 Thread GitBox
github-actions[bot] commented on PR #12831: URL: https://github.com/apache/arrow/pull/12831#issuecomment-1092247764 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #12831: ARROW-15879: [R] passing a schema calls open_dataset to fail on hive-partitioned csv files

2022-04-07 Thread GitBox
github-actions[bot] commented on PR #12831: URL: https://github.com/apache/arrow/pull/12831#issuecomment-1092247745 https://issues.apache.org/jira/browse/ARROW-15879 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] Eugene-Roslikov-BQ commented on pull request #12827: getTimestamp() and getTime(), getDate()

2022-04-07 Thread GitBox
Eugene-Roslikov-BQ commented on PR #12827: URL: https://github.com/apache/arrow/pull/12827#issuecomment-1092217296 wrong base branch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] Eugene-Roslikov-BQ closed pull request #12827: getTimestamp() and getTime(), getDate()

2022-04-07 Thread GitBox
Eugene-Roslikov-BQ closed pull request #12827: getTimestamp() and getTime(), getDate() URL: https://github.com/apache/arrow/pull/12827 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1527: fix clippy errors in 1.60

2022-04-07 Thread GitBox
codecov-commenter commented on PR #1527: URL: https://github.com/apache/arrow-rs/pull/1527#issuecomment-1092215504 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1527?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-rs] viirya commented on pull request #1507: Add `new_from_strings` to create `MapArrays`

2022-04-07 Thread GitBox
viirya commented on PR #1507: URL: https://github.com/apache/arrow-rs/pull/1507#issuecomment-1092210073 Thanks @alamb ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [arrow] lidavidm commented on pull request #12830: ARROW-15452: [FlightRPC][Java] JDBC driver for Flight SQL

2022-04-07 Thread GitBox
lidavidm commented on PR #12830: URL: https://github.com/apache/arrow/pull/12830#issuecomment-1092205766 Thanks @jduo! I see several ICLAs have been filed already, I'll move the list here: For the IP clearance: it looks like these are the current contributors. As far as I can s

[GitHub] [arrow-rs] alamb closed issue #1158: Nicer API to create `MapArrays`

2022-04-07 Thread GitBox
alamb closed issue #1158: Nicer API to create `MapArrays` URL: https://github.com/apache/arrow-rs/issues/1158 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

[GitHub] [arrow-rs] alamb merged pull request #1507: Add `new_from_strings` to create `MapArrays`

2022-04-07 Thread GitBox
alamb merged PR #1507: URL: https://github.com/apache/arrow-rs/pull/1507 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

[GitHub] [arrow-rs] alamb commented on pull request #1507: Add `new_from_strings` to create `MapArrays`

2022-04-07 Thread GitBox
alamb commented on PR #1507: URL: https://github.com/apache/arrow-rs/pull/1507#issuecomment-1092200844 Thanks @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [arrow] jonkeane closed pull request #12818: ARROW-16038: [R] different behavior from dplyr when mutate's `.keep` option is set

2022-04-07 Thread GitBox
jonkeane closed pull request #12818: ARROW-16038: [R] different behavior from dplyr when mutate's `.keep` option is set URL: https://github.com/apache/arrow/pull/12818 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [arrow-rs] viirya commented on pull request #1517: Fix reading nested lists from parquet files

2022-04-07 Thread GitBox
viirya commented on PR #1517: URL: https://github.com/apache/arrow-rs/pull/1517#issuecomment-1092200679 Thank you @alamb ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [arrow-rs] alamb commented on pull request #1517: Fix reading nested lists from parquet files

2022-04-07 Thread GitBox
alamb commented on PR #1517: URL: https://github.com/apache/arrow-rs/pull/1517#issuecomment-1092200231 Thanks @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [arrow-rs] alamb closed issue #1515: cannot read parquet file

2022-04-07 Thread GitBox
alamb closed issue #1515: cannot read parquet file URL: https://github.com/apache/arrow-rs/issues/1515 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [arrow-rs] alamb merged pull request #1517: Fix reading nested lists from parquet files

2022-04-07 Thread GitBox
alamb merged PR #1517: URL: https://github.com/apache/arrow-rs/pull/1517 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

[GitHub] [arrow-rs] alamb opened a new pull request, #1527: fix clippy errors in 1.60

2022-04-07 Thread GitBox
alamb opened a new pull request, #1527: URL: https://github.com/apache/arrow-rs/pull/1527 # Rationale Rust 1.60 is released 🎉 Clippy has added some new lints which were failing on CI # Changes "Fix" lints (by telling clippy to ignore them) to get CI clean -- This i

[GitHub] [arrow] westonpace commented on a diff in pull request #12812: ARROW-16131 [C++] support saving and retrieving custom metadata in batches for IPC file

2022-04-07 Thread GitBox
westonpace commented on code in PR #12812: URL: https://github.com/apache/arrow/pull/12812#discussion_r845559633 ## cpp/src/arrow/ipc/writer.cc: ## @@ -263,6 +263,14 @@ class RecordBatchSerializer { out_->body_length = offset - buffer_start_offset_; DCHECK(bit_util::Is

[GitHub] [arrow-rs] alamb closed issue #1516: Allow creating buffers from externally owned memory like Vec or String

2022-04-07 Thread GitBox
alamb closed issue #1516: Allow creating buffers from externally owned memory like Vec or String URL: https://github.com/apache/arrow-rs/issues/1516 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [arrow-rs] alamb merged pull request #1494: Decouple buffer deallocation from ffi and allow creating buffers from rust vec

2022-04-07 Thread GitBox
alamb merged PR #1494: URL: https://github.com/apache/arrow-rs/pull/1494 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

[GitHub] [arrow-rs] alamb commented on pull request #1494: Decouple buffer deallocation from ffi and allow creating buffers from rust vec

2022-04-07 Thread GitBox
alamb commented on PR #1494: URL: https://github.com/apache/arrow-rs/pull/1494#issuecomment-1092196178 I believe the clippy errors are due to rust 1.60 being released. I will create a new PR -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] jduo commented on pull request #12830: ARROW-15452: [FlightRPC][Java] JDBC driver for Flight SQL

2022-04-07 Thread GitBox
jduo commented on PR #12830: URL: https://github.com/apache/arrow/pull/12830#issuecomment-1092182615 This PR is a duplicate of #12254 , except targeting the flight-sql-jdbc branch. It's intended to be a snapshot of the prior PR for the purpose of IP clearance. -- This is an automated mes

[GitHub] [arrow] github-actions[bot] commented on pull request #12830: ARROW-15452: [FlightRPC][Java] JDBC driver for Flight SQL

2022-04-07 Thread GitBox
github-actions[bot] commented on PR #12830: URL: https://github.com/apache/arrow/pull/12830#issuecomment-1092182046 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue

[GitHub] [arrow] jduo opened a new pull request, #12830: ARROW-15452: [FlightRPC][Java] JDBC driver for Flight SQL

2022-04-07 Thread GitBox
jduo opened a new pull request, #12830: URL: https://github.com/apache/arrow/pull/12830 This implements a JDBC driver able to communicate to Flight SQL sources. So far this covers: Metadata retrieval by DatabaseMetadata, ResultSetMetadata, etc. Query execution by stat

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2170: Handle merged schemas in parquet pruning

2022-04-07 Thread GitBox
alamb commented on code in PR #2170: URL: https://github.com/apache/arrow-datafusion/pull/2170#discussion_r845539449 ## datafusion/core/src/physical_plan/file_format/parquet.rs: ## @@ -919,6 +955,73 @@ mod tests { assert_batches_sorted_eq!(expected, &read); } +

[GitHub] [arrow-rs] alamb closed issue #1511: Speed up the `substring` kernel

2022-04-07 Thread GitBox
alamb closed issue #1511: Speed up the `substring` kernel URL: https://github.com/apache/arrow-rs/issues/1511 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

[GitHub] [arrow-rs] alamb merged pull request #1512: Speed up the `substring` kernel by about 2x

2022-04-07 Thread GitBox
alamb merged PR #1512: URL: https://github.com/apache/arrow-rs/pull/1512 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

[GitHub] [arrow-rs] alamb commented on pull request #1512: Speed up the `substring` kernel by about 2x

2022-04-07 Thread GitBox
alamb commented on PR #1512: URL: https://github.com/apache/arrow-rs/pull/1512#issuecomment-1092173701 Thanks @HaoYang670 and @Dandandan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [arrow] ursabot commented on pull request #12796: ARROW-16079: [Python] Address docstrings in Parquet schema and metadata

2022-04-07 Thread GitBox
ursabot commented on PR #12796: URL: https://github.com/apache/arrow/pull/12796#issuecomment-1092173507 Benchmark runs are scheduled for baseline = dabb80df6e8fc28c5de16f4a856b0c7c2b5f90cd and contender = 77db0cfdd689d69f0090b5153cb1cbaeaf8a7496. 77db0cfdd689d69f0090b5153cb1cbaeaf8a7496 is

[GitHub] [arrow-datafusion] Cheappie commented on a diff in pull request #2170: Handle merged schemas in parquet pruning

2022-04-07 Thread GitBox
Cheappie commented on code in PR #2170: URL: https://github.com/apache/arrow-datafusion/pull/2170#discussion_r845538137 ## datafusion/core/src/physical_plan/file_format/parquet.rs: ## @@ -919,6 +955,73 @@ mod tests { assert_batches_sorted_eq!(expected, &read); }

[GitHub] [arrow-datafusion] alamb commented on pull request #2146: Buffer records in row format in memory for SortExec

2022-04-07 Thread GitBox
alamb commented on PR #2146: URL: https://github.com/apache/arrow-datafusion/pull/2146#issuecomment-1092164658 As I have mentioned, I am very interested in this work but I have not yet found time to give it the deep study it deserves (I am especially interested in the profiling). Ho

[GitHub] [arrow-datafusion] alamb merged pull request #2131: [Ballista] Add ballista plugin manager and UDF plugin

2022-04-07 Thread GitBox
alamb merged PR #2131: URL: https://github.com/apache/arrow-datafusion/pull/2131 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow-datafusion] alamb commented on pull request #2131: [Ballista] Add ballista plugin manager and UDF plugin

2022-04-07 Thread GitBox
alamb commented on PR #2131: URL: https://github.com/apache/arrow-datafusion/pull/2131#issuecomment-1092162600 I think that since @thinkharderdev has reviewed this and we have talked about it for a while, I will merge the code in and we can iterate on it as needed. Thank you for yo

[GitHub] [arrow-datafusion] alamb merged pull request #2170: Handle merged schemas in parquet pruning

2022-04-07 Thread GitBox
alamb merged PR #2170: URL: https://github.com/apache/arrow-datafusion/pull/2170 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow-datafusion] alamb closed issue #2161: Query execution fails with index out of bounds err

2022-04-07 Thread GitBox
alamb closed issue #2161: Query execution fails with index out of bounds err URL: https://github.com/apache/arrow-datafusion/issues/2161 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2142: implement 'StringConcat' operator to support sql like "select 'aa' || 'b' "

2022-04-07 Thread GitBox
alamb commented on code in PR #2142: URL: https://github.com/apache/arrow-datafusion/pull/2142#discussion_r845526998 ## datafusion/physical-expr/src/expressions/binary.rs: ## @@ -485,22 +485,31 @@ fn bitwise_or(left: ArrayRef, right: ArrayRef) -> Result { } } -/// Use d

[GitHub] [arrow-datafusion] alamb closed issue #2141: Add 'StringConcat' operator to df

2022-04-07 Thread GitBox
alamb closed issue #2141: Add 'StringConcat' operator to df URL: https://github.com/apache/arrow-datafusion/issues/2141 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[GitHub] [arrow-datafusion] alamb merged pull request #2142: implement 'StringConcat' operator to support sql like "select 'aa' || 'b' "

2022-04-07 Thread GitBox
alamb merged PR #2142: URL: https://github.com/apache/arrow-datafusion/pull/2142 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow-datafusion] alamb commented on issue #2175: [Discuss] Different implementation style between Expr, LogicalPlan and ExecutionPlan

2022-04-07 Thread GitBox
alamb commented on issue #2175: URL: https://github.com/apache/arrow-datafusion/issues/2175#issuecomment-1092157017 > And for physical ExecutionPlan, it is Trait/Trait Objects, I would prefer to use Enum also I agree this would be nice for code consistency. The fact it is a trait has

[GitHub] [arrow-cookbook] wjones127 opened a new pull request, #179: Improvements to support building on MacOS

2022-04-07 Thread GitBox
wjones127 opened a new pull request, #179: URL: https://github.com/apache/arrow-cookbook/pull/179 Clang versions need the no-unused flag added. Also made a note for others who are using homebrew outside this project and may encounter version mismatches. I also added ccache use to the

[GitHub] [arrow-rs] alamb commented on pull request #1494: Decouple buffer deallocation from ffi and allow creating buffers from rust vec

2022-04-07 Thread GitBox
alamb commented on PR #1494: URL: https://github.com/apache/arrow-rs/pull/1494#issuecomment-1092142878 I merged this PR with `master` and fixed a rustdoc issue with e6c6c07dee https://github.com/apache/arrow-rs/runs/5756218032?check_suite_focus=true -- thanks @jhorstmann -- This is an a

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2168: Implement fast path of with_new_children() in ExecutionPlan

2022-04-07 Thread GitBox
alamb commented on code in PR #2168: URL: https://github.com/apache/arrow-datafusion/pull/2168#discussion_r844259856 ## ballista/rust/core/src/serde/mod.rs: ## @@ -508,15 +508,10 @@ mod tests { &self, children: Vec>, ) -> datafusion::error::Res

[GitHub] [arrow] github-actions[bot] commented on pull request #12776: ARROW-16108: [Gandiva][C++] Fix castINTERVALDAY and castINTERVALYEAR

2022-04-07 Thread GitBox
github-actions[bot] commented on PR #12776: URL: https://github.com/apache/arrow/pull/12776#issuecomment-1092131063 Revision: 97c3de337e030553fc1c72fbc8e3357836b05191 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1830](https://github.com/ursacomputing/crossbow/branches/

[GitHub] [arrow] Johnnathanalmeida commented on pull request #12776: ARROW-16108: [Gandiva][C++] Fix castINTERVALDAY and castINTERVALYEAR

2022-04-07 Thread GitBox
Johnnathanalmeida commented on PR #12776: URL: https://github.com/apache/arrow/pull/12776#issuecomment-1092129882 @github-actions crossbow submit java-jars -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2168: Implement fast path of with_new_children() in ExecutionPlan

2022-04-07 Thread GitBox
alamb commented on code in PR #2168: URL: https://github.com/apache/arrow-datafusion/pull/2168#discussion_r844259856 ## ballista/rust/core/src/serde/mod.rs: ## @@ -508,15 +508,10 @@ mod tests { &self, children: Vec>, ) -> datafusion::error::Res

[GitHub] [arrow-datafusion] alamb closed issue #1965: Fast path of with_new_children() in ExecutionPlan

2022-04-07 Thread GitBox
alamb closed issue #1965: Fast path of with_new_children() in ExecutionPlan URL: https://github.com/apache/arrow-datafusion/issues/1965 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [arrow-datafusion] alamb merged pull request #2168: Implement fast path of with_new_children() in ExecutionPlan

2022-04-07 Thread GitBox
alamb merged PR #2168: URL: https://github.com/apache/arrow-datafusion/pull/2168 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow-datafusion] alamb closed issue #2096: Enable explain query in Ballista.

2022-04-07 Thread GitBox
alamb closed issue #2096: Enable explain query in Ballista. URL: https://github.com/apache/arrow-datafusion/issues/2096 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[GitHub] [arrow-datafusion] alamb merged pull request #2163: enable explain for ballista

2022-04-07 Thread GitBox
alamb merged PR #2163: URL: https://github.com/apache/arrow-datafusion/pull/2163 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow-datafusion] alamb commented on pull request #2163: enable explain for ballista

2022-04-07 Thread GitBox
alamb commented on PR #2163: URL: https://github.com/apache/arrow-datafusion/pull/2163#issuecomment-1092126184 Thanks @doki23 and @yjshen ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow-rs] alamb merged pull request #1510: chore: Update `prost`, `prost-derive` and `prost-types` to 0.10, `tonic`, and `tonic-build` to `0.7`

2022-04-07 Thread GitBox
alamb merged PR #1510: URL: https://github.com/apache/arrow-rs/pull/1510 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

[GitHub] [arrow-datafusion] alamb commented on issue #587: Optionally Limit memory used by DataFusion plan

2022-04-07 Thread GitBox
alamb commented on issue #587: URL: https://github.com/apache/arrow-datafusion/issues/587#issuecomment-1092120541 Hi @hzh0425 -- There is no estimated completion time I know of. Thanks to @yjshen there is a way to limit the memory used in Sort. The major other operators that need t

[GitHub] [arrow-datafusion] tustvold commented on issue #2175: [Discuss] Different implementation style between Expr, LogicalPlan and ExecutionPlan

2022-04-07 Thread GitBox
tustvold commented on issue #2175: URL: https://github.com/apache/arrow-datafusion/issues/2175#issuecomment-1092102331 Thank you for raising this, its really good to have these discussions. I would definitely support making the enumeration style consistent, and the expression struct patter

[GitHub] [arrow] lidavidm closed pull request #12794: ARROW-15578: [Java][Doc] Document C Data Interface and how to interface with other languages

2022-04-07 Thread GitBox
lidavidm closed pull request #12794: ARROW-15578: [Java][Doc] Document C Data Interface and how to interface with other languages URL: https://github.com/apache/arrow/pull/12794 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] github-actions[bot] commented on pull request #12763: ARROW-14892: [Python][C++] GCS Bindings

2022-04-07 Thread GitBox
github-actions[bot] commented on PR #12763: URL: https://github.com/apache/arrow/pull/12763#issuecomment-1092095185 Revision: 608b6ecc7f1841762462f66d06ce880fd9bb02a2 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1829](https://github.com/ursacomputing/crossbow/branches/

[GitHub] [arrow] emkornfield commented on pull request #12763: ARROW-14892: [Python][C++] GCS Bindings

2022-04-07 Thread GitBox
emkornfield commented on PR #12763: URL: https://github.com/apache/arrow/pull/12763#issuecomment-1092094306 @github-actions crossbow submit -g wheel -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow] lidavidm closed pull request #12789: ARROW-16128: [C++][FlightRPC] Fix Flight SQL static build on Windows

2022-04-07 Thread GitBox
lidavidm closed pull request #12789: ARROW-16128: [C++][FlightRPC] Fix Flight SQL static build on Windows URL: https://github.com/apache/arrow/pull/12789 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2171: minor: Avoid per cell evaluation in Coalesce, use zip in CaseWhen

2022-04-07 Thread GitBox
alamb commented on code in PR #2171: URL: https://github.com/apache/arrow-datafusion/pull/2171#discussion_r845450645 ## datafusion/physical-expr/src/expressions/case.rs: ## @@ -19,7 +19,8 @@ use std::{any::Any, sync::Arc}; use crate::expressions::try_cast; use crate::Physica

[GitHub] [arrow-datafusion] WinkerDu commented on a diff in pull request #2144: fix `not(null)` with constant `null`

2022-04-07 Thread GitBox
WinkerDu commented on code in PR #2144: URL: https://github.com/apache/arrow-datafusion/pull/2144#discussion_r845434357 ## datafusion/physical-expr/src/expressions/not.rs: ## @@ -86,10 +86,29 @@ impl PhysicalExpr for NotExpr { ))) } Col

[GitHub] [arrow] nealrichardson commented on pull request #12826: ARROW-15260: [R] open_dataset - add file_name as column

2022-04-07 Thread GitBox
nealrichardson commented on PR #12826: URL: https://github.com/apache/arrow/pull/12826#issuecomment-1092052001 > Currently this fails with this error: If you haven't already, can you build arrow with `-DARROW_EXTRA_ERROR_CONTEXT=ON` and include the C++ traceback from the error? From

[GitHub] [arrow] lidavidm commented on a diff in pull request #12460: ARROW-13530: [C++] Implement cumulative sum compute function

2022-04-07 Thread GitBox
lidavidm commented on code in PR #12460: URL: https://github.com/apache/arrow/pull/12460#discussion_r845422455 ## cpp/src/arrow/compute/kernels/vector_cumulative_ops.cc: ## @@ -0,0 +1,183 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor l

[GitHub] [arrow] github-actions[bot] commented on pull request #12829: ARROW-16116: [C++] Handle non-nullable fields when reading Parquet

2022-04-07 Thread GitBox
github-actions[bot] commented on PR #12829: URL: https://github.com/apache/arrow/pull/12829#issuecomment-1092022546 https://issues.apache.org/jira/browse/ARROW-16116 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] lidavidm commented on pull request #12706: ARROW-15961: [C++] Check nullability when validating fields on batches or struct arrays

2022-04-07 Thread GitBox
lidavidm commented on PR #12706: URL: https://github.com/apache/arrow/pull/12706#issuecomment-1092020432 See #12829 for an attempt at fixing the Parquet issue properly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [arrow] ursabot commented on pull request #12772: ARROW-16058: [Python] Address docstrings for Table class, methods, attributes and constructor

2022-04-07 Thread GitBox
ursabot commented on PR #12772: URL: https://github.com/apache/arrow/pull/12772#issuecomment-1092020048 Benchmark runs are scheduled for baseline = 96de9d58420826fe3eeaf8a2c62be1f25225e322 and contender = dabb80df6e8fc28c5de16f4a856b0c7c2b5f90cd. dabb80df6e8fc28c5de16f4a856b0c7c2b5f90cd is

[GitHub] [arrow] lidavidm commented on pull request #12829: ARROW-16116: [C++] Handle non-nullable fields when reading Parquet

2022-04-07 Thread GitBox
lidavidm commented on PR #12829: URL: https://github.com/apache/arrow/pull/12829#issuecomment-1092019478 I wasn't able to find a good way to test the int32/int64/byte array decimal cases, perhaps using the Parquet writer API directly to build the file? -- This is an automated message from

[GitHub] [arrow] paleolimbot opened a new pull request, #12828: MINOR: [C++] Fix typo that causes segfault when unknown functions are passed to Substrait

2022-04-07 Thread GitBox
paleolimbot opened a new pull request, #12828: URL: https://github.com/apache/arrow/pull/12828 I'm not sure if this is outside the scope of MINOR (I'm also happy to create a JIRA), but there's a typo in `ExtensionSet::Make()` that causes a crash whenever somebody provides an unsupported fun

[GitHub] [arrow] github-actions[bot] commented on pull request #12827: getTimestamp() and getTime(), getDate()

2022-04-07 Thread GitBox
github-actions[bot] commented on PR #12827: URL: https://github.com/apache/arrow/pull/12827#issuecomment-1092011466 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue

[GitHub] [arrow] lidavidm closed pull request #12820: ARROW-15917: [Java][Docs] Document how to use Flight artifacts

2022-04-07 Thread GitBox
lidavidm closed pull request #12820: ARROW-15917: [Java][Docs] Document how to use Flight artifacts URL: https://github.com/apache/arrow/pull/12820 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow] Eugene-Roslikov-BQ opened a new pull request, #12827: getTimestamp() and getTime(), getDate()

2022-04-07 Thread GitBox
Eugene-Roslikov-BQ opened a new pull request, #12827: URL: https://github.com/apache/arrow/pull/12827 Fix for getTime() and getDate() when calendar parameter is null by using default calendar. Complete fix for getTimestamp() -- This is an automated message from the Apache Git Servi

[GitHub] [arrow] lidavidm closed pull request #12028: ARROW-15192: [Java] Allow use of Jackson 2.12 and higher

2022-04-07 Thread GitBox
lidavidm closed pull request #12028: ARROW-15192: [Java] Allow use of Jackson 2.12 and higher URL: https://github.com/apache/arrow/pull/12028 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

  1   2   3   >