[GitHub] [arrow] pitrou commented on pull request #14599: ARROW-18238: [Docs][Python] Improve docs for S3FileSystem

2022-11-07 Thread GitBox
pitrou commented on PR #14599: URL: https://github.com/apache/arrow/pull/14599#issuecomment-1306777453 @wjones127 Would you like to review this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow] pitrou commented on a diff in pull request #14043: ARROW-17613: [C++] Add function execution API for a preconfigured kernel

2022-11-07 Thread GitBox
pitrou commented on code in PR #14043: URL: https://github.com/apache/arrow/pull/14043#discussion_r1016251035 ## cpp/src/arrow/compute/function_test.cc: ## @@ -351,5 +354,87 @@ TEST(ScalarAggregateFunction, DispatchExact) { ASSERT_TRUE(selected_kernel->signature->MatchesInput

[GitHub] [arrow-datafusion] jackwener commented on issue #4123: Bug in interpreting correctly parsed SQL with aliases

2022-11-07 Thread GitBox
jackwener commented on issue #4123: URL: https://github.com/apache/arrow-datafusion/issues/4123#issuecomment-1306769870 look like planner `sort` build wrong. ```sql create table foo as select 1 as a; select a from (select a from foo group by 1 order by a) as c; Projection

[GitHub] [arrow] rtpsw commented on a diff in pull request #14043: ARROW-17613: [C++] Add function execution API for a preconfigured kernel

2022-11-07 Thread GitBox
rtpsw commented on code in PR #14043: URL: https://github.com/apache/arrow/pull/14043#discussion_r1016237509 ## cpp/src/arrow/compute/function_test.cc: ## @@ -351,5 +354,87 @@ TEST(ScalarAggregateFunction, DispatchExact) { ASSERT_TRUE(selected_kernel->signature->MatchesInputs

[GitHub] [arrow-datafusion] Ted-Jiang commented on pull request #4122: [Part3] Partition and Sort Enforcement, Enforcement rule implementation

2022-11-07 Thread GitBox
Ted-Jiang commented on PR #4122: URL: https://github.com/apache/arrow-datafusion/pull/4122#issuecomment-1306761638 I am not familiar with this part. But this `EquivalenceProperties` system is amazing 👍 ! Looking forward to see this going on 😄 -- This is an automated message from the Apac

[GitHub] [arrow-datafusion] mingmwang commented on pull request #3912: Expression boundary analysis framework

2022-11-07 Thread GitBox
mingmwang commented on PR #3912: URL: https://github.com/apache/arrow-datafusion/pull/3912#issuecomment-1306760018 And for a simple expressions like a + 100 < 20, can we safely derive the boundary of a [min, -80) ? Probably not, considering a + 100 might overflow the type's boundary. -

[GitHub] [arrow-datafusion] jackwener commented on issue #4123: Bug in interpreting correctly parsed SQL with aliases

2022-11-07 Thread GitBox
jackwener commented on issue #4123: URL: https://github.com/apache/arrow-datafusion/issues/4123#issuecomment-1306757385 I will try to fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] rtpsw commented on a diff in pull request #14043: ARROW-17613: [C++] Add function execution API for a preconfigured kernel

2022-11-07 Thread GitBox
rtpsw commented on code in PR #14043: URL: https://github.com/apache/arrow/pull/14043#discussion_r1016229048 ## cpp/src/arrow/compute/function_test.cc: ## @@ -351,5 +354,87 @@ TEST(ScalarAggregateFunction, DispatchExact) { ASSERT_TRUE(selected_kernel->signature->MatchesInputs

[GitHub] [arrow-datafusion] mingmwang commented on pull request #3912: Expression boundary analysis framework

2022-11-07 Thread GitBox
mingmwang commented on PR #3912: URL: https://github.com/apache/arrow-datafusion/pull/3912#issuecomment-1306746408 Based my understanding, I think the selectivity and row count estimation is more important, the column boundaries estimation and derivation is error prone, especially after co

[GitHub] [arrow] ursabot commented on pull request #14598: ARROW-18260: [C++][CMake] Add support for x64 for CMAKE_SYSTEM_PROCESSOR

2022-11-07 Thread GitBox
ursabot commented on PR #14598: URL: https://github.com/apache/arrow/pull/14598#issuecomment-1306744750 Benchmark runs are scheduled for baseline = 8776295ffb5dc635f2dfe016ce6f4e9847ae1131 and contender = 4ec92ec1f8cf54b8ad70ef4a973573bf17d9b43d. 4ec92ec1f8cf54b8ad70ef4a973573bf17d9b43d is

[GitHub] [arrow] kou merged pull request #14586: ARROW-18284: [Python][Docs] Add missing CMAKE_PREFIX_PATH to allow setup.py CMake invocations to find Arrow CMake package

2022-11-07 Thread GitBox
kou merged PR #14586: URL: https://github.com/apache/arrow/pull/14586 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

[GitHub] [arrow] github-actions[bot] commented on pull request #14586: ARROW-18284: [Python] Update bulding docs to allow setup.py CMake invocations to find Arrow

2022-11-07 Thread GitBox
github-actions[bot] commented on PR #14586: URL: https://github.com/apache/arrow/pull/14586#issuecomment-1306736789 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #14586: ARROW-18284: [Python] Update bulding docs to allow setup.py CMake invocations to find Arrow

2022-11-07 Thread GitBox
github-actions[bot] commented on PR #14586: URL: https://github.com/apache/arrow/pull/14586#issuecomment-1306736754 https://issues.apache.org/jira/browse/ARROW-18284 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow-datafusion] mingmwang commented on pull request #3912: Expression boundary analysis framework

2022-11-07 Thread GitBox
mingmwang commented on PR #3912: URL: https://github.com/apache/arrow-datafusion/pull/3912#issuecomment-1306728884 Can we get this PR merged this week? I would like to start working on the Filter Selectivity estimation PR base on those new APIs. -- This is an automated message from th

[GitHub] [arrow-datafusion] HaoYang670 opened a new pull request, #4137: Reserve the `Cast` expression in `columnize_expr`

2022-11-07 Thread GitBox
HaoYang670 opened a new pull request, #4137: URL: https://github.com/apache/arrow-datafusion/pull/4137 Signed-off-by: remzi <1371656737...@gmail.com> # Which issue does this PR close? Closes #3499. # Rationale for this change # What changes are incl

[GitHub] [arrow] rtpsw commented on a diff in pull request #14043: ARROW-17613: [C++] Add function execution API for a preconfigured kernel

2022-11-07 Thread GitBox
rtpsw commented on code in PR #14043: URL: https://github.com/apache/arrow/pull/14043#discussion_r1016200159 ## cpp/src/arrow/compute/function.cc: ## @@ -167,6 +179,109 @@ const Kernel* DispatchExactImpl(const Function* func, return nullptr; } +struct FunctionExecutorImpl

[GitHub] [arrow] rtpsw commented on a diff in pull request #14043: ARROW-17613: [C++] Add function execution API for a preconfigured kernel

2022-11-07 Thread GitBox
rtpsw commented on code in PR #14043: URL: https://github.com/apache/arrow/pull/14043#discussion_r1016198833 ## cpp/src/arrow/compute/function.cc: ## @@ -167,6 +179,109 @@ const Kernel* DispatchExactImpl(const Function* func, return nullptr; } +struct FunctionExecutorImpl

[GitHub] [arrow-datafusion] ArkashaJavelin opened a new pull request, #4136: Set versions to dependencies with path in benchmarks Cargo.toml file

2022-11-07 Thread GitBox
ArkashaJavelin opened a new pull request, #4136: URL: https://github.com/apache/arrow-datafusion/pull/4136 # Which issue does this PR close? Closes #4134 # Rationale for this change See #4134 # Are these changes tested? # Are there any user-facing changes?

[GitHub] [arrow-datafusion] HaoYang670 commented on issue #3499: In some circumstances cast expression is not working

2022-11-07 Thread GitBox
HaoYang670 commented on issue #3499: URL: https://github.com/apache/arrow-datafusion/issues/3499#issuecomment-1306672468 By the way, there is no bug when you select from a view. For example: ``` ❯ create view v as select 1 as a; 0 rows in set. Query took 0.003 seconds. ❯ explain

[GitHub] [arrow-datafusion] HaoYang670 commented on issue #3499: In some circumstances cast expression is not working

2022-11-07 Thread GitBox
HaoYang670 commented on issue #3499: URL: https://github.com/apache/arrow-datafusion/issues/3499#issuecomment-1306670168 Thanks a lot for your reporting @mpurins-coralogix !! I will fix this. > I think this is happening because cast expression is replaced with column expression in co

[GitHub] [arrow-datafusion] timvw commented on a diff in pull request #4126: Enable TableProviderFactories to receive additional options when creating an external table

2022-11-07 Thread GitBox
timvw commented on code in PR #4126: URL: https://github.com/apache/arrow-datafusion/pull/4126#discussion_r1016162564 ## datafusion/sql/src/parser.rs: ## @@ -627,6 +677,48 @@ mod tests { "CREATE EXTERNAL TABLE t(c1 int) STORED AS CSV PARTITIONED BY (p1 int) LOCATIO

[GitHub] [arrow-datafusion] timvw commented on a diff in pull request #4126: Enable TableProviderFactories to receive additional options when creating an external table

2022-11-07 Thread GitBox
timvw commented on code in PR #4126: URL: https://github.com/apache/arrow-datafusion/pull/4126#discussion_r1016162564 ## datafusion/sql/src/parser.rs: ## @@ -627,6 +677,48 @@ mod tests { "CREATE EXTERNAL TABLE t(c1 int) STORED AS CSV PARTITIONED BY (p1 int) LOCATIO

[GitHub] [arrow-datafusion] timvw commented on a diff in pull request #4126: Enable TableProviderFactories to receive additional options when creating an external table

2022-11-07 Thread GitBox
timvw commented on code in PR #4126: URL: https://github.com/apache/arrow-datafusion/pull/4126#discussion_r1016162564 ## datafusion/sql/src/parser.rs: ## @@ -627,6 +677,48 @@ mod tests { "CREATE EXTERNAL TABLE t(c1 int) STORED AS CSV PARTITIONED BY (p1 int) LOCATIO

[GitHub] [arrow-datafusion] ursabot commented on pull request #4135: Fix links

2022-11-07 Thread GitBox
ursabot commented on PR #4135: URL: https://github.com/apache/arrow-datafusion/pull/4135#issuecomment-1306637591 Benchmark runs are scheduled for baseline = 531dfdc21178b3f0a2ca1b188c56e26143d7af87 and contender = 47f0e5a607acd0f400e9f6d998c0ea22e4c6a084. 47f0e5a607acd0f400e9f6d998c0ea22e

[GitHub] [arrow-datafusion] andygrove merged pull request #4135: Fix links

2022-11-07 Thread GitBox
andygrove merged PR #4135: URL: https://github.com/apache/arrow-datafusion/pull/4135 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow] kou merged pull request #14598: ARROW-18260: [C++][CMake] Add support for x64 for CMAKE_SYSTEM_PROCESSOR

2022-11-07 Thread GitBox
kou merged PR #14598: URL: https://github.com/apache/arrow/pull/14598 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

[GitHub] [arrow-ballista] dependabot[bot] closed pull request #498: Update arrow-flight requirement from 25.0.0 to 26.0.0

2022-11-07 Thread GitBox
dependabot[bot] closed pull request #498: Update arrow-flight requirement from 25.0.0 to 26.0.0 URL: https://github.com/apache/arrow-ballista/pull/498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow-ballista] dependabot[bot] commented on pull request #498: Update arrow-flight requirement from 25.0.0 to 26.0.0

2022-11-07 Thread GitBox
dependabot[bot] commented on PR #498: URL: https://github.com/apache/arrow-ballista/pull/498#issuecomment-1306633799 Looks like arrow-flight is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow-ballista] dependabot[bot] commented on pull request #497: Update arrow requirement from 25.0.0 to 26.0.0

2022-11-07 Thread GitBox
dependabot[bot] commented on PR #497: URL: https://github.com/apache/arrow-ballista/pull/497#issuecomment-1306633649 Looks like arrow is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [arrow-ballista] dependabot[bot] closed pull request #497: Update arrow requirement from 25.0.0 to 26.0.0

2022-11-07 Thread GitBox
dependabot[bot] closed pull request #497: Update arrow requirement from 25.0.0 to 26.0.0 URL: https://github.com/apache/arrow-ballista/pull/497 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow-ballista] andygrove closed issue #445: Upgrade to DataFusion 14.0.0

2022-11-07 Thread GitBox
andygrove closed issue #445: Upgrade to DataFusion 14.0.0 URL: https://github.com/apache/arrow-ballista/issues/445 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[GitHub] [arrow-ballista] andygrove merged pull request #499: Upgrade to DataFusion 14.0.0 and Arrow 26.0.0

2022-11-07 Thread GitBox
andygrove merged PR #499: URL: https://github.com/apache/arrow-ballista/pull/499 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow] milesgranger commented on a diff in pull request #14395: ARROW-17960: [C++][Python] Implement list_slice kernel

2022-11-07 Thread GitBox
milesgranger commented on code in PR #14395: URL: https://github.com/apache/arrow/pull/14395#discussion_r1016137221 ## cpp/src/arrow/util/string_builder.h: ## @@ -44,6 +45,11 @@ class ARROW_EXPORT StringStreamWrapper { } // namespace detail +template +std::ostream& operat

[GitHub] [arrow] milesgranger commented on a diff in pull request #14395: ARROW-17960: [C++][Python] Implement list_slice kernel

2022-11-07 Thread GitBox
milesgranger commented on code in PR #14395: URL: https://github.com/apache/arrow/pull/14395#discussion_r1016136776 ## cpp/src/arrow/compute/kernels/scalar_nested.cc: ## @@ -87,6 +89,198 @@ Status GetListElementIndex(const ExecValue& value, T* out) { return Status::OK(); }

[GitHub] [arrow] milesgranger commented on a diff in pull request #14395: ARROW-17960: [C++][Python] Implement list_slice kernel

2022-11-07 Thread GitBox
milesgranger commented on code in PR #14395: URL: https://github.com/apache/arrow/pull/14395#discussion_r1016136363 ## cpp/src/arrow/compute/kernels/scalar_nested.cc: ## @@ -87,6 +89,198 @@ Status GetListElementIndex(const ExecValue& value, T* out) { return Status::OK(); }

[GitHub] [arrow] milesgranger commented on a diff in pull request #14395: ARROW-17960: [C++][Python] Implement list_slice kernel

2022-11-07 Thread GitBox
milesgranger commented on code in PR #14395: URL: https://github.com/apache/arrow/pull/14395#discussion_r1016136115 ## cpp/src/arrow/compute/kernels/scalar_nested_test.cc: ## @@ -116,6 +117,125 @@ TEST(TestScalarNested, ListElementInvalid) { Raises(StatusCode::Inv

[GitHub] [arrow] moria97 opened a new issue, #14606: [C++] memory consumption question

2022-11-07 Thread GitBox
moria97 opened a new issue, #14606: URL: https://github.com/apache/arrow/issues/14606 We are using the arrow cpp lib in a long running service to continuously converting streaming data to parquet. However, we observe there are some native memory consumption even if there are no incoming dat

[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #4122: [Part3] Partition and Sort Enforcement, Enforcement rule implementation

2022-11-07 Thread GitBox
Ted-Jiang commented on code in PR #4122: URL: https://github.com/apache/arrow-datafusion/pull/4122#discussion_r1016121306 ## datafusion/core/src/physical_optimizer/enforcement.rs: ## @@ -0,0 +1,1646 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more co

[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #4122: [Part3] Partition and Sort Enforcement, Enforcement rule implementation

2022-11-07 Thread GitBox
Ted-Jiang commented on code in PR #4122: URL: https://github.com/apache/arrow-datafusion/pull/4122#discussion_r1016121306 ## datafusion/core/src/physical_optimizer/enforcement.rs: ## @@ -0,0 +1,1646 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more co

[GitHub] [arrow-ballista] yahoNanJing closed issue #502: Don't throw error when job path not exist in remove_job_data

2022-11-07 Thread GitBox
yahoNanJing closed issue #502: Don't throw error when job path not exist in remove_job_data URL: https://github.com/apache/arrow-ballista/issues/502 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [arrow-ballista] yahoNanJing merged pull request #503: Don't throw error when job shuffle data path not exist in executor

2022-11-07 Thread GitBox
yahoNanJing merged PR #503: URL: https://github.com/apache/arrow-ballista/pull/503 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@ar

[GitHub] [arrow-datafusion-python] Jimexist commented on issue #7: Release version 0.7.0

2022-11-07 Thread GitBox
Jimexist commented on issue #7: URL: https://github.com/apache/arrow-datafusion-python/issues/7#issuecomment-1306589917 I used to do release while it is hosted in datafusion-contrib org, if help is needed I can share more. -- This is an automated message from the Apache Git Service. To r

[GitHub] [arrow] rok commented on pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-11-07 Thread GitBox
rok commented on PR #14472: URL: https://github.com/apache/arrow/pull/14472#issuecomment-1306582303 I'm still having issues with `mvn generate-resources -Pgenerate-libs-jni-macos-linux -N` so I opened a follow-up [jira](https://issues.apache.org/jira/browse/ARROW-18278). -- This is an au

[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #4131: Add parquet integration tests for explicitly smaller page sizes, page pruning

2022-11-07 Thread GitBox
Ted-Jiang commented on code in PR #4131: URL: https://github.com/apache/arrow-datafusion/pull/4131#discussion_r1016111080 ## benchmarks/src/bin/parquet_filter_pushdown.rs: ## @@ -73,7 +74,19 @@ async fn main() -> Result<()> { let path = opt.path.join("logs.parquet"); -

[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #4131: Add parquet integration tests for explicitly smaller page sizes, page pruning

2022-11-07 Thread GitBox
Ted-Jiang commented on code in PR #4131: URL: https://github.com/apache/arrow-datafusion/pull/4131#discussion_r1016110656 ## benchmarks/src/bin/parquet_filter_pushdown.rs: ## @@ -73,7 +74,19 @@ async fn main() -> Result<()> { let path = opt.path.join("logs.parquet"); -

[GitHub] [arrow] ursabot commented on pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-11-07 Thread GitBox
ursabot commented on PR #14472: URL: https://github.com/apache/arrow/pull/14472#issuecomment-1306580236 Benchmark runs are scheduled for baseline = 98943d90dacb0311fe0d7a17a8ef10762e1c0ef2 and contender = 8776295ffb5dc635f2dfe016ce6f4e9847ae1131. 8776295ffb5dc635f2dfe016ce6f4e9847ae1131 is

[GitHub] [arrow-datafusion] Ted-Jiang commented on a diff in pull request #4130: Consolidate `ParquetExec` tests in `parquet_exec` integration test

2022-11-07 Thread GitBox
Ted-Jiang commented on code in PR #4130: URL: https://github.com/apache/arrow-datafusion/pull/4130#discussion_r1016089851 ## datafusion/core/tests/parquet_exec.rs: ## @@ -0,0 +1,19 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

[GitHub] [arrow-datafusion] Ted-Jiang commented on pull request #4062: Enable tests for page index filtering in parquet filter pushdown test

2022-11-07 Thread GitBox
Ted-Jiang commented on PR #4062: URL: https://github.com/apache/arrow-datafusion/pull/4062#issuecomment-1306541879 @alamb Thanks! let's see what going on 😄 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow-ballista] yahoNanJing opened a new pull request, #503: Don't throw error when job shuffle data path not exist in executor

2022-11-07 Thread GitBox
yahoNanJing opened a new pull request, #503: URL: https://github.com/apache/arrow-ballista/pull/503 # Which issue does this PR close? Closes #502. # Rationale for this change # What changes are included in this PR? # Are there any user-facing chang

[GitHub] [arrow-datafusion] mvanschellebeeck opened a new pull request, #4135: Fix links

2022-11-07 Thread GitBox
mvanschellebeeck opened a new pull request, #4135: URL: https://github.com/apache/arrow-datafusion/pull/4135 # What changes are included in this PR? Fix links -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow-ballista] yahoNanJing opened a new issue, #502: Don't throw error when job path not exist in remove_job_data

2022-11-07 Thread GitBox
yahoNanJing opened a new issue, #502: URL: https://github.com/apache/arrow-ballista/issues/502 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Since current job shuffle data clean up policy is to call remove_job_data for e

[GitHub] [arrow] djnavarro commented on a diff in pull request #14514: ARROW-17887: [R][Doc][WIP] Improve readability of the Get Started and README pages

2022-11-07 Thread GitBox
djnavarro commented on code in PR #14514: URL: https://github.com/apache/arrow/pull/14514#discussion_r1016082278 ## r/README.md: ## @@ -1,331 +1,104 @@ -# arrow +# arrow https://arrow.apache.org/img/arrow-logo_hex_black-txt_white-bg.png"; align="right" alt="" width="120" />

[GitHub] [arrow-rs] viirya commented on a diff in pull request #3040: Cast decimal256 to signed integer

2022-11-07 Thread GitBox
viirya commented on code in PR #3040: URL: https://github.com/apache/arrow-rs/pull/3040#discussion_r1016081759 ## arrow-cast/src/cast.rs: ## @@ -433,34 +434,52 @@ fn cast_reinterpret_arrays< )) } -// cast the decimal array to integer array -macro_rules! cast_decimal_to_i

[GitHub] [arrow] djnavarro commented on a diff in pull request #14514: ARROW-17887: [R][Doc][WIP] Improve readability of the Get Started and README pages

2022-11-07 Thread GitBox
djnavarro commented on code in PR #14514: URL: https://github.com/apache/arrow/pull/14514#discussion_r1016078767 ## r/README.md: ## @@ -1,331 +1,104 @@ -# arrow +# arrow https://arrow.apache.org/img/arrow-logo_hex_black-txt_white-bg.png"; align="right" alt="" width="120" />

[GitHub] [arrow] djnavarro commented on a diff in pull request #14514: ARROW-17887: [R][Doc][WIP] Improve readability of the Get Started and README pages

2022-11-07 Thread GitBox
djnavarro commented on code in PR #14514: URL: https://github.com/apache/arrow/pull/14514#discussion_r1016078636 ## r/README.md: ## @@ -1,331 +1,104 @@ -# arrow +# arrow https://arrow.apache.org/img/arrow-logo_hex_black-txt_white-bg.png"; align="right" alt="" width="120" />

[GitHub] [arrow] djnavarro commented on a diff in pull request #14514: ARROW-17887: [R][Doc][WIP] Improve readability of the Get Started and README pages

2022-11-07 Thread GitBox
djnavarro commented on code in PR #14514: URL: https://github.com/apache/arrow/pull/14514#discussion_r1016078081 ## r/vignettes/arrow.Rmd: ## @@ -1,227 +1,223 @@ --- -title: "Using the Arrow C++ Library in R" -description: "This document describes the low-level interface to the

[GitHub] [arrow] vibhatha commented on a diff in pull request #14596: ARROW-18258: [Docker] Substrait Integration Testing

2022-11-07 Thread GitBox
vibhatha commented on code in PR #14596: URL: https://github.com/apache/arrow/pull/14596#discussion_r1016066950 ## docker-compose.yml: ## @@ -1245,6 +1246,38 @@ services: /arrow/ci/scripts/python_build.sh /arrow /build && /arrow/ci/scripts/integration_turbodbc.

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #3040: Cast decimal256 to signed integer

2022-11-07 Thread GitBox
tustvold commented on code in PR #3040: URL: https://github.com/apache/arrow-rs/pull/3040#discussion_r1016055474 ## arrow-buffer/src/bigint.rs: ## @@ -478,6 +480,44 @@ define_as_primitive!(i16); define_as_primitive!(i32); define_as_primitive!(i64); +impl ToPrimitive for i256

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #3040: Cast decimal256 to signed integer

2022-11-07 Thread GitBox
tustvold commented on code in PR #3040: URL: https://github.com/apache/arrow-rs/pull/3040#discussion_r1016054603 ## arrow-cast/src/cast.rs: ## @@ -433,34 +434,52 @@ fn cast_reinterpret_arrays< )) } -// cast the decimal array to integer array -macro_rules! cast_decimal_to

[GitHub] [arrow-datafusion] tustvold commented on pull request #4133: Use f64::total_cmp instead of OrderedFloat

2022-11-07 Thread GitBox
tustvold commented on PR #4133: URL: https://github.com/apache/arrow-datafusion/pull/4133#issuecomment-1306477153 @comphead I would recommend creating a newtype wrapper around a num::Float that implements Hash using hash_utils, eq using total_cmp, etc... -- This is an automated message f

[GitHub] [arrow-datafusion] andygrove opened a new issue, #4134: Update release scripts to support datafusion-benchmarks

2022-11-07 Thread GitBox
andygrove opened a new issue, #4134: URL: https://github.com/apache/arrow-datafusion/issues/4134 **Describe the bug** The Cargo.toml in the benchmark crate references dependencies by path only and not by version. **To Reproduce** N/A **Expected behavior** We should use

[GitHub] [arrow-datafusion-python] andygrove commented on issue #7: Release version 0.7.0

2022-11-07 Thread GitBox
andygrove commented on issue #7: URL: https://github.com/apache/arrow-datafusion-python/issues/7#issuecomment-1306465820 I would like to see us start releasing this project again, but I don't have sufficient knowledge about Python. I would love to pair with someone to get a release out.

[GitHub] [arrow-datafusion-python] andygrove commented on pull request #67: Upgrade to DataFusion 14.0.0

2022-11-07 Thread GitBox
andygrove commented on PR #67: URL: https://github.com/apache/arrow-datafusion-python/pull/67#issuecomment-1306462750 @francis-du fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow-datafusion-python] andygrove opened a new pull request, #67: Upgrade to DataFusion 14.0.0

2022-11-07 Thread GitBox
andygrove opened a new pull request, #67: URL: https://github.com/apache/arrow-datafusion-python/pull/67 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing cha

[GitHub] [arrow] kou merged pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-11-07 Thread GitBox
kou merged PR #14472: URL: https://github.com/apache/arrow/pull/14472 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

[GitHub] [arrow] kou commented on pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-11-07 Thread GitBox
kou commented on PR #14472: URL: https://github.com/apache/arrow/pull/14472#issuecomment-1306447527 > I'm still struggling with `mvn generate-resources -Pgenerate-libs-jni-macos-linux -N` but I think that's related to my cmake situation. Could you open a Jira issue for it with full l

[GitHub] [arrow] westonpace commented on a diff in pull request #14444: ARROW-17784: [C++] Opening a dataset where partitioning variable is already in the dataset should error differently

2022-11-07 Thread GitBox
westonpace commented on code in PR #1: URL: https://github.com/apache/arrow/pull/1#discussion_r1016037559 ## cpp/src/arrow/dataset/discovery.cc: ## @@ -236,25 +236,35 @@ Result>> FileSystemDatasetFactory::InspectSc InspectOptions options) { std::vector> schemas;

[GitHub] [arrow] ursabot commented on pull request #14544: ARROW-18108: [Go] More scalar binary arithmetic (Multiply and Divide)

2022-11-07 Thread GitBox
ursabot commented on PR #14544: URL: https://github.com/apache/arrow/pull/14544#issuecomment-1306438818 Benchmark runs are scheduled for baseline = 619b034bd3e14937fa5d12f8e86fa83e7444b886 and contender = 98943d90dacb0311fe0d7a17a8ef10762e1c0ef2. 98943d90dacb0311fe0d7a17a8ef10762e1c0ef2 is

[GitHub] [arrow-adbc] lidavidm commented on pull request #161: fix(c/driver/postgres): fix wheel builds

2022-11-07 Thread GitBox
lidavidm commented on PR #161: URL: https://github.com/apache/arrow-adbc/pull/161#issuecomment-1306427358 Ah, thanks. I haven't tested on M1 yet but will do so soon. (In general I want to start setting up wheels for other platforms. Though: because there's no M1 runners available to this re

[GitHub] [arrow-rs] viirya commented on a diff in pull request #3040: Cast decimal256 to signed integer

2022-11-07 Thread GitBox
viirya commented on code in PR #3040: URL: https://github.com/apache/arrow-rs/pull/3040#discussion_r1016030872 ## arrow-buffer/src/bigint.rs: ## @@ -478,6 +480,44 @@ define_as_primitive!(i16); define_as_primitive!(i32); define_as_primitive!(i64); +impl ToPrimitive for i256 {

[GitHub] [arrow-rs] viirya commented on a diff in pull request #3040: Cast decimal256 to signed integer

2022-11-07 Thread GitBox
viirya commented on code in PR #3040: URL: https://github.com/apache/arrow-rs/pull/3040#discussion_r1016030044 ## arrow-cast/src/cast.rs: ## @@ -411,34 +412,51 @@ fn cast_reinterpret_arrays< )) } -// cast the decimal array to integer array -macro_rules! cast_decimal_to_i

[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #4132: Support parquet page filtering for string columns

2022-11-07 Thread GitBox
liukun4515 commented on code in PR #4132: URL: https://github.com/apache/arrow-datafusion/pull/4132#discussion_r1016029491 ## datafusion/core/src/physical_plan/file_format/parquet/page_filter.rs: ## @@ -421,7 +423,16 @@ macro_rules! get_min_max_values_for_page_index {

[GitHub] [arrow-ballista] avantgardnerio commented on pull request #501: Automatically register tables if env var specified

2022-11-07 Thread GitBox
avantgardnerio commented on PR #501: URL: https://github.com/apache/arrow-ballista/pull/501#issuecomment-1306419404 @andygrove this allows users to see tables in ballista from DataGrip (and probably tableau): ![image](https://user-images.githubusercontent.com/3855243/20075-c278b2

[GitHub] [arrow-ballista] avantgardnerio opened a new pull request, #501: Automatically register tables if env var specified

2022-11-07 Thread GitBox
avantgardnerio opened a new pull request, #501: URL: https://github.com/apache/arrow-ballista/pull/501 # Which issue does this PR close? Closes #500. # Rationale for this change Described in issue. # What changes are included in this PR? 1. upgrade of dataf

[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #4132: Support parquet page filtering for string columns

2022-11-07 Thread GitBox
liukun4515 commented on code in PR #4132: URL: https://github.com/apache/arrow-datafusion/pull/4132#discussion_r1016027321 ## datafusion/core/src/physical_plan/file_format/parquet/page_filter.rs: ## @@ -421,7 +423,16 @@ macro_rules! get_min_max_values_for_page_index {

[GitHub] [arrow-rs] viirya commented on a diff in pull request #3040: Cast decimal256 to signed integer

2022-11-07 Thread GitBox
viirya commented on code in PR #3040: URL: https://github.com/apache/arrow-rs/pull/3040#discussion_r1016027312 ## arrow-cast/src/cast.rs: ## @@ -411,34 +412,51 @@ fn cast_reinterpret_arrays< )) } -// cast the decimal array to integer array -macro_rules! cast_decimal_to_i

[GitHub] [arrow-datafusion-python] andygrove merged pull request #55: Add `register_object_store` to `SessionContext`

2022-11-07 Thread GitBox
andygrove merged PR #55: URL: https://github.com/apache/arrow-datafusion-python/pull/55 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

[GitHub] [arrow-ballista] avantgardnerio opened a new issue, #500: Use ListingSchema in Ballista

2022-11-07 Thread GitBox
avantgardnerio opened a new issue, #500: URL: https://github.com/apache/arrow-ballista/issues/500 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Ever since [this PR](https://github.com/apache/arrow-datafusion/pull/4112) was m

[GitHub] [arrow-ballista] andygrove opened a new pull request, #499: Upgrade to DataFusion 14.0.0

2022-11-07 Thread GitBox
andygrove opened a new pull request, #499: URL: https://github.com/apache/arrow-ballista/pull/499 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes?

[GitHub] [arrow-adbc] paleolimbot commented on pull request #161: fix(c/driver/postgres): fix wheel builds

2022-11-07 Thread GitBox
paleolimbot commented on PR #161: URL: https://github.com/apache/arrow-adbc/pull/161#issuecomment-1306405191 I installed libpq via homebrew on Mac M1...I don't know exactly when I'll get to it but am happy to debug + PR a fix when I do. It may just be a line in the readme (how to pass your

[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #4126: Enable TableProviderFactories to receive additional options when creating an external table

2022-11-07 Thread GitBox
andygrove commented on code in PR #4126: URL: https://github.com/apache/arrow-datafusion/pull/4126#discussion_r1016011698 ## datafusion/sql/src/parser.rs: ## @@ -381,6 +391,37 @@ impl<'a> DFParser<'a> { } } +fn parse_has_options(&mut self) -> bool { +

[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #4126: Enable TableProviderFactories to receive additional options when creating an external table

2022-11-07 Thread GitBox
andygrove commented on code in PR #4126: URL: https://github.com/apache/arrow-datafusion/pull/4126#discussion_r1016008111 ## datafusion/sql/src/parser.rs: ## @@ -627,6 +677,48 @@ mod tests { "CREATE EXTERNAL TABLE t(c1 int) STORED AS CSV PARTITIONED BY (p1 int) LOC

[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #4126: Enable TableProviderFactories to receive additional options when creating an external table

2022-11-07 Thread GitBox
andygrove commented on code in PR #4126: URL: https://github.com/apache/arrow-datafusion/pull/4126#discussion_r1016005713 ## datafusion/sql/src/parser.rs: ## @@ -381,6 +391,37 @@ impl<'a> DFParser<'a> { } } +fn parse_has_options(&mut self) -> bool { +

[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #4126: Enable TableProviderFactories to receive additional options when creating an external table

2022-11-07 Thread GitBox
avantgardnerio commented on PR #4126: URL: https://github.com/apache/arrow-datafusion/pull/4126#issuecomment-1306378118 > fyi Thanks @andygrove . @timvw read my mind - this was actually underway in the early versions of the PR but I abandoned it. LGTM = looks great to me :)

[GitHub] [arrow-datafusion] ursabot commented on pull request #4114: Template change

2022-11-07 Thread GitBox
ursabot commented on PR #4114: URL: https://github.com/apache/arrow-datafusion/pull/4114#issuecomment-1306373238 Benchmark runs are scheduled for baseline = b9eee503865b85d365ced94524f346a9aacc4ab6 and contender = 531dfdc21178b3f0a2ca1b188c56e26143d7af87. 531dfdc21178b3f0a2ca1b188c56e2614

[GitHub] [arrow-datafusion] ursabot commented on pull request #4109: Update SQL reference to state that decimal support is currently experimental

2022-11-07 Thread GitBox
ursabot commented on PR #4109: URL: https://github.com/apache/arrow-datafusion/pull/4109#issuecomment-1306373228 Benchmark runs are scheduled for baseline = 7b5842b91ebd00a2c7f894fcad797bea68a56d0f and contender = b9eee503865b85d365ced94524f346a9aacc4ab6. b9eee503865b85d365ced94524f346a9a

[GitHub] [arrow-datafusion] andygrove commented on pull request #4126: Enable TableProviderFactories to receive additional options when creating an external table

2022-11-07 Thread GitBox
andygrove commented on PR #4126: URL: https://github.com/apache/arrow-datafusion/pull/4126#issuecomment-1306372174 @avantgardnerio fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [arrow-datafusion] andygrove closed issue #4113: Add test field to PR template

2022-11-07 Thread GitBox
andygrove closed issue #4113: Add test field to PR template URL: https://github.com/apache/arrow-datafusion/issues/4113 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[GitHub] [arrow-datafusion] andygrove merged pull request #4114: Template change

2022-11-07 Thread GitBox
andygrove merged PR #4114: URL: https://github.com/apache/arrow-datafusion/pull/4114 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow-datafusion] andygrove merged pull request #4109: Update SQL reference to state that decimal support is currently experimental

2022-11-07 Thread GitBox
andygrove merged PR #4109: URL: https://github.com/apache/arrow-datafusion/pull/4109 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow-datafusion] andygrove closed issue #4036: Add documentation to make it clear that decimal support is still experimental

2022-11-07 Thread GitBox
andygrove closed issue #4036: Add documentation to make it clear that decimal support is still experimental URL: https://github.com/apache/arrow-datafusion/issues/4036 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [arrow-datafusion] comphead commented on pull request #4133: Use f64::total_cmp instead of OrderedFloat

2022-11-07 Thread GitBox
comphead commented on PR #4133: URL: https://github.com/apache/arrow-datafusion/pull/4133#issuecomment-1306344642 @tustvold I have replaced almost all entries OrderedFloat to f64. Still thinking how to use you hasher to remove OrderedFloat from Hash. As your trait implement HashValue an

[GitHub] [arrow-datafusion] comphead opened a new pull request, #4133: Use f64::total_cmp instead of OrderedFloat

2022-11-07 Thread GitBox
comphead opened a new pull request, #4133: URL: https://github.com/apache/arrow-datafusion/pull/4133 # Which issue does this PR close? Closes #4051 . # Rationale for this change See #4051 # What changes are included in this PR? # Are there a

[GitHub] [arrow-datafusion] isidentical commented on pull request #3912: Expression boundary analysis framework

2022-11-07 Thread GitBox
isidentical commented on PR #3912: URL: https://github.com/apache/arrow-datafusion/pull/3912#issuecomment-1306340705 @alamb I think that sounds like a good plan! The primary place where this would be essential (or at least, I'd interpret it as essential; but maybe we can find a simpler sol

[GitHub] [arrow] rok commented on pull request #14472: ARROW-18042: [Java] Distribute Apple M1 compatible JNI libraries via mavencentral

2022-11-07 Thread GitBox
rok commented on PR #14472: URL: https://github.com/apache/arrow/pull/14472#issuecomment-1306337963 I think we can merge! On M1 I was able to run: ``` cd arrow/java mvn clean install mvn generate-resources -Pgenerate-libs-cdata-all-os -N mvn -Darrow.c.jni.dist.dir=/Users

[GitHub] [arrow-rs] ursabot commented on pull request #3036: Fix decoding long and/or padded RLE data (#3029) (#3035)

2022-11-07 Thread GitBox
ursabot commented on PR #3036: URL: https://github.com/apache/arrow-rs/pull/3036#issuecomment-130692 Benchmark runs are scheduled for baseline = b7bc79bf2cbf593fafa0dc552cc2bb16b084d132 and contender = 879b461af8c1259d48fbb1bc67d50fa2a38bea68. 879b461af8c1259d48fbb1bc67d50fa2a38bea68 i

[GitHub] [arrow-rs] tustvold closed issue #3029: RLEDecoder::get_batch_with_dict may panic on bit-packed runs longer than 1024

2022-11-07 Thread GitBox
tustvold closed issue #3029: RLEDecoder::get_batch_with_dict may panic on bit-packed runs longer than 1024 URL: https://github.com/apache/arrow-rs/issues/3029 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow-rs] tustvold merged pull request #3036: Fix decoding long and/or padded RLE data (#3029) (#3035)

2022-11-07 Thread GitBox
tustvold merged PR #3036: URL: https://github.com/apache/arrow-rs/pull/3036 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

[GitHub] [arrow-rs] tustvold closed issue #3035: RLEDecoder Panics on Null Padded Pages

2022-11-07 Thread GitBox
tustvold closed issue #3035: RLEDecoder Panics on Null Padded Pages URL: https://github.com/apache/arrow-rs/issues/3035 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[GitHub] [arrow] github-actions[bot] commented on pull request #14605: ARROW-18109: [Go] Initial Unary Arithmetic

2022-11-07 Thread GitBox
github-actions[bot] commented on PR #14605: URL: https://github.com/apache/arrow/pull/14605#issuecomment-1306327474 https://issues.apache.org/jira/browse/ARROW-18109 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

  1   2   3   >