[GitHub] [arrow] cyb70289 commented on a diff in pull request #15201: GH-15200: [C++] Created benchmarks for round kernels.

2023-01-07 Thread GitBox
cyb70289 commented on code in PR #15201: URL: https://github.com/apache/arrow/pull/15201#discussion_r1064098091 ## cpp/src/arrow/compute/kernels/scalar_round_benchmark.cc: ## @@ -0,0 +1,111 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] cyb70289 commented on pull request #15201: GH-15200: [C++] Created benchmarks for round kernels.

2023-01-07 Thread GitBox
cyb70289 commented on PR #15201: URL: https://github.com/apache/arrow/pull/15201#issuecomment-1374747333 Do you plan to add benchmark for decimal types? It can be in a followup PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [arrow] cyb70289 commented on pull request #15201: GH-15200: [C++] Created benchmarks for round kernels.

2023-01-07 Thread GitBox
cyb70289 commented on PR #15201: URL: https://github.com/apache/arrow/pull/15201#issuecomment-1374746931 > One thought about the number of benchmarks -- there is a benchmark for every type and round mode (10 each). Some alternatives could be to try to fold the round mode tests into the type

[GitHub] [arrow] rtpsw commented on pull request #14934: ARROW-18427: [C++] Support negative tolerance in `AsofJoinNode`

2023-01-07 Thread GitBox
rtpsw commented on PR #14934: URL: https://github.com/apache/arrow/pull/14934#issuecomment-1374743963 @westonpace, looks like this should be ready to go. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] cyb70289 commented on a diff in pull request #15250: GH-15249: [Documentation] Add PR template

2023-01-07 Thread GitBox
cyb70289 commented on code in PR #15250: URL: https://github.com/apache/arrow/pull/15250#discussion_r1064093753 ## .github/pull_request_template.md: ## @@ -0,0 +1,62 @@ +# Which issue does this PR close? + +

[GitHub] [arrow] eitsupi commented on issue #14969: [R] pkgdown built-in search doesn't work

2023-01-07 Thread GitBox
eitsupi commented on issue #14969: URL: https://github.com/apache/arrow/issues/14969#issuecomment-1374697350 According to my past experiments (#12541), I believe the search bar is not working because of https://github.com/apache/arrow/blob/3f31b327cd04e79e673b37ee684d438a72367483/r/pkgdown/

[GitHub] [arrow] kou merged pull request #15238: GH-15237: [C++] Add ::arrow::Unreachable() using std::string_view

2023-01-07 Thread GitBox
kou merged PR #15238: URL: https://github.com/apache/arrow/pull/15238 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

[GitHub] [arrow] kou commented on pull request #15238: GH-15237: [C++] Add ::arrow::Unreachable() using std::string_view

2023-01-07 Thread GitBox
kou commented on PR #15238: URL: https://github.com/apache/arrow/pull/15238#issuecomment-1374697219 Thanks. The failure will be fixed by https://github.com/apache/arrow/pull/15186 . We don't need to amend because we use squash merge. -- This is an automated message from the A

[GitHub] [arrow] kou commented on issue #15242: how to serializer and deserializer arrow table? C++

2023-01-07 Thread GitBox
kou commented on issue #15242: URL: https://github.com/apache/arrow/issues/15242#issuecomment-1374696762 https://arrow.apache.org/docs/cpp/tutorials/io_tutorial.html#prepare-a-fileoutputstream will help you. -- This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [arrow] kou commented on issue #14920: [Python] wheel cmake error on Raspberry PI

2023-01-07 Thread GitBox
kou commented on issue #14920: URL: https://github.com/apache/arrow/issues/14920#issuecomment-1374695906 Thanks for confirming it. I've opened a pull request: https://github.com/apache/arrow/pull/15251 -- This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [arrow] github-actions[bot] commented on pull request #15251: GH-14920: [C++][CMake] Add missing -latomic to Arrow CMake package

2023-01-07 Thread GitBox
github-actions[bot] commented on PR #15251: URL: https://github.com/apache/arrow/pull/15251#issuecomment-1374695923 :warning: GitHub issue #14920 **has no components**, please add labels for components. -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [arrow] github-actions[bot] commented on pull request #15251: GH-14920: [C++][CMake] Add missing -latomic to Arrow CMake package

2023-01-07 Thread GitBox
github-actions[bot] commented on PR #15251: URL: https://github.com/apache/arrow/pull/15251#issuecomment-1374695911 * Closes: #14920 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] github-actions[bot] commented on pull request #15251: GH-14920: [C++][CMake] Add missing -latomic to Arrow CMake package

2023-01-07 Thread GitBox
github-actions[bot] commented on PR #15251: URL: https://github.com/apache/arrow/pull/15251#issuecomment-1374695920 :warning: GitHub issue #14920 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow] kou opened a new pull request, #15251: GH-14920: [C++][CMake] Add missing -latomic to Arrow CMake package

2023-01-07 Thread GitBox
kou opened a new pull request, #15251: URL: https://github.com/apache/arrow/pull/15251 -latomic is needed for Raspberry PI. Arrow CMake package should specify it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] assignUser commented on a diff in pull request #15245: ARROW-18377: MIGRATION: Automate component labels from issue form content

2023-01-07 Thread GitBox
assignUser commented on code in PR #15245: URL: https://github.com/apache/arrow/pull/15245#discussion_r1064072111 ## .github/workflows/issue_bot.yml: ## @@ -0,0 +1,63 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See

[GitHub] [arrow-datafusion] ygf11 commented on issue #4844: Incorrect results for join condition against current master branch

2023-01-07 Thread GitBox
ygf11 commented on issue #4844: URL: https://github.com/apache/arrow-datafusion/issues/4844#issuecomment-1374689844 It seems the `EliminateCrossJoin` rule only consider the equijoin predicates when finding an appropriate right input of join. We can extend equijoin to all join predicate(equ

[GitHub] [arrow-datafusion] jonmmease opened a new pull request, #4848: Support using var/var_pop/stddev/stddev_pop in window expressions with custom frames

2023-01-07 Thread GitBox
jonmmease opened a new pull request, #4848: URL: https://github.com/apache/arrow-datafusion/pull/4848 # Which issue does this PR close? Follow on to https://github.com/apache/arrow-datafusion/pull/4846 for `var`, `var_pop`, `stddev`, and `stddev_pop`. The actual `retract_batch

[GitHub] [arrow] rok commented on a diff in pull request #15245: ARROW-18377: MIGRATION: Automate component labels from issue form content

2023-01-07 Thread GitBox
rok commented on code in PR #15245: URL: https://github.com/apache/arrow/pull/15245#discussion_r1064066461 ## .github/workflows/issue_bot.yml: ## @@ -0,0 +1,63 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NO

[GitHub] [arrow] ursabot commented on pull request #15227: MINOR: [Docs] Fix typo in docs

2023-01-07 Thread GitBox
ursabot commented on PR #15227: URL: https://github.com/apache/arrow/pull/15227#issuecomment-1374675509 Benchmark runs are scheduled for baseline = 21d6374d2579c07d75832c5baf06479898e82fd5 and contender = 3f31b327cd04e79e673b37ee684d438a72367483. 3f31b327cd04e79e673b37ee684d438a72367483 is

[GitHub] [arrow-datafusion] jonmmease commented on a diff in pull request #4847: Update variance/stddev to work with single values

2023-01-07 Thread GitBox
jonmmease commented on code in PR #4847: URL: https://github.com/apache/arrow-datafusion/pull/4847#discussion_r1064064531 ## datafusion/physical-expr/src/aggregate/variance.rs: ## @@ -286,17 +286,17 @@ impl Accumulator for VarianceAccumulator { } }; -

[GitHub] [arrow-datafusion] jonmmease commented on pull request #4847: Update variance/stddev to work with single values

2023-01-07 Thread GitBox
jonmmease commented on PR #4847: URL: https://github.com/apache/arrow-datafusion/pull/4847#issuecomment-1374671628 Ok, the tests are sorted out now. Thanks for taking a look! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow-julia] Moelf commented on issue #373: Addtional `Missing` gets injected into Schema

2023-01-07 Thread GitBox
Moelf commented on issue #373: URL: https://github.com/apache/arrow-julia/issues/373#issuecomment-1374670910 I see, but for a column of `Vector{Union{T, T2}}` you don't need Missing right? because the empty element would just be a ``` Union{T,T2][] ``` -- This is an automated me

[GitHub] [arrow-datafusion] Jefffrey commented on issue #4844: Incorrect results for join condition against current master branch

2023-01-07 Thread GitBox
Jefffrey commented on issue #4844: URL: https://github.com/apache/arrow-datafusion/issues/4844#issuecomment-1374664702 Looks to be regression introduced by fddb3d3651041f41d66a801f10e27387e84374f7 (https://github.com/apache/arrow-datafusion/pull/4562) On the commit prior to it (2792

[GitHub] [arrow-datafusion] matthewwillian commented on pull request #4836: Remove tests from sql_integration that were ported to sqllogictest

2023-01-07 Thread GitBox
matthewwillian commented on PR #4836: URL: https://github.com/apache/arrow-datafusion/pull/4836#issuecomment-1374654108 Thanks @jackwener. One of the aggregates tests and all of the arrow_typeof tests are duplicates of tests that already exist. -- This is an automated message from the Ap

[GitHub] [arrow-datafusion] ursabot commented on pull request #4831: DataFusion 16.0.0 release prep: Update version + add changelog

2023-01-07 Thread GitBox
ursabot commented on PR #4831: URL: https://github.com/apache/arrow-datafusion/pull/4831#issuecomment-1374647896 Benchmark runs are scheduled for baseline = 3cc607de4ce6e9e1fd537091e471858c62f58653 and contender = dcd52ee3d87c4dd9e2c176165e9e20644f66988b. dcd52ee3d87c4dd9e2c176165e9e20644

[GitHub] [arrow] github-actions[bot] commented on pull request #15250: GH-15249: [Documentation] Add PR template

2023-01-07 Thread GitBox
github-actions[bot] commented on PR #15250: URL: https://github.com/apache/arrow/pull/15250#issuecomment-1374646497 :warning: GitHub issue #15249 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow] github-actions[bot] commented on pull request #15250: GH-15249: [Documentation] Add PR template

2023-01-07 Thread GitBox
github-actions[bot] commented on PR #15250: URL: https://github.com/apache/arrow/pull/15250#issuecomment-1374646485 * Closes: #15249 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow-datafusion] andygrove merged pull request #4831: DataFusion 16.0.0 release prep: Update version + add changelog

2023-01-07 Thread GitBox
andygrove merged PR #4831: URL: https://github.com/apache/arrow-datafusion/pull/4831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow-julia] quinnj commented on issue #373: Addtional `Missing` gets injected into Schema

2023-01-07 Thread GitBox
quinnj commented on issue #373: URL: https://github.com/apache/arrow-julia/issues/373#issuecomment-1374638211 H, yes, I think I remember that for the Union types, the arrow spec makes it hard because it always allows nulls, so we default to including `Missing` in the Union to account fo

[GitHub] [arrow-julia] quinnj merged pull request #372: Define defaults for `Missing`/`Nothing`

2023-01-07 Thread GitBox
quinnj merged PR #372: URL: https://github.com/apache/arrow-julia/pull/372 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apac

[GitHub] [arrow] assignUser commented on issue #14969: [R] pkgdown built-in search doesn't work

2023-01-07 Thread GitBox
assignUser commented on issue #14969: URL: https://github.com/apache/arrow/issues/14969#issuecomment-1374633546 The search bar only shows up in the dev version, maybe we can make it work and activate it on the release version :D -- This is an automated message from the Apache Git Service.

[GitHub] [arrow] assignUser merged pull request #15227: MINOR: [Docs] Fix typo in docs

2023-01-07 Thread GitBox
assignUser merged PR #15227: URL: https://github.com/apache/arrow/pull/15227 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.ap

[GitHub] [arrow-datafusion] Jefffrey commented on a diff in pull request #4840: Support wildcard select on multiple column using joins

2023-01-07 Thread GitBox
Jefffrey commented on code in PR #4840: URL: https://github.com/apache/arrow-datafusion/pull/4840#discussion_r1064052544 ## datafusion/sql/src/planner.rs: ## @@ -2390,6 +2390,63 @@ mod tests { quick_test(sql, expected); } +#[test] +fn using_join_multiple_

[GitHub] [arrow-datafusion] ozankabak commented on pull request #4847: Update variance/stddev to work with single values

2023-01-07 Thread GitBox
ozankabak commented on PR #4847: URL: https://github.com/apache/arrow-datafusion/pull/4847#issuecomment-1374628175 Hmm, some tests seem to be failing -- needs some investigation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [arrow] rtpsw opened a new pull request, #14934: ARROW-18427: [C++] Support negative tolerance in `AsofJoinNode`

2023-01-07 Thread GitBox
rtpsw opened a new pull request, #14934: URL: https://github.com/apache/arrow/pull/14934 See https://issues.apache.org/jira/browse/ARROW-18427 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] rtpsw closed pull request #14934: ARROW-18427: [C++] Support negative tolerance in `AsofJoinNode`

2023-01-07 Thread GitBox
rtpsw closed pull request #14934: ARROW-18427: [C++] Support negative tolerance in `AsofJoinNode` URL: https://github.com/apache/arrow/pull/14934 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-datafusion] ozankabak commented on a diff in pull request #4847: Update variance/stddev to work with single values

2023-01-07 Thread GitBox
ozankabak commented on code in PR #4847: URL: https://github.com/apache/arrow-datafusion/pull/4847#discussion_r1064051561 ## datafusion/physical-expr/src/aggregate/variance.rs: ## @@ -286,17 +286,17 @@ impl Accumulator for VarianceAccumulator { } }; -

[GitHub] [arrow] kou commented on a diff in pull request #14533: ARROW-17777: [Dev] Update the pull request merge script to work with master or main

2023-01-07 Thread GitBox
kou commented on code in PR #14533: URL: https://github.com/apache/arrow/pull/14533#discussion_r1064051564 ## dev/merge_arrow_pr.py: ## @@ -110,10 +111,32 @@ def strip_ci_directives(commit_message): return _REGEX_CI_DIRECTIVE.sub('', commit_message) +def git_default_bra

[GitHub] [arrow] westonpace commented on pull request #14934: ARROW-18427: [C++] Support negative tolerance in `AsofJoinNode`

2023-01-07 Thread GitBox
westonpace commented on PR #14934: URL: https://github.com/apache/arrow/pull/14934#issuecomment-1374622011 I don't know why the checks didn't run. Once this is passing I think we can merge it. Can you close and reopen the PR real quick so the tests run? -- This is an automated message f

[GitHub] [arrow] rtpsw commented on pull request #14934: ARROW-18427: [C++] Support negative tolerance in `AsofJoinNode`

2023-01-07 Thread GitBox
rtpsw commented on PR #14934: URL: https://github.com/apache/arrow/pull/14934#issuecomment-1374617059 @icexelloss confirms the use case of negative tolerance is a recurring one, and says that the use case of a double-sided tolerance is not, at least for the time being. @westonpace, could yo

[GitHub] [arrow] OfekShilon commented on issue #15246: [R] arrow::write_feather doesn't save row names

2023-01-07 Thread GitBox
OfekShilon commented on issue #15246: URL: https://github.com/apache/arrow/issues/15246#issuecomment-1374613900 This is caused by [explicit removal ](https://github.com/apache/arrow/blob/master/r/R/metadata.R#L125)of `row.names` in `remove_attributes`. @nealrichardson is this intent

[GitHub] [arrow] rok commented on a diff in pull request #15245: ARROW-18377: MIGRATION: Automate component labels from issue form content

2023-01-07 Thread GitBox
rok commented on code in PR #15245: URL: https://github.com/apache/arrow/pull/15245#discussion_r1064046982 ## .github/workflows/issue_bot.yml: ## @@ -0,0 +1,63 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NO

[GitHub] [arrow-datafusion] Jefffrey commented on issue #4837: SQL statement (`UNION` + `EXCEPT`) causes panic

2023-01-07 Thread GitBox
Jefffrey commented on issue #4837: URL: https://github.com/apache/arrow-datafusion/issues/4837#issuecomment-1374611170 > Shouldn't such case be a part of tests? That would be a good idea to have more test coverage, though I confess I'm not entirely certain where such a test would be

[GitHub] [arrow-datafusion] jonmmease opened a new pull request, #4847: Update variance/stddev to work with single values

2023-01-07 Thread GitBox
jonmmease opened a new pull request, #4847: URL: https://github.com/apache/arrow-datafusion/pull/4847 # Which issue does this PR close? Closes #4843. # Rationale for this change Rather than crash when performing a variance-based aggregation on a single value, follow Postgres' se

[GitHub] [arrow] OfekShilon commented on issue #15246: [R] arrow::write_feather doesn't save row names

2023-01-07 Thread GitBox
OfekShilon commented on issue #15246: URL: https://github.com/apache/arrow/issues/15246#issuecomment-1374609657 It seems there is a problem in `AddMetadataFromDots`. @romainfrancois -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

[GitHub] [arrow] assignUser commented on a diff in pull request #15245: ARROW-18377: MIGRATION: Automate component labels from issue form content

2023-01-07 Thread GitBox
assignUser commented on code in PR #15245: URL: https://github.com/apache/arrow/pull/15245#discussion_r1064046080 ## .github/workflows/issue_bot.yml: ## @@ -0,0 +1,63 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See

[GitHub] [arrow] rok commented on pull request #14378: ARROW-17933: [C++] SparseCOOTensor raises error when created with zero elements

2023-01-07 Thread GitBox
rok commented on PR #14378: URL: https://github.com/apache/arrow/pull/14378#issuecomment-1374607185 @pitrou is there something left to address here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow] mapleFU commented on pull request #15238: GH-15237: [C++] Add ::arrow::Unreachable() using std::string_view

2023-01-07 Thread GitBox
mapleFU commented on PR #15238: URL: https://github.com/apache/arrow/pull/15238#issuecomment-1374605368 The description updated @kou By the way, seems CI failed because of flakey test: ``` === FAILURES === _

[GitHub] [arrow-datafusion] comphead commented on issue #4644: Comparing a `Timestamp` to a `Date32` fails

2023-01-07 Thread GitBox
comphead commented on issue #4644: URL: https://github.com/apache/arrow-datafusion/issues/4644#issuecomment-1374605053 > > I'm still debugging this. Looks like logical plan doesn't include an extra cast in `type_coercion` optimizer rule for dates/ts binary ops > > Does you have test

[GitHub] [arrow] rok commented on pull request #15245: ARROW-18377: MIGRATION: Automate component labels from issue form content

2023-01-07 Thread GitBox
rok commented on PR #15245: URL: https://github.com/apache/arrow/pull/15245#issuecomment-1374604061 Thanks for doing this @assignUser ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [arrow-datafusion] ozankabak commented on issue #4844: Incorrect results for join condition against current master branch

2023-01-07 Thread GitBox
ozankabak commented on issue #4844: URL: https://github.com/apache/arrow-datafusion/issues/4844#issuecomment-1374601928 Wow, let's make sure to add some tests so regressions like this do not stealthily go through in the future 🤔 -- This is an automated message from the Apache Git Servic

[GitHub] [arrow-datafusion] ozankabak commented on issue #4843: Match Postgres for stddev and variance on less than 3 values

2023-01-07 Thread GitBox
ozankabak commented on issue #4843: URL: https://github.com/apache/arrow-datafusion/issues/4843#issuecomment-1374595943 We should definitely calculate something for the two-sample case -- erroring simply doesn't make sense in that case. The second example has various behaviors going around

[GitHub] [arrow-datafusion] alamb commented on pull request #4831: DataFusion 16.0.0 release prep: Update version + add changelog

2023-01-07 Thread GitBox
alamb commented on PR #4831: URL: https://github.com/apache/arrow-datafusion/pull/4831#issuecomment-1374594361 I took the liberty of merging from `apache/master` to this branch to resolve a conflict -- This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [arrow-datafusion] ozankabak commented on pull request #4846: Implement retract_batch for AvgAccumulator

2023-01-07 Thread GitBox
ozankabak commented on PR #4846: URL: https://github.com/apache/arrow-datafusion/pull/4846#issuecomment-1374593994 @mustafasrepo, PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [arrow-datafusion] ozankabak commented on a diff in pull request #4846: Implement retract_batch for AvgAccumulator

2023-01-07 Thread GitBox
ozankabak commented on code in PR #4846: URL: https://github.com/apache/arrow-datafusion/pull/4846#discussion_r1064041043 ## datafusion/physical-expr/src/aggregate/average.rs: ## @@ -154,6 +159,14 @@ impl Accumulator for AvgAccumulator { Ok(()) } +fn retract_

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4831: DataFusion 16.0.0 release prep

2023-01-07 Thread GitBox
alamb commented on code in PR #4831: URL: https://github.com/apache/arrow-datafusion/pull/4831#discussion_r1064041210 ## datafusion/CHANGELOG.md: ## @@ -19,6 +19,337 @@ # Changelog +## [16.0.0](https://github.com/apache/arrow-datafusion/tree/16.0.0) (2023-01-06) + +[Full

[GitHub] [arrow-datafusion] alamb commented on pull request #4581: WIP Support `select .. FROM 'parquet.file'` in datafusion-cli

2023-01-07 Thread GitBox
alamb commented on PR #4581: URL: https://github.com/apache/arrow-datafusion/pull/4581#issuecomment-1374592677 @unconsolable has a PR to add this feature -- I hope to review it tomorrow https://github.com/apache/arrow-datafusion/pull/4838 -- This is an automated message from the Apache

[GitHub] [arrow] rtpsw commented on a diff in pull request #14682: ARROW-17676: [C++] [Python] User-defined tabular functions

2023-01-07 Thread GitBox
rtpsw commented on code in PR #14682: URL: https://github.com/apache/arrow/pull/14682#discussion_r1064040644 ## python/pyarrow/tests/test_udf.py: ## @@ -504,3 +504,112 @@ def test_input_lifetime(unary_func_fixture): # Calling a UDF should not have kept `v` alive longer than

[GitHub] [arrow] rtpsw commented on a diff in pull request #14682: ARROW-17676: [C++] [Python] User-defined tabular functions

2023-01-07 Thread GitBox
rtpsw commented on code in PR #14682: URL: https://github.com/apache/arrow/pull/14682#discussion_r1064040160 ## python/pyarrow/src/arrow/python/udf.cc: ## @@ -105,21 +192,109 @@ Status RegisterScalarFunction(PyObject* user_function, ScalarUdfWrapperCallback } compute::Out

[GitHub] [arrow] rok commented on a diff in pull request #15245: ARROW-18377: MIGRATION: Automate component labels from issue form content

2023-01-07 Thread GitBox
rok commented on code in PR #15245: URL: https://github.com/apache/arrow/pull/15245#discussion_r1064040154 ## .github/workflows/issue_bot.yml: ## @@ -0,0 +1,63 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NO

[GitHub] [arrow] rtpsw commented on a diff in pull request #14682: ARROW-17676: [C++] [Python] User-defined tabular functions

2023-01-07 Thread GitBox
rtpsw commented on code in PR #14682: URL: https://github.com/apache/arrow/pull/14682#discussion_r1064038986 ## python/pyarrow/src/arrow/python/udf.cc: ## @@ -105,21 +158,117 @@ Status RegisterScalarFunction(PyObject* user_function, ScalarUdfWrapperCallback } compute::Out

[GitHub] [arrow-datafusion-python] dependabot[bot] opened a new pull request, #120: build(deps): bump mimalloc from 0.1.32 to 0.1.33

2023-01-07 Thread GitBox
dependabot[bot] opened a new pull request, #120: URL: https://github.com/apache/arrow-datafusion-python/pull/120 Bumps [mimalloc](https://github.com/purpleprotocol/mimalloc_rust) from 0.1.32 to 0.1.33. Release notes Sourced from https://github.com/purpleprotocol/mimalloc_rust/relea

[GitHub] [arrow-datafusion-python] dependabot[bot] opened a new pull request, #119: build(deps): bump tokio from 1.23.0 to 1.24.1

2023-01-07 Thread GitBox
dependabot[bot] opened a new pull request, #119: URL: https://github.com/apache/arrow-datafusion-python/pull/119 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.23.0 to 1.24.1. Release notes Sourced from https://github.com/tokio-rs/tokio/releases";>tokio's releases. T

[GitHub] [arrow-datafusion-python] dependabot[bot] opened a new pull request, #118: build(deps): bump async-trait from 0.1.60 to 0.1.61

2023-01-07 Thread GitBox
dependabot[bot] opened a new pull request, #118: URL: https://github.com/apache/arrow-datafusion-python/pull/118 Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.60 to 0.1.61. Release notes Sourced from https://github.com/dtolnay/async-trait/releases";>async-tra

[GitHub] [arrow-datafusion] jonmmease opened a new pull request, #4846: Implement retract_batch for AvgAccumulator

2023-01-07 Thread GitBox
jonmmease opened a new pull request, #4846: URL: https://github.com/apache/arrow-datafusion/pull/4846 # Which issue does this PR close? Closes #4845. # What changes are included in this PR? Adds an implementation of `retract_batch` to `AvgAccumulator` that is identical t

[GitHub] [arrow-rs] ursabot commented on pull request #3487: Fixes a broken link in the arrow lib.rs rustdoc

2023-01-07 Thread GitBox
ursabot commented on PR #3487: URL: https://github.com/apache/arrow-rs/pull/3487#issuecomment-1374558429 Benchmark runs are scheduled for baseline = 8492c27dfb6840e94843b0b2bb8de484280b6c5d and contender = c74665808439cb7020fb1cfb74b376a136c73259. c74665808439cb7020fb1cfb74b376a136c73259 i

[GitHub] [arrow-rs] viirya merged pull request #3487: Fixes a broken link in the arrow lib.rs rustdoc

2023-01-07 Thread GitBox
viirya merged PR #3487: URL: https://github.com/apache/arrow-rs/pull/3487 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apach

[GitHub] [arrow-rs] AdamGS commented on pull request #3487: Fixes a broken link in the arrow lib.rs rustdoc

2023-01-07 Thread GitBox
AdamGS commented on PR #3487: URL: https://github.com/apache/arrow-rs/pull/3487#issuecomment-1374556139 Glad I can help, thank you for the awesome work maintaining this project! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-rs] scsmithr opened a new pull request, #3489: feat: Allow providing a service account key directly for GCS

2023-01-07 Thread GitBox
scsmithr opened a new pull request, #3489: URL: https://github.com/apache/arrow-rs/pull/3489 # Which issue does this PR close? Closes https://github.com/apache/arrow-rs/issues/3488 # Rationale for this change Use case: We're storing service

[GitHub] [arrow-datafusion] melgenek commented on issue #4462: Replace python based integration test with sqllogictest

2023-01-07 Thread GitBox
melgenek commented on issue #4462: URL: https://github.com/apache/arrow-datafusion/issues/4462#issuecomment-1374552643 > I also left some feedback on https://github.com/apache/arrow-datafusion/pull/4834#pullrequestreview-1239686856 -- does that make sense? Thank you for the feedback

[GitHub] [arrow-rs] scsmithr opened a new issue, #3488: Allow providing service account key directly when building GCP object store client

2023-01-07 Thread GitBox
scsmithr opened a new issue, #3488: URL: https://github.com/apache/arrow-rs/issues/3488 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** The current implementation of the GCP builder requires that the service account key is

[GitHub] [arrow-datafusion] jonmmease opened a new issue, #4845: Support custom window frame with AVG aggregate function

2023-01-07 Thread GitBox
jonmmease opened a new issue, #4845: URL: https://github.com/apache/arrow-datafusion/issues/4845 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** I would like DataFusion to support using the `avg` aggregate function in a window wit

[GitHub] [arrow-datafusion] melgenek commented on a diff in pull request #4834: (#4462) Postgres compatibility tests using sqllogictest

2023-01-07 Thread GitBox
melgenek commented on code in PR #4834: URL: https://github.com/apache/arrow-datafusion/pull/4834#discussion_r1064029810 ## datafusion/core/tests/sqllogictests/postgres/test_files/simple_except.slt: ## @@ -0,0 +1,27 @@ +# Licensed to the Apache Software Foundation (ASF) under on

[GitHub] [arrow-datafusion] melgenek commented on a diff in pull request #4834: (#4462) Postgres compatibility tests using sqllogictest

2023-01-07 Thread GitBox
melgenek commented on code in PR #4834: URL: https://github.com/apache/arrow-datafusion/pull/4834#discussion_r1064029147 ## datafusion/core/tests/sqllogictests/postgres/test_files/simple_aggregation.slt: ## @@ -0,0 +1,29 @@ +# Licensed to the Apache Software Foundation (ASF) und

[GitHub] [arrow-datafusion] melgenek commented on a diff in pull request #4834: (#4462) Postgres compatibility tests using sqllogictest

2023-01-07 Thread GitBox
melgenek commented on code in PR #4834: URL: https://github.com/apache/arrow-datafusion/pull/4834#discussion_r1064028955 ## datafusion/core/tests/sqllogictests/src/main.rs: ## @@ -15,112 +15,135 @@ // specific language governing permissions and limitations // under the License

[GitHub] [arrow] mapleFU commented on pull request #15241: GH-14923: [C++][Parquet] Fix DELTA_BINARY_PACKED problem on reading the last block with malford bit-width

2023-01-07 Thread GitBox
mapleFU commented on PR #15241: URL: https://github.com/apache/arrow/pull/15241#issuecomment-1374547554 This time it failed because unpack32_avx512 meets an ASAN issue. Maybe I need to know how input data is generated and do some adjustions. Go to sleep now. -- This is an automated messa

[GitHub] [arrow-rs] AdamGS opened a new pull request, #3487: Fixes a broken link in the main lib.rs rustdoc

2023-01-07 Thread GitBox
AdamGS opened a new pull request, #3487: URL: https://github.com/apache/arrow-rs/pull/3487 # Which issue does this PR close? Seems too minor for an issue. # Rationale for this change Fixes a broken link # What changes are included in this PR? One small back

[GitHub] [arrow-datafusion] DDtKey commented on issue #4837: SQL statement (`UNION` + `EXCEPT`) causes panic

2023-01-07 Thread GitBox
DDtKey commented on issue #4837: URL: https://github.com/apache/arrow-datafusion/issues/4837#issuecomment-1374541585 After some tests against current master branch I was able to discover a new bug(broken behavior), see new issue: #4844 -- This is an automated message from the Apache Git

[GitHub] [arrow-datafusion] DDtKey opened a new issue, #4844: Incorrect results for join broken in current master branch

2023-01-07 Thread GitBox
DDtKey opened a new issue, #4844: URL: https://github.com/apache/arrow-datafusion/issues/4844 **Describe the bug** It used to work for latest stable release (`15.0.0` from `crates.io`) But I tested it against current master branch, hash `3cc607de4ce6e9e1fd537091e471858c62f58653`.

[GitHub] [arrow] github-actions[bot] commented on pull request #15245: ARROW-18377: MIGRATION: Automate component labels from issue form content

2023-01-07 Thread GitBox
github-actions[bot] commented on PR #15245: URL: https://github.com/apache/arrow/pull/15245#issuecomment-1374541097 https://issues.apache.org/jira/browse/ARROW-18377 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] github-actions[bot] commented on pull request #15245: ARROW-18377: MIGRATION: Automate component labels from issue form content

2023-01-07 Thread GitBox
github-actions[bot] commented on PR #15245: URL: https://github.com/apache/arrow/pull/15245#issuecomment-1374541108 :warning: Ticket **has no components in JIRA**, make sure you assign one. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow] assignUser opened a new pull request, #15245: ARROW-18377: MIGRATION: Automate component labels from issue form content

2023-01-07 Thread GitBox
assignUser opened a new pull request, #15245: URL: https://github.com/apache/arrow/pull/15245 tested on fork: https://github.com/assignUser/arrow/issues/9 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [arrow-datafusion] alamb commented on issue #4462: Replace python based integration test with sqllogictest

2023-01-07 Thread GitBox
alamb commented on issue #4462: URL: https://github.com/apache/arrow-datafusion/issues/4462#issuecomment-1374538949 Hi @melgenek > Hi there! I am new to Datafusion and would like to learn more about it. That is great! Welcome! > My main question is: what level of compati

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4834: (#4462) Postgres compatibility tests using sqllogictest

2023-01-07 Thread GitBox
alamb commented on code in PR #4834: URL: https://github.com/apache/arrow-datafusion/pull/4834#discussion_r1064023857 ## datafusion/core/tests/sqllogictests/postgres/test_files/self_join_with_alias.slt: ## @@ -0,0 +1,24 @@ +# Licensed to the Apache Software Foundation (ASF) unde

[GitHub] [arrow-rs] ursabot commented on pull request #3486: [doc] Fix broken URLs

2023-01-07 Thread GitBox
ursabot commented on PR #3486: URL: https://github.com/apache/arrow-rs/pull/3486#issuecomment-1374531299 Benchmark runs are scheduled for baseline = c28d69aa5be2cce2d065300c9a79c4063589f300 and contender = 8492c27dfb6840e94843b0b2bb8de484280b6c5d. 8492c27dfb6840e94843b0b2bb8de484280b6c5d i

[GitHub] [arrow] mapleFU commented on pull request #15241: GH-14923: [C++][Parquet] Fix DELTA_BINARY_PACKED problem on reading the last block with malford bit-width

2023-01-07 Thread GitBox
mapleFU commented on PR #15241: URL: https://github.com/apache/arrow/pull/15241#issuecomment-1374530796 Well, seems that both buffer is 273B. And I need to padding the `good_data`. @rok would you mind add the generating method of these two datas? Seems leaving a bit-string here is confusing

[GitHub] [arrow-rs] tustvold merged pull request #3486: [doc] Fix broken URLs

2023-01-07 Thread GitBox
tustvold merged PR #3486: URL: https://github.com/apache/arrow-rs/pull/3486 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

[GitHub] [arrow-datafusion] alamb commented on issue #4804: Blog post about datafusion 16 release

2023-01-07 Thread GitBox
alamb commented on issue #4804: URL: https://github.com/apache/arrow-datafusion/issues/4804#issuecomment-1374527239 Here is a draft post we can use to collaborate on: https://github.com/apache/arrow-site/pull/294 -- This is an automated message from the Apache Git Service. To res

[GitHub] [arrow-datafusion] DDtKey commented on issue #4837: SQL statement (`UNION` + `EXCEPT`) causes panic

2023-01-07 Thread GitBox
DDtKey commented on issue #4837: URL: https://github.com/apache/arrow-datafusion/issues/4837#issuecomment-1374527144 So, with bisect I was able to figure out that it was fixed by #4666 (ac2e5d15e5452e83c835d793a95335e87bf35569). Thanks for help @Jefffrey! Shouldn't such case be a

[GitHub] [arrow] mapleFU commented on pull request #15241: GH-14923: [C++][Parquet] Fix DELTA_BINARY_PACKED problem on reading the last block with malford bit-width

2023-01-07 Thread GitBox
mapleFU commented on PR #15241: URL: https://github.com/apache/arrow/pull/15241#issuecomment-1374523168 Well, it's a bit weird. ```c++ TEST_F(DeltaBitPackEncoding, MalfordMiniblockBitWidth) { std::shared_ptr descr_ = ExampleDescr(); auto decoder = MakeTypedDecoder(Encoding

[GitHub] [arrow] github-actions[bot] commented on pull request #15244: GH-15239: [C++][Parquet] Parquet writer writes decimal as int32/64

2023-01-07 Thread GitBox
github-actions[bot] commented on PR #15244: URL: https://github.com/apache/arrow/pull/15244#issuecomment-1374521623 :warning: GitHub issue #15239 **has no components**, please add labels for components. -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [arrow] github-actions[bot] commented on pull request #15244: GH-15239: [C++][Parquet] Parquet writer writes decimal as int32/64

2023-01-07 Thread GitBox
github-actions[bot] commented on PR #15244: URL: https://github.com/apache/arrow/pull/15244#issuecomment-1374521619 :warning: GitHub issue #15239 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow] github-actions[bot] commented on pull request #15244: GH-15239: [C++][Parquet] Parquet writer writes decimal as int32/64

2023-01-07 Thread GitBox
github-actions[bot] commented on PR #15244: URL: https://github.com/apache/arrow/pull/15244#issuecomment-1374521610 * Closes: #15239 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] wgtmac opened a new pull request, #15244: GH-15239: [C++][Parquet] Parquet writer writes decimal as int32/64

2023-01-07 Thread GitBox
wgtmac opened a new pull request, #15244: URL: https://github.com/apache/arrow/pull/15244 As the parquet [specs](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal) states, DECIMAL can be used to annotate the following types: - int32: for 1 <= precision <= 9

[GitHub] [arrow-datafusion] DDtKey commented on issue #4837: SQL statement (`UNION` + `EXCEPT`) causes panic

2023-01-07 Thread GitBox
DDtKey commented on issue #4837: URL: https://github.com/apache/arrow-datafusion/issues/4837#issuecomment-1374521248 > The output seems correct to me from my understanding, unless you're referring to the inconsistent output row ordering? > > The error seems straightforward enough, wh

[GitHub] [arrow-datafusion] jackwener closed issue #4563: tpch test exist duplicated

2023-01-07 Thread GitBox
jackwener closed issue #4563: tpch test exist duplicated URL: https://github.com/apache/arrow-datafusion/issues/4563 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [arrow-datafusion] jackwener closed issue #4798: move the tests in planner

2023-01-07 Thread GitBox
jackwener closed issue #4798: move the tests in planner URL: https://github.com/apache/arrow-datafusion/issues/4798 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [arrow-rs] vvv commented on a diff in pull request #3486: [doc] Fix broken URLs

2023-01-07 Thread GitBox
vvv commented on code in PR #3486: URL: https://github.com/apache/arrow-rs/pull/3486#discussion_r1064015351 ## arrow-array/src/iterator.rs: ## @@ -39,9 +39,9 @@ use crate::array::{ /// there are more efficient ways to iterate over just the non-null indices, this functionality

[GitHub] [arrow-datafusion] alamb commented on pull request #4826: Support non-tuple expression for in-subquery to join

2023-01-07 Thread GitBox
alamb commented on PR #4826: URL: https://github.com/apache/arrow-datafusion/pull/4826#issuecomment-1374512049 I plan to review this tomorrow. THank you @ygf11 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4842: Simplify the break rule of logical optimizer passes

2023-01-07 Thread GitBox
alamb commented on code in PR #4842: URL: https://github.com/apache/arrow-datafusion/pull/4842#discussion_r1064014394 ## datafusion/optimizer/src/optimizer.rs: ## @@ -309,17 +309,11 @@ impl Optimizer { } log_plan(&format!("Optimized plan (pass {i})"), &

  1   2   >