[GitHub] [arrow-datafusion] Dandandan merged pull request #2235: Add type coercion rule for date + interval

2022-04-14 Thread GitBox
Dandandan merged PR #2235: URL: https://github.com/apache/arrow-datafusion/pull/2235 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow] michalursa commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
michalursa commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r851090256 ## cpp/src/arrow/compute/light_array.h: ## @@ -0,0 +1,384 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

[GitHub] [arrow] ursabot commented on pull request #12874: ARROW-16185: [C++] Fix uninitialized output data in strptime kernel

2022-04-14 Thread GitBox
ursabot commented on PR #12874: URL: https://github.com/apache/arrow/pull/12874#issuecomment-1099891298 Benchmark runs are scheduled for baseline = 931907e94a5a83e78b0f2ede8f520c2f53edcce8 and contender = b61fb727fcc9a1c1e04002433bf502650b6ad369. b61fb727fcc9a1c1e04002433bf502650b6ad369 is

[GitHub] [arrow-datafusion] liukun4515 opened a new pull request, #2241: fix: find the right wider decimal datatype for comparison operation

2022-04-14 Thread GitBox
liukun4515 opened a new pull request, #2241: URL: https://github.com/apache/arrow-datafusion/pull/2241 # Which issue does this PR close? Closes #2232 For decimal with the comparison operation, we use the error algorithm to calculate the precision/scale. Before:

[GitHub] [arrow] michalursa commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
michalursa commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r851085686 ## cpp/src/arrow/compute/light_array.cc: ## @@ -0,0 +1,736 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

[GitHub] [arrow] michalursa commented on pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
michalursa commented on PR #12872: URL: https://github.com/apache/arrow/pull/12872#issuecomment-1099882476 > Looks good to me, I guess I already reviewed a bunch of this code on the other PR. One thing: is it worth to add `Hashing32::HashBatch` and `Hashing64::HashBatch` in this PR? I need

[GitHub] [arrow] michalursa commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
michalursa commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r851079678 ## cpp/src/arrow/compute/light_array_test.cc: ## @@ -0,0 +1,504 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

[GitHub] [arrow] michalursa commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
michalursa commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r851078401 ## cpp/src/arrow/compute/light_array.h: ## @@ -0,0 +1,384 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

[GitHub] [arrow] michalursa commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
michalursa commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r851077893 ## cpp/src/arrow/compute/light_array_test.cc: ## @@ -0,0 +1,504 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

[GitHub] [arrow] michalursa commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
michalursa commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r851077345 ## cpp/src/arrow/compute/light_array_test.cc: ## @@ -0,0 +1,504 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

[GitHub] [arrow] github-actions[bot] commented on pull request #12896: ARROW-16203: [Release] Remove all old artifacts on release

2022-04-14 Thread GitBox
github-actions[bot] commented on PR #12896: URL: https://github.com/apache/arrow/pull/12896#issuecomment-1099873104 https://issues.apache.org/jira/browse/ARROW-16203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] kou opened a new pull request, #12896: ARROW-16203: [Release] Remove all old artifacts on release

2022-04-14 Thread GitBox
kou opened a new pull request, #12896: URL: https://github.com/apache/arrow/pull/12896 * All RCs are removed * All releases except the latest release are removed The following change isn't related to "remove all old artifacts" but included. Sorry. * "svn mv" is used

[GitHub] [arrow-datafusion] liukun4515 commented on issue #2232: Error precision and scale for decimal coercion in logic comparison

2022-04-14 Thread GitBox
liukun4515 commented on issue #2232: URL: https://github.com/apache/arrow-datafusion/issues/2232#issuecomment-1099868338 https://github.com/apache/arrow-datafusion/pull/1483 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow-datafusion] liukun4515 commented on pull request #2235: Add type coercion rule for date + interval

2022-04-14 Thread GitBox
liukun4515 commented on PR #2235: URL: https://github.com/apache/arrow-datafusion/pull/2235#issuecomment-1099867238 the rule of type-coercion should be extracted if there are too many `match` or `switch` path -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] lidavidm commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
lidavidm commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r851066765 ## cpp/src/arrow/compute/light_array.h: ## @@ -0,0 +1,384 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

[GitHub] [arrow] michalursa commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
michalursa commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r851066303 ## cpp/src/arrow/compute/light_array.h: ## @@ -0,0 +1,384 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

[GitHub] [arrow] lidavidm commented on pull request #12895: ARROW-16043: [C++][Filesystem][S3] Add missing empty content for creating directory

2022-04-14 Thread GitBox
lidavidm commented on PR #12895: URL: https://github.com/apache/arrow/pull/12895#issuecomment-1099863197 Ah, ok, right. Thanks for the explanation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] westonpace commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
westonpace commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r851066060 ## cpp/src/arrow/compute/light_array.h: ## @@ -0,0 +1,384 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

[GitHub] [arrow] kou commented on pull request #12895: ARROW-16043: [C++][Filesystem][S3] Add missing empty content for creating directory

2022-04-14 Thread GitBox
kou commented on PR #12895: URL: https://github.com/apache/arrow/pull/12895#issuecomment-1099862034 Thanks! It seems that the docs says checksum is optional but doesn't say anything whether body is optional or not. -- This is an automated message from the Apache Git Service. To res

[GitHub] [arrow] lidavidm commented on pull request #12895: ARROW-16043: [C++][Filesystem][S3] Add missing empty content for creating directory

2022-04-14 Thread GitBox
lidavidm commented on PR #12895: URL: https://github.com/apache/arrow/pull/12895#issuecomment-1099856247 I may have misunderstood. I was looking at this (emphasis added): > The base64-encoded 128-bit MD5 digest of the message (without the headers) according to RFC 1864. This header ca

[GitHub] [arrow] kou commented on pull request #12895: ARROW-16043: [C++][Filesystem][S3] Add missing empty content for creating directory

2022-04-14 Thread GitBox
kou commented on PR #12895: URL: https://github.com/apache/arrow/pull/12895#issuecomment-1099852467 Sorry. I describe this problem. Recent AWS SDK for C++ adds checksum automatically by https://github.com/aws/aws-sdk-cpp/commit/01e61b3137f90509e71765ee7e3b4a39e2e8de91 . The chec

[GitHub] [arrow] github-actions[bot] commented on pull request #12895: ARROW-16043: [C++][Filesystem][S3] Add missing empty content for creating directory

2022-04-14 Thread GitBox
github-actions[bot] commented on PR #12895: URL: https://github.com/apache/arrow/pull/12895#issuecomment-1099849207 https://issues.apache.org/jira/browse/ARROW-16043 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] ursabot commented on pull request #12112: ARROW-15183: [Python][Docs] Add Missing Dataset Write Options

2022-04-14 Thread GitBox
ursabot commented on PR #12112: URL: https://github.com/apache/arrow/pull/12112#issuecomment-1099827121 Benchmark runs are scheduled for baseline = fc9af3cd3fe9abc792a90217578d0def7b2a9a84 and contender = 931907e94a5a83e78b0f2ede8f520c2f53edcce8. 931907e94a5a83e78b0f2ede8f520c2f53edcce8 is

[GitHub] [arrow] westonpace commented on pull request #12894: ARROW-14911: [C++] arrow-compute-hash-join-node-test failed

2022-04-14 Thread GitBox
westonpace commented on PR #12894: URL: https://github.com/apache/arrow/pull/12894#issuecomment-1099826784 I wasn't getting TSAN errors on this test case (this is before bloom filter) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [arrow] github-actions[bot] commented on pull request #12895: ARROW-16070: [C++][Filesystem][S3] Add missing empty content for creating directory

2022-04-14 Thread GitBox
github-actions[bot] commented on PR #12895: URL: https://github.com/apache/arrow/pull/12895#issuecomment-1099811223 Revision: 8ef2807ac87fdb830cf26998fd4bfa80998ff3a6 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1863](https://github.com/ursacomputing/crossbow/branches/

[GitHub] [arrow] kou commented on pull request #12895: ARROW-16070: [C++][Filesystem][S3] Add missing empty content for creating directory

2022-04-14 Thread GitBox
kou commented on PR #12895: URL: https://github.com/apache/arrow/pull/12895#issuecomment-1099810639 @github-actions crossbow submit homebrew-r-brew -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #12895: ARROW-16070: [C++][Filesystem][S3] Add missing empty content for creating directory

2022-04-14 Thread GitBox
github-actions[bot] commented on PR #12895: URL: https://github.com/apache/arrow/pull/12895#issuecomment-1099809130 https://issues.apache.org/jira/browse/ARROW-16070 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] github-actions[bot] commented on pull request #12895: ARROW-16070: [C++][Filesystem][S3] Add missing empty content for creating directory

2022-04-14 Thread GitBox
github-actions[bot] commented on PR #12895: URL: https://github.com/apache/arrow/pull/12895#issuecomment-1099809146 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] kou opened a new pull request, #12895: ARROW-16070: [C++][Filesystem][S3] Add missing empty content for creating directory

2022-04-14 Thread GitBox
kou opened a new pull request, #12895: URL: https://github.com/apache/arrow/pull/12895 We can't omit body to create a directory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [arrow] kou commented on a diff in pull request #12881: ARROW-16183: [C++][FlightRPC] Support bundled UCX

2022-04-14 Thread GitBox
kou commented on code in PR #12881: URL: https://github.com/apache/arrow/pull/12881#discussion_r851006778 ## cpp/cmake_modules/ThirdpartyToolchain.cmake: ## @@ -4513,6 +4523,88 @@ if(ARROW_S3) endif() endif() +# -

[GitHub] [arrow] save-buffer commented on pull request #12894: ARROW-14911: [C++] arrow-compute-hash-join-node-test failed

2022-04-14 Thread GitBox
save-buffer commented on PR #12894: URL: https://github.com/apache/arrow/pull/12894#issuecomment-1099776818 Epic detective effort! Does this fix thread sanitizer too? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [arrow] cyb70289 commented on a diff in pull request #12881: ARROW-16183: [C++][FlightRPC] Support bundled UCX

2022-04-14 Thread GitBox
cyb70289 commented on code in PR #12881: URL: https://github.com/apache/arrow/pull/12881#discussion_r850961740 ## cpp/cmake_modules/FindUcx.cmake: ## @@ -0,0 +1,25 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See th

[GitHub] [arrow] ursabot commented on pull request #12749: ARROW-16069: [C++][FlightRPC] Refactor out gRPC error code handling

2022-04-14 Thread GitBox
ursabot commented on PR #12749: URL: https://github.com/apache/arrow/pull/12749#issuecomment-1099765017 Benchmark runs are scheduled for baseline = 62d0b17989bc624abf1a33215b4fba8915512f3b and contender = fc9af3cd3fe9abc792a90217578d0def7b2a9a84. fc9af3cd3fe9abc792a90217578d0def7b2a9a84 is

[GitHub] [arrow] westonpace commented on pull request #12894: ARROW-14911: [C++] arrow-compute-hash-join-node-test failed

2022-04-14 Thread GitBox
westonpace commented on PR #12894: URL: https://github.com/apache/arrow/pull/12894#issuecomment-1099748302 CC @michalursa PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [arrow] github-actions[bot] commented on pull request #12894: ARROW-14911: [C++] arrow-compute-hash-join-node-test failed

2022-04-14 Thread GitBox
github-actions[bot] commented on PR #12894: URL: https://github.com/apache/arrow/pull/12894#issuecomment-1099748054 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #12894: ARROW-14911: [C++] arrow-compute-hash-join-node-test failed

2022-04-14 Thread GitBox
github-actions[bot] commented on PR #12894: URL: https://github.com/apache/arrow/pull/12894#issuecomment-1099748044 https://issues.apache.org/jira/browse/ARROW-14911 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] westonpace opened a new pull request, #12894: ARROW-14911: [C++] arrow-compute-hash-join-node-test failed

2022-04-14 Thread GitBox
westonpace opened a new pull request, #12894: URL: https://github.com/apache/arrow/pull/12894 I identified and reproduced two possible ways this sort of segmentation fault could happen. The stack traces demonstrated that worker tasks were still running for a plan after the test case had co

[GitHub] [arrow] kou commented on a diff in pull request #12888: MINOR: [Docs] Fix link to Julia implementation

2022-04-14 Thread GitBox
kou commented on code in PR #12888: URL: https://github.com/apache/arrow/pull/12888#discussion_r850904161 ## docs/source/index.rst: ## @@ -47,7 +47,7 @@ target environment.** Go Java JavaScript - Julia

[GitHub] [arrow-datafusion] Dandandan opened a new pull request, #2240: Fix join without constraints

2022-04-14 Thread GitBox
Dandandan opened a new pull request, #2240: URL: https://github.com/apache/arrow-datafusion/pull/2240 # Which issue does this PR close? Closes #2230 # Rationale for this change # What changes are included in this PR? Fixes join without on constrain

[GitHub] [arrow-datafusion] Dandandan closed pull request #2239: Fix join

2022-04-14 Thread GitBox
Dandandan closed pull request #2239: Fix join URL: https://github.com/apache/arrow-datafusion/pull/2239 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [arrow-datafusion] Dandandan opened a new pull request, #2239: Fix join

2022-04-14 Thread GitBox
Dandandan opened a new pull request, #2239: URL: https://github.com/apache/arrow-datafusion/pull/2239 # Which issue does this PR close? Closes #2230 # Rationale for this change Fixes join without keys. Also adds explicit test # What changes are in

[GitHub] [arrow] wjones127 commented on issue #12892: Arrow install on Databricks cluster takes 10+ minutes

2022-04-14 Thread GitBox
wjones127 commented on issue #12892: URL: https://github.com/apache/arrow/issues/12892#issuecomment-1099682718 Hmm yeah that's odd. Is `focal` not the right OS name for DBR 10.1? Might be worth trying the notebook-scoped command and see if that works. Should at least get better logs.

[GitHub] [arrow] ursabot commented on pull request #12858: ARROW-16167: [JS] refactor get and set visitors

2022-04-14 Thread GitBox
ursabot commented on PR #12858: URL: https://github.com/apache/arrow/pull/12858#issuecomment-1099681087 Benchmark runs are scheduled for baseline = 9d24ded7f7d58717c9d78308b0c59ab7a9636006 and contender = 62d0b17989bc624abf1a33215b4fba8915512f3b. 62d0b17989bc624abf1a33215b4fba8915512f3b is

[GitHub] [arrow] github-actions[bot] commented on pull request #12893: ARROW-16198: [CI][Packaging][Python] Update VCPKG version

2022-04-14 Thread GitBox
github-actions[bot] commented on PR #12893: URL: https://github.com/apache/arrow/pull/12893#issuecomment-1099678157 Revision: 52fc416eacde6de48f550c297709d396efdef0e8 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1862](https://github.com/ursacomputing/crossbow/branches/

[GitHub] [arrow] kszucs commented on pull request #12893: ARROW-16198: [CI][Packaging][Python] Update VCPKG version

2022-04-14 Thread GitBox
kszucs commented on PR #12893: URL: https://github.com/apache/arrow/pull/12893#issuecomment-1099677485 @github-actions crossbow submit *wheel* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wjones127 commented on a diff in pull request #12817: ARROW-15168: [R] Add S3 generics to create main Arrow objects

2022-04-14 Thread GitBox
wjones127 commented on code in PR #12817: URL: https://github.com/apache/arrow/pull/12817#discussion_r850805719 ## r/R/record-batch-reader.R: ## @@ -176,3 +176,37 @@ RecordBatchFileReader$create <- function(file) { assert_is(file, "InputStream") ipc___RecordBatchFileReader

[GitHub] [arrow-datafusion] andygrove opened a new issue, #2238: Add SQL query planning support for simple/nested subqueries

2022-04-14 Thread GitBox
andygrove opened a new issue, #2238: URL: https://github.com/apache/arrow-datafusion/issues/2238 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** As a user of the DataFusion library for SQL parsing and logical query planning, I wou

[GitHub] [arrow-datafusion] sunchao commented on issue #2205: RFC: Spill-To-Disk Object Storage Download

2022-04-14 Thread GitBox
sunchao commented on issue #2205: URL: https://github.com/apache/arrow-datafusion/issues/2205#issuecomment-1099675328 FWIW within each Spark task, it currently process each row group in a sequential manner, and for each of these it'll read all the projected column chunks (with filtered pag

[GitHub] [arrow-datafusion] andygrove commented on issue #2181: [Discuss] Add struct `Query` for datafusion

2022-04-14 Thread GitBox
andygrove commented on issue #2181: URL: https://github.com/apache/arrow-datafusion/issues/2181#issuecomment-1099673457 I just wanted to add a note here to say that I have dedicated time now to help with this effort. I think we can break this work down into phases and the logical starting

[GitHub] [arrow-datafusion] andygrove opened a new issue, #2237: Add SQL query planning support for IN subqueries

2022-04-14 Thread GitBox
andygrove opened a new issue, #2237: URL: https://github.com/apache/arrow-datafusion/issues/2237 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** As a user of the DataFusion library for SQL parsing and logical query planning, I wou

[GitHub] [arrow-datafusion] Dandandan commented on issue #2230: Panic while running inner join with predicate only on single relation

2022-04-14 Thread GitBox
Dandandan commented on issue #2230: URL: https://github.com/apache/arrow-datafusion/issues/2230#issuecomment-1099665076 Reproducing: ``` ❯ create table y as select 1 c; +---+ | c | +---+ | 1 | +---+ 1 row in set. Query took 0.005 seconds. ❯ select * from y t1 j

[GitHub] [arrow] lidavidm commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
lidavidm commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r850827924 ## cpp/src/arrow/compute/light_array.h: ## @@ -0,0 +1,384 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850819477 ## datafusion/scheduler/src/query.rs: ## @@ -0,0 +1,337 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agree

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850818283 ## datafusion/scheduler/src/task.rs: ## @@ -0,0 +1,225 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreem

[GitHub] [arrow-datafusion] andygrove opened a new issue, #2236: Implement physical planner support for DATE +/- INTERVAL

2022-04-14 Thread GitBox
andygrove opened a new issue, #2236: URL: https://github.com/apache/arrow-datafusion/issues/2236 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** PR https://github.com/apache/arrow-datafusion/issues/2229 adds support to the logical

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850815904 ## datafusion/scheduler/src/query.rs: ## @@ -0,0 +1,337 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agree

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850815433 ## datafusion/scheduler/src/task.rs: ## @@ -0,0 +1,225 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreem

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850814519 ## datafusion/scheduler/src/task.rs: ## @@ -0,0 +1,225 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreem

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850814077 ## datafusion/scheduler/src/pipeline/mod.rs: ## @@ -0,0 +1,72 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850812580 ## datafusion/scheduler/src/query.rs: ## @@ -0,0 +1,337 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agree

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850811841 ## datafusion/core/Cargo.toml: ## @@ -117,10 +117,6 @@ name = "scalar" harness = false name = "physical_plan" -[[bench]] -harness = false -name = "parquet_

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850811404 ## datafusion/scheduler/src/lib.rs: ## @@ -0,0 +1,275 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850811404 ## datafusion/scheduler/src/lib.rs: ## @@ -0,0 +1,275 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850811049 ## datafusion/scheduler/src/lib.rs: ## @@ -0,0 +1,275 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

[GitHub] [arrow] james-camacho-ab commented on issue #12892: Arrow install on Databricks cluster takes 10+ minutes

2022-04-14 Thread GitBox
james-camacho-ab commented on issue #12892: URL: https://github.com/apache/arrow/issues/12892#issuecomment-1099628817 @wjones127 I've been letting the install run for a while now but it seems to be stuck on "pending" ![image](https://user-images.githubusercontent.com/85585586/163477279-a

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850810063 ## datafusion/scheduler/src/lib.rs: ## @@ -0,0 +1,275 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
tustvold commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850809762 ## datafusion/scheduler/src/lib.rs: ## @@ -0,0 +1,275 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

[GitHub] [arrow-datafusion] alamb merged pull request #2224: minor: add editor config file

2022-04-14 Thread GitBox
alamb merged PR #2224: URL: https://github.com/apache/arrow-datafusion/pull/2224 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow-datafusion] andygrove opened a new pull request, #2235: Add type coercion rule for date + interval

2022-04-14 Thread GitBox
andygrove opened a new pull request, #2235: URL: https://github.com/apache/arrow-datafusion/pull/2235 Signed-off-by: Andy Grove # Which issue does this PR close? Closes #https://github.com/apache/arrow-datafusion/issues/2229. # Rationale for this change

[GitHub] [arrow-datafusion] alamb commented on issue #2227: Add unit test for case-sensitive SQL identifiers

2022-04-14 Thread GitBox
alamb commented on issue #2227: URL: https://github.com/apache/arrow-datafusion/issues/2227#issuecomment-1099623413 And to be clear, the point is to add an **error** test showing that the query errors when incorrect capitalization is used -- This is an automated message from the Apache G

[GitHub] [arrow-datafusion] alamb commented on pull request #2231: chore: add `debug!` log in some execution operators

2022-04-14 Thread GitBox
alamb commented on PR #2231: URL: https://github.com/apache/arrow-datafusion/pull/2231#issuecomment-1099617552 > For these type of logs, it might be interesting to see if it makes sense to use the tracing crate for instrumenting the code. We use the tracing crate in IOx and i

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2226: Introduce new optional scheduler, using Morsel-driven Parallelism + rayon (#2199)

2022-04-14 Thread GitBox
alamb commented on code in PR #2226: URL: https://github.com/apache/arrow-datafusion/pull/2226#discussion_r850764524 ## datafusion/scheduler/src/lib.rs: ## @@ -0,0 +1,275 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

[GitHub] [arrow-julia] codecov-commenter commented on pull request #318: Add a missing release process to update ASF's report database

2022-04-14 Thread GitBox
codecov-commenter commented on PR #318: URL: https://github.com/apache/arrow-julia/pull/318#issuecomment-1099613280 # [Codecov](https://codecov.io/gh/apache/arrow-julia/pull/318?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apach

[GitHub] [arrow] lidavidm commented on pull request #12891: ARROW-12659: [C++] Support is_valid as a guarantee

2022-04-14 Thread GitBox
lidavidm commented on PR #12891: URL: https://github.com/apache/arrow/pull/12891#issuecomment-1099607810 This needs to return a datum of the right type: https://github.com/apache/arrow/blob/63d2a9c856969a2c05e12ae8857a135bceaf45c1/cpp/src/arrow/compute/exec/expression.cc#L632-L640 -- This

[GitHub] [arrow-julia] codecov-commenter commented on pull request #317: Fix wrong RC verification path

2022-04-14 Thread GitBox
codecov-commenter commented on PR #317: URL: https://github.com/apache/arrow-julia/pull/317#issuecomment-1099607686 # [Codecov](https://codecov.io/gh/apache/arrow-julia/pull/317?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apach

[GitHub] [arrow-julia] codecov-commenter commented on pull request #316: Use a large RC number for CI

2022-04-14 Thread GitBox
codecov-commenter commented on PR #316: URL: https://github.com/apache/arrow-julia/pull/316#issuecomment-1099606047 # [Codecov](https://codecov.io/gh/apache/arrow-julia/pull/316?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apach

[GitHub] [arrow] lidavidm commented on pull request #12891: ARROW-12659: [C++] Support is_valid as a guarantee

2022-04-14 Thread GitBox
lidavidm commented on PR #12891: URL: https://github.com/apache/arrow/pull/12891#issuecomment-1099606380 Hmm, it's crashing because `A == null[string]` is getting converted to `null[string]` by `FoldConstants`. But FilterNode assumes the mask has to be boolean. -- This is an automated me

[GitHub] [arrow-julia] codecov-commenter commented on pull request #315: Remove old releases and RCs on a new release

2022-04-14 Thread GitBox
codecov-commenter commented on PR #315: URL: https://github.com/apache/arrow-julia/pull/315#issuecomment-1099605623 # [Codecov](https://codecov.io/gh/apache/arrow-julia/pull/315?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apach

[GitHub] [arrow] westonpace commented on pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
westonpace commented on PR #12872: URL: https://github.com/apache/arrow/pull/12872#issuecomment-1099604844 @pitrou @lidavidm I'd appreciate your eyes on this if you have time. I think these utilities could be generally useful outside of hash-join and it might be nice to have some more eyes

[GitHub] [arrow] westonpace commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
westonpace commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r850787225 ## cpp/src/arrow/compute/light_array.h: ## @@ -0,0 +1,384 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements

[GitHub] [arrow-julia] kou opened a new pull request, #318: Add a missing release process to update ASF's report database

2022-04-14 Thread GitBox
kou opened a new pull request, #318: URL: https://github.com/apache/arrow-julia/pull/318 fix #311 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gi

[GitHub] [arrow-julia] kou opened a new pull request, #317: Fix wrong RC verification path

2022-04-14 Thread GitBox
kou opened a new pull request, #317: URL: https://github.com/apache/arrow-julia/pull/317 fix #313 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gi

[GitHub] [arrow-julia] kou opened a new pull request, #316: Use a large RC number for CI

2022-04-14 Thread GitBox
kou opened a new pull request, #316: URL: https://github.com/apache/arrow-julia/pull/316 fix #314 This is for avoiding verify release RC CI job failures with RC1 commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [arrow] jonkeane closed pull request #12887: ARROW-16201: [R] SafeCallIntoR on 3.4

2022-04-14 Thread GitBox
jonkeane closed pull request #12887: ARROW-16201: [R] SafeCallIntoR on 3.4 URL: https://github.com/apache/arrow/pull/12887 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [arrow-julia] kou commented on pull request #315: Remove old releases and RCs on a new release

2022-04-14 Thread GitBox
kou commented on PR #315: URL: https://github.com/apache/arrow-julia/pull/315#issuecomment-1099593903 I used this to publish 2.3.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [arrow-julia] kou opened a new pull request, #315: Remove old releases and RCs on a new release

2022-04-14 Thread GitBox
kou opened a new pull request, #315: URL: https://github.com/apache/arrow-julia/pull/315 fix #307 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gi

[GitHub] [arrow-julia] kou commented on pull request #312: Bump version to 2.3.0

2022-04-14 Thread GitBox
kou commented on PR #312: URL: https://github.com/apache/arrow-julia/pull/312#issuecomment-1099592486 Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

[GitHub] [arrow] assignUser commented on pull request #12893: ARROW-16198: [CI][Packaging][Python] Update VCPKG version

2022-04-14 Thread GitBox
assignUser commented on PR #12893: URL: https://github.com/apache/arrow/pull/12893#issuecomment-1099585461 @kszucs could you take a look if this is correct? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [arrow] assignUser commented on a diff in pull request #12893: ARROW-16198: [CI][Packaging][Python] Update VCPKG version

2022-04-14 Thread GitBox
assignUser commented on code in PR #12893: URL: https://github.com/apache/arrow/pull/12893#discussion_r850772166 ## ci/vcpkg/ports.patch: ## @@ -174,23 +32,11 @@ index 2d6bba4d..0ac47887 100644 PATCHES patch-relocatable-rpath.patch fix-aws-root.patch +

[GitHub] [arrow] assignUser commented on a diff in pull request #12893: ARROW-16198: [CI][Packaging][Python] Update VCPKG version

2022-04-14 Thread GitBox
assignUser commented on code in PR #12893: URL: https://github.com/apache/arrow/pull/12893#discussion_r850771825 ## ci/vcpkg/ports.patch: ## @@ -1,145 +1,3 @@ -diff --git a/ports/aws-c-auth/vcpkg.json b/ports/aws-c-auth/vcpkg.json -index dc8f75e8..be703324 100644 a/ports/aw

[GitHub] [arrow] github-actions[bot] commented on pull request #12893: ARROW-16198: [CI][Packaging][Python] Update VCPKG version

2022-04-14 Thread GitBox
github-actions[bot] commented on PR #12893: URL: https://github.com/apache/arrow/pull/12893#issuecomment-1099583192 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #12893: ARROW-16198: [CI][Packaging][Python] Update VCPKG version

2022-04-14 Thread GitBox
github-actions[bot] commented on PR #12893: URL: https://github.com/apache/arrow/pull/12893#issuecomment-1099583184 https://issues.apache.org/jira/browse/ARROW-16198 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] westonpace commented on a diff in pull request #12872: ARROW-16166: [C++][Compute] Utilities for assembling join output

2022-04-14 Thread GitBox
westonpace commented on code in PR #12872: URL: https://github.com/apache/arrow/pull/12872#discussion_r850768949 ## cpp/src/arrow/compute/light_array_test.cc: ## @@ -0,0 +1,504 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

[GitHub] [arrow] westonpace commented on a diff in pull request #12843: ARROW-16148: [C++] TPC-H generator cleanup

2022-04-14 Thread GitBox
westonpace commented on code in PR #12843: URL: https://github.com/apache/arrow/pull/12843#discussion_r850764806 ## cpp/src/arrow/compute/exec/tpch_node_test.cc: ## @@ -333,24 +343,15 @@ void CountModifiedComments(const Datum& d, int* good_count, int* bad_count) { } } -TE

[GitHub] [arrow-rs] alamb commented on issue #1568: Release Arrow 12.0.0 (next release after 11.1.0)

2022-04-14 Thread GitBox
alamb commented on issue #1568: URL: https://github.com/apache/arrow-rs/issues/1568#issuecomment-1099572942 The PRs are coming fast and furious -- I plan to prepare changelog / release tomorrow (Friday) so if there are any issues that we want to get in (as in not wait another 2 weeks) pleas

[GitHub] [arrow-datafusion] Dandandan commented on pull request #2231: chore: add `debug!` log in some execution operators

2022-04-14 Thread GitBox
Dandandan commented on PR #2231: URL: https://github.com/apache/arrow-datafusion/pull/2231#issuecomment-1099572190 Thanks @NGA-TRAN ! For these type of logs, it might be interesting to see if it makes sense to use the `tracing` crate for instrumenting the code. -- This is an autom

[GitHub] [arrow-rs] alamb opened a new issue, #1568: Release Arrow 12.0.0 (next release after 11.1.0)

2022-04-14 Thread GitBox
alamb opened a new issue, #1568: URL: https://github.com/apache/arrow-rs/issues/1568 This is a tracking ticket for the next arrow release after https://crates.io/crates/arrow/11.1.0 was released 2022-04-04 The next release from master contains some significant API changes (new prost/

[GitHub] [arrow] ursabot commented on pull request #12867: ARROW-14168: [R] Warn only once about arrow function differences

2022-04-14 Thread GitBox
ursabot commented on PR #12867: URL: https://github.com/apache/arrow/pull/12867#issuecomment-1099568876 Benchmark runs are scheduled for baseline = 5d5ccebeed155f6d6ee10cefee3cb295fc300c85 and contender = 9d24ded7f7d58717c9d78308b0c59ab7a9636006. 9d24ded7f7d58717c9d78308b0c59ab7a9636006 is

[GitHub] [arrow-datafusion] alamb merged pull request #2231: chore: add `debug!` log in some execution operators

2022-04-14 Thread GitBox
alamb merged PR #2231: URL: https://github.com/apache/arrow-datafusion/pull/2231 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow-rs] alamb commented on a diff in pull request #1567: Fix incorrect `into_buffers` for UnionArray

2022-04-14 Thread GitBox
alamb commented on code in PR #1567: URL: https://github.com/apache/arrow-rs/pull/1567#discussion_r850750862 ## arrow/src/compute/kernels/filter.rs: ## @@ -1692,9 +1692,6 @@ mod tests { } #[test] -// Fails when validation enabled -// https://github.com/apache

  1   2   3   >