[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1633: Add `substring` support for `FixedSizeBinaryArray`

2022-05-01 Thread GitBox
HaoYang670 commented on code in PR #1633: URL: https://github.com/apache/arrow-rs/pull/1633#discussion_r862622056 ## arrow/src/compute/kernels/substring.rs: ## @@ -86,6 +86,52 @@ fn binary_substring( Ok(make_array(data)) } +fn fixed_size_binary_substring( +array: &Fi

[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1633: Add `substring` support for `FixedSizeBinaryArray`

2022-05-01 Thread GitBox
HaoYang670 commented on code in PR #1633: URL: https://github.com/apache/arrow-rs/pull/1633#discussion_r862620323 ## arrow/src/compute/kernels/substring.rs: ## @@ -86,6 +86,52 @@ fn binary_substring( Ok(make_array(data)) } +fn fixed_size_binary_substring( +array: &Fi

[GitHub] [arrow-rs] viirya commented on a diff in pull request #1633: Add `substring` support for `FixedSizeBinaryArray`

2022-05-01 Thread GitBox
viirya commented on code in PR #1633: URL: https://github.com/apache/arrow-rs/pull/1633#discussion_r862611258 ## arrow/src/compute/kernels/substring.rs: ## @@ -249,6 +304,8 @@ mod tests { #[allow(clippy::type_complexity)] fn with_nulls_generic_binary() -> Result<()> {

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1631: Do not assume dictionaries exists in footer

2022-05-01 Thread GitBox
codecov-commenter commented on PR #1631: URL: https://github.com/apache/arrow-rs/pull/1631#issuecomment-1114531612 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1631?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-rs] viirya commented on a diff in pull request #1633: Add `substring` support for `FixedSizeBinaryArray`

2022-05-01 Thread GitBox
viirya commented on code in PR #1633: URL: https://github.com/apache/arrow-rs/pull/1633#discussion_r862608650 ## arrow/src/compute/kernels/substring.rs: ## @@ -86,6 +86,52 @@ fn binary_substring( Ok(make_array(data)) } +fn fixed_size_binary_substring( +array: &FixedS

[GitHub] [arrow-datafusion] Jimexist commented on issue #2378: Add support for `group by rollup` in SQL query planner and logical plan

2022-05-01 Thread GitBox
Jimexist commented on issue #2378: URL: https://github.com/apache/arrow-datafusion/issues/2378#issuecomment-1114522054 see also: - https://github.com/apache/arrow-datafusion/issues/1327 -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow-rs] viirya commented on pull request #1631: Do not assume dictionaries exists in footer

2022-05-01 Thread GitBox
viirya commented on PR #1631: URL: https://github.com/apache/arrow-rs/pull/1631#issuecomment-1114521629 There are some format error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow-rs] viirya commented on a diff in pull request #1631: do not assume footer exists, fixes issue #1335

2022-05-01 Thread GitBox
viirya commented on code in PR #1631: URL: https://github.com/apache/arrow-rs/pull/1631#discussion_r862603471 ## arrow/src/ipc/reader.rs: ## @@ -651,43 +651,48 @@ impl FileReader { // Create an array of optional dictionary value arrays, one per field. let mut

[GitHub] [arrow] kou commented on pull request #12763: ARROW-14892: [Python][C++] GCS Bindings

2022-05-01 Thread GitBox
kou commented on PR #12763: URL: https://github.com/apache/arrow/pull/12763#issuecomment-1114515619 It seems that testbench doesn't work on arm64 macOS because of https://github.com/grpc/grpc/issues/28387 : https://github.com/ursacomputing/crossbow/runs/6252603874?check_suite_focus=t

[GitHub] [arrow] github-actions[bot] commented on pull request #12763: ARROW-14892: [Python][C++] GCS Bindings

2022-05-01 Thread GitBox
github-actions[bot] commented on PR #12763: URL: https://github.com/apache/arrow/pull/12763#issuecomment-1114505328 Revision: 7ce4813c79e6c54f38f5c667aa6c73d9e428045c Submitted crossbow builds: [ursacomputing/crossbow @ actions-1989](https://github.com/ursacomputing/crossbow/branches/

[GitHub] [arrow] kou commented on pull request #12763: ARROW-14892: [Python][C++] GCS Bindings

2022-05-01 Thread GitBox
kou commented on PR #12763: URL: https://github.com/apache/arrow/pull/12763#issuecomment-1114504691 @github-actions crossbow submit wheel-macos-big-sur-cp310-arm64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [arrow] github-actions[bot] commented on pull request #13043: ARROW-15787: [C++] Temporal floor/ceil/round kernels could be optimised with templating

2022-05-01 Thread GitBox
github-actions[bot] commented on PR #13043: URL: https://github.com/apache/arrow/pull/13043#issuecomment-1114496335 https://issues.apache.org/jira/browse/ARROW-15787 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] rok opened a new pull request, #13043: ARROW-15787: [C++] Temporal floor/ceil/round kernels could be optimised with templating

2022-05-01 Thread GitBox
rok opened a new pull request, #13043: URL: https://github.com/apache/arrow/pull/13043 This is to resolve [ARROW-15787](https://issues.apache.org/jira/browse/ARROW-15787) - that is simplify and optimize temporal rounding kernels by templating. It builds on top of [ARROW-14821](https://g

[GitHub] [arrow-datafusion] andygrove commented on issue #2374: Identifiers are made lower-case in SQL query

2022-05-01 Thread GitBox
andygrove commented on issue #2374: URL: https://github.com/apache/arrow-datafusion/issues/2374#issuecomment-1114493881 @dbr My view is that this is not a bug. If a non-lowercase table or column name is created via the low-level Schema/DataFrame APIs then I would expect to have to use doub

[GitHub] [arrow-datafusion] AdheipSingh commented on issue #2397: Helm Chart to deploy Ballista on kubernetes

2022-05-01 Thread GitBox
AdheipSingh commented on issue #2397: URL: https://github.com/apache/arrow-datafusion/issues/2397#issuecomment-1114488546 thanks @andygrove I will submit a PR for this soon for the helm charts with documentation. -- This is an automated message from the Apache Git Service. To resp

[GitHub] [arrow-datafusion] andygrove commented on issue #2397: Helm Chart to deploy Ballista on kubernetes

2022-05-01 Thread GitBox
andygrove commented on issue #2397: URL: https://github.com/apache/arrow-datafusion/issues/2397#issuecomment-1114484305 I would love to see official docker images for DataFusion/Ballista as well as supported k8s / helm deployments. We should at least have documentation in the user guide fo

[GitHub] [arrow] ursabot commented on pull request #10883: ARROW-7272: [C++][Java][Dataset] JNI bridge between RecordBatch and VectorSchemaRoot

2022-05-01 Thread GitBox
ursabot commented on PR #10883: URL: https://github.com/apache/arrow/pull/10883#issuecomment-1114481878 Benchmark runs are scheduled for baseline = d46442549d3ff9c4c20646196d02e9cf5ad68c12 and contender = dc97883dee25ba8da55c7591060c44de2ea00865. dc97883dee25ba8da55c7591060c44de2ea00865 is

[GitHub] [arrow] ursabot commented on pull request #12990: ARROW-16292: [Java][Doc] Upgrade java documentation for JSE17/JSE18

2022-05-01 Thread GitBox
ursabot commented on PR #12990: URL: https://github.com/apache/arrow/pull/12990#issuecomment-1114414489 Benchmark runs are scheduled for baseline = cb41e069b51d65231cd8dde8bd19ee4fcc4eb9b1 and contender = d46442549d3ff9c4c20646196d02e9cf5ad68c12. d46442549d3ff9c4c20646196d02e9cf5ad68c12 is

[GitHub] [arrow] rok commented on a diff in pull request #13037: ARROW-16425: [C++] Add compute kernel test for scalar array timestamp comparison

2022-05-01 Thread GitBox
rok commented on code in PR #13037: URL: https://github.com/apache/arrow/pull/13037#discussion_r862551524 ## cpp/src/arrow/compute/kernels/scalar_compare_test.cc: ## @@ -502,6 +502,53 @@ TEST(TestCompareTimestamps, DifferentParameters) { } } +TEST(TestCompareTimestamps, Sc

[GitHub] [arrow-rs] HaoYang670 opened a new issue, #1638: Question: Why are there 3 types of `OffsetSizeTrait`s?

2022-05-01 Thread GitBox
HaoYang670 opened a new issue, #1638: URL: https://github.com/apache/arrow-rs/issues/1638 **Which part is this question about** In our code, there are 3 different `OffsetSizeTrait`: `OffsetSizeTrait`, `BinaryOffsetSizeTrait` and `StringOffsetSizeTrait`. The only difference between them i

[GitHub] [arrow-datafusion] yjshen merged pull request #2388: Re-organize and rename aggregates physical plan

2022-05-01 Thread GitBox
yjshen merged PR #2388: URL: https://github.com/apache/arrow-datafusion/pull/2388 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arr

[GitHub] [arrow-datafusion] yjshen closed issue #2387: Re-organize and rename aggregates physical plan

2022-05-01 Thread GitBox
yjshen closed issue #2387: Re-organize and rename aggregates physical plan URL: https://github.com/apache/arrow-datafusion/issues/2387 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [arrow-datafusion] andygrove closed issue #2399: remove duplicated `fn aggregate()` in aggregate expression tests

2022-05-01 Thread GitBox
andygrove closed issue #2399: remove duplicated `fn aggregate()` in aggregate expression tests URL: https://github.com/apache/arrow-datafusion/issues/2399 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow-datafusion] andygrove merged pull request #2400: remove duplicated `fn aggregate()` in aggregate expression tests

2022-05-01 Thread GitBox
andygrove merged PR #2400: URL: https://github.com/apache/arrow-datafusion/pull/2400 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow-datafusion] andygrove merged pull request #2401: docs: update the renamed project Flock (Squirtle)

2022-05-01 Thread GitBox
andygrove merged PR #2401: URL: https://github.com/apache/arrow-datafusion/pull/2401 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow] ursabot commented on pull request #12982: ARROW-16311: [Java] Do not return table_schema column when it's not requested

2022-05-01 Thread GitBox
ursabot commented on PR #12982: URL: https://github.com/apache/arrow/pull/12982#issuecomment-1114353318 ['Python', 'R'] benchmarks have high level of regressions. [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/1fd49b7e01564f6a8a0a2d94d7df8a21...6eb700c493f14d62b3488c6ad6c42ce7/)

[GitHub] [arrow] ursabot commented on pull request #12982: ARROW-16311: [Java] Do not return table_schema column when it's not requested

2022-05-01 Thread GitBox
ursabot commented on PR #12982: URL: https://github.com/apache/arrow/pull/12982#issuecomment-1114353292 Benchmark runs are scheduled for baseline = 3e56a949c73168d789612a9022cdceae6088e2b7 and contender = cb41e069b51d65231cd8dde8bd19ee4fcc4eb9b1. cb41e069b51d65231cd8dde8bd19ee4fcc4eb9b1 is

[GitHub] [arrow-rs] tfeda opened a new issue, #1637: UnionBuilder produces incorrect Union DataType

2022-05-01 Thread GitBox
tfeda opened a new issue, #1637: URL: https://github.com/apache/arrow-rs/issues/1637 **Describe the bug** The Union DataType produced by UnionBuilder has non-nullable children Fields after appending nulls in the builder. **To Reproduce** Steps to reproduce the behavior: Try the

[GitHub] [arrow-datafusion] jon-chuang commented on issue #2248: [EPIC] Subquery support

2022-05-01 Thread GitBox
jon-chuang commented on issue #2248: URL: https://github.com/apache/arrow-datafusion/issues/2248#issuecomment-1114332289 Some additional ideas for subquery optimizations: - If rest of subquery is not correlated, push up correlated filter (in particular, correlated equality filter) into a

[GitHub] [arrow] kou closed pull request #13039: ARROW-16428: [Release] Add prefix to ENV variables

2022-05-01 Thread GitBox
kou closed pull request #13039: ARROW-16428: [Release] Add prefix to ENV variables URL: https://github.com/apache/arrow/pull/13039 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [arrow] ursabot commented on pull request #12581: ARROW-15940: [Gandiva][C++] Add NEGATIVE function for decimal data type

2022-05-01 Thread GitBox
ursabot commented on PR #12581: URL: https://github.com/apache/arrow/pull/12581#issuecomment-1114320202 Benchmark runs are scheduled for baseline = 05e09f6414359834325a89cebbf3bf2f525e4e2a and contender = 3e56a949c73168d789612a9022cdceae6088e2b7. 3e56a949c73168d789612a9022cdceae6088e2b7 is

[GitHub] [arrow] ursabot commented on pull request #12996: ARROW-14651: [Release][Archery] Add support for retrying download

2022-05-01 Thread GitBox
ursabot commented on PR #12996: URL: https://github.com/apache/arrow/pull/12996#issuecomment-1114320196 Benchmark runs are scheduled for baseline = 3c75b17c405d2fb47167a27a8fb5fd356df9f9ba and contender = 05e09f6414359834325a89cebbf3bf2f525e4e2a. 05e09f6414359834325a89cebbf3bf2f525e4e2a is

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1636: Fix generate_nested_dictionary_case integration test failure for Rust cases

2022-05-01 Thread GitBox
codecov-commenter commented on PR #1636: URL: https://github.com/apache/arrow-rs/pull/1636#issuecomment-1114318356 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1636?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-rs] viirya commented on a diff in pull request #1636: Fix generate_nested_dictionary_case integration test failure for Rust cases

2022-05-01 Thread GitBox
viirya commented on code in PR #1636: URL: https://github.com/apache/arrow-rs/pull/1636#discussion_r862511828 ## integration-testing/src/lib.rs: ## @@ -640,6 +645,7 @@ fn dictionary_array_from_json( dict_key: &DataType, dict_value: &DataType, dictionary: &ArrowJso

[GitHub] [arrow-rs] viirya commented on a diff in pull request #1636: Fix generate_nested_dictionary_case integration test failure for Rust cases

2022-05-01 Thread GitBox
viirya commented on code in PR #1636: URL: https://github.com/apache/arrow-rs/pull/1636#discussion_r862511688 ## integration-testing/src/lib.rs: ## @@ -640,6 +645,7 @@ fn dictionary_array_from_json( dict_key: &DataType, dict_value: &DataType, dictionary: &ArrowJso

[GitHub] [arrow-datafusion] timvw commented on issue #2393: Allow user to use glob/wildcard in file path

2022-05-01 Thread GitBox
timvw commented on issue #2393: URL: https://github.com/apache/arrow-datafusion/issues/2393#issuecomment-1114314986 The thing is, currently datafusion-objectstore-s3 already supports (some?) globbing... eg: when I a test in update s3.rs to use globbing instead of filename, it keeps

[GitHub] [arrow-rs] viirya commented on a diff in pull request #1636: Fix generate_nested_dictionary_case integration test failure for Rust cases

2022-05-01 Thread GitBox
viirya commented on code in PR #1636: URL: https://github.com/apache/arrow-rs/pull/1636#discussion_r862511593 ## arrow/src/ipc/reader.rs: ## @@ -563,16 +566,10 @@ pub fn read_dictionary( ArrowError::InvalidArgumentError("dictionary id not found in schema".to_string())

[GitHub] [arrow-rs] viirya opened a new pull request, #1636: Fix ipc nested dict

2022-05-01 Thread GitBox
viirya opened a new pull request, #1636: URL: https://github.com/apache/arrow-rs/pull/1636 # Which issue does this PR close? Closes #1635. # Rationale for this change # What changes are included in this PR? # Are there any user-facing chan

[GitHub] [arrow-rs] viirya opened a new issue, #1635: Fix generate_nested_dictionary_case integration test failure for Rust cases

2022-05-01 Thread GitBox
viirya opened a new issue, #1635: URL: https://github.com/apache/arrow-rs/issues/1635 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Part of #1404. This is for `generate_nested_dictionary_case`, Rust producing, Rust consumin

[GitHub] [arrow-datafusion] timvw commented on issue #2393: Allow user to use glob/wildcard in file path

2022-05-01 Thread GitBox
timvw commented on issue #2393: URL: https://github.com/apache/arrow-datafusion/issues/2393#issuecomment-1114308038 Your suggestion to make this explicit on ObjectStore makes a lot of sense (was not aware of the other implementations and the datafusion-contrib project till now). -- This

[GitHub] [arrow-datafusion] timvw commented on pull request #2394: Issue 2393: Support glob patterns for (local) file(s)

2022-05-01 Thread GitBox
timvw commented on PR #2394: URL: https://github.com/apache/arrow-datafusion/pull/2394#issuecomment-1114306230 Had a look at the implementation of tokio fs operations, and copied their (non-public) asyncify method to run the globbing on... -- This is an automated message from the Apache

[GitHub] [arrow] pitrou commented on pull request #12055: ARROW-11989: [C++][Python] Improve ChunkedArray's complexity for the access of elements

2022-05-01 Thread GitBox
pitrou commented on PR #12055: URL: https://github.com/apache/arrow/pull/12055#issuecomment-1114300296 It's not related because the runtimes are incompatible with the CPU time needed for calculating chunk indices. That said, it's worth [opening an issue](https://arrow.apache.org/docs/develo

[GitHub] [arrow] eelxpeng commented on pull request #12055: ARROW-11989: [C++][Python] Improve ChunkedArray's complexity for the access of elements

2022-05-01 Thread GitBox
eelxpeng commented on PR #12055: URL: https://github.com/apache/arrow/pull/12055#issuecomment-1114299709 @pitrou How does it not related to this PR? In this case, 1 chunk is significantly faster than 488 chunks. I'm accessing via efs (https://aws.amazon.com/efs/). -- This is an automated

[GitHub] [arrow] eelxpeng commented on pull request #12055: ARROW-11989: [C++][Python] Improve ChunkedArray's complexity for the access of elements

2022-05-01 Thread GitBox
eelxpeng commented on PR #12055: URL: https://github.com/apache/arrow/pull/12055#issuecomment-1114298947 @edponce with previous Arrow version (7.0.0) before this patch, the time is 1510 seconds. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] pitrou commented on pull request #12055: ARROW-11989: [C++][Python] Improve ChunkedArray's complexity for the access of elements

2022-05-01 Thread GitBox
pitrou commented on PR #12055: URL: https://github.com/apache/arrow/pull/12055#issuecomment-1114297975 > Not sure about why. I have a large arrow file with 115545630 rows and 488 chunks. Even using the fix in this pr, the time to randomly access 1000 rows is 2102 seconds This sounds

[GitHub] [arrow-datafusion] WinkerDu commented on pull request #2400: remove duplicated `fn aggregate()` in aggregate expression tests

2022-05-01 Thread GitBox
WinkerDu commented on PR #2400: URL: https://github.com/apache/arrow-datafusion/pull/2400#issuecomment-1114296882 cc @andygrove @alamb @yjshen Please have a review, thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [arrow-datafusion] tustvold commented on issue #2393: Allow user to use glob/wildcard in file path

2022-05-01 Thread GitBox
tustvold commented on issue #2393: URL: https://github.com/apache/arrow-datafusion/issues/2393#issuecomment-1114294530 I'm not sure how I feel about having the local object store treat paths differently from the other implementations. Perhaps we should consistently support glob expressions

[GitHub] [arrow-datafusion] tustvold commented on pull request #2394: Issue 2393: Support glob patterns for (local) file(s)

2022-05-01 Thread GitBox
tustvold commented on PR #2394: URL: https://github.com/apache/arrow-datafusion/pull/2394#issuecomment-1114289132 I think the code now mixes blocking IO, as performed by glob, and non-blocking IO as performed by tokio::fs. I think we should consistently use one or the other... -- This i

[GitHub] [arrow] ursabot commented on pull request #12995: MINOR: [Release] Use temporary directory within the docker container while uploading binaries

2022-05-01 Thread GitBox
ursabot commented on PR #12995: URL: https://github.com/apache/arrow/pull/12995#issuecomment-1114285395 Benchmark runs are scheduled for baseline = d2fa3ad1e8029b1827554fd7a157c2433cf5ebc6 and contender = 3c75b17c405d2fb47167a27a8fb5fd356df9f9ba. 3c75b17c405d2fb47167a27a8fb5fd356df9f9ba is

[GitHub] [arrow-datafusion] WinkerDu opened a new pull request, #2400: remove duplicated `fn aggregate()` in aggregate expression tests

2022-05-01 Thread GitBox
WinkerDu opened a new pull request, #2400: URL: https://github.com/apache/arrow-datafusion/pull/2400 # Which issue does this PR close? Closes #2399 . # Rationale for this change Here are duplicated `fn aggregate() ` implementations in multiple aggregate expressi

[GitHub] [arrow-datafusion] WinkerDu opened a new issue, #2399: remove duplicated `fn aggregate()` in aggregate expression tests

2022-05-01 Thread GitBox
WinkerDu opened a new issue, #2399: URL: https://github.com/apache/arrow-datafusion/issues/2399 Here are duplicated `fn aggregate()` implementations in multiple aggregate expressions like `sum.rs`, `average.rs`, `correlation.rs`, etc. We can remove them and `use crate::expressions::tests

[GitHub] [arrow] shefali163 commented on a diff in pull request #12914: ARROW-2034: [C++] Filesystem implementation for Azure Blob Storage

2022-05-01 Thread GitBox
shefali163 commented on code in PR #12914: URL: https://github.com/apache/arrow/pull/12914#discussion_r862498378 ## cpp/src/arrow/filesystem/azure/azurefs.cc: ## @@ -0,0 +1,1651 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agr

[GitHub] [arrow] rtpsw commented on pull request #13037: ARROW-16425: [C++] Add compute kernel test for scalar array timestamp comparison

2022-05-01 Thread GitBox
rtpsw commented on PR #13037: URL: https://github.com/apache/arrow/pull/13037#issuecomment-1114281826 The build errors don't seem to be related to the change. Please advise how to proceed. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow] edponce commented on pull request #12055: ARROW-11989: [C++][Python] Improve ChunkedArray's complexity for the access of elements

2022-05-01 Thread GitBox
edponce commented on PR #12055: URL: https://github.com/apache/arrow/pull/12055#issuecomment-1114281714 I can set up benchmarks this week to investigate further performance with varying chunk sizes. -- This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [arrow-datafusion] jackwener opened a new pull request, #2398: Replace Union.alias with SubqueryAlias

2022-05-01 Thread GitBox
jackwener opened a new pull request, #2398: URL: https://github.com/apache/arrow-datafusion/pull/2398 # Which issue does this PR close? Closes #2213. # What changes are included in this PR? Replace Union.alias with SubqueryAlias # Are there any user-facing changes?

[GitHub] [arrow] edponce commented on pull request #12055: ARROW-11989: [C++][Python] Improve ChunkedArray's complexity for the access of elements

2022-05-01 Thread GitBox
edponce commented on PR #12055: URL: https://github.com/apache/arrow/pull/12055#issuecomment-1114279295 Hi @eelxpeng, the implementation in this PR caches the previous chunk used, so it is expected to be faster than randomly accessing chunks where chunk caching operations add a (small) pena

[GitHub] [arrow-datafusion] andygrove merged pull request #2391: minor: SchemaError code cleanup and improvements

2022-05-01 Thread GitBox
andygrove merged PR #2391: URL: https://github.com/apache/arrow-datafusion/pull/2391 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2389: Support struct_expr generate struct in sql

2022-05-01 Thread GitBox
andygrove commented on code in PR #2389: URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r862485794 ## datafusion/physical-expr/src/struct_expressions.rs: ## @@ -0,0 +1,109 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2389: Support struct_expr generate struct in sql

2022-05-01 Thread GitBox
andygrove commented on code in PR #2389: URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r862485635 ## datafusion/expr/src/function.rs: ## @@ -222,6 +222,14 @@ pub fn return_type( _ => Ok(DataType::Float64), }, +BuiltinScalarF

[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2389: Support struct_expr generate struct in sql

2022-05-01 Thread GitBox
andygrove commented on code in PR #2389: URL: https://github.com/apache/arrow-datafusion/pull/2389#discussion_r862485429 ## datafusion/core/tests/sql/expr.rs: ## @@ -510,6 +510,23 @@ async fn test_array_literals() -> Result<()> { Ok(()) } +#[tokio::test] +async fn test_s

[GitHub] [arrow] ursabot commented on pull request #12994: MINOR: [Release] Install gpg in all ubuntu verification images

2022-05-01 Thread GitBox
ursabot commented on PR #12994: URL: https://github.com/apache/arrow/pull/12994#issuecomment-1114251556 Benchmark runs are scheduled for baseline = 4010916f864783b2a8e7a7d1a9d0187060ec47e7 and contender = d2fa3ad1e8029b1827554fd7a157c2433cf5ebc6. d2fa3ad1e8029b1827554fd7a157c2433cf5ebc6 is

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2388: Re-organize and rename aggregates physical plan

2022-05-01 Thread GitBox
alamb commented on code in PR #2388: URL: https://github.com/apache/arrow-datafusion/pull/2388#discussion_r862470943 ## ballista/rust/core/src/serde/physical_plan/mod.rs: ## @@ -306,19 +306,21 @@ impl AsExecutionPlan for PhysicalPlanNode { Arc::new((&input_s

[GitHub] [arrow] ursabot commented on pull request #12935: ARROW-16225: [C++][Parquet] Fix length of encryption AAD random byte generation

2022-05-01 Thread GitBox
ursabot commented on PR #12935: URL: https://github.com/apache/arrow/pull/12935#issuecomment-1114211985 Benchmark runs are scheduled for baseline = d85e9024b52e131f8dacde064962d53b04697a6f and contender = 4010916f864783b2a8e7a7d1a9d0187060ec47e7. 4010916f864783b2a8e7a7d1a9d0187060ec47e7 is

[GitHub] [arrow-datafusion] AdheipSingh opened a new issue, #2397: Helm Chart to deploy Ballista on kubernetes

2022-05-01 Thread GitBox
AdheipSingh opened a new issue, #2397: URL: https://github.com/apache/arrow-datafusion/issues/2397 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** - Ballista has a scheduler and executor, which need pvc's. - Currently it is

[GitHub] [arrow] ursabot commented on pull request #12960: ARROW-16283: [Go] Cleanup panics in new Buffered Reader

2022-05-01 Thread GitBox
ursabot commented on PR #12960: URL: https://github.com/apache/arrow/pull/12960#issuecomment-1114175574 Benchmark runs are scheduled for baseline = 3592f988b2aefab05357231378a539fc1766c4ec and contender = d85e9024b52e131f8dacde064962d53b04697a6f. d85e9024b52e131f8dacde064962d53b04697a6f is