Re: [PR] fix: CometExec's outputPartitioning might not be same as Spark expects after AQE interferes [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #299: URL: https://github.com/apache/datafusion-comet/pull/299#discussion_r157455 ## spark/src/main/scala/org/apache/spark/sql/comet/operators.scala: ## @@ -377,7 +383,8 @@ case class CometProjectExec( override val output: Seq[Attribute],

Re: [PR] fix: CometExec's outputPartitioning might not be same as Spark expects after AQE interferes [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #299: URL: https://github.com/apache/datafusion-comet/pull/299#discussion_r1574890396 ## spark/src/main/scala/org/apache/spark/sql/comet/operators.scala: ## @@ -586,7 +607,8 @@ case class CometHashAggregateExec( mode: Option[AggregateMode],

Re: [PR] feat: Support Variance [datafusion-comet]

2024-04-22 Thread via GitHub
huaxingao commented on PR #297: URL: https://github.com/apache/datafusion-comet/pull/297#issuecomment-2069835976 cc @andygrove @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] feat: Add extended explain info to Comet plan [datafusion-comet]

2024-04-22 Thread via GitHub
parthchandra commented on code in PR #255: URL: https://github.com/apache/datafusion-comet/pull/255#discussion_r1574880087 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -128,6 +158,26 @@ class CometSparkSessionExtensions _) if

Re: [I] Remove "Execution error: " prefix from error messages from Rust [datafusion-comet]

2024-04-22 Thread via GitHub
comphead commented on issue #293: URL: https://github.com/apache/datafusion-comet/issues/293#issuecomment-2070160145 My naive approach was to fix the formatter and for some alignment case the prefix won't be added ``` println!("{}", err) => Execution error: err println!("{:>}", er

Re: [PR] feat: Add extended explain info to Comet plan [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on PR #255: URL: https://github.com/apache/datafusion-comet/pull/255#issuecomment-2070462405 Thank you @parthchandra. I will merge this once CI passes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] build: bump spark version to 3.4.3 [datafusion-comet]

2024-04-22 Thread via GitHub
huaxingao closed pull request #292: build: bump spark version to 3.4.3 URL: https://github.com/apache/datafusion-comet/pull/292 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] build: bump spark version to 3.4.3 [datafusion-comet]

2024-04-22 Thread via GitHub
huaxingao opened a new pull request, #292: URL: https://github.com/apache/datafusion-comet/pull/292 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes t

Re: [PR] fix: CometExec's outputPartitioning might not be same as Spark expects after AQE interferes [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #299: URL: https://github.com/apache/datafusion-comet/pull/299#discussion_r1575235651 ## spark/src/main/scala/org/apache/spark/sql/comet/plans/PartitioningPreservingUnaryExecNode.scala: ## @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Fo

[PR] doc: Update DataFusion project name and url [datafusion-comet]

2024-04-22 Thread via GitHub
viirya opened a new pull request, #300: URL: https://github.com/apache/datafusion-comet/pull/300 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes test

Re: [PR] doc: Update DataFusion project name and url [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on PR #300: URL: https://github.com/apache/datafusion-comet/pull/300#issuecomment-2070684612 cc @sunchao @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] feat: Add extended explain info to Comet plan [datafusion-comet]

2024-04-22 Thread via GitHub
parthchandra commented on PR #255: URL: https://github.com/apache/datafusion-comet/pull/255#issuecomment-2070779215 Thanks for the review @viirya @andygrove @advancedxy -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Support extended explain plan information [datafusion-comet]

2024-04-22 Thread via GitHub
viirya closed issue #253: Support extended explain plan information URL: https://github.com/apache/datafusion-comet/issues/253 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] feat: Add extended explain info to Comet plan [datafusion-comet]

2024-04-22 Thread via GitHub
viirya merged PR #255: URL: https://github.com/apache/datafusion-comet/pull/255 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] feat: Add extended explain info to Comet plan [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on PR #255: URL: https://github.com/apache/datafusion-comet/pull/255#issuecomment-2070833278 Merged. Thanks all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] doc: Update DataFusion project name and url [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on PR #300: URL: https://github.com/apache/datafusion-comet/pull/300#issuecomment-2070834764 Thank you @kazuyukitanimura -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] [EPIC] Spark-compatible cast / try_cast operations [datafusion-comet]

2024-04-22 Thread via GitHub
andygrove commented on issue #286: URL: https://github.com/apache/datafusion-comet/issues/286#issuecomment-2070843903 I am now working on cast string -> integral types. I will have a PR up later this week. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] doc: Update DataFusion project name and url [datafusion-comet]

2024-04-22 Thread via GitHub
viirya merged PR #300: URL: https://github.com/apache/datafusion-comet/pull/300 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] doc: Update DataFusion project name and url [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on PR #300: URL: https://github.com/apache/datafusion-comet/pull/300#issuecomment-2070860204 Merged. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] feat: Support Variance [datafusion-comet]

2024-04-22 Thread via GitHub
parthchandra commented on code in PR #297: URL: https://github.com/apache/datafusion-comet/pull/297#discussion_r1575356501 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -426,6 +426,42 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde

[PR] chore: Remove unused functions [datafusion-comet]

2024-04-22 Thread via GitHub
kazuyukitanimura opened a new pull request, #301: URL: https://github.com/apache/datafusion-comet/pull/301 ## Which issue does this PR close? Closes #. ## Rationale for this change To clean up unused pub functions before the first release. ## What c

[PR] fix: Iceberg scan transition should be in front of other data source v2 [datafusion-comet]

2024-04-22 Thread via GitHub
viirya opened a new pull request, #302: URL: https://github.com/apache/datafusion-comet/pull/302 ## Which issue does this PR close? Closes #. ## Rationale for this change This is a follow up fix to #255. #255 added a case of `BatchScanExec` to catch all d

Re: [PR] fix: Iceberg scan transition should be in front of other data source v2 [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on PR #302: URL: https://github.com/apache/datafusion-comet/pull/302#issuecomment-2071054677 cc @parthchandra @andygrove @sunchao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[I] Improve CometSortMergeJoin statistics [datafusion-comet]

2024-04-22 Thread via GitHub
planga82 opened a new issue, #303: URL: https://github.com/apache/datafusion-comet/issues/303 ### What is the problem the feature request solves? Add all statistics SortMergeJoinExec datafusion node provides. ### Describe the potential solution Override metrics map in Com

[PR] Improve CometSortMergeJoin statistics [datafusion-comet]

2024-04-22 Thread via GitHub
planga82 opened a new pull request, #304: URL: https://github.com/apache/datafusion-comet/pull/304 ## Which issue does this PR close? Closes #303 . ## Rationale for this change Add all statistics SortMergeJoinExec datafusion node provides. ## What chang

Re: [I] Remove "Execution error: " prefix from error messages from Rust [datafusion-comet]

2024-04-22 Thread via GitHub
comphead commented on issue #293: URL: https://github.com/apache/datafusion-comet/issues/293#issuecomment-2071163130 Filed https://github.com/apache/datafusion/pull/10186 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[I] Correct build commend in Comet Development Guide [datafusion-comet]

2024-04-22 Thread via GitHub
whynick1 opened a new issue, #305: URL: https://github.com/apache/datafusion-comet/issues/305 ### Describe the bug ``` make test-java: compile the project and run tests in Java side ``` in [DEVELOPMENT.md](https://github.com/apache/datafusion-comet/blob/main/DEVELOPMENT.md#bu

Re: [I] Correct build commend in Comet Development Guide [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on issue #305: URL: https://github.com/apache/datafusion-comet/issues/305#issuecomment-2071274677 Thank you @whynick1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Update .asf.yaml to point to new mailing list [datafusion]

2024-04-22 Thread via GitHub
andygrove commented on PR #10189: URL: https://github.com/apache/datafusion/pull/10189#issuecomment-2071304739 Thanks @phillipleblanc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] Update .asf.yaml to publish docs to datafusion.apache.org [datafusion]

2024-04-22 Thread via GitHub
phillipleblanc opened a new pull request, #10190: URL: https://github.com/apache/datafusion/pull/10190 ## Which issue does this PR close? Closes #10151 ## Rationale for this change ## What changes are included in this PR? Updates `.asf.yaml` to point to the new top

Re: [PR] Update .asf.yaml to publish docs to datafusion.apache.org [datafusion]

2024-04-22 Thread via GitHub
phillipleblanc commented on PR #10190: URL: https://github.com/apache/datafusion/pull/10190#issuecomment-2071314691 > LGTM. Thanks @phillipleblanc. I think it would be good to wait for @alamb to also review before merging. Yeah - there are some links to arrow.apache.org/datafusion sti

Re: [PR] Update .asf.yaml to publish docs to datafusion.apache.org [datafusion]

2024-04-22 Thread via GitHub
viirya commented on code in PR #10190: URL: https://github.com/apache/datafusion/pull/10190#discussion_r1575589155 ## .asf.yaml: ## @@ -27,7 +27,7 @@ notifications: jira_options: link label worklog github: description: "Apache DataFusion SQL Query Engine" - homepage: htt

Re: [PR] Update .asf.yaml to publish docs to datafusion.apache.org [datafusion]

2024-04-22 Thread via GitHub
viirya commented on code in PR #10190: URL: https://github.com/apache/datafusion/pull/10190#discussion_r1575589155 ## .asf.yaml: ## @@ -27,7 +27,7 @@ notifications: jira_options: link label worklog github: description: "Apache DataFusion SQL Query Engine" - homepage: htt

Re: [PR] Update .asf.yaml to publish docs to datafusion.apache.org [datafusion]

2024-04-22 Thread via GitHub
viirya commented on code in PR #10190: URL: https://github.com/apache/datafusion/pull/10190#discussion_r1575590991 ## .asf.yaml: ## @@ -27,7 +27,7 @@ notifications: jira_options: link label worklog github: description: "Apache DataFusion SQL Query Engine" - homepage: htt

Re: [PR] Update .asf.yaml to publish docs to datafusion.apache.org [datafusion]

2024-04-22 Thread via GitHub
phillipleblanc commented on code in PR #10190: URL: https://github.com/apache/datafusion/pull/10190#discussion_r1575603462 ## .asf.yaml: ## @@ -27,7 +27,7 @@ notifications: jira_options: link label worklog github: description: "Apache DataFusion SQL Query Engine" - homep

Re: [PR] Update NOTICE.txt to be relevant to DataFusion [datafusion]

2024-04-22 Thread via GitHub
andygrove merged PR #10185: URL: https://github.com/apache/datafusion/pull/10185 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [I] Replace NOTICES.txt with version that is relevant to DataFusion [datafusion]

2024-04-22 Thread via GitHub
andygrove closed issue #10131: Replace NOTICES.txt with version that is relevant to DataFusion URL: https://github.com/apache/datafusion/issues/10131 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] feat: Support murmur3_hash and sha2 family hash functions [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #226: URL: https://github.com/apache/datafusion-comet/pull/226#discussion_r1575664500 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -983,8 +983,7 @@ class CometExpressionSuite extends CometTestBase with AdaptiveSparkPl

Re: [PR] feat: Support murmur3_hash and sha2 family hash functions [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #226: URL: https://github.com/apache/datafusion-comet/pull/226#discussion_r1575665016 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -1350,7 +1350,7 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde {

Re: [PR] feat: Support murmur3_hash and sha2 family hash functions [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #226: URL: https://github.com/apache/datafusion-comet/pull/226#discussion_r1575666562 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -983,8 +983,7 @@ class CometExpressionSuite extends CometTestBase with AdaptiveSparkPl

Re: [PR] feat: Support murmur3_hash and sha2 family hash functions [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #226: URL: https://github.com/apache/datafusion-comet/pull/226#discussion_r1575674292 ## core/src/execution/datafusion/expressions/scalar_funcs.rs: ## @@ -629,3 +622,72 @@ fn spark_decimal_div( let result = result.with_data_type(DataType::Deci

Re: [PR] feat: Support murmur3_hash and sha2 family hash functions [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #226: URL: https://github.com/apache/datafusion-comet/pull/226#discussion_r1575674292 ## core/src/execution/datafusion/expressions/scalar_funcs.rs: ## @@ -629,3 +622,72 @@ fn spark_decimal_div( let result = result.with_data_type(DataType::Deci

Re: [PR] feat: Support murmur3_hash and sha2 family hash functions [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #226: URL: https://github.com/apache/datafusion-comet/pull/226#discussion_r1575674292 ## core/src/execution/datafusion/expressions/scalar_funcs.rs: ## @@ -629,3 +622,72 @@ fn spark_decimal_div( let result = result.with_data_type(DataType::Deci

Re: [I] Error "entered unreachable code: NamedStructField should be rewritten in OperatorToFunction" after upgrade to 37 [datafusion]

2024-04-22 Thread via GitHub
jayzhan211 commented on issue #10181: URL: https://github.com/apache/datafusion/issues/10181#issuecomment-2071461117 > done between the parse plan and the logical plan I had also thought about deprecate `Expr` and use functions directly in parsing phase. I think it might be a good ide

Re: [PR] feat: Support murmur3_hash and sha2 family hash functions [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #226: URL: https://github.com/apache/datafusion-comet/pull/226#discussion_r1575678310 ## core/src/execution/datafusion/expressions/scalar_funcs.rs: ## @@ -629,3 +622,72 @@ fn spark_decimal_div( let result = result.with_data_type(DataType::Deci

Re: [PR] feat: Support murmur3_hash and sha2 family hash functions [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #226: URL: https://github.com/apache/datafusion-comet/pull/226#discussion_r1575679390 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -1646,6 +1647,37 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde {

Re: [PR] feat: Support murmur3_hash and sha2 family hash functions [datafusion-comet]

2024-04-22 Thread via GitHub
viirya commented on code in PR #226: URL: https://github.com/apache/datafusion-comet/pull/226#discussion_r1575681660 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -1646,6 +1647,37 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde {

Re: [PR] feat: Support murmur3_hash and sha2 family hash functions [datafusion-comet]

2024-04-22 Thread via GitHub
advancedxy commented on code in PR #226: URL: https://github.com/apache/datafusion-comet/pull/226#discussion_r1575732635 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -1350,7 +1350,7 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde {