Re: [PR] fix: Fall back to Spark for unsupported partition or sort expressions in window aggregates [datafusion-comet]
andygrove merged PR #1253: URL: https://github.com/apache/datafusion-comet/pull/1253 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for unsupported partition or sort expressions in window aggregates [datafusion-comet]
andygrove commented on PR #1253: URL: https://github.com/apache/datafusion-comet/pull/1253#issuecomment-2581468945 Thanks for the review @kazuyukitanimura -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for unsupported partition or sort expressions in window aggregates [datafusion-comet]
codecov-commenter commented on PR #1253: URL: https://github.com/apache/datafusion-comet/pull/1253#issuecomment-2581266227 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1253?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report Attention: Patch coverage is `42.85714%` with `4 lines` in your changes missing coverage. Please review. > Project coverage is 34.69%. Comparing base [(`4cf840f`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/4cf840f571b916b399e6ca6708bf0de4c4c1ca95?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) to head [(`5bddb87`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/5bddb87ef6d9f2221f3b00a6851d7857f3b49365?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache). > Report is 7 commits behind head on main. | [Files with missing lines](https://app.codecov.io/gh/apache/datafusion-comet/pull/1253?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Patch % | Lines | |---|---|---| | [.../scala/org/apache/comet/serde/QueryPlanSerde.scala](https://app.codecov.io/gh/apache/datafusion-comet/pull/1253?src=pr&el=tree&filepath=spark%2Fsrc%2Fmain%2Fscala%2Forg%2Fapache%2Fcomet%2Fserde%2FQueryPlanSerde.scala&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-c3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9jb21ldC9zZXJkZS9RdWVyeVBsYW5TZXJkZS5zY2FsYQ==) | 42.85% | [3 Missing and 1 partial :warning: ](https://app.codecov.io/gh/apache/datafusion-comet/pull/1253?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Additional details and impacted files ```diff @@ Coverage Diff @@ ## main#1253 +/- ## - Coverage 34.88% 34.69% -0.19% - Complexity 990 991 +1 Files 116 116 Lines 4375544891+1136 Branches 9569 9864 +295 + Hits 1526315574 +311 - Misses2551326168 +655 - Partials 2979 3149 +170 ``` [:umbrella: View full report in Codecov by Sentry](https://app.codecov.io/gh/apache/datafusion-comet/pull/1253?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache). :loudspeaker: Have feedback on the report? [Share it here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for unsupported partition or sort expressions in window aggregates [datafusion-comet]
andygrove commented on code in PR #1253: URL: https://github.com/apache/datafusion-comet/pull/1253#discussion_r1909275211 ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -89,6 +89,24 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } + // based on Spark's SQLWindowFunctionSuite test of the same name + test("window function: partition and order expressions") { +for (shuffleMode <- Seq("auto", "native", "jvm")) { + withSQLConf(CometConf.COMET_SHUFFLE_MODE.key -> shuffleMode) { +val df = + Seq((1, "a", 5), (2, "a", 6), (3, "b", 7), (4, "b", 8), (5, "c", 9), (6, "c", 10)).toDF( +"month", +"area", +"product") +df.createOrReplaceTempView("windowData") +checkSparkAnswer(sql(""" Review Comment: I updated the test. I added assertions to check for the expected shuffle exchanges and added some notes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for unsupported partition or sort expressions in window aggregates [datafusion-comet]
andygrove commented on code in PR #1253: URL: https://github.com/apache/datafusion-comet/pull/1253#discussion_r1909259510 ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -89,6 +89,24 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } + // based on Spark's SQLWindowFunctionSuite test of the same name + test("window function: partition and order expressions") { +for (shuffleMode <- Seq("auto", "native", "jvm")) { + withSQLConf(CometConf.COMET_SHUFFLE_MODE.key -> shuffleMode) { +val df = + Seq((1, "a", 5), (2, "a", 6), (3, "b", 7), (4, "b", 8), (5, "c", 9), (6, "c", 10)).toDF( +"month", +"area", +"product") +df.createOrReplaceTempView("windowData") +checkSparkAnswer(sql(""" Review Comment: With `auto` and `native` we fall back to Spark currently -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for unsupported partition or sort expressions in window aggregates [datafusion-comet]
andygrove commented on code in PR #1253: URL: https://github.com/apache/datafusion-comet/pull/1253#discussion_r1909260086 ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -89,6 +89,24 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } + // based on Spark's SQLWindowFunctionSuite test of the same name + test("window function: partition and order expressions") { +for (shuffleMode <- Seq("auto", "native", "jvm")) { + withSQLConf(CometConf.COMET_SHUFFLE_MODE.key -> shuffleMode) { +val df = + Seq((1, "a", 5), (2, "a", 6), (3, "b", 7), (4, "b", 8), (5, "c", 9), (6, "c", 10)).toDF( +"month", +"area", +"product") +df.createOrReplaceTempView("windowData") +checkSparkAnswer(sql(""" Review Comment: Let me see if I can conditionally use `checkSparkAnswerAndOperator` for the `jvm` shuffle case -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for unsupported partition or sort expressions in window aggregates [datafusion-comet]
andygrove commented on code in PR #1253: URL: https://github.com/apache/datafusion-comet/pull/1253#discussion_r1909258219 ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -89,6 +89,24 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } + // based on Spark's SQLWindowFunctionSuite test of the same name + test("window function: partition and order expressions") { +for (shuffleMode <- Seq("auto", "native", "jvm")) { Review Comment: Not directly, but due to an existing issue in Comet, we fall back to Spark for this query if shuffle mode is auto or native, so we only see the issue when specifying jvm shuffle. See https://github.com/apache/datafusion-comet/issues/1248 for more info. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for unsupported partition or sort expressions in window aggregates [datafusion-comet]
andygrove commented on code in PR #1253: URL: https://github.com/apache/datafusion-comet/pull/1253#discussion_r1909258219 ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -89,6 +89,24 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } + // based on Spark's SQLWindowFunctionSuite test of the same name + test("window function: partition and order expressions") { +for (shuffleMode <- Seq("auto", "native", "jvm")) { Review Comment: Not directly, but due to an existing issue in Comet, we fall back to Spark for if shuffle mode is auto or native, so we only see the issue when specifying jvm shuffle. See https://github.com/apache/datafusion-comet/issues/1248 for more info. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for unsupported partition or sort expressions in window aggregates [datafusion-comet]
kazuyukitanimura commented on code in PR #1253: URL: https://github.com/apache/datafusion-comet/pull/1253#discussion_r1909251792 ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -89,6 +89,24 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } + // based on Spark's SQLWindowFunctionSuite test of the same name + test("window function: partition and order expressions") { +for (shuffleMode <- Seq("auto", "native", "jvm")) { + withSQLConf(CometConf.COMET_SHUFFLE_MODE.key -> shuffleMode) { +val df = + Seq((1, "a", 5), (2, "a", 6), (3, "b", 7), (4, "b", 8), (5, "c", 9), (6, "c", 10)).toDF( +"month", +"area", +"product") +df.createOrReplaceTempView("windowData") +checkSparkAnswer(sql(""" Review Comment: What about using `checkSparkAnswerAndOperator` instead? ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -89,6 +89,24 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } + // based on Spark's SQLWindowFunctionSuite test of the same name + test("window function: partition and order expressions") { +for (shuffleMode <- Seq("auto", "native", "jvm")) { Review Comment: Does shuffle mode affect the behavior of window function? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org