Re: [PR] fix: Fall back to Spark for distinct aggregates [datafusion-comet]
andygrove merged PR #1262: URL: https://github.com/apache/datafusion-comet/pull/1262 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for distinct aggregates [datafusion-comet]
andygrove commented on PR #1262: URL: https://github.com/apache/datafusion-comet/pull/1262#issuecomment-2587330496 Thanks for the review @huaxingao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for distinct aggregates [datafusion-comet]
andygrove commented on code in PR #1262: URL: https://github.com/apache/datafusion-comet/pull/1262#discussion_r1911777608 ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -974,7 +974,8 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } - test("distinct") { + // TODO enable once https://github.com/apache/datafusion-comet/issues/1267 is implemented Review Comment: I understand what is happening now. Spark optimizes some simple cases like `SELECT a, COUNT(DISTINCT b) ...` to first perform an aggregate to just get the distinct values, so by the time we perform the `COUNT` there are only distinct values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for distinct aggregates [datafusion-comet]
andygrove commented on code in PR #1262: URL: https://github.com/apache/datafusion-comet/pull/1262#discussion_r1911770443 ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -974,7 +974,8 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } - test("distinct") { + // TODO enable once https://github.com/apache/datafusion-comet/issues/1267 is implemented Review Comment: I'm looking at this test again on main and do not understand how this one is passing now :confused: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for distinct aggregates [datafusion-comet]
codecov-commenter commented on PR #1262: URL: https://github.com/apache/datafusion-comet/pull/1262#issuecomment-2584940287 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1262?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report All modified and coverable lines are covered by tests :white_check_mark: > Project coverage is 34.72%. Comparing base [(`be48839`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/be488390dcd89b2a5bff6f24b9ae7724383cffde?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) to head [(`43c8f55`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/43c8f559fc948a94cf7e3a5dcb70149ef2e29acb?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache). > Report is 2 commits behind head on main. Additional details and impacted files ```diff @@ Coverage Diff @@ ## main#1262 +/- ## + Coverage 34.69% 34.72% +0.03% - Complexity 991 994 +3 Files 116 116 Lines 4489144907 +16 Branches 9864 9869 +5 + Hits 1557415595 +21 Misses2616826168 + Partials 3149 3144 -5 ``` [:umbrella: View full report in Codecov by Sentry](https://app.codecov.io/gh/apache/datafusion-comet/pull/1262?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache). :loudspeaker: Have feedback on the report? [Share it here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for distinct aggregates [datafusion-comet]
andygrove commented on code in PR #1262: URL: https://github.com/apache/datafusion-comet/pull/1262#discussion_r1911767513 ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -974,7 +974,8 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } - test("distinct") { + // TODO enable once https://github.com/apache/datafusion-comet/issues/1267 is implemented Review Comment: yes, many of the tests were passing because the aggregate was running in Spark rather than Comet or because the input values were already distinct -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for distinct aggregates [datafusion-comet]
comphead commented on code in PR #1262: URL: https://github.com/apache/datafusion-comet/pull/1262#discussion_r1911745757 ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -974,7 +974,8 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } - test("distinct") { + // TODO enable once https://github.com/apache/datafusion-comet/issues/1267 is implemented Review Comment: I'm thinking should we add a test with unique values, that fails now by correctness issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] fix: Fall back to Spark for distinct aggregates [datafusion-comet]
comphead commented on code in PR #1262: URL: https://github.com/apache/datafusion-comet/pull/1262#discussion_r1911726694 ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -974,7 +974,8 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } - test("distinct") { + // TODO enable once https://github.com/apache/datafusion-comet/issues/1267 is implemented Review Comment: so now this tests passed? ## spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala: ## @@ -974,7 +974,8 @@ class CometAggregateSuite extends CometTestBase with AdaptiveSparkPlanHelper { } } - test("distinct") { + // TODO enable once https://github.com/apache/datafusion-comet/issues/1267 is implemented Review Comment: so now this test passed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org