EmilyMatt opened a new pull request, #1390: URL: https://github.com/apache/datafusion-comet/pull/1390
## What issue does this close? Closes #1389 . ## Rationale for this change As described in the issue, we'd like to prevent situations where despite the Partial aggregate being supported and converted, and the shuffle being supported and converted, the Final would not be converted, because the result expressions were not supported. This leads to an unrecoverable state, where Spark expects an aggregate buffer to be created by the Partial HA and it doesn't exist. ## What changes are included in this PR? I've separated the conversion of the hash aggregate into a separate function(I believe everything should be separated tbh, its very hard to manage rn), which also returns information about whether the result expressions were converted, when they are not, we create a new ProjectExec with those result expressions, convert the HA without them, and place a conversion between the two, that way we can ensure a valid state at all times. This feature can be ignored by enforcing result conversion, using "spark.comet.exec.aggregate.enforceResults=true", result enforcing is disabled by default. ## How are these changes tested? Essentially a lot of the stability tests, will have a new plan where the aggregate is completed natively, and the ProjectExec runs in Spark, instead of the current situation, where the final stage of the HashAggregate is done in Spark completely. Those tests currently fail because I am unable to run them with SPARK_GENERATE_GOLDEN_FILES, might be a skill issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org