alamb commented on PR #16625: URL: https://github.com/apache/datafusion/pull/16625#issuecomment-3113309576
> Thus, it seems optimal to be able to distinguish functions that > > * benefit from ordering, e.g. first_value (`Beneficial`) > * those are simply better if input can be ordered, e.g. ordered array_agg in this PR (`SoftRequirement` being added in this PR) > * those which cannot execute if input is not pre-ordered, e.g. ordered array_agg before this PR (`HardRequirement`) > * those which do not care about input ordering (`Insensitive`) I see -- my confusion stemmed from that I understand the theoretical difference between 1. "needs the ordering to correctly run" (HardRequirement) 2. "can take advantage of the ordering" (Beneficial) 3. "is always better to use sorting" (SoftRequirement) What I think I am confused about is what is the practical difference between `HardRequirement` and `SoftRequiement` -- specifically, what different plan / decision will be made. I believe the result is that DataFusion will attempt to sort the input according to the requirement, but if it can not (because it will cause a conflict with another aggregate function's requirements, for example) then the aggregate can still be run with the different ordering -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org