asolimando commented on issue #21120: URL: https://github.com/apache/datafusion/issues/21120#issuecomment-4138418915
> This makes sense, I think now at the API framework stage, the goal is clear. I was thinking a little bit ahead, when adding expression/operator coverage for advanced statistics like NDV, it should better be workload guided at that time. Makes sense, I thought you were referring to individual tasks composing this issue. FWIW https://github.com/apache/datafusion/pull/20292 is a good starting point for evaluating improvements to statistics propagation, it's a bit coarse as reviewers have pointed out, but it can be refined over time and gives us a first measure. At the moment stats aren't "consumed" in many places, therefore I understand that it's hard for reviewers to evaluate changes. What I have tried to do so far, is to push to look into systems with good CBO (Spark, Postgres, Trino, Hive, etc.) and make explicit references to them (e.g., https://github.com/apache/datafusion/pull/20904#discussion_r2924816786), for any non-trivial change. > Also it looks like the API for [#21122](https://github.com/apache/datafusion/pull/21122) and [#19609](https://github.com/apache/datafusion/pull/19609) can be unified, I’ll try to help with the review. Thanks, help with the review would be greatly appreciated, on my hand I will take another closer look at [#19609](https://github.com/apache/datafusion/pull/19609). [#21122](https://github.com/apache/datafusion/pull/21122) is probably already reviewable, but I wanted to take another close look before removing the "Draft" label, but I don't expect any major change, in case you want to take a look already. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
