MilanTyagi2004 commented on issue #64122: URL: https://github.com/apache/doris/issues/64122#issuecomment-4718766782
Thanks for the clarification. My current PR uses ndv normalization (0 -> 1), but based on the feedback from @englefly and @morrySnow, I understand that this is not the preferred direction because it may affect cardinality estimation and plan quality. Before revising the implementation, could you please clarify the expected fix path? My understanding is: 1. Statistics collection should not fail for this pattern. 2. The statistics can still be written into the statistics table. 3. When these statistics are consumed later, they should be treated as UNKNOWN rather than being used directly. Could you point me to the preferred location for handling this conversion to UNKNOWN (for example during ColumnStatistic construction/loading, StatisticsUtil, or another layer)? I would like to align the implementation with the intended design before updating the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
