[ https://issues.apache.org/jira/browse/CALCITE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140741#comment-17140741 ]
Xiening Dai commented on CALCITE-3963: -------------------------------------- {quote} For sets of predicates and unique keys, the operation is union. For minRowCount, the operation is max. {quote} Do you have a concrete example where different RelNodes in a set have different unique keys and a union of those would make sense? Regarding the minRowCount, how do we know the max value is the best or most accurate? In your hyperthetical example, the Project vs Join case, if Project reports a big minRowCount, you just pick the one from Project? How does this would solve the problem? I tend to agree with you that we might need to consider the input confidence when report current estimate confidence. But as you said, the way of doing it would greatly complicate the solution, and doesn't seem quite necessary at this point. In practice, the example you gave is not a problem. A MultiJoin has low confidence and its RelSet stats will be replaced when it's converted into LogicalJoin which gives better estimate. And this change would propagate to its parent Project node so Project stats should be the same with the Join eventually. > Maintain logical properties at RelSet (equivalent group) instead of RelNode > --------------------------------------------------------------------------- > > Key: CALCITE-3963 > URL: https://issues.apache.org/jira/browse/CALCITE-3963 > Project: Calcite > Issue Type: Bug > Reporter: Xiening Dai > Assignee: Xiening Dai > Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Currently the logical properties (such as row count, distinct row count, etc) > are maintained at RelNode level. This creates a number of meta data > consistency problems, e.g. CALCITE-1048, CALCITE-2166. > In theory, all RelNodes in a RelSet should share the same logical properties > per definition of relational equivalence. So it makes more sense to keep > logical properties at RelSet level, rather than the RelNode. And such > properties shouldn't change when new sub set is created or subset's best is > changed. > Specifically I think below build in metadata should fall into the logical > properties category - > Selectivity > UniqueKeys > ColumnUniqueness > RowCount > MaxRowCount > MinRowCount > DistinctRowCount > Size (averageRowSize, averageColumnSize) > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)