[ https://issues.apache.org/jira/browse/CALCITE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127240#comment-17127240 ]
Xiening Dai commented on CALCITE-3963: -------------------------------------- Thanks for your reply, Julian. I don't feel that current RelSet can be modeled as semigroups. What would be the binary operation that satisfies associative property in RelSet? My approach actually is pretty straightforward. I use the first RelNode in the set by default to report RelSubset statistics, with two exceptions MultiJoin and TableScan (for MV). Following the same approach, I update the other meta handlers, and it works fine. We no longer need the hacks put in previously (such as CALCITE-1018), and there's no more ambiguities around RelSubset stats calculation. See latest patch - https://github.com/apache/calcite/pull/1992 I'd suggest we move forward with current approach. If there's a need in the future for a fold operation across RelSubset, we could still update meta handler to do so. But at this point, I don't see it is necessary. > Maintain logical properties at RelSet (equivalent group) instead of RelNode > --------------------------------------------------------------------------- > > Key: CALCITE-3963 > URL: https://issues.apache.org/jira/browse/CALCITE-3963 > Project: Calcite > Issue Type: Bug > Reporter: Xiening Dai > Assignee: Xiening Dai > Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently the logical properties (such as row count, distinct row count, etc) > are maintained at RelNode level. This creates a number of meta data > consistency problems, e.g. CALCITE-1048, CALCITE-2166. > In theory, all RelNodes in a RelSet should share the same logical properties > per definition of relational equivalence. So it makes more sense to keep > logical properties at RelSet level, rather than the RelNode. And such > properties shouldn't change when new sub set is created or subset's best is > changed. > Specifically I think below build in metadata should fall into the logical > properties category - > Selectivity > UniqueKeys > ColumnUniqueness > RowCount > MaxRowCount > MinRowCount > DistinctRowCount > Size (averageRowSize, averageColumnSize) > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)