> > This gets back to my “What is this supposed to do?”
I can confidently say it's the first one: "Is this set of keys, when taken together, unique"? Sorry, earlier, I guess I misunderstood what you meant by join analysis. You are referring to the predicate-constant thing in the context of joins. I agree, that seems wrong to me, and in both join and filter cases, it makes the uniqueness test more conservative than it needs to be. On Fri, Mar 7, 2025 at 2:28 PM Ian Bertolacci <[email protected]> wrote: > That example makes sense but isn’t directly related to the issue here. > In this example: `select T1.id, count(*) from T1 inner join T2 on > T1.foreignKey = T2.ID where T2.foo = 1234 group by T1.id` > AggregateRemoveRule wants to know if `T1.id` after the filter is unique. > The answer should be “yes” because T1 is not expanded by the join (in these > examples, all ID fields are unique at the TableScan level). > For example: > T1 > ID | foreignKey > 1 | 10 > 2 | 10 > 3 | 20 > 4 | 20 > > T2 > ID (lets ignore foo) > 10 > 20 > > T1 inner join T2 on T1.foreignKey = T2.id > T1.ID | T2.ID > 1 | 10 > 2 | 10 > 3 | 20 > 4 | 20 > > And the analyser is already aware of this. > I ask the join “Is T1.id unique” it will say “yes”; and if I ask “is T2.id > unique” it will say “no”, because the non-uniqueness of the foreign key > field negates any uniqueness on the opposite/ primary key side. > > But if you ask the filter “is T1.id unique” it will ask the join “is > {T1.id T2.foo}” unique. > The join looks at T1.id and sees that its unique, then looks at T2.foo and > sees that its not unique. > And for join, it “ands” these together, so if any non-unique column > exists, then it claims that the whole key-set is not unique. > > This gets back to my “What is this supposed to do?” > Is it answering the question “Is this set of keys, when taken together, > unique?” or is it answering the question “are *all* of these keys > *independently* unique?” > If it’s the first, then the join analysis is too conservative, it’s the > second, then the table scan analysis is incorrect (I think) > > To me, I feel like the analysis could be as simple as “Are any of these > columns unique after the join?” if yes, then the key set as a whole is > unique. > > -Ian > >
