>
> This gets back to my “What is this supposed to do?”

I can confidently say it's the first one:  "Is this set of keys, when taken
together, unique"?

Sorry, earlier, I guess I misunderstood what you meant by join analysis.
You are referring to the predicate-constant thing in the context of joins.
I agree, that seems wrong to me, and in both join and filter cases, it
makes the uniqueness test more conservative than it needs to be.

On Fri, Mar 7, 2025 at 2:28 PM Ian Bertolacci
<[email protected]> wrote:

> That example makes sense but isn’t directly related to the issue here.
> In this example: `select T1.id, count(*) from T1 inner join T2 on
> T1.foreignKey = T2.ID where T2.foo = 1234 group by T1.id`
> AggregateRemoveRule wants to know if  `T1.id` after the filter is unique.
> The answer should be “yes” because T1 is not expanded by the join (in these
> examples, all ID fields are unique at the TableScan level).
> For example:
> T1
> ID | foreignKey
> 1  | 10
> 2  | 10
> 3  | 20
> 4  | 20
>
> T2
> ID  (lets ignore foo)
> 10
> 20
>
> T1 inner join T2 on T1.foreignKey = T2.id
> T1.ID | T2.ID
> 1         | 10
> 2         | 10
> 3         | 20
> 4         | 20
>
> And the analyser is already aware of this.
> I ask the join “Is T1.id unique” it will say “yes”; and if I ask “is T2.id
> unique” it will say “no”, because the non-uniqueness of the foreign key
> field negates any uniqueness on the opposite/ primary key side.
>
> But if you ask the filter “is T1.id unique” it will ask the join “is
> {T1.id T2.foo}” unique.
> The join looks at T1.id and sees that its unique, then looks at T2.foo and
> sees that its not unique.
> And for join, it “ands” these together, so if any non-unique column
> exists, then it claims that the whole key-set is not unique.
>
> This gets back to my “What is this supposed to do?”
> Is it answering the question “Is this set of keys, when taken together,
> unique?” or is it  answering the question “are *all* of these keys
> *independently* unique?”
> If it’s the first, then the join analysis is too conservative, it’s the
> second, then the table scan analysis is incorrect (I think)
>
> To me, I feel like the analysis could be as simple as “Are any of these
> columns unique after the join?” if yes, then the key set as a whole is
> unique.
>
> -Ian
>
>

Reply via email to