> Hang on -- upthread the context was inner join, and the gripe was join
> fast with '=', slow with INDF. When he said the nulls were > 'generated', I didn't follow that they were part of the original > query. If the nulls are generated along with the query, sure, an > index won't help. > > I maintain my earlier point; with respect to the original query, to > get from performance of INDF to =, you have three options: > a) expr index the nulls (assuming they are physically stored) > b) convert to ((a = b) or a is null and b is null) which can help with > a bitmap or plan > c) covert to union all equivalent of "b" > > merlin a) and b) would be workaround that would run an order of magnitude slower. The query starts with a full table scan of a large table. If the planner had started elsewhere it could have reduced the result to 1-2 rows from the start. It won't choose this plan without the help from =. c) could be a acceptable workaround, but it would clutter up if you would want more than one column to be IS NOT DISTINCT FROM. You end up with 2^n unions to simulate IS NOT DISTINCT FROM. Without knowing the work required, I will still argue that having IS NOT DISTINCT FROM use the same transitive rules as equality, would be a better approach. With fear of talking about things I know little(nothing) of, I think the description of EquivalenceClasses in postgres/src/backend/optimizer/README, should be extended to also include EquivalenceClasses of IS NOT DISTINCT FROM.