The current memoization cost calculation method often results in memoize
nodes being added even when they are useless due to the outer side of the
join being guaranteed unique due to constraints, increasing overhead
unnecessarily. To try to prevent this, I am exploring methods to check
whether the outer side of the join is guaranteed unique before adding
memoize nodes. The simplest method I have found thus far is to use
innerrel_is_unique, passing in the outer relation where the inner relation
normally would be. This appears to work (although it is logically different
from the other uses of innerrel_is_unique), and adds negligible overhead
for simple joins because it can often reuse cached values from other
potential join orders.

However, with more complex joins, innerrel_is_unique fails because it can't
handle joined relations. My original thought on how to get around this is
to calculate innerrel_is_unique for each of the join constituent relations
(which to my understanding should already be done and cached because of
different join orders), and only remove memoize if all joins making up the
outer relation are unique, making the assumption that if there is even one
non unique join, that joins join keys will create cascading non unique
result sets. Does this sound feasible, and could there be a better way? I
am not entirely sure it will work naturally with bottom up query planning.
I am new to postgres development, so I don't have a great understanding of
how everything fits together yet.

Thanks!
Jacob

Reply via email to