On 11/27/20 7:05 AM, Ashutosh Bapat wrote:
On Tue, Nov 10, 2020 at 2:43 PM Richard Guo <guofengli...@gmail.com> wrote:

To recap, the problem we are fixing here is when generating join clauses
from equivalence classes, we only select the joinclause with the 'best
score', or the first joinclause with a score of 3. This may cause us to
miss some joinclause on partition keys and thus fail to generate
partitionwise join.

The initial idea for the fix is to create all the RestrictInfos from ECs
in order to check whether there exist equi-join conditions involving
pairs of matching partition keys of the relations being joined for all
partition keys. And then Tom proposed a much better idea which leverages
function exprs_known_equal() to tell whether the partkeys can be found
in the same eclass, which is the current implementation in the latest
patch.

In the example you gave earlier, the equi join on partition key was
there but it was replaced by individual constant assignment clauses.
So if we keep the original restrictclause in there with a new flag
indicating that it's redundant, have_partkey_equi_join will still be
able to use it without much change. Depending upon where all we need
to use avoid restrictclauses with the redundant flag, this might be an
easier approach. However, with Tom's idea partition-wise join may be
used even when there is no equi-join between partition keys but there
are clauses like pk = const for all tables involved and const is the
same for all such tables.

In the spirit of small improvement made to the performance of
have_partkey_equi_join(), pk_has_clause should be renamed as
pk_known_equal and pks_known_equal as num_equal_pks.

The loop traversing the partition keys at a given position, may be
optimized further if we pass lists to exprs_known_equal() which in
turns checks whether one expression from each list is member of a
given EC. This will avoid traversing all equivalence classes for each
partition key expression, which can be a huge improvement when there
are many ECs. But I think if one of the partition key expression at a
given position is member of an equivalence class all the other
partition key expressions at that position should be part of that
equivalence class since there should be an equi-join between those. So
the loop in loop may not be required to start with.

Richard, any thoughts on Ashutosh's comments?

Regards,
--
-David
da...@pgmasters.net


Reply via email to