Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> wrote:

> I originally thought to provide it along-with the changes to
> expand_inherited_rtentry(), but that thread is taking longer. Jeevan
> Chalke needs rebased patches for his work on aggregate pushdown and
> Thomas might need them for further review. So, here they are.

Since I have related patch in the current commitfest
(https://commitfest.postgresql.org/14/1247/), I spent some time reviewing your
patch:

* generate_partition_wise_join_paths()

Right parenthesis is missing in the prologue.


* get_partitioned_child_rels_for_join()

I think the Assert() statement is easier to understand inside the loop, see
the assert.diff attachment.


* have_partkey_equi_join()

As the function handles generic join, this comment doesn't seem to me
relevant:

    /*
     * The equi-join between partition keys is strict if equi-join between
     * at least one partition key is using a strict operator. See
     * explanation about outer join reordering identity 3 in
     * optimizer/README
     */
    strict_op = op_strict(opexpr->opno);

And I think the function can return true even if strict_op is false for all
the operators evaluated in the loop.


* match_expr_to_partition_keys()

I'm not sure this comment is clear enough:

    /*
     * If it's a strict equi-join a NULL partition key on one side will
     * not join a NULL partition key on the other side. So, rows with NULL
     * partition key from a partition on one side can not join with those
     * from a non-matching partition on the other side. So, search the
     * nullable partition keys as well.
     */
    if (!strict_op)
            continue;

My understanding of the problem of NULL values generated by outer join is:
these NULL values --- if evaluated by non-strict expression --- can make row
of N-th partition on one side of the join match row(s) of *other than* N-th
partition(s) on the other side. Thus the nullable input expressions may only
be evaluated by strict operators. I think it'd be clearer if you stressed that
(undesired) *match* of partition keys can be a problem, rather than mismatch.

If you insist on your wording, then I think you should at least move the
comment below to the part that only deals with strict operators.


* There are several places where lfirst_node() macro should be used. For
  example

rel = lfirst_node(RelOptInfo, lc);

instead of

rel = (RelOptInfo *) lfirst(lc);


* map_and_merge_partitions()

Besides a few changes proposed in map_and_merge_partitions.diff (a few of them
to suppress compiler warnings) I think that this part needs more thought:

    {
            Assert(mergemap1[index1] != mergemap2[index2] &&
                       mergemap1[index1] >= 0 && mergemap2[index2] >= 0);

            /*
             * Both the partitions map to different merged partitions. This
             * means that multiple partitions from one relation matches to one
             * partition from the other relation. Partition-wise join does not
             * handle this case right now, since it requires ganging multiple
             * partitions together (into one RelOptInfo).
             */
            merged_index = -1;
    }

I could hit this path with the following test:

CREATE TABLE a(i int) PARTITION BY LIST(i);
CREATE TABLE a_0 PARTITION OF a FOR VALUES IN (0, 2);
CREATE TABLE b(j int) PARTITION BY LIST(j);
CREATE TABLE b_0 PARTITION OF b FOR VALUES IN (1, 2);

SET enable_partition_wise_join TO on;

SELECT  *
FROM    a
        FULL JOIN
        b ON i = j;

I don't think there's a reason not to join a_0 partition to b_0, is there?

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index a75b1a3..3094b56
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** get_partitioned_child_rels_for_join(Plan
*** 6160,6170 ****
  		PartitionedChildRelInfo *pc = lfirst(l);
  
  		if (bms_is_member(pc->parent_relid, join_relids))
  			result = list_concat(result, list_copy(pc->child_rels));
  	}
  
- 	/* The root partitioned table is included as a child rel */
- 	Assert(list_length(result) >= bms_num_members(join_relids));
- 
  	return result;
  }
--- 6160,6172 ----
  		PartitionedChildRelInfo *pc = lfirst(l);
  
  		if (bms_is_member(pc->parent_relid, join_relids))
+ 		{
+ 			/* The root partitioned table is included as a child rel */
+ 			Assert(list_length(pc->child_rels) >= 1);
+ 
  			result = list_concat(result, list_copy(pc->child_rels));
+ 		}
  	}
  
  	return result;
  }
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
new file mode 100644
index eb35fab..aa9c70d
*** a/src/backend/catalog/partition.c
--- b/src/backend/catalog/partition.c
*************** partition_list_bounds_merge(int partnatt
*** 3110,3116 ****
--- 3110,3119 ----
  			 */
  			if (jointype == JOIN_INNER || jointype == JOIN_RIGHT ||
  				jointype == JOIN_SEMI)
+ 			{
  				merged_index = -1;
+ 				merged_datum = NULL;
+ 			}
  			else if (jointype == JOIN_LEFT || jointype == JOIN_FULL ||
  					 jointype == JOIN_ANTI)
  			{
*************** partition_list_bounds_merge(int partnatt
*** 3140,3146 ****
--- 3143,3152 ----
  			 */
  			if (jointype == JOIN_INNER || jointype == JOIN_LEFT ||
  				jointype == JOIN_SEMI || jointype == JOIN_ANTI)
+ 			{
  				merged_index = -1;
+ 				merged_datum = NULL;
+ 			}
  			else if (jointype == JOIN_RIGHT || jointype == JOIN_FULL)
  			{
  				/*
*************** map_and_merge_partitions(int *partmap1,
*** 3334,3346 ****
  
  	/*
  	 * If the given to partitions map to each other, find the corresponding
! 	 * merged partition index .
  	 */
  	if (partmap1[index1] == index2 && partmap2[index2] == index1)
  	{
  		/*
! 		 * If both the partitions are mapped to the same merged partition, get
! 		 * the index of merged partition.
  		 */
  		if (mergemap1[index1] == mergemap2[index2])
  		{
--- 3340,3352 ----
  
  	/*
  	 * If the given to partitions map to each other, find the corresponding
! 	 * merged partition index.
  	 */
  	if (partmap1[index1] == index2 && partmap2[index2] == index1)
  	{
  		/*
! 		 * If both the partitions are mapped to the same merged partition, or
! 		 * neither maps at all yet, get the index of merged partition.
  		 */
  		if (mergemap1[index1] == mergemap2[index2])
  		{
*************** map_and_merge_partitions(int *partmap1,
*** 3352,3359 ****
  			 */
  			if (merged_index < 0)
  			{
! 				merged_index = *next_index;
! 				*next_index = *next_index + 1;
  				mergemap1[index1] = merged_index;
  				mergemap2[index2] = merged_index;
  			}
--- 3358,3364 ----
  			 */
  			if (merged_index < 0)
  			{
! 				merged_index = (*next_index)++;
  				mergemap1[index1] = merged_index;
  				mergemap2[index2] = merged_index;
  			}
*************** map_and_merge_partitions(int *partmap1,
*** 3366,3380 ****
  		 * partitions map to the same merged partition.
  		 */
  		else if (mergemap1[index1] >= 0 && mergemap2[index2] < 0)
! 		{
! 			mergemap2[index2] = mergemap1[index1];
! 			merged_index = mergemap1[index1];
! 		}
  		else if (mergemap1[index1] < 0 && mergemap2[index2] >= 0)
! 		{
! 			mergemap1[index1] = mergemap2[index2];
! 			merged_index = mergemap2[index2];
! 		}
  		else
  		{
  			Assert(mergemap1[index1] != mergemap2[index2] &&
--- 3371,3379 ----
  		 * partitions map to the same merged partition.
  		 */
  		else if (mergemap1[index1] >= 0 && mergemap2[index2] < 0)
! 			merged_index = mergemap2[index2] = mergemap1[index1];
  		else if (mergemap1[index1] < 0 && mergemap2[index2] >= 0)
! 			merged_index = mergemap1[index1] = mergemap2[index2];
  		else
  		{
  			Assert(mergemap1[index1] != mergemap2[index2] &&
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to