Thanks for the review.
On 2018/07/12 22:01, Ashutosh Bapat wrote:
> On Thu, Jul 12, 2018 at 11:10 AM, Amit Langote
> <[email protected]> wrote:
>>>
>>> I think your fix is correct. I slightly modified it along with updating
>>> nearby comments and added regression tests.
>>
>> I updated regression tests to reduce lines. There is no point in
>> repeating tests like v2 patch did.
>
> + *
> + * For hash partitioning however, it is possible to combine null and non-
> + * null keys in a pruning step, so do this only if *all* partition keys
> + * are involved in IS NULL clauses.
>
> I don't think this is true. When equality conditions and IS NULL clauses cover
> all partition keys of a hash partitioned table and do not have contradictory
> clauses, we should be able to find the partition which will remain unpruned.
I was trying to say the same thing, but maybe the comment doesn't like it.
How about this:
+ * For hash partitioning, if we found IS NULL clauses for some keys and
+ * OpExpr's for others, gen_prune_steps_from_opexps() will generate the
+ * necessary pruning steps if all partition keys are taken care of.
+ * If we found only IS NULL clauses, then we can generate the pruning
+ * step here but only if all partition keys are covered.
> I
> see that we already have this supported in get_matching_hash_bounds()
> /*
> * For hash partitioning we can only perform pruning based on equality
> * clauses to the partition key or IS NULL clauses. We also can only
> * prune if we got values for all keys.
> */
> if (nvalues + bms_num_members(nullkeys) == partnatts)
> {
>
> */
> - if (!generate_opsteps)
> + if (!bms_is_empty(nullkeys) &&
> + (part_scheme->strategy != PARTITION_STRATEGY_HASH ||
> + bms_num_members(nullkeys) == part_scheme->partnatts))
>
> So, it looks like we don't need bms_num_members(nullkeys) ==
> part_scheme->partnatts there.
Yes, we can perform pruning in all three cases for hash partitioning:
* all keys are covered by OpExprs providing non-null keys
* some keys are covered by IS NULL clauses, others by OpExprs (all keys
covered)
* all keys are covered by IS NULL clauses
... as long as we generate a pruning step at all. The issue here was that
we skipped generating the pruning step due to poorly coded condition in
gen_partprune_steps_internal in some cases.
> Also, I think, we don't know how some new partition strategy will treat NULL
> values so above condition looks wrong to me. Instead it should explicitly
> check
> the strategies for which we know that the NULL values go to a single
> partition.
How about if we explicitly spell out the strategies like this:
+ if (!bms_is_empty(nullkeys) &&
+ (part_scheme->strategy == PARTITION_STRATEGY_LIST ||
+ part_scheme->strategy == PARTITION_STRATEGY_RANGE ||
+ (part_scheme->strategy == PARTITION_STRATEGY_HASH &&
+ bms_num_members(nullkeys) == part_scheme->partnatts)))
The proposed comment also describes why the condition looks like that.
> /*
> - * Note that for IS NOT NULL clauses, simply having step suffices;
> - * there is no need to propagate the exact details of which keys are
> - * required to be NOT NULL. Hash partitioning expects to see actual
> - * values to perform any pruning.
> + * There are no OpExpr's, but there are IS NOT NULL clauses, which
> + * can be used to eliminate the null-partition-key-only partition.
>
> I don't understand this. When there are IS NOT NULL clauses for all the
> partition keys, it's only then that we could eliminate the partition
> containing
> NULL values, not otherwise.
Actually, there is only one case where the pruning step generated by that
block of code is useful -- to prune a list partition that accepts only
nulls. List partitioning only allows one partition, key so this works,
but let's say only accidentally. I modified the condition as follows:
+ else if (!generate_opsteps && !bms_is_empty(notnullkeys) &&
+ bms_num_members(notnullkeys) == part_scheme->partnatts)
Attached updated patch.
Thanks,
Amit
diff --git a/src/backend/partitioning/partprune.c
b/src/backend/partitioning/partprune.c
index cdc61a8997..a89cec0759 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -854,44 +854,47 @@ gen_partprune_steps_internal(GeneratePruningStepsContext
*context,
}
/*
- * If generate_opsteps is set to false it means no OpExprs were directly
- * present in the input list.
+ * For list and range partitioning, null partition keys can only be
found
+ * in one designated partition, so if there are IS NULL clauses
containing
+ * partition keys we should generate a pruning step that gets rid of all
+ * partitions but the special null-accepting partitions. For range
+ * partitioning, that means we will end up disregarding OpExpr's that
may
+ * have been found for some other keys, but that's fine, because it is
not
+ * possible to prune range partitions with a combination of null and
+ * non-null keys.
+ *
+ * For hash partitioning, if we found IS NULL clauses for some keys and
+ * OpExpr's for others, gen_prune_steps_from_opexps() will generate the
+ * necessary pruning steps if all partition keys are taken care of.
+ * If we found only IS NULL clauses, then we can generate the pruning
+ * step here but only if all partition keys are covered.
*/
- if (!generate_opsteps)
+ if (!bms_is_empty(nullkeys) &&
+ (part_scheme->strategy == PARTITION_STRATEGY_LIST ||
+ part_scheme->strategy == PARTITION_STRATEGY_RANGE ||
+ (part_scheme->strategy == PARTITION_STRATEGY_HASH &&
+ bms_num_members(nullkeys) == part_scheme->partnatts)))
+ {
+ PartitionPruneStep *step;
+
+ step = gen_prune_step_op(context, InvalidStrategy,
+ false, NIL,
NIL, nullkeys);
+ result = lappend(result, step);
+ }
+ else if (!generate_opsteps && !bms_is_empty(notnullkeys) &&
+ bms_num_members(notnullkeys) == part_scheme->partnatts)
{
/*
- * Generate one prune step for the information derived from IS
NULL,
- * if any. To prune hash partitions, we must have found IS NULL
- * clauses for all partition keys.
+ * There are no OpExpr's, but there are IS NOT NULL clauses,
which
+ * can be used to eliminate the null-partition-key-only
partition.
*/
- if (!bms_is_empty(nullkeys) &&
- (part_scheme->strategy != PARTITION_STRATEGY_HASH ||
- bms_num_members(nullkeys) == part_scheme->partnatts))
- {
- PartitionPruneStep *step;
+ PartitionPruneStep *step;
- step = gen_prune_step_op(context, InvalidStrategy,
- false,
NIL, NIL, nullkeys);
- result = lappend(result, step);
- }
-
- /*
- * Note that for IS NOT NULL clauses, simply having step
suffices;
- * there is no need to propagate the exact details of which
keys are
- * required to be NOT NULL. Hash partitioning expects to see
actual
- * values to perform any pruning.
- */
- if (!bms_is_empty(notnullkeys) &&
- part_scheme->strategy != PARTITION_STRATEGY_HASH)
- {
- PartitionPruneStep *step;
-
- step = gen_prune_step_op(context, InvalidStrategy,
- false,
NIL, NIL, NULL);
- result = lappend(result, step);
- }
+ step = gen_prune_step_op(context, InvalidStrategy,
+ false, NIL,
NIL, NULL);
+ result = lappend(result, step);
}
- else
+ else if (generate_opsteps)
{
PartitionPruneStep *step;
diff --git a/src/test/regress/expected/partition_prune.out
b/src/test/regress/expected/partition_prune.out
index d15f1d37f1..a3b21130e3 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -3125,3 +3125,51 @@ explain (costs off) select * from pp_temp_parent where a
= 2;
(3 rows)
drop table pp_temp_parent;
+-- multi-column range partitioning and pruning with IS NULL clauses on some
+-- columns
+create table pp_multirange (a int, b int) partition by range (a, b);
+create table pp_multirange1 partition of pp_multirange for values from (1, 1)
to (100, 100);
+create table pp_multirange2 partition of pp_multirange for values from (100,
100) to (200, 200);
+create table pp_multirange_def partition of pp_multirange default;
+-- all partitions but the default one should be pruned
+explain (costs off) select * from pp_multirange where a = 1 and b is null;
+ QUERY PLAN
+-------------------------------------------
+ Append
+ -> Seq Scan on pp_multirange_def
+ Filter: ((b IS NULL) AND (a = 1))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null and b is null;
+ QUERY PLAN
+-----------------------------------------------
+ Append
+ -> Seq Scan on pp_multirange_def
+ Filter: ((a IS NULL) AND (b IS NULL))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null and b = 1;
+ QUERY PLAN
+-------------------------------------------
+ Append
+ -> Seq Scan on pp_multirange_def
+ Filter: ((a IS NULL) AND (b = 1))
+(3 rows)
+
+explain (costs off) select * from pp_multirange where a is null;
+ QUERY PLAN
+-------------------------------------
+ Append
+ -> Seq Scan on pp_multirange_def
+ Filter: (a IS NULL)
+(3 rows)
+
+explain (costs off) select * from pp_multirange where b is null;
+ QUERY PLAN
+-------------------------------------
+ Append
+ -> Seq Scan on pp_multirange_def
+ Filter: (b IS NULL)
+(3 rows)
+
+drop table pp_multirange;
diff --git a/src/test/regress/sql/partition_prune.sql
b/src/test/regress/sql/partition_prune.sql
index b8e823d562..cf56fdac40 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -823,3 +823,19 @@ create temp table pp_temp_part_def partition of
pp_temp_parent default;
explain (costs off) select * from pp_temp_parent where true;
explain (costs off) select * from pp_temp_parent where a = 2;
drop table pp_temp_parent;
+
+-- multi-column range partitioning and pruning with IS NULL clauses on some
+-- columns
+create table pp_multirange (a int, b int) partition by range (a, b);
+create table pp_multirange1 partition of pp_multirange for values from (1, 1)
to (100, 100);
+create table pp_multirange2 partition of pp_multirange for values from (100,
100) to (200, 200);
+create table pp_multirange_def partition of pp_multirange default;
+
+-- all partitions but the default one should be pruned
+explain (costs off) select * from pp_multirange where a = 1 and b is null;
+explain (costs off) select * from pp_multirange where a is null and b is null;
+explain (costs off) select * from pp_multirange where a is null and b = 1;
+explain (costs off) select * from pp_multirange where a is null;
+explain (costs off) select * from pp_multirange where b is null;
+
+drop table pp_multirange;