Hi,

While testing ALTER TABLE ... SPLIT PARTITION, I found a bug and a few 
behaviors and messages that seem worth improving.

0. A bound-overlap bug

I numbered this item as 0 because I found it after finishing items 1, 2, and 3. 
While doing a final verification before sending this email, I was surprised to 
find that the partitioned table ended up with two overlapping partitions.

Here is a simple repro:
```
evantest=# drop table t;
DROP TABLE
evantest=# CREATE TABLE t (i int) PARTITION BY RANGE(i);
CREATE TABLE
evantest=# CREATE TABLE p0a PARTITION OF t FOR VALUES FROM (0) TO (51);
CREATE TABLE
evantest=# CREATE TABLE p0b PARTITION OF t FOR VALUES FROM (51) TO (100);
CREATE TABLE
evantest=# \d+ t;
                                      Partitioned table "public.t"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | 
Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 i      | integer |           |          |         | plain   |             |    
          |
Partition key: RANGE (i)
Partitions:
    p0a FOR VALUES FROM (0) TO (51)
    p0b FOR VALUES FROM (51) TO (100)

evantest=# ALTER TABLE t SPLIT PARTITION p0a INTO
evantest-#   (PARTITION p0a FOR VALUES FROM (0) TO (53),
evantest(#    PARTITION pdef DEFAULT);
ALTER TABLE
evantest=#
evantest=#
evantest=# \d+ t;
                                      Partitioned table "public.t"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | 
Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 i      | integer |           |          |         | plain   |             |    
          |
Partition key: RANGE (i)
Partitions:
    p0a FOR VALUES FROM (0) TO (53)
    p0b FOR VALUES FROM (51) TO (100)
    pdef DEFAULT
```

As shown above, p0a and p0b now overlap. I think this is a real bug.

The problem seems to be in check_partition_bounds_for_split_range(). If the 
only non-default replacement partition is both first and last, the function 
checks only the lower bound because it uses if (first) ... else .... It never 
checks the upper bound in that case.

1. The documentation about splitting with a DEFAULT partition is a bit unclear

Since SPLIT PARTITION allows one of the new partitions to be specified as 
DEFAULT, I wondered whether that new DEFAULT partition means the remaining part 
of the split partition's bound, or the partitioned table's global DEFAULT 
partition.

The current documentation says:
```
     <para>
      This form splits a single partition of the target table into new
      partitions. Hash-partitioned target table is not supported.
      Only a simple, non-partitioned partition can be split.
      If the split partition is the <literal>DEFAULT</literal> partition,
      one of the new partitions must be <literal>DEFAULT</literal>.
      If the partitioned table does not have a <literal>DEFAULT</literal>
      partition, a <literal>DEFAULT</literal> partition can be defined as one
      of the new partitions.
     </para>

     <para>
      The bounds of new partitions should not overlap with those of new or
      existing partitions (except <replaceable 
class="parameter">partition_name</replaceable>).
      The combined bounds of new partitions <literal>
      <replaceable class="parameter">partition_name1</replaceable>,
      <replaceable class="parameter">partition_name2</replaceable>[, ...]
      </literal> should be equal to the bounds of the split partition
      <replaceable class="parameter">partition_name</replaceable>.
      One of the new partitions can have the same name as the split partition
      <replaceable class="parameter">partition_name</replaceable>
      (this is suitable in case of splitting the <literal>DEFAULT</literal>
      partition: after the split, the <literal>DEFAULT</literal> partition
      remains with the same name, but its partition bound changes).
     </para>
```

From the first paragraph, it seems that the new DEFAULT partition is a 
table-level default partition. However, the second paragraph says that the 
combined bounds of the new partitions should be equal to the bounds of the 
split partition, which can make it look as if the new DEFAULT partition only 
covers the remaining part of the split partition's bound.

My tests show that the new DEFAULT partition is the partitioned table's global 
DEFAULT partition. So I think the documentation can be improved to make that 
clearer.

2. I found this hint message confusing:
```
evantest=# ALTER TABLE t SPLIT PARTITION p0a INTO (PARTITION p0a1 FOR VALUES 
FROM (0) TO (5), PARTITION p0a2 FOR VALUES FROM (6) to (51), PARTITION pdef 
DEFAULT);
ERROR:  upper bound of partition "p0a2" is greater than upper bound of split 
partition "p0a"
LINE 1: ...0) TO (5), PARTITION p0a2 FOR VALUES FROM (6) to (51), PARTI...
                                                             ^
HINT:  ALTER TABLE ... SPLIT PARTITION require combined bounds of new 
partitions must exactly match the bound of the split partition.
```

In this command, one of the new explicit partition bounds exceeds the original 
partition bound, but the command also specifies a DEFAULT partition. In this 
case, the combined explicit bounds do not need to exactly match the original 
partition bound, they only need to stay within it. So the hint is not quite 
accurate for this case.

3. SPLIT PARTITION currently provides another way to add a DEFAULT partition:
```
evantest=# \d+ t;
                                      Partitioned table "public.t"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | 
Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 i      | integer |           |          |         | plain   |             |    
          |
Partition key: RANGE (i)
Partitions:
    p0a FOR VALUES FROM (0) TO (50)
    p0b FOR VALUES FROM (51) TO (100)
    p200 FOR VALUES FROM (200) TO (300)

evantest=# ALTER TABLE t SPLIT PARTITION p0a INTO (PARTITION p0a FOR VALUES 
FROM (0) TO (50), PARTITION pdef DEFAULT);
ALTER TABLE
```

Here, the new p0a partition has the same bound as the original p0a partition, 
so no real split happens. The command effectively only adds a new DEFAULT 
partition. However, it still goes through the full split-partition path: 
creating a new partition, moving data, attaching the new partition, and 
dropping the old partition.

Initially, I considered adding a fast path for this case so that it would only 
add the new DEFAULT partition. But after thinking about it more, I think it is 
better to reject this degenerate form instead. We already has direct ways to 
add a DEFAULT partition:
```
CREATE TABLE p PARTITION OF t DEFAULT;

or

CREATE TABLE p;
ALTER TABLE t ATTACH PARTITION p DEFAULT;
```

So I do not think SPLIT PARTITION needs to become another syntax for adding a 
DEFAULT partition when no actual split is performed. Accepting this form could 
also raise another question later: if this is allowed, why does the user have 
to repeat the original bound at all? Why not allow something like this?
```
ALTER TABLE t SPLIT PARTITION p0a INTO (PARTITION pdef DEFAULT);
```
That seems like an awkward direction.

The attached patch tries to address all these issues.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Attachment: v1-0001-Fix-SPLIT-PARTITION-validation-with-DEFAULT.patch
Description: Binary data

Reply via email to