On Thu, Sep 15, 2016 at 10:07 PM, Robert Haas <robertmh...@gmail.com> wrote: > On Thu, Sep 15, 2016 at 4:53 AM, Amit Langote > <langote_amit...@lab.ntt.co.jp> wrote: >> Wow, this is bad. What is needed in this case is "canonicalization" of >> the range partition bounds specified in the command. > > I think we shouldn't worry about this. It seems like unnecessary > scope creep. All human beings capable of using PostgreSQL will > understand that there are no integers between 4 and 5, so any > practical impact on this would be on someone creating partitions > automatically. But if someone is creating partitions automatically > they are highly likely to be using the same EXCLUSIVE/INCLUSIVE > settings for all of the partitions, in which case this won't arise. > And if they aren't, then I think we should just make them deal with > this limitation in their code instead of dealing with it in our code. > This patch is plenty complicated enough already; introducing a whole > new canonicalization concept that will help practically nobody seems > to me to be going in the wrong direction. If somebody really cares > enough to want to try to fix this, they can submit a followup patch > someday. > >> To mitigate this, how about we restrict range partition key to contain >> columns of only those types for which we know we can safely canonicalize a >> range bound (ie, discrete range types)? I don't think we can use, say, >> existing int4range_canonical but will have to write a version of it for >> partitioning usage (range bounds of partitions are different from what >> int4range_canonical is ready to handle). This approach will be very >> limiting as then range partitions will be limited to columns of int, >> bigint and date type only. > > -1. That is letting the tail wag the dog. Let's leave it the way you > had it and be happy.
Alright, let's leave this as something to work out later. We will have to document the fact that such limitation exists though, I'd think. >>> -- Observation 2 : able to create sub-partition out of the range set for >>> main table, causing not able to insert data satisfying any of the partition. >>> >>> create table test_subpart (c1 int) partition by range (c1); >>> create table test_subpart_p1 partition of test_subpart for values start (1) >>> end (100) inclusive partition by range (c1); >>> create table test_subpart_p1_sub1 partition of test_subpart_p1 for values >>> start (101) end (200); >> >> It seems that DDL should prevent the same column being used in partition >> key of lower level partitions. I don't know how much sense it would make, >> but being able to use the same column as partition key of lower level >> partitions may be a feature useful to some users if they know what they >> are doing. But this last part doesn't sound like a good thing. I >> modified the patch such that lower level partitions cannot use columns >> used by ancestor tables. > > I again disagree. If somebody defines partition bounds that make it > impossible to insert the data that they care about, that's operator > error. The fact that, across multiple levels, you can manage to make > it impossible to insert any data at all does not make it anything > other than operator error. If we take the contrary position that it's > the system's job to prevent this sort of thing, we may disallow some > useful cases, like partitioning by the year portion of a date and then > subpartitioning by the month portion of that same date. > > I think we'll probably also find that we're making the code > complicated to no purpose. For example, now you have to check when > attaching a partition that it doesn't violate the rule; otherwise you > end up with a table that can't be created directly (and thus can't > survive dump-and-restore) but can be created indirectly by exploiting > loopholes in the checks. It's tempting to think that we can check > simple cases - e.g. if the parent and the child are partitioning on > the same exact column, the child's range should be contained within > the parent's range - but more complicated cases are tricky. Suppose > the table is range-partitioned on (a, b) and range-subpartitioned on > b. It's not trivial to figure out whether the set of values that the > user can insert into that partition is non-empty. If we allow > partitioning on expressions, then it quickly becomes altogether > impossible to deduce anything useful - unless you can solve the > halting problem. > > And, really, why do we care? If the user creates a partitioning > scheme that permits no rows in some or all of the partitions, then > they will have an empty table that can be correctly dumped and > restored but which cannot be used for anything useful unless it is > modified first. They probably don't want that, but it's not any more > broken than a table inheritance setup with mutually exclusive CHECK > constraints, or for that matter a plain table with mutually exclusive > CHECK constraints - and we don't try to prohibit those things. This > patch is supposed to be implementing partitioning, not artificial > intelligence. Agree with your arguments. Certainly, I overlooked some use cases that my proposed solution would outright prevent. I withdraw it. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers