On Thu, Apr 6, 2017 at 7:30 AM, Rahila Syed <rahilasye...@gmail.com> wrote: > Thanks a lot for testing and reporting this. Please find attached an updated > patch with the fix. The patch also contains a fix > regarding operator used at the time of creating expression as default > partition constraint. This was notified offlist by Amit Langote.
I think that the syntax for this patch should probably be revised. Right now the proposal is for: CREATE TABLE .. PARTITION OF ... FOR VALUES IN (DEFAULT); But that's not a good idea for several reasons. For one thing, you can also write FOR VALUES IN (DEFAULT, 5) or which isn't sensible. For another thing, this kind of syntax won't generalize to range partitioning, which we've talked about making this feature support. Maybe something like: CREATE TABLE .. PARTITION OF .. DEFAULT; This patch makes the assumption throughout that any DefElem represents the word DEFAULT, which is true in the patch as written but doesn't seem very future-proof. I think the "def" in "DefElem" stands for "definition" or "define" or something like that, so this is actually pretty confusing. Maybe we should introduce a dedicated node type to represent a default-specification in the parser grammar. If not, then let's at least encapsulate the test a little better, e.g. by adding isDefaultPartitionBound() which tests not only IsA(..., DefElem) but also whether the name is DEFAULT as expected. BTW, we typically use lower-case internally, so if we stick with this representation it should really be "default" not "DEFAULT". Useless hunk: + bool has_def; /* Is there a default partition? Currently false + * for a range partitioned table */ + int def_index; /* Index of the default list partition. -1 for + * range partitioned tables */ Why abbreviate "default" to def here? Seems pointless. + if (found_def) + { + if (mapping[def_index] == -1) + mapping[def_index] = next_index++; + } Consider && @@ -717,7 +754,6 @@ check_new_partition_bound(char *relname, Relation parent, Node *bound) } } } - break; } + * default partiton for rows satisfying the new partition Spelling. + * constraint. If found dont allow addition of a new partition. Missing apostrophe. + defrel = heap_open(defid, AccessShareLock); + tupdesc = CreateTupleDescCopy(RelationGetDescr(defrel)); + + /* Build expression execution states for partition check quals */ + partqualstate = ExecPrepareCheck(partConstraint, + estate); + + econtext = GetPerTupleExprContext(estate); + snapshot = RegisterSnapshot(GetLatestSnapshot()); Definitely not safe against concurrency, since AccessShareLock won't exclude somebody else's update. In fact, it won't even cover somebody else's already-in-flight transaction. + errmsg("new default partition constraint is violated by some row"))); Normally in such cases we try to give more detail using ExecBuildSlotValueDescription. + bool is_def = true; This variable starts out true and is never set to any value other than true. Just get rid of it and, in the one place where it is currently used, write "true". That's shorter and clearer. + inhoids = find_inheritance_children(RelationGetRelid(parent), NoLock); If it's actually safe to do this with no lock, there ought to be a comment with a very compelling explanation of why it's safe. + boundspec = (Node *) stringToNode(TextDatumGetCString(datum)); + bspec = (PartitionBoundSpec *)boundspec; There's not really a reason to cast the result of stringToNode() to Node * and then turn around and cast it to PartitionBoundSpec *. Just cast it directly to whatever it needs to be. And use the new castNode macro. + foreach(cell1, bspec->listdatums) + { + Node *value = lfirst(cell1); + if (IsA(value, DefElem)) + { + def_elem = true; + *defid = inhrelid; + } + } + if (def_elem) + { + ReleaseSysCache(tuple); + continue; + } + foreach(cell3, bspec->listdatums) + { + Node *value = lfirst(cell3); + boundspecs = lappend(boundspecs, value); + } + ReleaseSysCache(tuple); + } + foreach(cell4, spec->listdatums) + { + Node *value = lfirst(cell4); + boundspecs = lappend(boundspecs, value); + } cell1, cell2, cell3, and cell4 are not very clear variable names. Between that and the lack of comments, this is not easy to understand. It's sort of spaghetti logic, too. The if (def_elem) test continues early, but if the point is that the loop using cell3 shouldn't execute in that case, why not just put if (!def_elem) { foreach(cell3, ...) { ... } } instead of reiterating the ReleaseSysCache in two places? + /* Collect bound spec nodes in a list. This is done if the partition is + * a default partition. In case of default partition, constraint is formed + * by performing <> operation over the partition constraints of the + * existing partitions. + */ I doubt that handles NULLs properly. + inhoids = find_inheritance_children(RelationGetRelid(parent), NoLock); Again, no lock? Really? The logic which follows looks largely cut-and-pasted, which makes me think you need to do some refactoring here to make it more clear what's going on, so that you have the relevant logic in just one place. It seems wrong anyway to shove all of this logic specific to the default case into get_qual_from_partbound() when the logic for the non-default case is inside get_qual_for_list. Where there were 2 lines of code before you've now got something like 30. + if(get_negator(operoid) == InvalidOid) + elog(ERROR, "no negator found for partition operator %u", + operoid); I really doubt that's OK. elog() shouldn't be reachable, but this will be reachable if the partitioning operator does not have a negator. And there's the NULL-handling issue I mentioned above, too. + if (partdesc->boundinfo->has_def && key->strategy + == PARTITION_STRATEGY_LIST) + result = parent->indexes[partdesc->boundinfo->def_index]; Testing for PARTITION_STRATEGY_LIST here seems unnecessary. If has_def (or has_default, as it probably should be) isn't allowed for range partitions, then it's redundant; if it is allowed, then that case should be handled too. Also, at this point we've already set *failed_at and *failed_slot; presumably you'd want to make this check before you get to that point. I suspect there are quite a few more problems here in addition to the ones mentioned above, but I don't think it makes sense to spend too much time searching for them until some of this basic stuff is cleaned up. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers