On Fri, May 12, 2017 at 6:08 PM, amul sul <sula...@gmail.com> wrote: > Hi, > > Please find the following updated patches attached: > > 0001-Cleanup.patch : Does some cleanup and code refactoring required > for hash partition patch. Otherwise, there will be unnecessary diff in > 0002 patch
Thanks for splitting the patch. + if (isnull[0]) + cur_index = partdesc->boundinfo->null_index; This code assumes that null_index will be set to -1 when has_null is false. Per RelationBuildPartitionDesc() this is true. But probably we should write this code as if (isnull[0]) { if (partdesc->boundinfo->has_null) cur_index = partdesc->boundinfo->null_index; } That way we are certain that when has_null is false, cur_index = -1 similar to the original code. Additional arguement to ComputePartitionAttrs() isn't used anywhere in this patch, so may be this better be part of 0002. If we do this the only change that will remain in patch is the refactoring of RelationBuildPartitionDesc(), so we may consider merging it into 0002, unless we find that some more refactoring is needed. But for now, having it as a separate patch helps. Here's some more comments on 0002 + * In the case of hash partitioning, datums is a 2-D array, stores modulus and + * remainder values at datums[x][0] and datums[x][1] respectively for each + * partition in the ascending order. This comment about datums should appear in a paragraph of itself and may be rephrased as in the attached patch. May be we could also add a line about ndatums for hash partitioned tables as in the attached patch. + * (see the above enum); NULL for has and list typo s/has/hash + if (key->strategy == PARTITION_STRATEGY_HASH) + { + ndatums = nparts; + hbounds = (PartitionHashBound **) palloc(nparts * + sizeof(PartitionHashBound *)); + i = 0; + foreach (cell, boundspecs) + { + PartitionBoundSpec *spec = lfirst(cell); + [ clipped ] + hbounds[i]->index = i; + i++; + } For list and range partitioned table we order the bounds so that two partitioned tables have them in the same order irrespective of order in which they are specified by the user or hence stored in the catalogs. The partitions then get indexes according the order in which their bounds appear in ordered arrays of bounds. Thus any two partitioned tables with same partition specification always have same PartitionBoundInfoData. This helps in partition-wise join to match partition bounds of two given tables. Above code assigns the indexes to the partitions as they appear in the catalogs. This means that two partitioned tables with same partition specification but different order for partition bound specification will have different PartitionBoundInfoData represenation. If we do that, probably partition_bounds_equal() would reduce to just matching indexes and the last element of datums array i.e. the greatest modulus datum. If ordered datums array of two partitioned table do not match exactly, the mismatch can be because missing datums or different datums. If it's a missing datum it will change the greatest modulus or have corresponding entry in indexes array as -1. If the entry differs it will cause mismatching indexes in the index arrays. + * is not a factor of 15. + * + * + * Get greatest bound in array boundinfo->datums which is An extra line here. + if (offset < 0) + { + nmod = DatumGetInt32(datums[0][0]); + valid_bound = (nmod % spec->modulus) == 0; + } + else + { + pmod = DatumGetInt32(datums[offset][0]); + valid_bound = (spec->modulus % pmod) == 0; + + if (valid_bound && (offset + 1) < ndatums) + { + nmod = DatumGetInt32(datums[offset + 1][0]); + valid_bound = (nmod % spec->modulus) == 0; + } + } May be name the variables as prev_mod(ulus) and next_mod(ulus) for better readability. + * for p_p1: satisfies_hash_partition(2, 1, hash_fn(a), hash_fn(b)) + * for p_p2: satisfies_hash_partition(4, 2, hash_fn(a), hash_fn(b)) + * for p_p3: satisfies_hash_partition(8, 0, hash_fn(a), hash_fn(b)) + * for p_p4: satisfies_hash_partition(8, 4, hash_fn(a), hash_fn(b)) The description here may be read as if we are calling the same hash function for both a and b, but that's not true. So, you may want to clarify that in hash_fn(a) hash_fn means hash function specified for key a. + if (key->partattrs[i] != 0) + { + keyCol = (Node *) makeVar(1, + key->partattrs[i], + key->parttypid[i], + key->parttypmod[i], + key->parttypcoll[i], + 0); + + /* Form hash_fn(value) expression */ + keyCol = (Node *) makeFuncExpr(key->partsupfunc[i].fn_oid, + get_fn_expr_rettype(&key->partsupfunc[i]), + list_make1(keyCol), + InvalidOid, + InvalidOid, + COERCE_EXPLICIT_CALL); + } + else + { + keyCol = (Node *) copyObject(lfirst(partexprs_item)); + partexprs_item = lnext(partexprs_item); + } I think we should add FuncExpr for column Vars as well as expressions. The logic to compare two bounds is duplicated in qsort_partition_hbound_cmp() and partition_bound_cmp(). Should we separate it into a separate function accepting moduli and remainders. That way in case we change it in future, we have to change only one place. I think we need more comments for compute_hash_value(), mix_hash_value() and satisfies_hash_partition() as to what each of them accepts and what it computes. + /* key's hash values start from third argument of function. */ + if (!PG_ARGISNULL(i + 2)) + { + values[i] = PG_GETARG_DATUM(i + 2); + isnull[i] = false; + } + else + isnull[i] = true; You could write this as isnull[i] = PG_ARGISNULL(i + 2); if (isnull[i]) values[i] = PG_GETARG_DATUM(i + 2); + * Identify a btree or hash opclass to use. Currently, we use only + * btree operators, which seems enough for list and range partitioning, + * and hash operators for hash partitioning. The wording, if not read carefully, might be read as "we use only btree operators". I suggest we rephrase it as "Identify opclass to use. For list and range partitioning we use only btree operators, which seems enough for those. For hash partitioning, we use hash operators." for clarity. + foreach (lc, $5) + { + DefElem *opt = (DefElem *) lfirst(lc); A search on WITH in gram.y shows that we do not handle WITH options in gram.y. Usually they are handled at the transformation stage. Why is this an exception? If you do that, we can have all the error handling in transformPartitionBound(). +DATA(insert OID = 5028 ( satisfies_hash_partition PGNSP PGUID 12 1 0 2276 0 f f f f f f i s 3 0 16 "23 23 2276" _null_ _null_ _null_ _null_ _null_ satisfies_hash_partition _null_ _null_ _null_ )); Why is third argument to this function ANY? Shouldn't it be INT4ARRAY (variadic INT4)? I am yet to review the testcases and thumb through all the places using PARTITION_STRATEGY_RANGE/LIST to make sure that we are handling PARTITION_STRATEGY_HASH in all those places. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c index 6bd4aa8..89c3b0a 100644 --- a/src/backend/catalog/partition.c +++ b/src/backend/catalog/partition.c @@ -59,9 +59,12 @@ * In the case of range partitioning, ndatums will typically be far less than * 2 * nparts, because a partition's upper bound and the next partition's lower * bound are the same in most common cases, and we only store one of them. - * In the case of hash partitioning, datums is a 2-D array, stores modulus and - * remainder values at datums[x][0] and datums[x][1] respectively for each - * partition in the ascending order. + * In case of hash partitioning, ndatums be same as the number of partitions. + * + * For range and list partitioned tables, datums is an array of datum-tuples + * with key->partnatts datums each. + * For hash partitioned tables, it is an array of datum-tuples with 2 datums, + * modulus and remainder, corresponding to a given partition. * * In the case of list partitioning, the indexes array stores one entry for * every datum, which is the index of the partition that accepts a given datum. @@ -87,12 +90,9 @@ typedef struct PartitionBoundInfoData { char strategy; /* hash, list or range bounds? */ int ndatums; /* Length of the datums following array */ - Datum **datums; /* For hash partitioned table, array of modulus - * and remainder. For range and list partitioned - * table, array of datum-tuples with - * key->partnatts datums each */ + Datum **datums; RangeDatumContent **content;/* what's contained in each range bound datum? - * (see the above enum); NULL for has and list + * (see the above enum); NULL for hash and list * partitioned tables */ int *indexes; /* Partition indexes */ bool has_null; /* Is there a null-accepting partition? false
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers