Re: [HACKERS] [POC] hash partitioning

Yugo Nagata Thu, 22 Jun 2017 21:42:07 -0700

On Tue, 6 Jun 2017 13:03:58 +0530
amul sul <[email protected]> wrote:



> Updated patch attached.

I looked into the latest patch (v13) and have some comments
althogh they might be trivial.

First, I couldn't apply this patch to the latest HEAD due to
a documentation fix and pgintend updates. It needes rebase.

$ git apply /tmp/0002-hash-partitioning_another_design-v13.patch  
error: patch failed: doc/src/sgml/ref/create_table.sgml:87
error: doc/src/sgml/ref/create_table.sgml: patch does not apply
error: patch failed: src/backend/catalog/partition.c:76
error: src/backend/catalog/partition.c: patch does not apply
error: patch failed: src/backend/commands/tablecmds.c:13371
error: src/backend/commands/tablecmds.c: patch does not apply


       <varlistentry>
+       <term>Hash Partitioning</term>
+
+       <listitem>
+        <para>
+         The table is partitioned by specifying modulus and remainder for each
+         partition. Each partition holds rows for which the hash value of
+         partition keys when divided by specified modulus produces specified
+         remainder. For more clarification on modulus and remainder please 
refer
+         <xref linkend="sql-createtable-partition">.
+        </para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
        <term>Range Partitioning</term>

I think this section should be inserted after List Partitioning section because
the order of the descriptions is Range, List, then Hash in other places of
the documentation. At least, 


-    <firstterm>partition bounds</firstterm>.  Currently supported
-    partitioning methods include range and list, where each partition is
-    assigned a range of keys and a list of keys, respectively.
+    <firstterm>partition bounds</firstterm>.  The currently supported
+    partitioning methods are list, range, and hash.
    </para>

Also in this hunk. I think "The currently supported partitioning methods are
range, list, and hash." is better. We don't need to change the order of
the original description.
 

       <listitem>
        <para>
-        Declarative partitioning only supports list and range partitioning,
-        whereas table inheritance allows data to be divided in a manner of
-        the user's choosing.  (Note, however, that if constraint exclusion is
-        unable to prune partitions effectively, query performance will be very
-        poor.)
+        Declarative partitioning only supports hash, list and range
+        partitioning, whereas table inheritance allows data to be divided in a
+        manner of the user's choosing.  (Note, however, that if constraint
+        exclusion is unable to prune partitions effectively, query performance
+        will be very poor.)

Similarly, I think "Declarative partitioning only supports range, list and hash
partitioning," is better.


+
+  <para>
+   Create a hash partitioned table:
+<programlisting>
+CREATE TABLE orders (
+    order_id     bigint not null,
+    cust_id      bigint not null,
+    status       text
+) PARTITION BY HASH (order_id);
+</programlisting></para>
+

This paragraph should be inserted between "Create a list partitioned table:"
paragraph and "Ceate partition of a range partitioned table:" paragraph
as well as range and list.


                *strategy = PARTITION_STRATEGY_LIST;
        else if (pg_strcasecmp(partspec->strategy, "range") == 0)
                *strategy = PARTITION_STRATEGY_RANGE;
+       else if (pg_strcasecmp(partspec->strategy, "hash") == 0)
+               *strategy = PARTITION_STRATEGY_HASH;
        else
                ereport(ERROR,

In the most of codes, the order is hash, range, then list, but only
in transformPartitionSpec(), the order is list, range, then hash,
as above. Maybe it is better to be uniform.


+                       {
+                               if (strategy == PARTITION_STRATEGY_HASH)
+                                       ereport(ERROR,
+                                                       
(errcode(ERRCODE_UNDEFINED_OBJECT),
+                                                        errmsg("data type %s 
has no default hash operator class",
+                                                                       
format_type_be(atttype)),
+                                                        errhint("You must 
specify a hash operator class or define a default hash operator class for the 
data type.")));
+                               else
+                                       ereport(ERROR,
+                                                       
(errcode(ERRCODE_UNDEFINED_OBJECT),
+                                                        errmsg("data type %s 
has no default btree operator class",
+                                                                       
format_type_be(atttype)),
+                                                        errhint("You must 
specify a btree operator class or define a default btree operator class for the 
data type.")));
+
+

                                                                                
           atttype,
-                                                                               
           "btree",
-                                                                               
           BTREE_AM_OID);
+                                                                               
           am_oid == HASH_AM_OID ? "hash" : "btree",
+                                                                               
           am_oid);

How about writing this part as following to reduce code redundancy?

+       Oid                     am_oid;
+       char       *am_name;

<snip> 

+               if (strategy == PARTITION_STRATEGY_HASH)
+               {
+                       am_oid = HASH_AM_OID;
+                       am_name = pstrdup("hash");
+               }
+               else
+               {
+                       am_oid = BTREE_AM_OID;
+                       am_name = pstrdup("btree");
+               }
+
                if (!pelem->opclass)
                {
-                       partopclass[attn] = GetDefaultOpClass(atttype, 
BTREE_AM_OID);
+                       partopclass[attn] = GetDefaultOpClass(atttype, am_oid);
 
                        if (!OidIsValid(partopclass[attn]))
                                ereport(ERROR,
                                                
(errcode(ERRCODE_UNDEFINED_OBJECT),
-                                  errmsg("data type %s has no default btree 
operator class",
-                                                 format_type_be(atttype)),
-                                                errhint("You must specify a 
btree operator class or define a default btree operator class for the data 
type.")));
+                                                errmsg("data type %s has no 
default %s operator class",
+                                                               
format_type_be(atttype), am_name),
+                                                errhint("You must specify a %s 
operator class or define a default %s operator class for the data type.",
+                                                                am_name, 
am_name)));
+
                }
                else
                        partopclass[attn] = ResolveOpClass(pelem->opclass,
                                                                                
           atttype,
-                                                                               
           "btree",
-                                                                               
           BTREE_AM_OID);
+                                                                               
           am_name,
+                                                                               
           am_oid);


There is meaningless indentation change.

@@ -2021,7 +2370,8 @@ get_partition_for_tuple(PartitionDispatch *pd,
                    /* bsearch in partdesc->boundinfo */
                    cur_offset = partition_bound_bsearch(key,
                                                         partdesc->boundinfo,
-                                                        values, false, &equal);
+                                                     values, false, &equal);
+
                    /*
                     * Offset returned is such that the bound at offset is


Fixing the comment of pg_get_partkeydef() is missing.

 * pg_get_partkeydef
 *
 * Returns the partition key specification, ie, the following:
 *
 * PARTITION BY { RANGE | LIST } (column opt_collation opt_opclass [, ...])
 */
Datum
pg_get_partkeydef(PG_FUNCTION_ARGS)
{

Regards,

> 
> Regards,
> Amul Sul


-- 
Yugo Nagata <[email protected]>


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] hash partitioning

Reply via email to