Re: [HACKERS] hash partitioning based on v10Beta2

2017-08-28 Thread yang...@highgo.com
On Sat, Aug 26, 2017 at 10:10 AM, yang...@highgo.com <yang...@highgo.com>
wrote:

> Hi all,
>
> Now we have had the range / list partition, but hash partitioning is not
> implemented yet.
> Attached is a POC patch based on the v10Beta2 to add the
> hash partitioning feature.
> Although we will need more discussions about the syntax
> and other specifications before going ahead the project,
> but I think this runnable code might help to discuss
> what and how we implement this.
>
>
FYI, there is already an existing commitfest entry for this project.

https://commitfest.postgresql.org/14/1059/



> Description
>
> The hash partition's implement is on the basis of
> the original range / list partition,and using similar syntax.
>
> To create a partitioned table ,use:
>
> CREATE TABLE h (id int) PARTITION BY HASH(id);
>
> The partitioning key supports only one value, and I think
> the partition key can support multiple values,
> which may be difficult to implement when querying, but
> it is not impossible.
>
> A partition table can be create as bellow:
>
>  CREATE TABLE h1 PARTITION OF h;
>  CREATE TABLE h2 PARTITION OF h;
>  CREATE TABLE h3 PARTITION OF h;
>
> FOR VALUES clause cannot be used, and the partition bound
> is calclulated automatically as partition index of single integer value.
>
> An inserted record is stored in a partition whose index equals
> DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0],
> TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts
> /* Number of partitions */
> ;
> In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_
> cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;
>
> postgres=# insert into h select generate_series(1,20);
> INSERT 0 20
> postgres=# select tableoid::regclass,* from h;
>  tableoid | id
> --+
>  h1   |  3
>  h1   |  5
>  h1   | 17
>  h1   | 19
>  h2   |  2
>  h2   |  6
>  h2   |  7
>  h2   | 11
>  h2   | 12
>  h2   | 14
>  h2   | 15
>  h2   | 18
>  h2   | 20
>  h3   |  1
>  h3   |  4
>  h3   |  8
>  h3   |  9
>  h3   | 10
>  h3   | 13
>  h3   | 16
> (20 rows)
>
> The number of partitions here can be dynamically added, and
> if a new partition is created, the number of partitions
> changes, the calculated target partitions will change,
>  and the same data is not reasonable in different
> partitions,So you need to re-calculate the existing data
> and insert the target partition when you create a new partition.
>
> postgres=# create table h4 partition of h;
> CREATE TABLE
> postgres=# select tableoid::regclass,* from h;
>  tableoid | id
> --+
>  h1   |  5
>  h1   | 17
>  h1   | 19
>  h1   |  6
>  h1   | 12
>  h1   |  8
>  h1   | 13
>  h2   | 11
>  h2   | 14
>  h3   |  1
>  h3   |  9
>  h3   |  2
>  h3   | 15
>  h4   |  3
>  h4   |  7
>  h4   | 18
>  h4   | 20
>  h4   |  4
>  h4   | 10
>  h4   | 16
> (20 rows)
>
> When querying the data, the hash partition uses the same
> algorithm as the insertion, and filters out the table
> that does not need to be scanned.
>
> postgres=# explain analyze select * from h where id = 1;
>  QUERY PLAN
>
> 
> 
>  Append  (cost=0.00..41.88 rows=13 width=4) (actual time=
> 0.020..0.023 rows=1 loops=1)
>->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4) (
> actual time=0.013..0.016 rows=1 loops=1)
>  Filter: (id = 1)
>  Rows Removed by Filter: 3
>  Planning time: 0.346 ms
>  Execution time: 0.061 ms
> (6 rows)
>
> postgres=# explain analyze select * from h where id in (1,5);;
>  QUERY PLAN
>
> 
> 
>  Append  (cost=0.00..83.75 rows=52 width=4) (actual time=
> 0.016..0.028 rows=2 loops=1)
>->  Seq Scan on h1  (cost=0.00..41.88 rows=26 width=4) (
> actual time=0.015..0.018 rows=1 loops=1)
>  Filter: (id = ANY ('{1,5}'::integer[]))
>  Rows Removed by Filter: 6
>->  Seq Scan on h3  (cost=0.00..41.88 rows=26 width=4) (
> actual time=0.005..0.007 rows=1 loops=1)
>  Filter: (id = ANY ('{1,5}'::integer[]))
>  Rows Removed by Filter: 3
>  Planning time: 0.720 ms
>  Execution time: 0.074 ms
> (9 rows)
>
> postgres=

Re: [HACKERS] [POC] hash partitioning

2017-08-28 Thread yang...@highgo.com
Hello

Looking at your hash partitioning syntax, I implemented a hash partition in a 
more concise way, with no need to determine the number of sub-tables, and 
dynamically add partitions.

Description

The hash partition's implement is on the basis of the original range / list 
partition,and using similar syntax.

To create a partitioned table ,use:

CREATE TABLE h (id int) PARTITION BY HASH(id);

The partitioning key supports only one value, and I think the partition key can 
support multiple values, 
which may be difficult to implement when querying, but it is not impossible.

A partition table can be create as bellow:

 CREATE TABLE h1 PARTITION OF h;
 CREATE TABLE h2 PARTITION OF h;
 CREATE TABLE h3 PARTITION OF h;
 
FOR VALUES clause cannot be used, and the partition bound is calclulated 
automatically as partition index of single integer value.

An inserted record is stored in a partition whose index equals 
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], 
TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts/* Number of partitions */
;
In the above example, this is 
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], 
TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;

postgres=# insert into h select generate_series(1,20);
INSERT 0 20
postgres=# select tableoid::regclass,* from h;
 tableoid | id 
--+
 h1   |  3
 h1   |  5
 h1   | 17
 h1   | 19
 h2   |  2
 h2   |  6
 h2   |  7
 h2   | 11
 h2   | 12
 h2   | 14
 h2   | 15
 h2   | 18
 h2   | 20
 h3   |  1
 h3   |  4
 h3   |  8
 h3   |  9
 h3   | 10
 h3   | 13
 h3   | 16
(20 rows)

The number of partitions here can be dynamically added, and if a new partition 
is created, the number of partitions changes, the calculated target partitions 
will change, and the same data is not reasonable in different partitions,So you 
need to re-calculate the existing data and insert the target partition when you 
create a new partition.

postgres=# create table h4 partition of h;
CREATE TABLE
postgres=# select tableoid::regclass,* from h;
 tableoid | id 
--+
 h1   |  5
 h1   | 17
 h1   | 19
 h1   |  6
 h1   | 12
 h1   |  8
 h1   | 13
 h2   | 11
 h2   | 14
 h3   |  1
 h3   |  9
 h3   |  2
 h3   | 15
 h4   |  3
 h4   |  7
 h4   | 18
 h4   | 20
 h4   |  4
 h4   | 10
 h4   | 16
(20 rows)

When querying the data, the hash partition uses the same algorithm as the 
insertion, and filters out the table that does not need to be scanned.

postgres=# explain analyze select * from h where id = 1;
 QUERY PLAN 


 Append  (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 
loops=1)
   ->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4) (actual 
time=0.013..0.016 rows=1 loops=1)
 Filter: (id = 1)
 Rows Removed by Filter: 3
 Planning time: 0.346 ms
 Execution time: 0.061 ms
(6 rows)

postgres=# explain analyze select * from h where id in (1,5);;
 QUERY PLAN 


 Append  (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 
loops=1)
   ->  Seq Scan on h1  (cost=0.00..41.88 rows=26 width=4) (actual 
time=0.015..0.018 rows=1 loops=1)
 Filter: (id = ANY ('{1,5}'::integer[]))
 Rows Removed by Filter: 6
   ->  Seq Scan on h3  (cost=0.00..41.88 rows=26 width=4) (actual 
time=0.005..0.007 rows=1 loops=1)
 Filter: (id = ANY ('{1,5}'::integer[]))
 Rows Removed by Filter: 3
 Planning time: 0.720 ms
 Execution time: 0.074 ms
(9 rows)

postgres=# explain analyze select * from h where id = 1 or id = 5;;
 QUERY PLAN 


 Append  (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 
loops=1)
   ->  Seq Scan on h1  (cost=0.00..48.25 rows=25 width=4) (actual 
time=0.015..0.019 rows=1 loops=1)
 Filter: ((id = 1) OR (id = 5))
 Rows Removed by Filter: 6
   ->  Seq Scan on h3  (cost=0.00..48.25 rows=25 width=4) (actual 
time=0.005..0.010 rows=1 loops=1)
 Filter: ((id = 1) OR (id = 5))
 Rows Removed by Filter: 3
 Planning time: 0.396 ms
 Execution time: 0.139 ms
(9 rows)

Can not detach / attach / drop partition table.

Best regards,
young


yonj1e.github.io
yang...@highgo.com


[HACKERS] hash partitioning based on v10Beta2

2017-08-26 Thread yang...@highgo.com
Hi all,

Now we have had the range / list partition, but hash partitioning is not 
implemented yet. 
Attached is a POC patch based on the v10Beta2 to add the hash partitioning 
feature. 
Although we will need more discussions about the syntax and other 
specifications before going ahead the project, 
but I think this runnable code might help to discuss what and how we implement 
this.

Description

The hash partition's implement is on the basis of the original range / list 
partition,and using similar syntax.

To create a partitioned table ,use:

CREATE TABLE h (id int) PARTITION BY HASH(id);

The partitioning key supports only one value, and I think the partition key can 
support multiple values, 
which may be difficult to implement when querying, but it is not impossible.

A partition table can be create as bellow:

 CREATE TABLE h1 PARTITION OF h;
 CREATE TABLE h2 PARTITION OF h;
 CREATE TABLE h3 PARTITION OF h;
 
FOR VALUES clause cannot be used, and the partition bound is calclulated 
automatically as partition index of single integer value.

An inserted record is stored in a partition whose index equals 
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], 
TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts /* Number of partitions */
;
In the above example, this is 
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], 
TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;

postgres=# insert into h select generate_series(1,20);
INSERT 0 20
postgres=# select tableoid::regclass,* from h;
 tableoid | id 
--+
 h1   |  3
 h1   |  5
 h1   | 17
 h1   | 19
 h2   |  2
 h2   |  6
 h2   |  7
 h2   | 11
 h2   | 12
 h2   | 14
 h2   | 15
 h2   | 18
 h2   | 20
 h3   |  1
 h3   |  4
 h3   |  8
 h3   |  9
 h3   | 10
 h3   | 13
 h3   | 16
(20 rows)

The number of partitions here can be dynamically added, and if a new partition 
is created, the number of partitions changes, the calculated target partitions 
will change, and the same data is not reasonable in different partitions,So you 
need to re-calculate the existing data and insert the target partition when you 
create a new partition.

postgres=# create table h4 partition of h;
CREATE TABLE
postgres=# select tableoid::regclass,* from h;
 tableoid | id 
--+
 h1   |  5
 h1   | 17
 h1   | 19
 h1   |  6
 h1   | 12
 h1   |  8
 h1   | 13
 h2   | 11
 h2   | 14
 h3   |  1
 h3   |  9
 h3   |  2
 h3   | 15
 h4   |  3
 h4   |  7
 h4   | 18
 h4   | 20
 h4   |  4
 h4   | 10
 h4   | 16
(20 rows)

When querying the data, the hash partition uses the same algorithm as the 
insertion, and filters out the table that does not need to be scanned.

postgres=# explain analyze select * from h where id = 1;
 QUERY PLAN 


 Append  (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 
loops=1)
   ->  Seq Scan on h3  (cost=0.00..41.88 rows=13 width=4) (actual 
time=0.013..0.016 rows=1 loops=1)
 Filter: (id = 1)
 Rows Removed by Filter: 3
 Planning time: 0.346 ms
 Execution time: 0.061 ms
(6 rows)

postgres=# explain analyze select * from h where id in (1,5);;
 QUERY PLAN 


 Append  (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 
loops=1)
   ->  Seq Scan on h1  (cost=0.00..41.88 rows=26 width=4) (actual 
time=0.015..0.018 rows=1 loops=1)
 Filter: (id = ANY ('{1,5}'::integer[]))
 Rows Removed by Filter: 6
   ->  Seq Scan on h3  (cost=0.00..41.88 rows=26 width=4) (actual 
time=0.005..0.007 rows=1 loops=1)
 Filter: (id = ANY ('{1,5}'::integer[]))
 Rows Removed by Filter: 3
 Planning time: 0.720 ms
 Execution time: 0.074 ms
(9 rows)

postgres=# explain analyze select * from h where id = 1 or id = 5;;
 QUERY PLAN 


 Append  (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 
loops=1)
   ->  Seq Scan on h1  (cost=0.00..48.25 rows=25 width=4) (actual 
time=0.015..0.019 rows=1 loops=1)
 Filter: ((id = 1) OR (id = 5))
 Rows Removed by Filter: 6
   ->  Seq Scan on h3  (cost=0.00..48.25 rows=25 width=4) (actual 
time=0.005..0.010 rows=1 loops=1)
 Filter: ((id = 1) OR (id = 5))
 Rows Removed by Filter: 3
 Planning time: 0.396 ms
 Execution time: 0.139 ms
(9 rows)

Can not detach / attach / drop partition table.