Re: [HACKERS] hash partitioning based on v10Beta2
On Sat, Aug 26, 2017 at 10:10 AM, yang...@highgo.com <yang...@highgo.com> wrote: > Hi all, > > Now we have had the range / list partition, but hash partitioning is not > implemented yet. > Attached is a POC patch based on the v10Beta2 to add the > hash partitioning feature. > Although we will need more discussions about the syntax > and other specifications before going ahead the project, > but I think this runnable code might help to discuss > what and how we implement this. > > FYI, there is already an existing commitfest entry for this project. https://commitfest.postgresql.org/14/1059/ > Description > > The hash partition's implement is on the basis of > the original range / list partition,and using similar syntax. > > To create a partitioned table ,use: > > CREATE TABLE h (id int) PARTITION BY HASH(id); > > The partitioning key supports only one value, and I think > the partition key can support multiple values, > which may be difficult to implement when querying, but > it is not impossible. > > A partition table can be create as bellow: > > CREATE TABLE h1 PARTITION OF h; > CREATE TABLE h2 PARTITION OF h; > CREATE TABLE h3 PARTITION OF h; > > FOR VALUES clause cannot be used, and the partition bound > is calclulated automatically as partition index of single integer value. > > An inserted record is stored in a partition whose index equals > DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], > TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts > /* Number of partitions */ > ; > In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_ > cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3; > > postgres=# insert into h select generate_series(1,20); > INSERT 0 20 > postgres=# select tableoid::regclass,* from h; > tableoid | id > --+ > h1 | 3 > h1 | 5 > h1 | 17 > h1 | 19 > h2 | 2 > h2 | 6 > h2 | 7 > h2 | 11 > h2 | 12 > h2 | 14 > h2 | 15 > h2 | 18 > h2 | 20 > h3 | 1 > h3 | 4 > h3 | 8 > h3 | 9 > h3 | 10 > h3 | 13 > h3 | 16 > (20 rows) > > The number of partitions here can be dynamically added, and > if a new partition is created, the number of partitions > changes, the calculated target partitions will change, > and the same data is not reasonable in different > partitions,So you need to re-calculate the existing data > and insert the target partition when you create a new partition. > > postgres=# create table h4 partition of h; > CREATE TABLE > postgres=# select tableoid::regclass,* from h; > tableoid | id > --+ > h1 | 5 > h1 | 17 > h1 | 19 > h1 | 6 > h1 | 12 > h1 | 8 > h1 | 13 > h2 | 11 > h2 | 14 > h3 | 1 > h3 | 9 > h3 | 2 > h3 | 15 > h4 | 3 > h4 | 7 > h4 | 18 > h4 | 20 > h4 | 4 > h4 | 10 > h4 | 16 > (20 rows) > > When querying the data, the hash partition uses the same > algorithm as the insertion, and filters out the table > that does not need to be scanned. > > postgres=# explain analyze select * from h where id = 1; > QUERY PLAN > > > > Append (cost=0.00..41.88 rows=13 width=4) (actual time= > 0.020..0.023 rows=1 loops=1) >-> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4) ( > actual time=0.013..0.016 rows=1 loops=1) > Filter: (id = 1) > Rows Removed by Filter: 3 > Planning time: 0.346 ms > Execution time: 0.061 ms > (6 rows) > > postgres=# explain analyze select * from h where id in (1,5);; > QUERY PLAN > > > > Append (cost=0.00..83.75 rows=52 width=4) (actual time= > 0.016..0.028 rows=2 loops=1) >-> Seq Scan on h1 (cost=0.00..41.88 rows=26 width=4) ( > actual time=0.015..0.018 rows=1 loops=1) > Filter: (id = ANY ('{1,5}'::integer[])) > Rows Removed by Filter: 6 >-> Seq Scan on h3 (cost=0.00..41.88 rows=26 width=4) ( > actual time=0.005..0.007 rows=1 loops=1) > Filter: (id = ANY ('{1,5}'::integer[])) > Rows Removed by Filter: 3 > Planning time: 0.720 ms > Execution time: 0.074 ms > (9 rows) > > postgres=
Re: [HACKERS] [POC] hash partitioning
Hello Looking at your hash partitioning syntax, I implemented a hash partition in a more concise way, with no need to determine the number of sub-tables, and dynamically add partitions. Description The hash partition's implement is on the basis of the original range / list partition,and using similar syntax. To create a partitioned table ,use: CREATE TABLE h (id int) PARTITION BY HASH(id); The partitioning key supports only one value, and I think the partition key can support multiple values, which may be difficult to implement when querying, but it is not impossible. A partition table can be create as bellow: CREATE TABLE h1 PARTITION OF h; CREATE TABLE h2 PARTITION OF h; CREATE TABLE h3 PARTITION OF h; FOR VALUES clause cannot be used, and the partition bound is calclulated automatically as partition index of single integer value. An inserted record is stored in a partition whose index equals DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts/* Number of partitions */ ; In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3; postgres=# insert into h select generate_series(1,20); INSERT 0 20 postgres=# select tableoid::regclass,* from h; tableoid | id --+ h1 | 3 h1 | 5 h1 | 17 h1 | 19 h2 | 2 h2 | 6 h2 | 7 h2 | 11 h2 | 12 h2 | 14 h2 | 15 h2 | 18 h2 | 20 h3 | 1 h3 | 4 h3 | 8 h3 | 9 h3 | 10 h3 | 13 h3 | 16 (20 rows) The number of partitions here can be dynamically added, and if a new partition is created, the number of partitions changes, the calculated target partitions will change, and the same data is not reasonable in different partitions,So you need to re-calculate the existing data and insert the target partition when you create a new partition. postgres=# create table h4 partition of h; CREATE TABLE postgres=# select tableoid::regclass,* from h; tableoid | id --+ h1 | 5 h1 | 17 h1 | 19 h1 | 6 h1 | 12 h1 | 8 h1 | 13 h2 | 11 h2 | 14 h3 | 1 h3 | 9 h3 | 2 h3 | 15 h4 | 3 h4 | 7 h4 | 18 h4 | 20 h4 | 4 h4 | 10 h4 | 16 (20 rows) When querying the data, the hash partition uses the same algorithm as the insertion, and filters out the table that does not need to be scanned. postgres=# explain analyze select * from h where id = 1; QUERY PLAN Append (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 loops=1) -> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4) (actual time=0.013..0.016 rows=1 loops=1) Filter: (id = 1) Rows Removed by Filter: 3 Planning time: 0.346 ms Execution time: 0.061 ms (6 rows) postgres=# explain analyze select * from h where id in (1,5);; QUERY PLAN Append (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 loops=1) -> Seq Scan on h1 (cost=0.00..41.88 rows=26 width=4) (actual time=0.015..0.018 rows=1 loops=1) Filter: (id = ANY ('{1,5}'::integer[])) Rows Removed by Filter: 6 -> Seq Scan on h3 (cost=0.00..41.88 rows=26 width=4) (actual time=0.005..0.007 rows=1 loops=1) Filter: (id = ANY ('{1,5}'::integer[])) Rows Removed by Filter: 3 Planning time: 0.720 ms Execution time: 0.074 ms (9 rows) postgres=# explain analyze select * from h where id = 1 or id = 5;; QUERY PLAN Append (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 loops=1) -> Seq Scan on h1 (cost=0.00..48.25 rows=25 width=4) (actual time=0.015..0.019 rows=1 loops=1) Filter: ((id = 1) OR (id = 5)) Rows Removed by Filter: 6 -> Seq Scan on h3 (cost=0.00..48.25 rows=25 width=4) (actual time=0.005..0.010 rows=1 loops=1) Filter: ((id = 1) OR (id = 5)) Rows Removed by Filter: 3 Planning time: 0.396 ms Execution time: 0.139 ms (9 rows) Can not detach / attach / drop partition table. Best regards, young yonj1e.github.io yang...@highgo.com
[HACKERS] hash partitioning based on v10Beta2
Hi all, Now we have had the range / list partition, but hash partitioning is not implemented yet. Attached is a POC patch based on the v10Beta2 to add the hash partitioning feature. Although we will need more discussions about the syntax and other specifications before going ahead the project, but I think this runnable code might help to discuss what and how we implement this. Description The hash partition's implement is on the basis of the original range / list partition,and using similar syntax. To create a partitioned table ,use: CREATE TABLE h (id int) PARTITION BY HASH(id); The partitioning key supports only one value, and I think the partition key can support multiple values, which may be difficult to implement when querying, but it is not impossible. A partition table can be create as bellow: CREATE TABLE h1 PARTITION OF h; CREATE TABLE h2 PARTITION OF h; CREATE TABLE h3 PARTITION OF h; FOR VALUES clause cannot be used, and the partition bound is calclulated automatically as partition index of single integer value. An inserted record is stored in a partition whose index equals DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts /* Number of partitions */ ; In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3; postgres=# insert into h select generate_series(1,20); INSERT 0 20 postgres=# select tableoid::regclass,* from h; tableoid | id --+ h1 | 3 h1 | 5 h1 | 17 h1 | 19 h2 | 2 h2 | 6 h2 | 7 h2 | 11 h2 | 12 h2 | 14 h2 | 15 h2 | 18 h2 | 20 h3 | 1 h3 | 4 h3 | 8 h3 | 9 h3 | 10 h3 | 13 h3 | 16 (20 rows) The number of partitions here can be dynamically added, and if a new partition is created, the number of partitions changes, the calculated target partitions will change, and the same data is not reasonable in different partitions,So you need to re-calculate the existing data and insert the target partition when you create a new partition. postgres=# create table h4 partition of h; CREATE TABLE postgres=# select tableoid::regclass,* from h; tableoid | id --+ h1 | 5 h1 | 17 h1 | 19 h1 | 6 h1 | 12 h1 | 8 h1 | 13 h2 | 11 h2 | 14 h3 | 1 h3 | 9 h3 | 2 h3 | 15 h4 | 3 h4 | 7 h4 | 18 h4 | 20 h4 | 4 h4 | 10 h4 | 16 (20 rows) When querying the data, the hash partition uses the same algorithm as the insertion, and filters out the table that does not need to be scanned. postgres=# explain analyze select * from h where id = 1; QUERY PLAN Append (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 loops=1) -> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4) (actual time=0.013..0.016 rows=1 loops=1) Filter: (id = 1) Rows Removed by Filter: 3 Planning time: 0.346 ms Execution time: 0.061 ms (6 rows) postgres=# explain analyze select * from h where id in (1,5);; QUERY PLAN Append (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 loops=1) -> Seq Scan on h1 (cost=0.00..41.88 rows=26 width=4) (actual time=0.015..0.018 rows=1 loops=1) Filter: (id = ANY ('{1,5}'::integer[])) Rows Removed by Filter: 6 -> Seq Scan on h3 (cost=0.00..41.88 rows=26 width=4) (actual time=0.005..0.007 rows=1 loops=1) Filter: (id = ANY ('{1,5}'::integer[])) Rows Removed by Filter: 3 Planning time: 0.720 ms Execution time: 0.074 ms (9 rows) postgres=# explain analyze select * from h where id = 1 or id = 5;; QUERY PLAN Append (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 loops=1) -> Seq Scan on h1 (cost=0.00..48.25 rows=25 width=4) (actual time=0.015..0.019 rows=1 loops=1) Filter: ((id = 1) OR (id = 5)) Rows Removed by Filter: 6 -> Seq Scan on h3 (cost=0.00..48.25 rows=25 width=4) (actual time=0.005..0.010 rows=1 loops=1) Filter: ((id = 1) OR (id = 5)) Rows Removed by Filter: 3 Planning time: 0.396 ms Execution time: 0.139 ms (9 rows) Can not detach / attach / drop partition table.