Re: [HACKERS] Secondary index access optimizations
On 08.10.2018 00:16, David Rowley wrote: On 5 October 2018 at 04:45, Konstantin Knizhnik wrote: Will the following test be enough: -- check that columns for parent table are correctly mapped to child partition of their order doesn't match create table paren (a int, b text) partition by range(a); create table child_1 partition of paren for values from (0) to (10); create table child_2 (b text, a int); alter table paren attach partition child_2 for values from (10) to (20); insert into paren values (generate_series(0,19), generate_series(100,119)); explain (costs off) select * from paren where a between 0 and 9; explain (costs off) select * from paren where a between 10 and 20; explain (costs off) select * from paren where a >= 5; explain (costs off) select * from paren where a <= 15; select count(*) from paren where a >= 5; select count(*) from paren where a < 15; drop table paren cascade; I started looking at this to see if this test would cause a crash with the original code, but it does not. The closest I can get is: drop table parent; create table parent (a bytea, b int) partition by range(a); create table child_1 (b int, a bytea); alter table parent attach partition child_1 for values from ('a') to ('z'); explain (costs off) select * from parent where b = 1; But despite the varattnos of the bytea and int accidentally matching, there's no crash due to the way operator_predicate_proof() requires more than just the varno and varattno to match. It requires the Vars to be equal(), which includes vartype, and those are not the same. So the proof just fails. In short, probably this test is doing nothing in addition to what various other tests are doing. So given the test is unable to crash the unfixed code, then I think it's likely not a worthwhile test to add. I wrote: create table listp (a int, b int) partition by list(a); create table listp1 partition of listp for values in(1); create index listp_a_b_idx on listp (a,b); and a query: select * from listp where a = 1 order by b; if we remove the "a = 1" qual, then listp_a_b_idx can't be used. I had a look at what allows this query still to use the index and it's down to pathkey_is_redundant() returning true because there's still an equivalence class containing {a,1}. I don't quite see any reason why it would not be okay to rely on that working, but that only works for pathkeys. If you have a case like: set max_parallel_workers_per_gather=0; create table listp (a int, b int) partition by list(a); create table listp1 partition of listp for values in(1); insert into listp select 1,x from generate_Series(1,100) x; create index listp_a_b_idx on listp (a,b); vacuum analyze listp; explain analyze select * from listp where a = 1 and b = 1; the "a = 1" will be dropped and the index on (a,b) does not get used. Patched results in: postgres=# explain analyze select * from listp where a = 1 and b = 1; QUERY PLAN Append (cost=0.00..16925.01 rows=1 width=8) (actual time=0.019..169.231 rows=1 loops=1) -> Seq Scan on listp1 (cost=0.00..16925.00 rows=1 width=8) (actual time=0.017..169.228 rows=1 loops=1) Filter: (b = 1) Rows Removed by Filter: 99 Planning Time: 0.351 ms Execution Time: 169.257 ms (6 rows) Whereas unpatched gets: postgres=# explain analyze select * from listp where a = 1 and b = 1; QUERY PLAN -- Append (cost=0.42..4.45 rows=1 width=8) (actual time=0.657..0.660 rows=1 loops=1) -> Index Only Scan using listp1_a_b_idx on listp1 (cost=0.42..4.44 rows=1 width=8) (actual time=0.653..0.655 rows=1 loops=1) Index Cond: ((a = 1) AND (b = 1)) Heap Fetches: 0 Planning Time: 32.303 ms Execution Time: 0.826 ms (6 rows) so I was probably wrong about suggesting set_append_rel_size() as a good place to remove these quals. It should perhaps be done later, or maybe we can add some sort of marker to the qual to say it does not need to be enforced during execution. Probably the former would be best as we don't want to show these in EXPLAIN. Well, I made a different conclusion from this problem (inability use of compound index because of redundant qual elimination). Is it really good idea to define compound index with first key equal to partitioning key? Restriction on this key in any case will lead to partition pruning. We do no need index for it... In your case if we create index listp_b_idx: create index listp_b_idx on listp (b); then right plan we be generated: Append (cost=0.42..8.45 rows=1 width=8) (actual time=0.046..0.047 rows=1 loops=1) -> Index Scan using listp1_b_idx on listp1 (cost=0.42..8.44 rows=1 width=8) (actu
Re: [HACKERS] Secondary index access optimizations
On 5 October 2018 at 04:45, Konstantin Knizhnik wrote: > Will the following test be enough: > > -- check that columns for parent table are correctly mapped to child > partition of their order doesn't match > create table paren (a int, b text) partition by range(a); > create table child_1 partition of paren for values from (0) to (10); > create table child_2 (b text, a int); > alter table paren attach partition child_2 for values from (10) to (20); > insert into paren values (generate_series(0,19), generate_series(100,119)); > > explain (costs off) select * from paren where a between 0 and 9; > explain (costs off) select * from paren where a between 10 and 20; > explain (costs off) select * from paren where a >= 5; > explain (costs off) select * from paren where a <= 15; > > select count(*) from paren where a >= 5; > select count(*) from paren where a < 15; > > drop table paren cascade; I started looking at this to see if this test would cause a crash with the original code, but it does not. The closest I can get is: drop table parent; create table parent (a bytea, b int) partition by range(a); create table child_1 (b int, a bytea); alter table parent attach partition child_1 for values from ('a') to ('z'); explain (costs off) select * from parent where b = 1; But despite the varattnos of the bytea and int accidentally matching, there's no crash due to the way operator_predicate_proof() requires more than just the varno and varattno to match. It requires the Vars to be equal(), which includes vartype, and those are not the same. So the proof just fails. In short, probably this test is doing nothing in addition to what various other tests are doing. So given the test is unable to crash the unfixed code, then I think it's likely not a worthwhile test to add. I wrote: > create table listp (a int, b int) partition by list(a); > create table listp1 partition of listp for values in(1); > create index listp_a_b_idx on listp (a,b); > > and a query: > > select * from listp where a = 1 order by b; > > if we remove the "a = 1" qual, then listp_a_b_idx can't be used. I had a look at what allows this query still to use the index and it's down to pathkey_is_redundant() returning true because there's still an equivalence class containing {a,1}. I don't quite see any reason why it would not be okay to rely on that working, but that only works for pathkeys. If you have a case like: set max_parallel_workers_per_gather=0; create table listp (a int, b int) partition by list(a); create table listp1 partition of listp for values in(1); insert into listp select 1,x from generate_Series(1,100) x; create index listp_a_b_idx on listp (a,b); vacuum analyze listp; explain analyze select * from listp where a = 1 and b = 1; the "a = 1" will be dropped and the index on (a,b) does not get used. Patched results in: postgres=# explain analyze select * from listp where a = 1 and b = 1; QUERY PLAN Append (cost=0.00..16925.01 rows=1 width=8) (actual time=0.019..169.231 rows=1 loops=1) -> Seq Scan on listp1 (cost=0.00..16925.00 rows=1 width=8) (actual time=0.017..169.228 rows=1 loops=1) Filter: (b = 1) Rows Removed by Filter: 99 Planning Time: 0.351 ms Execution Time: 169.257 ms (6 rows) Whereas unpatched gets: postgres=# explain analyze select * from listp where a = 1 and b = 1; QUERY PLAN -- Append (cost=0.42..4.45 rows=1 width=8) (actual time=0.657..0.660 rows=1 loops=1) -> Index Only Scan using listp1_a_b_idx on listp1 (cost=0.42..4.44 rows=1 width=8) (actual time=0.653..0.655 rows=1 loops=1) Index Cond: ((a = 1) AND (b = 1)) Heap Fetches: 0 Planning Time: 32.303 ms Execution Time: 0.826 ms (6 rows) so I was probably wrong about suggesting set_append_rel_size() as a good place to remove these quals. It should perhaps be done later, or maybe we can add some sort of marker to the qual to say it does not need to be enforced during execution. Probably the former would be best as we don't want to show these in EXPLAIN. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Secondary index access optimizations
On 04.10.2018 12:19, David Rowley wrote: On 4 October 2018 at 22:11, Konstantin Knizhnik wrote: On 04.10.2018 06:19, David Rowley wrote: Please, can you also add a test which tests this code which has a partition with columns in a different order than it's parent. Having an INT and a TEXT column is best as if the translations are done incorrectly it's likely to result in a crash which will alert us to the issue. It would be good to also verify the test causes a crash if you temporarily put the code back to using the untranslated qual. Thanks for working on this. Thank you very much for detecting and fixing this problem. I have checked that all changes in plan caused by this fix are correct. Updated version of the patch is attached. Can you add the test that I mentioned above? Will the following test be enough: -- check that columns for parent table are correctly mapped to child partition of their order doesn't match create table paren (a int, b text) partition by range(a); create table child_1 partition of paren for values from (0) to (10); create table child_2 (b text, a int); alter table paren attach partition child_2 for values from (10) to (20); insert into paren values (generate_series(0,19), generate_series(100,119)); explain (costs off) select * from paren where a between 0 and 9; explain (costs off) select * from paren where a between 10 and 20; explain (costs off) select * from paren where a >= 5; explain (costs off) select * from paren where a <= 15; select count(*) from paren where a >= 5; select count(*) from paren where a < 15; drop table paren cascade; -- If so, then updated patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 5f74d3b..b628ac7 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -37,6 +37,7 @@ #include "optimizer/paths.h" #include "optimizer/plancat.h" #include "optimizer/planner.h" +#include "optimizer/predtest.h" #include "optimizer/prep.h" #include "optimizer/restrictinfo.h" #include "optimizer/tlist.h" @@ -1052,6 +1053,27 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, /* Restriction reduces to constant TRUE, so drop it */ continue; } + + /* + * For partitions, we may be able to eliminate some quals if + * they're implied by the partition bound. + */ + if (childrel->partition_qual != NIL) + { +Node *checkqual = copyObject(childqual); + +/* + * Since the partition_qual has all Vars stored as varno=1, we + * must convert all Vars of the childqual to have their varnos + * set to 1 so that predicate_implied_by can properly match + * implied quals. + */ +ChangeVarNodes(checkqual, childrel->relid, 1, 0); + +if (predicate_implied_by(list_make1(checkqual), childrel->partition_qual, false)) + continue; + } + /* might have gotten an AND clause, if so flatten it */ foreach(lc2, make_ands_implicit((Expr *) childqual)) { diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 8369e3a..8cd9b06 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -450,7 +450,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent, */ if (inhparent && relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) set_relation_partition_info(root, rel, relation); - + else if (relation->rd_rel->relispartition) + rel->partition_qual = RelationGetPartitionQual(relation); heap_close(relation, NoLock); /* diff --git a/src/common/zpq_stream.c b/src/common/zpq_stream.c new file mode 100644 index 000..afd42e9 --- /dev/null +++ b/src/common/zpq_stream.c @@ -0,0 +1,386 @@ +#include "postgres_fe.h" +#include "common/zpq_stream.h" +#include "c.h" +#include "pg_config.h" + +#if HAVE_LIBZSTD + +#include +#include + +#define ZPQ_BUFFER_SIZE (8*1024) +#define ZSTD_COMPRESSION_LEVEL 1 + +struct ZpqStream +{ + ZSTD_CStream* tx_stream; + ZSTD_DStream* rx_stream; + ZSTD_outBuffer tx; + ZSTD_inBuffer rx; + size_t tx_not_flushed; /* Amount of datas in internal zstd buffer */ + size_t tx_buffered;/* Data which is consumed by zpq_read but not yet sent */ + zpq_tx_functx_func; + zpq_rx_funcrx_func; + void* arg; + char const*rx_error;/* Decompress error message */ + size_t tx_total; + size_t tx_total_raw; + size_t rx_total; + size_t rx_total_raw; + char tx_buf[ZPQ_BUFFER_SIZE]; + char rx_buf[ZPQ_BUFFER_SIZE]; +}; + +ZpqStream* +zpq_create(zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg) +{ + ZpqStream* zs = (ZpqStream*)malloc(sizeof(ZpqStream)); + zs->tx_stream = ZSTD_createCStream(); + ZSTD_initCStream(zs->tx_stream, ZSTD_COMPRESSION_LEVEL); + zs
Re: [HACKERS] Secondary index access optimizations
On 4 October 2018 at 22:11, Konstantin Knizhnik wrote: > On 04.10.2018 06:19, David Rowley wrote: >> Please, can you also add a test which tests this code which has a >> partition with columns in a different order than it's parent. Having >> an INT and a TEXT column is best as if the translations are done >> incorrectly it's likely to result in a crash which will alert us to >> the issue. It would be good to also verify the test causes a crash if >> you temporarily put the code back to using the untranslated qual. >> >> Thanks for working on this. >> > Thank you very much for detecting and fixing this problem. > I have checked that all changes in plan caused by this fix are correct. > Updated version of the patch is attached. Can you add the test that I mentioned above? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Secondary index access optimizations
On 04.10.2018 06:19, David Rowley wrote: On 12 September 2018 at 08:32, Konstantin Knizhnik wrote: Also the patch proposed by you is much simple and does mostly the same. Yes, it is not covering CHECK constraints, I started to look at this and found a problem in regards to varno during the predicate_implied_by() test. The problem is that the partition bound is always stored as varno=1 (For example, see how get_qual_for_list() calls makeVar()). This causes the patch to fail in cases where the partitioned table is not varno=1. You're also incorrectly using rinfo->clause to pass to predicate_implied_by(). This is a problem because stored here have not been translated to have the child varattnos. childqual is the correct thing to use as that's just been translated. You may have not used it as the varnos will have been converted to the child's varno, which will never be varno=1, so you might have found that not to work due to the missing code to change the varnos to 1. I've attached the diff for allpaths.c (only) that I ended up with to make it work. This causes the output of many other regression test to change, so you'll need to go over these and verify everything is correct again. Please, can you also add a test which tests this code which has a partition with columns in a different order than it's parent. Having an INT and a TEXT column is best as if the translations are done incorrectly it's likely to result in a crash which will alert us to the issue. It would be good to also verify the test causes a crash if you temporarily put the code back to using the untranslated qual. Thanks for working on this. Thank you very much for detecting and fixing this problem. I have checked that all changes in plan caused by this fix are correct. Updated version of the patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 5f74d3b..b628ac7 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -37,6 +37,7 @@ #include "optimizer/paths.h" #include "optimizer/plancat.h" #include "optimizer/planner.h" +#include "optimizer/predtest.h" #include "optimizer/prep.h" #include "optimizer/restrictinfo.h" #include "optimizer/tlist.h" @@ -1052,6 +1053,27 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, /* Restriction reduces to constant TRUE, so drop it */ continue; } + + /* + * For partitions, we may be able to eliminate some quals if + * they're implied by the partition bound. + */ + if (childrel->partition_qual != NIL) + { +Node *checkqual = copyObject(childqual); + +/* + * Since the partition_qual has all Vars stored as varno=1, we + * must convert all Vars of the childqual to have their varnos + * set to 1 so that predicate_implied_by can properly match + * implied quals. + */ +ChangeVarNodes(checkqual, childrel->relid, 1, 0); + +if (predicate_implied_by(list_make1(checkqual), childrel->partition_qual, false)) + continue; + } + /* might have gotten an AND clause, if so flatten it */ foreach(lc2, make_ands_implicit((Expr *) childqual)) { diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 8369e3a..8cd9b06 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -450,7 +450,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent, */ if (inhparent && relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) set_relation_partition_info(root, rel, relation); - + else if (relation->rd_rel->relispartition) + rel->partition_qual = RelationGetPartitionQual(relation); heap_close(relation, NoLock); /* diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out index 4f29d9f..67d7a41 100644 --- a/src/test/regress/expected/inherit.out +++ b/src/test/regress/expected/inherit.out @@ -1772,30 +1772,26 @@ explain (costs off) select * from list_parted where a is not null; - Append -> Seq Scan on part_ab_cd - Filter: (a IS NOT NULL) -> Seq Scan on part_ef_gh - Filter: (a IS NOT NULL) -> Seq Scan on part_null_xy Filter: (a IS NOT NULL) -(7 rows) +(5 rows) explain (costs off) select * from list_parted where a in ('ab', 'cd', 'ef'); QUERY PLAN -- Append -> Seq Scan on part_ab_cd - Filter: ((a)::text = ANY ('{ab,cd,ef}'::text[])) -> Seq Scan on part_ef_gh Filter: ((a)::text = ANY ('{ab,cd,ef}'::text[])) -(5 rows) +(4 rows) explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd'); - QUERY PLAN
Re: [HACKERS] Secondary index access optimizations
On 12 September 2018 at 08:32, Konstantin Knizhnik wrote: > Also the patch proposed by you is much simple and does mostly the same. Yes, > it is not covering CHECK constraints, I started to look at this and found a problem in regards to varno during the predicate_implied_by() test. The problem is that the partition bound is always stored as varno=1 (For example, see how get_qual_for_list() calls makeVar()). This causes the patch to fail in cases where the partitioned table is not varno=1. You're also incorrectly using rinfo->clause to pass to predicate_implied_by(). This is a problem because stored here have not been translated to have the child varattnos. childqual is the correct thing to use as that's just been translated. You may have not used it as the varnos will have been converted to the child's varno, which will never be varno=1, so you might have found that not to work due to the missing code to change the varnos to 1. I've attached the diff for allpaths.c (only) that I ended up with to make it work. This causes the output of many other regression test to change, so you'll need to go over these and verify everything is correct again. Please, can you also add a test which tests this code which has a partition with columns in a different order than it's parent. Having an INT and a TEXT column is best as if the translations are done incorrectly it's likely to result in a crash which will alert us to the issue. It would be good to also verify the test causes a crash if you temporarily put the code back to using the untranslated qual. Thanks for working on this. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services skip_implied_child_quals_allpaths.diff Description: Binary data
Re: [HACKERS] Secondary index access optimizations
On Wed, Sep 12, 2018 at 11:43:09AM +0300, Konstantin Knizhnik wrote: > Looks like this qual is considered for choosing optimal path before it is > removed from list of quals in set_append_rel_size. Hm... The latest reviews have not been addressed yet, so I have marked this as returned with feedback. -- Michael signature.asc Description: PGP signature
Re: [HACKERS] Secondary index access optimizations
On 12.09.2018 08:14, David Rowley wrote: On 12 September 2018 at 08:32, Konstantin Knizhnik wrote: Also the patch proposed by you is much simple and does mostly the same. Yes, it is not covering CHECK constraints, but as far as partitioning becomes now standard in Postgres, I do not think that much people will use old inheritance mechanism and CHECK constraints. In any case, there are now many optimizations which works only for partitions, but not for inherited tables. I've not had time to look at your updated patch yet, but one thing I thought about after my initial review, imagine you have a setup like: create table listp (a int, b int) partition by list(a); create table listp1 partition of listp for values in(1); create index listp_a_b_idx on listp (a,b); and a query: select * from listp where a = 1 order by b; if we remove the "a = 1" qual, then listp_a_b_idx can't be used. Looks like this qual is considered for choosing optimal path before it is removed from list of quals in set_append_rel_size. At least the presence of this patch is not breaking the plan in this case: create table listp (a int, b int) partition by list(a); create table listp1 partition of listp for values in(1); create table listp2 partition of listp for values in(2); create index listp_a_b_idx on listp (a,b); insert into listp values (1,generate_series(1,10)); insert into listp values (2,generate_series(11,20)); explain select * from listp where a = 1 order by b; QUERY PLAN Merge Append (cost=0.30..4630.43 rows=10 width=8) Sort Key: listp1.b -> Index Only Scan using listp1_a_b_idx on listp1 (cost=0.29..3630.42 rows=10 width=8) (3 rows) I didn't test this in your patch, but I guess since the additional quals are not applied to the children in set_append_rel_size() that by the time set_append_rel_pathlist() is called, then when we go generating the paths, the (a,b) index won't be any good. Perhaps there's some workaround like inventing some sort of "no-op" qual that exists in planning but never makes it way down to scans. Although I admit to not having fully thought that idea through. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Secondary index access optimizations
On 12 September 2018 at 08:32, Konstantin Knizhnik wrote: > Also the patch proposed by you is much simple and does mostly the same. Yes, > it is not covering CHECK constraints, > but as far as partitioning becomes now standard in Postgres, I do not think > that much people will use old inheritance mechanism and CHECK constraints. > In any case, there are now many optimizations which works only for > partitions, but not for inherited tables. I've not had time to look at your updated patch yet, but one thing I thought about after my initial review, imagine you have a setup like: create table listp (a int, b int) partition by list(a); create table listp1 partition of listp for values in(1); create index listp_a_b_idx on listp (a,b); and a query: select * from listp where a = 1 order by b; if we remove the "a = 1" qual, then listp_a_b_idx can't be used. I didn't test this in your patch, but I guess since the additional quals are not applied to the children in set_append_rel_size() that by the time set_append_rel_pathlist() is called, then when we go generating the paths, the (a,b) index won't be any good. Perhaps there's some workaround like inventing some sort of "no-op" qual that exists in planning but never makes it way down to scans. Although I admit to not having fully thought that idea through. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Secondary index access optimizations
Hi David, On 11.09.2018 06:49, David Rowley wrote: On 9 July 2018 at 13:26, David Rowley wrote: I started looking over this patch and have a few comments: Hi Konstantin, Wondering, are you going to be submitting an updated patch for this commitfest? If not then I think we can mark this as returned with feedback as it's been waiting on author for quite a while now. First of all thank your for review. I am very sorry for delay with answer: I was in vacation in July and just forgot about this mail. I have to agree with you that it is better to split this patch into two and that using range type for open and close intervals match is so good idea. Also the patch proposed by you is much simple and does mostly the same. Yes, it is not covering CHECK constraints, but as far as partitioning becomes now standard in Postgres, I do not think that much people will use old inheritance mechanism and CHECK constraints. In any case, there are now many optimizations which works only for partitions, but not for inherited tables. I attach to this mail your patch combined with corrected tests outputs. Unfortunately the performance gain is not so much as I expected (even without additional predicate_implied_by() smarts). On the following database: create table bt (k integer, v integer) partition by range (k); create table dt1 partition of bt for values from (1) to (10001); create table dt2 partition of bt for values from (10001) to (20001); create index dti1 on dt1(v); create index dti2 on dt2(v); insert into bt values (generate_series(1,2), generate_series(1,2)); analyze bt; and pgbench script: \set x random(1, 1) select * from bt where k between 1 and 20001 and v=:x; I got ~170k TPS with this patch and about ~160k TPS without it. My hypothesis was that we have to perform redundant predicate only once (for one selected record) and it adds no so much overhead comparing with index search cost. So I tried another version of the query which selects large number of records: select sum(*) from bt where k between 1 and 20001 and v between :x and :x + 1000; Now patch version shows 23k TPS vs. 19k for Vanilla. diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 5db1688..48359f4 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -37,6 +37,7 @@ #include "optimizer/paths.h" #include "optimizer/plancat.h" #include "optimizer/planner.h" +#include "optimizer/predtest.h" #include "optimizer/prep.h" #include "optimizer/restrictinfo.h" #include "optimizer/tlist.h" @@ -1039,6 +1040,12 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, /* Restriction reduces to constant TRUE, so drop it */ continue; } + + /* We can skip any quals that are implied by any partition bound. */ + if (childrel->partition_qual != NIL && +predicate_implied_by(list_make1(rinfo->clause), childrel->partition_qual, false)) +continue; + /* might have gotten an AND clause, if so flatten it */ foreach(lc2, make_ands_implicit((Expr *) childqual)) { diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 8369e3a..8cd9b06 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -450,7 +450,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent, */ if (inhparent && relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) set_relation_partition_info(root, rel, relation); - + else if (relation->rd_rel->relispartition) + rel->partition_qual = RelationGetPartitionQual(relation); heap_close(relation, NoLock); /* diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out index 4f29d9f..91219fa 100644 --- a/src/test/regress/expected/inherit.out +++ b/src/test/regress/expected/inherit.out @@ -1772,30 +1772,26 @@ explain (costs off) select * from list_parted where a is not null; - Append -> Seq Scan on part_ab_cd - Filter: (a IS NOT NULL) -> Seq Scan on part_ef_gh - Filter: (a IS NOT NULL) -> Seq Scan on part_null_xy Filter: (a IS NOT NULL) -(7 rows) +(5 rows) explain (costs off) select * from list_parted where a in ('ab', 'cd', 'ef'); QUERY PLAN -- Append -> Seq Scan on part_ab_cd - Filter: ((a)::text = ANY ('{ab,cd,ef}'::text[])) -> Seq Scan on part_ef_gh Filter: ((a)::text = ANY ('{ab,cd,ef}'::text[])) -(5 rows) +(4 rows) explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd'); - QUERY PLAN + QUERY PLAN +--
Re: [HACKERS] Secondary index access optimizations
On 9 July 2018 at 13:26, David Rowley wrote: > I started looking over this patch and have a few comments: Hi Konstantin, Wondering, are you going to be submitting an updated patch for this commitfest? If not then I think we can mark this as returned with feedback as it's been waiting on author for quite a while now. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Secondary index access optimizations
On 22 March 2018 at 22:38, Konstantin Knizhnik wrote: > Attached please find rebased version of the patch. Hi, I started looking over this patch and have a few comments: I don't think this range type idea is a great one. I don't think it's going to ever perform very well. I also see you're not checking the collation of the type anywhere. As of now, no range types have collation support, but you can't really go and code this with that assumption. I also don't really like the sequence scan over pg_range. Probably a better way to do this would be to add a new bt proc, like what was done in 0a459cec for 2 new functions. Something like BTISNEXTVAL_PROC and BTISPREVVAL_PROC. You'd then need to define functions for all the types based on integers, making functions which take 2 parameters of the type, and an additional collation param. The functions should return bool. int4_isnextval(2, 3, InvalidOid) would return true. You'd need to return false on wraparound. I also think that the patch is worth doing without the additional predicate_implied_by() smarts. In fact, I think strongly that the two should be considered as two independent patches. Partial indexes suffer from the same issue you're trying to address here and that would be resolved by any patch which makes changes to predicate_implied_by(). Probably the best place to put the code to skip the redundant quals is inside set_append_rel_size(). There's already code there that skips quals that are seen as const TRUE. This applies for UNION ALL targets with expressions that can be folded to constants once the qual is passed through adjust_appendrel_attrs() For example: explain select * from (select 1 as a from pg_class union all select 2 from pg_class) t where a = 1; I've attached a patch to show what I had in mind. I had to change how partition_qual is populated, which I was surprised to see only gets populated for sub-partitioned tables (the top-level parent won't have a qual since it's not partitioned) I didn't update the partition pruning code that assumes this is only populated for sub-partitioned table. That will need to be done. The patch comes complete with all the failing regression tests where the redundant quals have been removed by the new code. If you want to make this work for CHECK constraints too, then I think the code for that can be added to the same location as the code I added in the attached patch. You'd just need to fetch the check constraint quals and just add some extra code to check if the qual is redundant. Some new performance benchmarks would then be useful in order to find out how much overhead there is. We might learn that we don't want to enable it by default if it's too expensive. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services skip_redundant_partition_quals_poc.patch Description: Binary data
Re: [HACKERS] Secondary index access optimizations
On 21.03.2018 20:30, Konstantin Knizhnik wrote: On 01.03.2018 23:15, Andres Freund wrote: Hi, This patch seems like quite a good improvement. One thing I've not really looked at but am slightly concerned in passing: Are there cases where we now would do a lot of predicate pruning work even though the overhead of just evaluating the qual is trivial, e.g. because there's only one row due to a pkey? The predtest code is many things but lightning fast is not one of them. Obviously that won't matter for analytics queries, but I could see it hurt in OLTPish workloads... Greetings, Andres Freund Hi, I am sorry for delay with answer. I understand your concern and did the following experiment. I have initialized the database in the following way: create table base (k integer primary key, v integer); create table part1 (check (k between 1 and 1)) inherits (base); create table part2 (check (k between 10001 and 2)) inherits (base); create index pi1 on part1(v); create index pi2 on part2(v); insert into part1 values (generate series(1,1), random()*10); insert into part2 values (generate_series(10001,2), random()*10); vacuum analyze part1; vacuum analyze part2; Vanilla Postgres uses the following plan: explain select * from base where k between 1 and 2 and v = 100; QUERY PLAN --- Append (cost=0.00..16.63 rows=3 width=8) -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) Filter: ((k >= 1) AND (k <= 2) AND (v = 100)) -> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8) Index Cond: (v = 100) Filter: ((k >= 1) AND (k <= 2)) -> Index Scan using pi2 on part2 (cost=0.29..8.31 rows=1 width=8) Index Cond: (v = 100) Filter: ((k >= 1) AND (k <= 2)) (9 rows) Execution of this query 10 times gives is done in 12 seconds. With applied patch query plan is changed to: QUERY PLAN --- Append (cost=0.00..16.62 rows=3 width=8) -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) Filter: ((k >= 1) AND (k <= 2) AND (v = 100)) -> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8) Index Cond: (v = 100) -> Index Scan using pi2 on part2 (cost=0.29..8.30 rows=1 width=8) Index Cond: (v = 100) (7 rows) Elapsed time of 10 query executions is 13 seconds. So you was right that increased query optimization time exceeds advantage of extra checks elimination. But it is true only if we are not using prepare statements. With prepared statements results are the following: Vanilla: 0m3.915s This patch: 0m3.563s So improvement is not so large, but it exists. If somebody wants to repeat my experiments, I attached to this mail small shell script which I used to run query several times. Non-prepared query is launched using the following command: time ./repeat-query.sh "select * from base where k between 1 and 2 and v = 100" 10 and prepared query: time ./repeat-query.sh "execute select100" 10 "prepare select100 as select * from base where k between 1 and 2 and v = 100" And once again I want to notice that using prepared statements can increase performance almost 3 times! As far as using prepared statements is not always possible I want to recall autoprepare patch waiting at the commitfest: https://www.postgresql.org/message-id/flat/8eed9c23-19ba-5404-7a9e-0584b836b3f3%40postgrespro.ru#8eed9c23-19ba-5404-7a9e-0584b836b...@postgrespro.ru -- Konstantin Knizhnik Postgres Professional:http://www.postgrespro.com The Russian Postgres Company Attached please find rebased version of the patch. Also I do more testing, now using pgbench. Scripts for initialization of the database and for custom script for pgbench are attached. Results at my computer are the following: pgbench options Vanilla Patch -c 1 9208 8289 -c 1 -M prepared 38503 41206 -c 10 39224 34040 -c 10 -M prepared 165465 172874 -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index a2b1384..84b8543 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -631,12 +631,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL)) (3 rows) -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest - QUERY PLAN ---
Re: [HACKERS] Secondary index access optimizations
On 01.03.2018 23:15, Andres Freund wrote: Hi, This patch seems like quite a good improvement. One thing I've not really looked at but am slightly concerned in passing: Are there cases where we now would do a lot of predicate pruning work even though the overhead of just evaluating the qual is trivial, e.g. because there's only one row due to a pkey? The predtest code is many things but lightning fast is not one of them. Obviously that won't matter for analytics queries, but I could see it hurt in OLTPish workloads... Greetings, Andres Freund Hi, I am sorry for delay with answer. I understand your concern and did the following experiment. I have initialized the database in the following way: create table base (k integer primary key, v integer); create table part1 (check (k between 1 and 1)) inherits (base); create table part2 (check (k between 10001 and 2)) inherits (base); create index pi1 on part1(v); create index pi2 on part2(v); insert into part1 values (generate series(1,1), random()*10); insert into part2 values (generate_series(10001,2), random()*10); vacuum analyze part1; vacuum analyze part2; Vanilla Postgres uses the following plan: explain select * from base where k between 1 and 2 and v = 100; QUERY PLAN --- Append (cost=0.00..16.63 rows=3 width=8) -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) Filter: ((k >= 1) AND (k <= 2) AND (v = 100)) -> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8) Index Cond: (v = 100) Filter: ((k >= 1) AND (k <= 2)) -> Index Scan using pi2 on part2 (cost=0.29..8.31 rows=1 width=8) Index Cond: (v = 100) Filter: ((k >= 1) AND (k <= 2)) (9 rows) Execution of this query 10 times gives is done in 12 seconds. With applied patch query plan is changed to: QUERY PLAN --- Append (cost=0.00..16.62 rows=3 width=8) -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) Filter: ((k >= 1) AND (k <= 2) AND (v = 100)) -> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8) Index Cond: (v = 100) -> Index Scan using pi2 on part2 (cost=0.29..8.30 rows=1 width=8) Index Cond: (v = 100) (7 rows) Elapsed time of 10 query executions is 13 seconds. So you was right that increased query optimization time exceeds advantage of extra checks elimination. But it is true only if we are not using prepare statements. With prepared statements results are the following: Vanilla: 0m3.915s This patch: 0m3.563s So improvement is not so large, but it exists. If somebody wants to repeat my experiments, I attached to this mail small shell script which I used to run query several times. Non-prepared query is launched using the following command: time ./repeat-query.sh "select * from base where k between 1 and 2 and v = 100" 10 and prepared query: time ./repeat-query.sh "execute select100" 10 "prepare select100 as select * from base where k between 1 and 2 and v = 100" And once again I want to notice that using prepared statements can increase performance almost 3 times! As far as using prepared statements is not always possible I want to recall autoprepare patch waiting at the commitfest: https://www.postgresql.org/message-id/flat/8eed9c23-19ba-5404-7a9e-0584b836b3f3%40postgrespro.ru#8eed9c23-19ba-5404-7a9e-0584b836b...@postgrespro.ru -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company repeat-query.sh Description: application/shellscript
Re: [HACKERS] Secondary index access optimizations
Hi, This patch seems like quite a good improvement. One thing I've not really looked at but am slightly concerned in passing: Are there cases where we now would do a lot of predicate pruning work even though the overhead of just evaluating the qual is trivial, e.g. because there's only one row due to a pkey? The predtest code is many things but lightning fast is not one of them. Obviously that won't matter for analytics queries, but I could see it hurt in OLTPish workloads... Greetings, Andres Freund
Re: [HACKERS] Secondary index access optimizations
On 29.01.2018 16:24, Konstantin Knizhnik wrote: On 29.01.2018 07:34, Thomas Munro wrote: On Sat, Jan 20, 2018 at 5:41 AM, Konstantin Knizhnik wrote: On 19.01.2018 16:14, Antonin Houska wrote: you should test the operator B-tree strategy: BTLessStrategyNumber, BTLessEqualStrategyNumber, etc. The operator string alone does not tell enough about the operator semantics. The strategy can be found in the pg_amop catalog. get_op_btree_interpretation() function may be useful, but there may be something better in utils/cache/lsyscache.c. Thank you very much. Shame on me that I didn't notice such solution myself - such checking of B-tree strategy is done in the same predtest.c file! Now checking of predicate clauses compatibility is done in much more elegant and general way. Attached please find new version of the patch. I wonder if you should create a new index and SysCache entry for looking up range types by subtype. I'll be interested to see what others have to say about this range type-based technique -- it seems clever to me, but I'm not familiar enough with this stuff to say if there is some other approach you should be using instead. I think that it is good idea to add caching for range type lookup. If community think that it will be useful I can try to add such mechanism. But it seems to be not so trivial, especially properly handle invalidations. I have added caching of element_type->range_type mapping to typcache.c. But it is not invalidated now if pg_range relation is changed. Also if there are more than one range types for the specified element type then first of them is used. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index bce3348..6a7e7fb 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL)) (3 rows) -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest - QUERY PLAN -- +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest +QUERY PLAN +-- Foreign Scan on public.ft1 t1 Output: c1, c2, c3, c4, c5, c6, c7, c8 - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL)) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL)) (3 rows) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 1df1e3a..c421530 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -292,7 +292,7 @@ RESET enable_nestloop; EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1; -- OpExpr(l) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!; -- OpExpr(r) diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 44f6b03..c7bf118 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -345,6 +345,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel, switch (rel->rtekind) { case RTE_RELATION: +remove_restrictions_implied_by_constraints(root, rel, rte); if (rte->relkind == RELKIND_FOREIGN_TABLE) { /* Foreign table */ @@ -1130,6 +1131,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, set_dummy_rel_pathlist(childrel); continue; } + remove_restrictions_implied_by_constraints(root, childrel, childRTE); /* CE failed, so finish copying/modifying join quals. */ childrel->joininfo = (List *) diff --git a/src/backe
Re: [HACKERS] Secondary index access optimizations
On 29.01.2018 07:34, Thomas Munro wrote: On Sat, Jan 20, 2018 at 5:41 AM, Konstantin Knizhnik wrote: On 19.01.2018 16:14, Antonin Houska wrote: you should test the operator B-tree strategy: BTLessStrategyNumber, BTLessEqualStrategyNumber, etc. The operator string alone does not tell enough about the operator semantics. The strategy can be found in the pg_amop catalog. get_op_btree_interpretation() function may be useful, but there may be something better in utils/cache/lsyscache.c. Thank you very much. Shame on me that I didn't notice such solution myself - such checking of B-tree strategy is done in the same predtest.c file! Now checking of predicate clauses compatibility is done in much more elegant and general way. Attached please find new version of the patch. I wonder if you should create a new index and SysCache entry for looking up range types by subtype. I'll be interested to see what others have to say about this range type-based technique -- it seems clever to me, but I'm not familiar enough with this stuff to say if there is some other approach you should be using instead. I think that it is good idea to add caching for range type lookup. If community think that it will be useful I can try to add such mechanism. But it seems to be not so trivial, especially properly handle invalidations. Some superficial project style comments (maybe these could be fixed automatically with pgindent?): +static TypeCacheEntry* lookup_range_type(Oid type) +static RangeType* operator_to_range(TypeCacheEntry *typcache, Oid oper, Const* literal) ... these should be like this: static RangeType * operator_to_range(... I think the idea is that you can always search for a function definition with by looking for "name(" at the start of a line, so we put a newline there. Then there is the whitespace before "*", not after it. + if (pred_range && clause_range && range_eq_internal(typcache, pred_range, clause_range)) + { + return true; + } Unnecessary braces. +/* + * Get range type for the corresprent scalar type. + * Returns NULl if such range type is not found. + * This function performs sequential scan in pg_range table, + * but since number of range rtype is not expected to be large (6 builtin range types), + * it should not be a problem. + */ Typos. Thank you. I fixed this issues. New patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index bce3348..6a7e7fb 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL)) (3 rows) -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest - QUERY PLAN -- +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest +QUERY PLAN +-- Foreign Scan on public.ft1 t1 Output: c1, c2, c3, c4, c5, c6, c7, c8 - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL)) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL)) (3 rows) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 1df1e3a..c421530 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -292,7 +292,7 @@ RESET enable_nestloop; EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1; -- OpExpr(l) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!; -- OpExpr(r) diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allp
Re: [HACKERS] Secondary index access optimizations
On Sat, Jan 20, 2018 at 5:41 AM, Konstantin Knizhnik wrote: > On 19.01.2018 16:14, Antonin Houska wrote: >> you should test the operator B-tree strategy: BTLessStrategyNumber, >> BTLessEqualStrategyNumber, etc. The operator string alone does not tell >> enough >> about the operator semantics. >> >> The strategy can be found in the pg_amop catalog. >> get_op_btree_interpretation() function may be useful, but there may be >> something better in utils/cache/lsyscache.c. >> > Thank you very much. > Shame on me that I didn't notice such solution myself - such checking of > B-tree strategy is done in the same predtest.c file! > Now checking of predicate clauses compatibility is done in much more elegant > and general way. > Attached please find new version of the patch. I wonder if you should create a new index and SysCache entry for looking up range types by subtype. I'll be interested to see what others have to say about this range type-based technique -- it seems clever to me, but I'm not familiar enough with this stuff to say if there is some other approach you should be using instead. Some superficial project style comments (maybe these could be fixed automatically with pgindent?): +static TypeCacheEntry* lookup_range_type(Oid type) +static RangeType* operator_to_range(TypeCacheEntry *typcache, Oid oper, Const* literal) ... these should be like this: static RangeType * operator_to_range(... I think the idea is that you can always search for a function definition with by looking for "name(" at the start of a line, so we put a newline there. Then there is the whitespace before "*", not after it. + if (pred_range && clause_range && range_eq_internal(typcache, pred_range, clause_range)) + { + return true; + } Unnecessary braces. +/* + * Get range type for the corresprent scalar type. + * Returns NULl if such range type is not found. + * This function performs sequential scan in pg_range table, + * but since number of range rtype is not expected to be large (6 builtin range types), + * it should not be a problem. + */ Typos. -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] Secondary index access optimizations
On 19.01.2018 16:14, Antonin Houska wrote: Konstantin Knizhnik wrote: On 11.01.2018 12:34, Antonin Houska wrote: Konstantin Knizhnik wrote: I haven't thought that much about details, so just one comment: you shouldn't need the conversion to text and back to binary form. utils/adt/rangetypes.c contains constructors that accept the binary values. Attached please find new version of the patch which uses range type to interval matching in operator_predicate_proof function. I think that instead of checking the operator name, e.g. strcmp(oprname, "<") you should test the operator B-tree strategy: BTLessStrategyNumber, BTLessEqualStrategyNumber, etc. The operator string alone does not tell enough about the operator semantics. The strategy can be found in the pg_amop catalog. get_op_btree_interpretation() function may be useful, but there may be something better in utils/cache/lsyscache.c. Thank you very much. Shame on me that I didn't notice such solution myself - such checking of B-tree strategy is done in the same predtest.c file! Now checking of predicate clauses compatibility is done in much more elegant and general way. Attached please find new version of the patch. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index bce3348..6a7e7fb 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL)) (3 rows) -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest - QUERY PLAN -- +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest +QUERY PLAN +-- Foreign Scan on public.ft1 t1 Output: c1, c2, c3, c4, c5, c6, c7, c8 - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL)) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL)) (3 rows) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 1df1e3a..c421530 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -292,7 +292,7 @@ RESET enable_nestloop; EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1; -- OpExpr(l) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!; -- OpExpr(r) diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 44f6b03..c7bf118 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -345,6 +345,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel, switch (rel->rtekind) { case RTE_RELATION: +remove_restrictions_implied_by_constraints(root, rel, rte); if (rte->relkind == RELKIND_FOREIGN_TABLE) { /* Foreign table */ @@ -1130,6 +1131,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, set_dummy_rel_pathlist(childrel); continue; } + remove_restrictions_implied_by_constraints(root, childrel, childRTE); /* CE failed, so finish copying/modifying join quals. */ childrel->joininfo = (List *) diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index f743871..f763a97 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -1466,6 +1466,51 @@ relation_excluded_by_constraints(PlannerInfo *root, return false; } +/* + * Remove from restrictions list items implied by table constraints + */ +void remove_restrictions_implied_by_constraints(PlannerInfo *root,
Re: [HACKERS] Secondary index access optimizations
Konstantin Knizhnik wrote: > On 11.01.2018 12:34, Antonin Houska wrote: > > Konstantin Knizhnik wrote: > > I haven't thought that much about details, so just one comment: you > > shouldn't > > need the conversion to text and back to binary form. utils/adt/rangetypes.c > > contains constructors that accept the binary values. > > > Attached please find new version of the patch which uses range type to > interval matching in operator_predicate_proof function. I think that instead of checking the operator name, e.g. strcmp(oprname, "<") you should test the operator B-tree strategy: BTLessStrategyNumber, BTLessEqualStrategyNumber, etc. The operator string alone does not tell enough about the operator semantics. The strategy can be found in the pg_amop catalog. get_op_btree_interpretation() function may be useful, but there may be something better in utils/cache/lsyscache.c. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com
Re: [HACKERS] Secondary index access optimizations
On 11.01.2018 12:34, Antonin Houska wrote: Konstantin Knizhnik wrote: On 09.01.2018 19:48, Antonin Houska wrote: Have you considered using the range types (internally in operator_predicate_proof()) instead of hard-coding operator OIDs? The range types do have the knowledge that k < 20001 and k <= 2 are equivalent for integer type: postgres=# SELECT int4range '(, 20001)' = int4range '(, 2]'; ?column? -- t (1 row) It is bright idea, but it is not quit clear to me how to implement it. There are several builtin ranges types in Postgres: int4range, int8range, numrange, tsrange, tstzrange, daterange. Among them int4range, int8range and daterange are discrete types having canonical function, for which this transformation rules are applicable. Now I perform checks for all this types. So the challenge is to support user defined range types with canonical function. As input operator_predicate_proof function has Oid of comparison operator and Const * expression representing literal value. So I see the following generic way of checking equivalence of ranges: 1. Get name of operator. If it is '<=' or '>=' then it is closed interval, if it is '<' or '>' then it is open interval. 2. Convert Const to text (using type's out function) and construct interval: '(,"$value"]' for '<=', '["$value",)' for '>=', '(,"$value")' for '<' and '("$value",)' for '>'. 3. Find range type from type of the constant: select * from pg_range where rngsubtype=?; 4. Try to cast constructed above string to this range type (using type's in function). 5. Compare two produced ranges and if them are equal, then operator_predicate_proof should return true. I haven't thought that much about details, so just one comment: you shouldn't need the conversion to text and back to binary form. utils/adt/rangetypes.c contains constructors that accept the binary values. Attached please find new version of the patch which uses range type to interval matching in operator_predicate_proof function. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index bce3348..6a7e7fb 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL)) (3 rows) -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest - QUERY PLAN -- +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest +QUERY PLAN +-- Foreign Scan on public.ft1 t1 Output: c1, c2, c3, c4, c5, c6, c7, c8 - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL)) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL)) (3 rows) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 1df1e3a..c421530 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -292,7 +292,7 @@ RESET enable_nestloop; EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1; -- OpExpr(l) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!; -- OpExpr(r) diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 44f6b03..c7bf118 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -345,6 +345,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel, switch (rel->rtekind) { case RTE_RELATION: +remove_restrictions_implied_by_constraints(root, rel, rte); if (rte->relkind ==
Re: [HACKERS] Secondary index access optimizations
Konstantin Knizhnik wrote: > On 09.01.2018 19:48, Antonin Houska wrote: > >> Have you considered using the range types (internally in >> operator_predicate_proof()) instead of hard-coding operator OIDs? The range >> types do have the knowledge that k < 20001 and k <= 2 are equivalent for >> integer type: >> >> postgres=# SELECT int4range '(, 20001)' = int4range '(, 2]'; >> ?column? >> -- >> t >> (1 row) > It is bright idea, but it is not quit clear to me how to implement it. > There are several builtin ranges types in Postgres: int4range, int8range, > numrange, tsrange, tstzrange, daterange. > > Among them int4range, int8range and daterange are discrete types having > canonical function, for which this transformation rules are applicable. > Now I perform checks for all this types. So the challenge is to support user > defined range types with canonical function. > As input operator_predicate_proof function has Oid of comparison operator and > Const * expression representing literal value. > So I see the following generic way of checking equivalence of ranges: > > 1. Get name of operator. If it is '<=' or '>=' then it is closed interval, if > it is '<' or '>' then it is open interval. > 2. Convert Const to text (using type's out function) and construct interval: > '(,"$value"]' for '<=', '["$value",)' for '>=', '(,"$value")' for '<' and > '("$value",)' for '>'. > 3. Find range type from type of the constant: > select * from pg_range where rngsubtype=?; > 4. Try to cast constructed above string to this range type (using type's in > function). > 5. Compare two produced ranges and if them are equal, then > operator_predicate_proof should return true. I haven't thought that much about details, so just one comment: you shouldn't need the conversion to text and back to binary form. utils/adt/rangetypes.c contains constructors that accept the binary values. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com
Re: [HACKERS] Secondary index access optimizations
On 09.01.2018 19:48, Antonin Houska wrote: Konstantin Knizhnik wrote: On 14.08.2017 19:33, Konstantin Knizhnik wrote: On 14.08.2017 12:37, Konstantin Knizhnik wrote: Hi hackers, I am trying to compare different ways of optimizing work with huge append-only tables in PostgreSQL where primary key is something like timestamp and queries are usually accessing most recent data using some secondary keys. Size of secondary index is one of the most critical factors limiting insert/search performance. As far as data is inserted in timestamp ascending order, access to primary key is well localized and accessed tables are present in memory. But if we create secondary key for the whole table, then access to it will require random reads from the disk and significantly decrease performance. There are two well known solutions of the problem: 1. Table partitioning 2. Partial indexes This approaches I want to compare. First of all I want to check if optimizer is able to generate efficient query execution plan covering different time intervals. Unfortunately in both cases generated plan is not optimal. 1. Table partitioning: create table base (k integer primary key, v integer); create table part1 (check (k between 1 and 1)) inherits (base); create table part2 (check (k between 10001 and 2)) inherits (base); create index pi1 on part1(v); create index pi2 on part2(v); insert int part1 values (generate series(1,1), random()); insert into part2 values (generate_series(10001,2), random()); explain select * from base where k between 1 and 2 and v = 100; QUERY PLAN --- Append (cost=0.00..15.65 rows=3 width=8) -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) Filter: ((k >= 1) AND (k <= 2) AND (v = 100)) -> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8) Index Cond: (v = 100) Filter: ((k >= 1) AND (k <= 2)) -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) Index Cond: (v = 100) Filter: ((k >= 1) AND (k <= 2)) Questions: - Is there some way to avoid sequential scan of parent table? Yes, it is empty and so sequential scan will not take much time, but ... it still requires some additional actions and so increasing query execution time. - Why index scan of partition indexes includes filter condition if it is guaranteed by check constraint that all records of this partition match search predicate? 2. Partial indexes: create table t (k integer primary key, v integer); insert into t values (generate_series(1,2),random()); create index i1 on t(v) where k between 1 and 1; create index i2 on t(v) where k between 10001 and 2; postgres=# explain select * from t where k between 1 and 1 and v = 100; QUERY PLAN Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) Index Cond: (v = 100) (2 rows) Here we get perfect plan. Let's try to extend search interval: postgres=# explain select * from t where k between 1 and 2 and v = 100; QUERY PLAN -- Index Scan using t_pkey on t (cost=0.29..760.43 rows=1 width=8) Index Cond: ((k >= 1) AND (k <= 2)) Filter: (v = 100) (3 rows) Unfortunately in this case Postgres is not able to apply partial indexes. And this is what I expected to get: postgres=# explain select * from t where k between 1 and 1 and v = 100 union all select * from t where k between 10001 and 2 and v = 100; QUERY PLAN -- Append (cost=0.29..14.58 rows=2 width=8) -> Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) Index Cond: (v = 100) -> Index Scan using i2 on t t_1 (cost=0.29..7.28 rows=1 width=8) Index Cond: (v = 100) I wonder if there are some principle problems in supporting this two things in optimizer: 1. Remove search condition for primary key if it is fully satisfied by derived table check constraint. 2. Append index scans of several partial indexes if specified interval is covered by their conditions. I wonder if someone is familiar with this part of optimizer ad can easily fix it. Otherwise I am going to spend some time on solving this problems (if community think that such optimizations will be useful). Replying to myself: the following small patch removes redundant checks from index scans for derived tables: diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 939045d..1f7c9cf 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -1441,6 +1441,20 @@ relation_excluded_by_constraints(PlannerInfo *root, if (predicate_refuted_by(safe_constraints, rel->baserestrictinfo, false)) return true; + /*
Re: [HACKERS] Secondary index access optimizations
Konstantin Knizhnik wrote: > On 14.08.2017 19:33, Konstantin Knizhnik wrote: > > On 14.08.2017 12:37, Konstantin Knizhnik wrote: > > Hi hackers, > > I am trying to compare different ways of optimizing work with huge > append-only tables in PostgreSQL where primary key is something like > timestamp and queries are usually accessing most recent data using some > secondary keys. Size of secondary index is one of the most critical > factors limiting insert/search performance. As far as data is inserted in > timestamp ascending order, access to primary key is well localized and > accessed tables are present in memory. But if we create secondary key for the > whole table, then access to it will require random reads from > the disk and significantly decrease performance. > > There are two well known solutions of the problem: > 1. Table partitioning > 2. Partial indexes > > This approaches I want to compare. First of all I want to check if optimizer > is able to generate efficient query execution plan covering different time > intervals. > Unfortunately in both cases generated plan is not optimal. > > 1. Table partitioning: > > create table base (k integer primary key, v integer); > create table part1 (check (k between 1 and 1)) inherits (base); > create table part2 (check (k between 10001 and 2)) inherits (base); > create index pi1 on part1(v); > create index pi2 on part2(v); > insert int part1 values (generate series(1,1), random()); > insert into part2 values (generate_series(10001,2), random()); > explain select * from base where k between 1 and 2 and v = 100; > QUERY PLAN > --- > Append (cost=0.00..15.65 rows=3 width=8) > -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) > Filter: ((k >= 1) AND (k <= 2) AND (v = 100)) > -> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8) > Index Cond: (v = 100) > Filter: ((k >= 1) AND (k <= 2)) > -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) > Index Cond: (v = 100) > Filter: ((k >= 1) AND (k <= 2)) > > Questions: > - Is there some way to avoid sequential scan of parent table? Yes, it is > empty and so sequential scan will not take much time, but ... it still > requires some additional actions and so increasing query execution time. > - Why index scan of partition indexes includes filter condition if it is > guaranteed by check constraint that all records of this partition match > search predicate? > > 2. Partial indexes: > > create table t (k integer primary key, v integer); > insert into t values (generate_series(1,2),random()); > create index i1 on t(v) where k between 1 and 1; > create index i2 on t(v) where k between 10001 and 2; > postgres=# explain select * from t where k between 1 and 1 and v = 100; > QUERY PLAN > > Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) > Index Cond: (v = 100) > (2 rows) > > Here we get perfect plan. Let's try to extend search interval: > > postgres=# explain select * from t where k between 1 and 2 and v = 100; > QUERY PLAN > -- > Index Scan using t_pkey on t (cost=0.29..760.43 rows=1 width=8) > Index Cond: ((k >= 1) AND (k <= 2)) > Filter: (v = 100) > (3 rows) > > Unfortunately in this case Postgres is not able to apply partial indexes. > And this is what I expected to get: > > postgres=# explain select * from t where k between 1 and 1 and v = 100 > union all select * from t where k between 10001 and 2 and v = 100; > QUERY PLAN > -- > Append (cost=0.29..14.58 rows=2 width=8) > -> Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) > Index Cond: (v = 100) > -> Index Scan using i2 on t t_1 (cost=0.29..7.28 rows=1 width=8) > Index Cond: (v = 100) > > I wonder if there are some principle problems in supporting this two things > in optimizer: > 1. Remove search condition for primary key if it is fully satisfied by > derived table check constraint. > 2. Append index scans of several partial indexes if specified interval is > covered by their conditions. > > I wonder if someone is familiar with this part of optimizer ad can easily > fix it. > Otherwise I am going to spend some time on solving this problems (if > community think that such optimizations will be useful). > > Replying to myself: the following small patch removes redundant checks from > index scans for derived tables: > > diff --git a/src/backend/optimizer/util/plancat.c > b/src/backend/optimizer/util/plancat.c > index 939045d..1f7c9cf 100644 > --- a/src/backend/optimizer/util/plancat.c > +++ b/src/backend/optimizer/util/plancat.c > @@
Re: [HACKERS] Secondary index access optimizations
Greetings, * Konstantin Knizhnik (k.knizh...@postgrespro.ru) wrote: > On 04.12.2017 19:44, Alvaro Herrera wrote: > >Konstantin Knizhnik wrote: > >> > >>On 30.11.2017 05:16, Michael Paquier wrote: > >>>On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik > >>> wrote: > Concerning broken partition_join test: it is "expected" failure: my patch > removes from the plans redundant checks. > So the only required action is to update expected file with results. > Attached please find updated patch. > >>>The last patch version has conflicts in regression tests and did not > >>>get any reviews. Please provide a rebased version. I am moving the > >>>patch to next CF with waiting on author as status. Thanks! > >>Rebased patch is attached. > >I don't think this is a rebase on the previously posted patch ... it's > >about 10x as big and appears to be a thorough rewrite of the entire > >optimizer. > > Or, sorry. I really occasionally committed in this branch patch for > aggregate push down. > Correct reabased patch is attached. This patch applies, builds and passes 'make check-world', with no real review posted of it, so I don't believe it should be 'Waiting on Author' but really in 'Needs Review' status, so I've gone ahead and updated the CF with that status. Thanks! Stephen signature.asc Description: PGP signature
Re: [HACKERS] Secondary index access optimizations
On Tue, Sep 5, 2017 at 9:10 AM, Konstantin Knizhnik < k.knizh...@postgrespro.ru> wrote: > > > On 05.09.2017 04:02, Amit Langote wrote: > > Like Thomas, I'm not so sure about the whole predtest.c patch. The core > logic in operator_predicate_proof() should be able to conclude that, say, > k < 21 is implied by k <= 20, which you are trying to address with some > special case code. If there is still problem you think need to be fixed > here, a better place to look at would be somewhere around get_btree_test_op(). > > > Frankly speaking I also do not like this part of my patch. > I will be pleased if you or somebody else can propose better solution. > I do not understand how get_btree_test_op() can help here. > > Yes, k < 21 is implied by k <= 20. It is based on generic properties of < > and <= operators. > But I need to proof something different: having table partition constraint > (k < 21) I want to remove predicate (k <= 20) from query. > In other words, operator_predicate_proof() should be able to conclude > that (k <= 20) is implied by (k < 21). > But it is true only for integer types, not for floating point types. And > Postgres operator definition > doesn't provide some way to determine that user defined type is integer > type: has integer values for which such conclusion is true. > > Why I think that it is important? Certainly, it is possible to rewrite > query as (k < 21) and no changes in operator_predicate_proof() are needed. > Assume the most natural use case: I have some positive integer key and I > wan to range partition table by such key, for example with interval 1. > Currently standard PostgreSQL partitioning mechanism requires to specify > intervals with open high boundary. > So if I want first partition to contain interval [1,1], second - > [10001,20001],... I have to create partitions in such way: > > create table bt (k integer, v integer) partition by range (k); > create table dt1 partition of bt for values from (1) to (10001); > create table dt2 partition of bt for values from (10001) to (20001); > ... > > If I want to write query inspecting data of the particular partition, then > most likely I will use BETWEEN operator: > > SELECT * FROM t WHERE k BETWEEN 1 and 1; > > But right now operator_predicate_proof() is not able to conclude that > predicate (k BETWEEN 1 and 1) transformed to (k >= 1 AND k <= 1) is > equivalent to (k >= 1 AND k < 10001) > which is used as partition constraint. > > Another very popular use case (for example mentioned in PostgreSQL > documentation of partitioning: https://www.postgresql.org/ > docs/10/static/ddl-partitioning.html) > is using date as partition key: > > CREATE TABLE measurement ( > city_id int not null, > logdate date not null, > peaktempint, > unitsales int > ) PARTITION BY RANGE (logdate); > > > CREATE TABLE measurement_y2006m03 PARTITION OF measurement > FOR VALUES FROM ('2006-03-01') TO ('2006-04-01') > > > Assume that now I want to get measurements for March: > > There are three ways to write this query: > > select * from measurement where extract(month from logdate) = 3; > select * from measurement where logdate between '2006-03-01' AND > '2006-03-31'; > select * from measurement where logdate >= '2006-03-01' AND logdate < > '2006-04-01'; > > Right now only for the last query optimal query plan will be constructed. > Perhaps the relative pages (about partitioning and optimization) should mention to avoid BETWEEN and using closed-open checks, as the last query. Dates are a perfect example to demonstrate that BETWEEN shouldn't be used, in my opinion. Dates (and timestamps) are not like integers as they are often used with different levels of precisions, day, month, year, hour, minute, second, etc. (month in your example). Constructing the correct expressions for the different precisions can be a nightmare with BETWEEN but very simple with >= and < (in the example: get start_date, '2006-03-01', and add one month). So, just my 2c, is it worth the trouble to implement this feature (conversion of (k<21) to (k<=20) and vice versa) and how much work would it need for all data types that are commonly used for partitioning? > Unfortunately my patch is not covering date type. > > -- > Konstantin Knizhnik > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > >
Re: [HACKERS] Secondary index access optimizations
On 04.12.2017 19:44, Alvaro Herrera wrote: Konstantin Knizhnik wrote: On 30.11.2017 05:16, Michael Paquier wrote: On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik wrote: Concerning broken partition_join test: it is "expected" failure: my patch removes from the plans redundant checks. So the only required action is to update expected file with results. Attached please find updated patch. The last patch version has conflicts in regression tests and did not get any reviews. Please provide a rebased version. I am moving the patch to next CF with waiting on author as status. Thanks! Rebased patch is attached. I don't think this is a rebase on the previously posted patch ... it's about 10x as big and appears to be a thorough rewrite of the entire optimizer. Or, sorry. I really occasionally committed in this branch patch for aggregate push down. Correct reabased patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index bce3348..6a7e7fb 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL)) (3 rows) -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest - QUERY PLAN -- +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest +QUERY PLAN +-- Foreign Scan on public.ft1 t1 Output: c1, c2, c3, c4, c5, c6, c7, c8 - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL)) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL)) (3 rows) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 1df1e3a..c421530 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -292,7 +292,7 @@ RESET enable_nestloop; EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1; -- OpExpr(l) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!; -- OpExpr(r) diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 44f6b03..c7bf118 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -345,6 +345,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel, switch (rel->rtekind) { case RTE_RELATION: +remove_restrictions_implied_by_constraints(root, rel, rte); if (rte->relkind == RELKIND_FOREIGN_TABLE) { /* Foreign table */ @@ -1130,6 +1131,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, set_dummy_rel_pathlist(childrel); continue; } + remove_restrictions_implied_by_constraints(root, childrel, childRTE); /* CE failed, so finish copying/modifying join quals. */ childrel->joininfo = (List *) diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index f743871..f763a97 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -1466,6 +1466,51 @@ relation_excluded_by_constraints(PlannerInfo *root, return false; } +/* + * Remove from restrictions list items implied by table constraints + */ +void remove_restrictions_implied_by_constraints(PlannerInfo *root, +RelOptInfo *rel, RangeTblEntry *rte) +{ + List *constraint_pred; + List *safe_constraints = NIL; + List *safe_restrictions = NIL; + ListCell *lc; + + if (rte->rtekind != RTE_RELATION || rte->inh) + return; + + /* + * OK to fetch the constraint expressions. Include "col IS NOT
Re: [HACKERS] Secondary index access optimizations
Konstantin Knizhnik wrote: > > > On 30.11.2017 05:16, Michael Paquier wrote: > > On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik > > wrote: > > > Concerning broken partition_join test: it is "expected" failure: my patch > > > removes from the plans redundant checks. > > > So the only required action is to update expected file with results. > > > Attached please find updated patch. > > The last patch version has conflicts in regression tests and did not > > get any reviews. Please provide a rebased version. I am moving the > > patch to next CF with waiting on author as status. Thanks! > > Rebased patch is attached. I don't think this is a rebase on the previously posted patch ... it's about 10x as big and appears to be a thorough rewrite of the entire optimizer. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Secondary index access optimizations
On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik wrote: > Concerning broken partition_join test: it is "expected" failure: my patch > removes from the plans redundant checks. > So the only required action is to update expected file with results. > Attached please find updated patch. The last patch version has conflicts in regression tests and did not get any reviews. Please provide a rebased version. I am moving the patch to next CF with waiting on author as status. Thanks! -- Michael