Re: Should we add GUCs to allow partition pruning to be disabled?
On Fri, Apr 12, 2019 at 02:01:39PM +1200, David Rowley wrote: > On Thu, 11 Apr 2019 at 17:40, Justin Pryzby wrote: > > I tweaked this patch some more (sorry): > > - remove "especially"; > > I think that likely needs to be kept for the PG11 version. I was > hoping it was stop a casual tester testing a SELECT and seeing that > it's not so bad only to find later that UPDATE/DELETE OOMs. With "especially", it reads as if "excessive memory usage" might happen for SELECT, and it'll be additionally worse for UPDATE/DELETE. Without "especially", it makes "excessive RAM use" apply only to UPDATE/DELETE, which I think is what's intended. |Larger partition hierarchies may incur long planning time, and [especially] in |the case of UPDATE and DELETE, excessive |memory usage. I think as long as UPDATE/DELETE are specifically mentioned, that would handle your concern. If I were to suggest an alternative: |Larger partition hierarchies may incur long planning time; and, in |the case of UPDATE and DELETE, may also |incur excessive memory usage. ..after which I'll stop wrestling with words. Justin
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, 11 Apr 2019 at 17:40, Justin Pryzby wrote: > I tweaked this patch some more (sorry): > - remove "currently" since that's not expected to be changed (right?); Seems like a good idea. I think the way we exclude inheritance child relations will never scale well. Other improvements that we'll see will most likely be as a consequence of speeding up declarative partitioning. For example the planner improvements in PG12 just did that for UPDATE/DELETE. > - remove "especially"; I think that likely needs to be kept for the PG11 version. I was hoping it was stop a casual tester testing a SELECT and seeing that it's not so bad only to find later that UPDATE/DELETE OOMs. > - refer to "partition hierarchies" not "partitioning hierarchies"; fine > - rewrite bit about "When partition pruning is not possible" fine. > Also, I noticed awhile ago while grepping for "probably be fixed in future > releases" that some items under ddl-inherit-caveats are actually possible for > relkind=p partitions in v11. I assume those will never be implemented for > inheritence partitioning, so I propose another update to docs (if preferred, > I'll bring up on a new thread). Not sure about that. It may be very simple to implement if we one day get global indexes. It may just be a matter of pointing all the tables at the same index and letting the wonders of global indexes handle all the hard stuff. I'm not that excited about removing that. I'd be equally excited about adding the text if it wasn't already there and you were proposing to add it. > - unique constraints on parent table; > - FK constraints on parent table; > > Note that FK constraints *referencing* a partitiond table are possible in v12 > but not in v11. So if there's any finer-grained update to documentation of > the > individual limitations, it'd need to be tweaked for back branches (v10 and > 11). Don't we just need to remove or update: While primary keys are supported on partitioned tables, foreign keys referencing partitioned tables are not supported. (Foreign key references from a partitioned table to some other table are supported.) I didn't follow this work, but on testing, I see the foreign key does not CASCADE when doing DETACH PARTITION, it errors instead. Perhaps that's worth a mention here. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, Apr 11, 2019 at 03:34:30PM +1200, David Rowley wrote: > On Thu, 21 Mar 2019 at 00:51, David Rowley > wrote: > > Just so I don't forget about this, I've added it to the July 'fest. > > > > https://commitfest.postgresql.org/23/2065/ > > Now that we have 428b260f8, I think the version of this that goes into > master should be more like the attached. I tweaked this patch some more (sorry): - remove "currently" since that's not expected to be changed (right?); - remove "especially"; - refer to "partition hierarchies" not "partitioning hierarchies"; - rewrite bit about "When partition pruning is not possible" Also, I noticed awhile ago while grepping for "probably be fixed in future releases" that some items under ddl-inherit-caveats are actually possible for relkind=p partitions in v11. I assume those will never be implemented for inheritence partitioning, so I propose another update to docs (if preferred, I'll bring up on a new thread). - unique constraints on parent table; - FK constraints on parent table; Note that FK constraints *referencing* a partitiond table are possible in v12 but not in v11. So if there's any finer-grained update to documentation of the individual limitations, it'd need to be tweaked for back branches (v10 and 11). Justin >From 3a787b95f5a35b53cd958855ec6fc4ff9fc9a455 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Thu, 11 Apr 2019 00:24:44 -0500 Subject: [PATCH v1 1/2] Reinstate warnings regarding large heirarchies Put back warnings regarding high planning time and/or RAM use for large inheritance heirarchies, and high planning time for large number of partitions not pruned during planning with declaratively partitioned tables. Discussion: https://www.postgresql.org/message-id/CAKJS1f8RW-mHQ8aEWD5Dv0%2B8A1wH5tHHdYMGW9y5sXqnE0X9wA%40mail.gmail.com https://commitfest.postgresql.org/23/2065/ Author: Robert Haas, David Rowley Reviewed by: Amit Langote, Justin Pryzby --- doc/src/sgml/ddl.sgml | 20 1 file changed, 20 insertions(+) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 244d5ce..83cbc66 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -3582,6 +3582,26 @@ VALUES ('Albany', NULL, NULL, 'NY'); offer flexibility but do not have some of the performance benefits of built-in declarative partitioning. + + + + When using table inheritance, partition hierarchies with more than a few + hundred partitions are not recommended. Larger partition hierarchies may + incur long planning time, and, in the case of UPDATE + and DELETE, excessive memory usage. When inheritance + is used, see also the limitations described in + . + + + + When using declarative partitioning, the overhead of query planning + is directly related to the number of unpruned partitions. Planning is + generally fast with small numbers of unpruned partitions, even in + partition hierarchies containing many thousands of partitions. However, + long planning time will be incurred by large partition hierarchies if + partition pruning is not possible during the planning phase. + + -- 2.1.4 >From 6bd80e7cdddc3c9552d44439b4b8e9843c1007e4 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Thu, 11 Apr 2019 00:22:56 -0500 Subject: [PATCH v1 2/2] Document features of declarative partitioning.. ..which will never be implemented for legacy inheritance. --- doc/src/sgml/ddl.sgml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 83cbc66..3495a66 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -3450,8 +3450,9 @@ VALUES ('Albany', NULL, NULL, 'NY'); - These deficiencies will probably be fixed in some future release, - but in the meantime considerable care is needed in deciding whether + Some functionality not implemented for inheritance hierarchies is + implemented for declarative partitioning. + Considerable care is needed in deciding whether partitioning with legacy inheritance is useful for your application. -- 2.1.4
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2019/04/11 13:50, David Rowley wrote: > On Thu, 11 Apr 2019 at 16:06, Amit Langote > wrote: >> I've posted a patch last week on the "speed up partition planning" thread >> [1] which modifies ddl.sgml to remove the text about UPDATE/DELETE using >> constraint exclusion under the covers. Do you think there's any merit to >> combining that with this one? > > Probably separate is better. I don't think anything you're proposing > there is for back-patching, but I think the original patch over here > should be. OK, no problem. I just thought to point out my patch because you've posted a version of the patch here for HEAD *because of* 428b260f8, the commit which also obsoleted the text that the other patch fixes. Anyway, let's leave the other patch on its own thread where there are a few other things to be sorted out. Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, 11 Apr 2019 at 16:06, Amit Langote wrote: > > On 2019/04/11 12:34, David Rowley wrote: > > Now that we have 428b260f8, I think the version of this that goes into > > master should be more like the attached. > > Thanks, looks good. Thanks for looking. > I've posted a patch last week on the "speed up partition planning" thread > [1] which modifies ddl.sgml to remove the text about UPDATE/DELETE using > constraint exclusion under the covers. Do you think there's any merit to > combining that with this one? Probably separate is better. I don't think anything you're proposing there is for back-patching, but I think the original patch over here should be. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2019/04/11 12:34, David Rowley wrote: > On Thu, 21 Mar 2019 at 00:51, David Rowley > wrote: >> Just so I don't forget about this, I've added it to the July 'fest. >> >> https://commitfest.postgresql.org/23/2065/ > > Now that we have 428b260f8, I think the version of this that goes into > master should be more like the attached. Thanks, looks good. I've posted a patch last week on the "speed up partition planning" thread [1] which modifies ddl.sgml to remove the text about UPDATE/DELETE using constraint exclusion under the covers. Do you think there's any merit to combining that with this one? Thanks, Amit [1] https://www.postgresql.org/message-id/4f049572-9440-3c99-afa1-f7ca7f38fe80%40lab.ntt.co.jp
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, 21 Mar 2019 at 00:51, David Rowley wrote: > Just so I don't forget about this, I've added it to the July 'fest. > > https://commitfest.postgresql.org/23/2065/ Now that we have 428b260f8, I think the version of this that goes into master should be more like the attached. I think the original patch is fine for the back branches. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services docs_partitioning_warning_master.patch Description: Binary data
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, 14 Mar 2019 at 02:10, Robert Haas wrote: > > On Tue, Mar 12, 2019 at 7:28 PM David Rowley > wrote: > > I think I've done that in the attached patch. > > Cool, thanks. Just so I don't forget about this, I've added it to the July 'fest. https://commitfest.postgresql.org/23/2065/ -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Tue, Mar 12, 2019 at 7:28 PM David Rowley wrote: > I think I've done that in the attached patch. Cool, thanks. > However, do think the > just saying "excessive memory usage" seems strange without prefixing > it with "can result in" and dropping the "especially". I'm fairly > used to having my wording debated, so I've left your words in the > patch. I'm not direly opposed to that. I included "especially" so as not to rule out the possibility that there might be cases other than UPDATE and DELETE that, in some circumstances, also use a lot of memory. I didn't prefix it with "can result in" because I don't think English grammar requires it to be there. It would be grammatically correct to say "Larger partitioning hierarchies may incur long planning time and excessive memory usage," and I don't think that injecting an appositive phrase before "excessive memory usage" changes that calculus. However, somebody might find your way easier to follow. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2019/03/13 8:28, David Rowley wrote: > On Wed, 13 Mar 2019 at 04:07, Robert Haas wrote: >> I think it should be added to one of the existing sub-headings. I >> suggest adding it to the end of 5.10.1 and rephrasing it so that it >> makes clearer the distinction between what will happen with >> inheritance and what will happen with table partitioning, e.g. +1. >> When using either declarative partitioning or table inheritance, >> partitioning hierarchies with more than a few hundred partitions are >> not currently recommended. Larger partition hierarchies may incur long >> planning time, and especially in the case of UPDATE and DELETE, >> excessive memory usage. When inheritance is used, see also the >> limitations described in Section 5.10.5, Partitioning and Constraint >> Exclusion. > > I think I've done that in the attached patch. However, do think the > just saying "excessive memory usage" seems strange without prefixing > it with "can result in" and dropping the "especially". FWIW, I've gotten used to reading the kind of English that Robert wrote (meaning it makes perfect sense to me), but wording tweaks you suggest will work to. Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On Wed, 13 Mar 2019 at 04:07, Robert Haas wrote: > I think it should be added to one of the existing sub-headings. I > suggest adding it to the end of 5.10.1 and rephrasing it so that it > makes clearer the distinction between what will happen with > inheritance and what will happen with table partitioning, e.g. > > When using either declarative partitioning or table inheritance, > partitioning hierarchies with more than a few hundred partitions are > not currently recommended. Larger partition hierarchies may incur long > planning time, and especially in the case of UPDATE and DELETE, > excessive memory usage. When inheritance is used, see also the > limitations described in Section 5.10.5, Partitioning and Constraint > Exclusion. I think I've done that in the attached patch. However, do think the just saying "excessive memory usage" seems strange without prefixing it with "can result in" and dropping the "especially". I'm fairly used to having my wording debated, so I've left your words in the patch. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services docs_partitioning_warning.patch Description: Binary data
Re: Should we add GUCs to allow partition pruning to be disabled?
On Mon, Mar 11, 2019 at 12:30 AM Amit Langote wrote: > Now the question is where to put this text? Currently, we have: > > 5.10. Table Partitioning > 5.10.1. Overview > 5.10.2. Declarative Partitioning > 5.10.3. Implementation Using Inheritance > 5.10.4. Partition Pruning > 5.10.5. Partitioning and Constraint Exclusion > > Should we add 5.10.6 Notes for the above "note", or should it be stuffed > under one of the existing sub-headings? I think it should be added to one of the existing sub-headings. I suggest adding it to the end of 5.10.1 and rephrasing it so that it makes clearer the distinction between what will happen with inheritance and what will happen with table partitioning, e.g. When using either declarative partitioning or table inheritance, partitioning hierarchies with more than a few hundred partitions are not currently recommended. Larger partition hierarchies may incur long planning time, and especially in the case of UPDATE and DELETE, excessive memory usage. When inheritance is used, see also the limitations described in Section 5.10.5, Partitioning and Constraint Exclusion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2019/03/11 13:22, Justin Pryzby wrote: > On Mon, Mar 11, 2019 at 01:06:08PM +0900, Amit Langote wrote: >> On 2019/03/11 11:13, David Rowley wrote: >>> On Mon, 11 Mar 2019 at 15:00, David Rowley >>> wrote: On Mon, 11 Mar 2019 at 14:33, Amit Langote wrote: > PG 11 moved the needle a bit for SELECT queries: > > Excluding unnecessary partitions is slow for UPDATE and DELETE queries, With those words I expect the user might be surprised that it's still slow after doing SET enable_partition_pruning = off; >>> >>> I had in mind in 10, 11 and master add a note to mention: >> >> Thanks for putting this together. >> >>> Currently, it is not recommended to have partition hierarchies more >>> than a few hundred partitions. Larger partition hierarchies can >>> suffer from slow planning times with SELECT >>> queries. Planning times for UPDATE and >>> DELETE commands may also suffer slow planning >>> times, but in addition, memory consumption may also become an issue >>> due to how the planner currently plans the query once per partition. >>> These limitations are likely to be resolved in a future version of >>> PostgreSQL. > > Can I offer the following variation: > > | Currently, it is not recommended to have partition hierarchies with more > than > | a few hundred partitions. Larger partition hierarchies may incur long > | planning time. > | In addition, UPDATE and DELETE > | commands on larger hierarchies may cause excessive memory consumption. > | These deficiencies are likely to be fixed in a future release of > | PostgreSQL. Says essentially the same thing but with fewer words, so +1. Now the question is where to put this text? Currently, we have: 5.10. Table Partitioning 5.10.1. Overview 5.10.2. Declarative Partitioning 5.10.3. Implementation Using Inheritance 5.10.4. Partition Pruning 5.10.5. Partitioning and Constraint Exclusion Should we add 5.10.6 Notes for the above "note", or should it be stuffed under one of the existing sub-headings? Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On Mon, Mar 11, 2019 at 01:06:08PM +0900, Amit Langote wrote: > On 2019/03/11 11:13, David Rowley wrote: > > On Mon, 11 Mar 2019 at 15:00, David Rowley > > wrote: > >> On Mon, 11 Mar 2019 at 14:33, Amit Langote > >> wrote: > >>> PG 11 moved the needle a bit for SELECT queries: > >>> > >>> Excluding unnecessary partitions is slow for UPDATE and DELETE queries, > >> > >> With those words I expect the user might be surprised that it's still > >> slow after doing SET enable_partition_pruning = off; > > > > I had in mind in 10, 11 and master add a note to mention: > > Thanks for putting this together. > > > Currently, it is not recommended to have partition hierarchies more > > than a few hundred partitions. Larger partition hierarchies can > > suffer from slow planning times with SELECT > > queries. Planning times for UPDATE and > > DELETE commands may also suffer slow planning > > times, but in addition, memory consumption may also become an issue > > due to how the planner currently plans the query once per partition. > > These limitations are likely to be resolved in a future version of > > PostgreSQL. Can I offer the following variation: | Currently, it is not recommended to have partition hierarchies with more than | a few hundred partitions. Larger partition hierarchies may incur long | planning time. | In addition, UPDATE and DELETE | commands on larger hierarchies may cause excessive memory consumption. | These deficiencies are likely to be fixed in a future release of | PostgreSQL.
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2019/03/11 11:13, David Rowley wrote: > On Mon, 11 Mar 2019 at 15:00, David Rowley > wrote: >> >> On Mon, 11 Mar 2019 at 14:33, Amit Langote >> wrote: >>> PG 11 moved the needle a bit for SELECT queries: >>> >>> Excluding unnecessary partitions is slow for UPDATE and DELETE queries, >> >> With those words I expect the user might be surprised that it's still >> slow after doing SET enable_partition_pruning = off; > > I had in mind in 10, 11 and master add a note to mention: Thanks for putting this together. > Currently, it is not recommended to have partition hierarchies more > than a few hundred partitions. Larger partition hierarchies can > suffer from slow planning times with SELECT > queries. Planning times for UPDATE and > DELETE commands may also suffer slow planning > times, but in addition, memory consumption may also become an issue > due to how the planner currently plans the query once per partition. > These limitations are likely to be resolved in a future version of > PostgreSQL. How about slightly rewriting the sentence toward the end as: memory consumption may also become an issue, because planner currently plans the query once for every partition. > I've not really thought too much on the fact that the issue also > exists with inheritance tables in earlier version too. That's fine maybe. Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On Mon, 11 Mar 2019 at 15:00, David Rowley wrote: > > On Mon, 11 Mar 2019 at 14:33, Amit Langote > wrote: > > PG 11 moved the needle a bit for SELECT queries: > > > > Excluding unnecessary partitions is slow for UPDATE and DELETE queries, > > With those words I expect the user might be surprised that it's still > slow after doing SET enable_partition_pruning = off; I had in mind in 10, 11 and master add a note to mention: Currently, it is not recommended to have partition hierarchies more than a few hundred partitions. Larger partition hierarchies can suffer from slow planning times with SELECT queries. Planning times for UPDATE and DELETE commands may also suffer slow planning times, but in addition, memory consumption may also become an issue due to how the planner currently plans the query once per partition. These limitations are likely to be resolved in a future version of PostgreSQL. I've not really thought too much on the fact that the issue also exists with inheritance tables in earlier version too. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2019/03/11 11:00, David Rowley wrote: > On Mon, 11 Mar 2019 at 14:33, Amit Langote > wrote: >> PG 11 moved the needle a bit for SELECT queries: >> >> Excluding unnecessary partitions is slow for UPDATE and DELETE queries, > > With those words I expect the user might be surprised that it's still > slow after doing SET enable_partition_pruning = off; > > I'm not really talking about constraint exclusion or partition > pruning. The memory growth problem the user was experiencing was down > to the fact that we plan once per partition and each of the > PlannerInfos used for each planner run has a RangeTblEntry for all > partitions. This means if you add one more partition and you get N > partitions more RangeTblEntry items in memory. This is the quadratic > memory growth that I mentioned in the -general post. Yeah, I get it. As I said in my email, all we have ever mentioned in the documentation as the reason for queries on partitioned tables being slow is that partition exclusion is slow and nothing else. Can we put quadratic memory growth during planning as the reason for performance degradation into the documentation? Maybe we could, but every time I tried it, it didn't read like user-facing documentation to me. Do you have something in mind that we could add? Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On Mon, 11 Mar 2019 at 14:33, Amit Langote wrote: > PG 11 moved the needle a bit for SELECT queries: > > Excluding unnecessary partitions is slow for UPDATE and DELETE queries, With those words I expect the user might be surprised that it's still slow after doing SET enable_partition_pruning = off; I'm not really talking about constraint exclusion or partition pruning. The memory growth problem the user was experiencing was down to the fact that we plan once per partition and each of the PlannerInfos used for each planner run has a RangeTblEntry for all partitions. This means if you add one more partition and you get N partitions more RangeTblEntry items in memory. This is the quadratic memory growth that I mentioned in the -general post. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2019/03/11 0:25, Justin Pryzby wrote: > On Sun, Mar 10, 2019 at 10:53:02PM +1300, David Rowley wrote: >> On Fri, 11 May 2018 at 17:37, Amit Langote >> wrote: >>> 5. The last sentence in caveats, that is, >>> >>> "Partitioning using these techniques will work well with up to perhaps a >>> hundred partitions; don't try to use many thousands of partitions." >>> >>> should perhaps be reworded as: >>> >>> "So the legacy inheritance based partitioning will work well with up to >>> perhaps a hundred partitions; don't try to use many thousands of >>> partitions." > >> In the -general post, I was just about to point them at the part in >> the documents that warn against these large partition hierarchies, but >> it looks like the warning was removed in bebc46931a1, or at least >> modified to say that constraint exclusion with heritance tables is >> slow. I really wonder if we shouldn't put something back in there to >> warn against this sort of thing. > > +1 > > I believe I was of the same mind when I wrote: > https://www.postgresql.org/message-id/flat/20180525215002.GD14378%40telsasoft.com#c9de33b17fe63cecad4ac30fb1662531 I agree PG 11 didn't improve things enough to have removed such a warning from the documentation even for partitioning. Actually, we only ever had a warning about constraint exclusion getting slower as more children are added, but nothing about UPDATE/DELETE planning being slow in itself; perhaps more importantly, much slower than SELECT. It seems very hard to put that in the documentation though. In PG 10: Excluding unnecessary partitions is slow, especially as the number of partitions increases, because constraint exclusion needs to look at each partition to determine whether it could be excluded. Also, planning for UPDATE and DELETE queries is significantly slower than for SELECT queries for $REASONS. Given that, it is wise to use up to a few hundred partitions but not more. PG 11 moved the needle a bit for SELECT queries: Excluding unnecessary partitions is slow for UPDATE and DELETE queries, especially as the number of partitions increases, because constraint exclusion needs to look at each partition to determine whether it could be excluded. Also, planning for UPDATE and DELETE queries is significantly slower than for SELECT queries for $REASONS. Given that, it is wise to use up to a few hundred partitions but not more. Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On Sun, Mar 10, 2019 at 10:53:02PM +1300, David Rowley wrote: > On Fri, 11 May 2018 at 17:37, Amit Langote > wrote: > > 5. The last sentence in caveats, that is, > > > > "Partitioning using these techniques will work well with up to perhaps a > > hundred partitions; don't try to use many thousands of partitions." > > > > should perhaps be reworded as: > > > > "So the legacy inheritance based partitioning will work well with up to > > perhaps a hundred partitions; don't try to use many thousands of > > partitions." > In the -general post, I was just about to point them at the part in > the documents that warn against these large partition hierarchies, but > it looks like the warning was removed in bebc46931a1, or at least > modified to say that constraint exclusion with heritance tables is > slow. I really wonder if we shouldn't put something back in there to > warn against this sort of thing. +1 I believe I was of the same mind when I wrote: https://www.postgresql.org/message-id/flat/20180525215002.GD14378%40telsasoft.com#c9de33b17fe63cecad4ac30fb1662531 Justin PS. Sorry to dredge up another 10 month old thread..
Re: Should we add GUCs to allow partition pruning to be disabled?
On Fri, 11 May 2018 at 17:37, Amit Langote wrote: > 5. The last sentence in caveats, that is, > > "Partitioning using these techniques will work well with up to perhaps a > hundred partitions; don't try to use many thousands of partitions." > > should perhaps be reworded as: > > "So the legacy inheritance based partitioning will work well with up to > perhaps a hundred partitions; don't try to use many thousands of partitions." (digging up 10-month-old thread [1]) There was a report [2] on -general today where someone had a 4000 partition partitioned table and were complaining about memory consumption in the planner during DELETE. They didn't mention the exact version they were using, but mentioned that the problem exists on 10, 11 and master. Of course, we're well aware of this issue with DELETE and UPDATE of large partition hierarchies, Amit has been working hard with trying to solve it for PG12. In the -general post, I was just about to point them at the part in the documents that warn against these large partition hierarchies, but it looks like the warning was removed in bebc46931a1, or at least modified to say that constraint exclusion with heritance tables is slow. I really wonder if we shouldn't put something back in there to warn against this sort of thing. It might be a bit late for the people who've read the docs and done it already, but a warning might at least stop new people making the mistake. Hopefully one day we can remove the warning again, but it won't be for PG12. Thoughts? [1] https://www.postgresql.org/message-id/6bc4e96a-0e30-e9b6-dcc7-791c7486a491%40lab.ntt.co.jp [2] https://www.postgresql.org/message-id/739b7a5e-1192-1011-5aa2-41adad55682d%40perfexpert.ch -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, May 24, 2018 at 02:23:17PM -0400, Alvaro Herrera wrote: > On 2018-May-24, Bruce Momjian wrote: > > > On Wed, May 2, 2018 at 07:49:42PM -0700, David G. Johnston wrote: > > > > I toss my +1 to removing it altogether. > > > > +1 We are terrible at removing old GUCs and having it around means > > everyone has to decide if they need to change it, so having it is not a > > zero cost. > > Are you voting to remove the GUC? Yes. Sorry but I am very late on this and maybe too late to vote. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2018-May-24, Bruce Momjian wrote: > On Wed, May 2, 2018 at 07:49:42PM -0700, David G. Johnston wrote: > > I toss my +1 to removing it altogether. > > +1 We are terrible at removing old GUCs and having it around means > everyone has to decide if they need to change it, so having it is not a > zero cost. Are you voting to remove the GUC? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Wed, May 2, 2018 at 07:49:42PM -0700, David G. Johnston wrote: > On Wednesday, May 2, 2018, Alvaro Herrera wrote: > > Robert Haas wrote: > > On Wed, May 2, 2018 at 9:28 AM, Alvaro Herrera > > wrote: > > > I admit I am more concerned about the possibility of bugs than I am > > > about providing a performance-related tool. > > > > I agree that if partition pruning has bugs, somebody might want to > > turn it off. On the other hand, when they do, there's a good chance > > that they will lose so much performance that they'll still be pretty > > sad. Somebody certainly could have a workload where the pruning > > helps, but by a small enough amount that shutting it off is > > acceptable. But I suspect that's a somewhat narrow target. > > > > I'm not going to go to war over this, though. I'm just telling you > > what I think. > > Well, we didn't have a GUC initially, evidently because none of us > thought that this would be a huge problem. So maybe you're both right > and it's overkill to have it. I'm not set on having it, either. Does > anybody else have an opinion? > > > I toss my +1 to removing it altogether. +1 We are terrible at removing old GUCs and having it around means everyone has to decide if they need to change it, so having it is not a zero cost. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: Should we add GUCs to allow partition pruning to be disabled?
On Sat, May 19, 2018 at 5:02 AM, Robert Haas wrote: > On Fri, May 18, 2018 at 4:22 AM, Amit Langote > wrote: >> Yeah, I think it'd help to have Append be annotated as suggested by Robert >> above. I guess if "at executor startup" is shown, then the subnodes >> listed under Append will consist of only those that survived >> executor-startup pruning and thus will help understand why there are fewer >> than shown with EXPLAIN (without ANALYZE). Also, if "at runtime" is >> shown, a user may want look at nloops property of the individual subnodes >> to guess at how much pruning has occurred; although only the latter (that >> is, inspecting nloops) suffices to know that runtime pruning has occurred >> as also currently written in the documentation about pruning [1], the >> first piece of information (the "at runtime" annotation) seems nice to have. > > Having EXPLAIN and EXPLAIN ANALYZE show different things doesn't sound > like a good idea. Hmm yeah. I think I was misunderstanding how executor-startup pruning works when I wrote: ...and thus will help understand why there are fewer than shown with EXPLAIN (without ANALYZE). Actually, because ExecInitAppend would run for both EXPLAIN and EXPLAIN ANALYZE, executor-startup pruning should occur in both cases and will result in the same plan shape to be shown. Sorry about the confusion. Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
I reread this and have some more comments. https://www.postgresql.org/docs/devel/static/ddl-partitioning.html "however, it is not possible to use some of the inheritance features discussed in the previous section with partitioned tables and partitions" => The referenced section now follows rather than precedes the text; I suggest to say: "however, it is not possible to use some features of inheritance (discussed below) with declaratively partitioned tables or their partitions" "It is neither possible to specify columns when creating partitions with CREATE TABLE nor is it possible to add columns to partitions after-the-fact using ALTER TABLE" => change to: "It is not possible .. nor is it possible .." Immediately after the section in inheritence: https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#DDL-PARTITION-PRUNING "Partition pruning is a query optimization technique that improves performance for partitioned tables" => I think should say "improves performance for DECLARATIVELY partitioned tables" "You can use the EXPLAIN command to show the difference between a plan whose partitions have been pruned from one whose partitions haven't, by using the enable_partition_pruning configuration parameter. A typical unoptimized plan for this type of table setup is:" => should say "difference between .. AND", not FROM. => Also, should avoid repeating "use...using". Also, remove the comma or rearrange the sentence: "By using the EXPLAIN command and the enable_partition_pruning configuration parameter, one can show the difference between a plan whose partitions have been pruned from one whose partitions haven't. "Constraint exclusion is only applied during query planning; it is not applied at execution time like partition pruning does." => Remove "does" ? "Partitioning enforces a rule that all partitions must have exactly the same set of columns as the parent" => I think should say "Declarative partitioning enforces"; or maybe: "Partitions of a partitioned table must have exactly the same set of columns as the parent" or: "For declarative partitioning, partitions must have exactly the same set of columns as the partitioned table" Let me know if it's useful to provide a patch. Justin
Re: Should we add GUCs to allow partition pruning to be disabled?
On Fri, May 18, 2018 at 4:22 AM, Amit Langote wrote: > Yeah, I think it'd help to have Append be annotated as suggested by Robert > above. I guess if "at executor startup" is shown, then the subnodes > listed under Append will consist of only those that survived > executor-startup pruning and thus will help understand why there are fewer > than shown with EXPLAIN (without ANALYZE). Also, if "at runtime" is > shown, a user may want look at nloops property of the individual subnodes > to guess at how much pruning has occurred; although only the latter (that > is, inspecting nloops) suffices to know that runtime pruning has occurred > as also currently written in the documentation about pruning [1], the > first piece of information (the "at runtime" annotation) seems nice to have. Having EXPLAIN and EXPLAIN ANALYZE show different things doesn't sound like a good idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2018/05/17 23:24, Robert Haas wrote: > On Thu, May 17, 2018 at 12:04 AM, David Rowley wrote: >> I'm not really a fan of overloading properties with a bunch of text. >> Multiple int or text properties would be easier to deal with, >> especially so when you consider the other explain formats. Remember, >> all 3 pruning methods could be used for a single Append node. > > I was imagining it as two properties in non-text format that got > displayed in a special way in text mode. I intended that this would > only give information about execution-time pruning, so there would > only two methods to consider here, but, yeah, you might have something > like: > > Execution-Time Pruning: order_lines (at executor startup, at runtime) This looks short enough and useful. >> I guess doing work here would require additional code in the planner >> to track how many relations were removed by both partition pruning and >> constraint exclusion. Dunno if that would be tracked together or >> separately. However, I'd prefer to have a clear idea of what exactly >> the design should be before I go write some code that perhaps nobody >> will like. > > I don't feel strongly about adding more code to track the number of > removed partitions. I think that the important thing is whether or > not partitioning is happening and at what stage, and I think it's > useful to show the relation name if we can. As you pointed out, it's > largely possible already to figure out how well we did at removing > stuff and at which stages, but to me it seems quite easy to be > confused about which stages tried to remove things. For example, > consider: > > Gather > -> Nested Loop > -> Seq Scan > Filter: something > -> Append > -> Index Scan > -> Index Scan > -> Index Scan > > I think it's going to be quite tricky to figure out whether that > Append node is trying to do execution-time pruning without some > annotation. The nloops values are going to be affected by how many > rows are in which partitions and how many workers got which rows as > well as by whether execution-time pruning worked and how effectively. > You might be able to figure out it out by staring at the EXPLAIN > output for a while... but it sure seems like it would be a lot nicer > to have an explicit indicator... especially if you're some random user > rather than a PostgreSQL expect. Yeah, I think it'd help to have Append be annotated as suggested by Robert above. I guess if "at executor startup" is shown, then the subnodes listed under Append will consist of only those that survived executor-startup pruning and thus will help understand why there are fewer than shown with EXPLAIN (without ANALYZE). Also, if "at runtime" is shown, a user may want look at nloops property of the individual subnodes to guess at how much pruning has occurred; although only the latter (that is, inspecting nloops) suffices to know that runtime pruning has occurred as also currently written in the documentation about pruning [1], the first piece of information (the "at runtime" annotation) seems nice to have. Thanks, Amit [1] https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#DDL-PARTITION-PRUNING
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, May 17, 2018 at 12:04 AM, David Rowley wrote: >> Append >> Execution-Time Pruning: order_lines (at executor startup) >> -> Index Scan ... > > Perhaps Append should be shown as "Unordered Partitioned Table Scan on > ". That seems more aligned to how else we show which relation a > node belongs to. The partition being scanned is simple to obtain. It's > just the first item in the partitioned_rels List. (MergeAppend would > be an "Ordered Partitioned Table Scan") Hmm, that's a radical proposal but I'm not sure I like it. For one thing, table scan might mean sequential scan to some users. For another, it's not really unordered. Unless it's parallel-aware, we're going to scan them strictly in the order they're given. > I'm not really a fan of overloading properties with a bunch of text. > Multiple int or text properties would be easier to deal with, > especially so when you consider the other explain formats. Remember, > all 3 pruning methods could be used for a single Append node. I was imagining it as two properties in non-text format that got displayed in a special way in text mode. I intended that this would only give information about execution-time pruning, so there would only two methods to consider here, but, yeah, you might have something like: Execution-Time Pruning: order_lines (at executor startup, at runtime) > I guess doing work here would require additional code in the planner > to track how many relations were removed by both partition pruning and > constraint exclusion. Dunno if that would be tracked together or > separately. However, I'd prefer to have a clear idea of what exactly > the design should be before I go write some code that perhaps nobody > will like. I don't feel strongly about adding more code to track the number of removed partitions. I think that the important thing is whether or not partitioning is happening and at what stage, and I think it's useful to show the relation name if we can. As you pointed out, it's largely possible already to figure out how well we did at removing stuff and at which stages, but to me it seems quite easy to be confused about which stages tried to remove things. For example, consider: Gather -> Nested Loop -> Seq Scan Filter: something -> Append -> Index Scan -> Index Scan -> Index Scan I think it's going to be quite tricky to figure out whether that Append node is trying to do execution-time pruning without some annotation. The nloops values are going to be affected by how many rows are in which partitions and how many workers got which rows as well as by whether execution-time pruning worked and how effectively. You might be able to figure out it out by staring at the EXPLAIN output for a while... but it sure seems like it would be a lot nicer to have an explicit indicator... especially if you're some random user rather than a PostgreSQL expect. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we add GUCs to allow partition pruning to be disabled?
On 17 May 2018 at 01:19, Robert Haas wrote: > Hmm, that's actually not as bad as I thought. Thanks for the > explanation. I think if I were going to try to improve things, I'd > try to annotate the Append node with the name of the partitioned table > that it's using for pruning in case #2 and case #3, and maybe > something to indicate which type of pruning is in use. That would > make it really clear whether pruning is enabled or not. The methods > you mention above sort of require reading the tea leaves -- and it > might not always be very easy to distinguish between cases where > pruning is possible but nothing got pruned (imagine an inequality > qual) and where it's not even possible in the first place. > > e.g. > > Append > Execution-Time Pruning: order_lines (at executor startup) > -> Index Scan ... Perhaps Append should be shown as "Unordered Partitioned Table Scan on ". That seems more aligned to how else we show which relation a node belongs to. The partition being scanned is simple to obtain. It's just the first item in the partitioned_rels List. (MergeAppend would be an "Ordered Partitioned Table Scan") I'm not really a fan of overloading properties with a bunch of text. Multiple int or text properties would be easier to deal with, especially so when you consider the other explain formats. Remember, all 3 pruning methods could be used for a single Append node. I guess doing work here would require additional code in the planner to track how many relations were removed by both partition pruning and constraint exclusion. Dunno if that would be tracked together or separately. However, I'd prefer to have a clear idea of what exactly the design should be before I go write some code that perhaps nobody will like. Unsure what you have in mind for the pruning done during actual execution; just a yay or nay as to whether we're attempting it or not? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, May 10, 2018 at 10:22 PM, David Rowley wrote: > Here's a recap of the current way to determine where the pruning occurred: > > Phase 1: Plan time pruning: > > EXPLAIN/EXPLAIN ANALYZE shows Append/MergeAppend/ModifyTable shows > fewer subnodes than there are partitions. > Both EXPLAIN and EXPLAIN ANALYZE output gives no indication of any pruning. > > Phase 2: Executor init pruning: > > EXPLAIN and EXPLAIN ANALYZE shows Append with fewer subnodes than > there are partitions + "Subplans Removed: " appears to indicate the > number of subnodes removed by this phase. > > MergeAppend and ModifyTable are unsupported in PG11. > > Phase 3: Executor run pruning: > > EXPLAIN/EXPLAIN ANALYZE shows all nodes that survived phase 1+2. > > EXPLAIN ANALYZE shows that if a given node was never executed then the > runtime times appear as "(never executed)". If the Append was executed > and a subnode the Append was "(never executed)" then it was pruned by > this phase. Hmm, that's actually not as bad as I thought. Thanks for the explanation. I think if I were going to try to improve things, I'd try to annotate the Append node with the name of the partitioned table that it's using for pruning in case #2 and case #3, and maybe something to indicate which type of pruning is in use. That would make it really clear whether pruning is enabled or not. The methods you mention above sort of require reading the tea leaves -- and it might not always be very easy to distinguish between cases where pruning is possible but nothing got pruned (imagine an inequality qual) and where it's not even possible in the first place. e.g. Append Execution-Time Pruning: order_lines (at executor startup) -> Index Scan ... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2018/05/14 9:55, David Rowley wrote: > On 13 May 2018 at 03:30, Alvaro Herrera wrote: >> Amit Langote wrote: >> >>> +1 to this more radical overhaul of this part of the documentation. >> >> Thanks. I pushed now after some more tweaking, > > Thanks for pushing. Thank you. Regards, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On 13 May 2018 at 03:30, Alvaro Herrera wrote: > Amit Langote wrote: > >> +1 to this more radical overhaul of this part of the documentation. > > Thanks. I pushed now after some more tweaking, Thanks for pushing. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
Hello Amit Langote wrote: > +1 to this more radical overhaul of this part of the documentation. Thanks. I pushed now after some more tweaking, including your suggested corrections. I removed the examples, because they were both wrong. We can give this more polish if anybody has the energy, but I think we're in a pretty decent place now. I'm not convinced that we need to show so much detail on pruning as proposed by Robert elsewhere; we didn't have a lot of detail for exclusion either and I don't have any evidence that it was a terrible problem for users. Also, one possible use of the new GUC is that you can compare plans if you so wish. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Fri, May 11, 2018 at 12:59:27PM +0900, Amit Langote wrote: > On 2018/05/11 2:13, Robert Haas wrote: >> On Thu, May 10, 2018 at 12:58 PM, Alvaro Herrera >> wrote: >>> David G. Johnston wrote: As a user I don't really need to know which model is implemented and the name doesn't necessarily imply the implementation. Pruning seems to be the commonly-used term for this feature and we should stick with that. >>> >>> I agree with this conclusion. So we have it right and we shouldn't >>> change it. >> >> +1. > > +1 from me too. +1. -- Michael signature.asc Description: PGP signature
Re: Should we add GUCs to allow partition pruning to be disabled?
Hi. On 2018/05/11 4:45, Alvaro Herrera wrote: > I'm thinking something like this. +1 to this more radical overhaul of this part of the documentation. > The examples for runtime pruning are lame -- in the first, the text says > "watch out for Subplans Removed" and then the example provided doesn't > show one. (That example is probably exercising the wrong thing.) > > Anyway, wording suggestions for 5.10.4 and 5.10.5 in the attached file > are welcome. A few comments. 1. At the beginning of 5.10.4, in this example EXPLAIN's output: SET enable_partition_pruning = off; EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01'; There used to be [1] ellipses to show discontinuation between partitions shown in the output plan, which no longer exists. Should be like this: -> Seq Scan on measurement_y2006m03 (cost=0.00..33.12 rows=617 width=0) Filter: (logdate >= '2008-01-01'::date) ... -> Seq Scan on measurement_y2007m11 (cost=0.00..33.12 rows=617 width=0) Filter: (logdate >= '2008-01-01'::date) 2. In the following sentence in 5.10.5 "Constraint exclusion works in a very similar way to partition pruning, except that it uses each table's CHECK constraints — which gives it its name — instead of the partitioning constraints, as with partition pruning. Another difference is that it is only applied at plan time; there is no attempt to remove partitions at execution time." I think that saying "instead of the partitioning constraints, as with partition pruning" here may be a bit misleading, because it may give readers an impression that *all* tables have a partitioning constraint but constraint exclusion ignores it in favor of using CHECK constraints. How about saying: whereas partition pruning uses a table's partitioning constraint which exists only in the case of declarative partitioning. 3. Do we want the following sentence 5.10.5 to be revised now? "The default (and recommended) setting of constraint_exclusion is actually neither on nor off, but an intermediate setting called partition, which causes the technique to be applied only to queries that are likely to be working on inheritance partitioned tables." I'm not sure if it's the time yet, but maybe we would want to recommend "on" and mention that users may want to switch to "partition" if they need to use legacy inheritance partitioning for one reason or another. 4. In the following sentence in the caveats part of 5.10.5. Partitioning and Constraint Exclusion "A good rule of thumb is that partitioning constraints should contain only comparisons of the partitioning column(s) to constants using B-tree-indexable operators, which applies even to partitioned tables, because only B-tree-indexable column(s) are allowed in the partition key." The part beginning with ", which applies even to partitioned tables" is no longer needed as I had pointed out upthread [2]. The reason is we no longer pass the partition key derived partition constraints to constraint exclusion algorithm, as the new pruning covers that base. 5. The last sentence in caveats, that is, "Partitioning using these techniques will work well with up to perhaps a hundred partitions; don't try to use many thousands of partitions." should perhaps be reworded as: "So the legacy inheritance based partitioning will work well with up to perhaps a hundred partitions; don't try to use many thousands of partitions." Thanks, Amit [1] https://www.postgresql.org/docs/10/static/ddl-partitioning.html [2] https://www.postgresql.org/message-id/a8ad3dd8-ef30-bbd0-6732-a673710378fa%40lab.ntt.co.jp
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2018/05/11 2:13, Robert Haas wrote: > On Thu, May 10, 2018 at 12:58 PM, Alvaro Herrera > wrote: >> David G. Johnston wrote: >>> As a user I don't really need to know which model is implemented and the >>> name doesn't necessarily imply the implementation. Pruning seems to be the >>> commonly-used term for this feature and we should stick with that. >> >> I agree with this conclusion. So we have it right and we shouldn't >> change it. > > +1. +1 from me too. Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On 11 May 2018 at 08:05, Robert Haas wrote: > > On Thu, May 10, 2018 at 3:45 PM, Alvaro Herrera > wrote: > > The examples for runtime pruning are lame -- in the first, the text says > > "watch out for Subplans Removed" and then the example provided doesn't > > show one. (That example is probably exercising the wrong thing.) > > It seems to me that EXPLAIN output should have a clear way to show -- > and to distinguish -- (1) plan-time pruning, (2) executor startup time > pruning, (3) mid-execution pruning. I don't think that's entirely the > case right now. I'm open to improving this, but I've just not come up with any bright ideas on how to, yet. Here's a recap of the current way to determine where the pruning occurred: Phase 1: Plan time pruning: EXPLAIN/EXPLAIN ANALYZE shows Append/MergeAppend/ModifyTable shows fewer subnodes than there are partitions. Both EXPLAIN and EXPLAIN ANALYZE output gives no indication of any pruning. Phase 2: Executor init pruning: EXPLAIN and EXPLAIN ANALYZE shows Append with fewer subnodes than there are partitions + "Subplans Removed: " appears to indicate the number of subnodes removed by this phase. MergeAppend and ModifyTable are unsupported in PG11. Phase 3: Executor run pruning: EXPLAIN/EXPLAIN ANALYZE shows all nodes that survived phase 1+2. EXPLAIN ANALYZE shows that if a given node was never executed then the runtime times appear as "(never executed)". If the Append was executed and a subnode the Append was "(never executed)" then it was pruned by this phase. Changing parameters may cause some nodes to be scanned fewer times than other nodes. The "nloops" count being lower than the nloop count of the Append indicates this. e.g nloops=5 on an Append subnode vs nloops=8 on the Append node indicates the node was eliminated 3 times. Although complications around Parallel Append could make it quite difficult to count nloops, since a node running a partial plan could be executed by may workers which would increase the nloops. Solutions? The best I can think of right now is to add 2 more int properties to the EXPLAIN output: 1. Subplans removed by plan-time constraints exclusion: N 2. Subplans removed by plan-time partition pruning: N The rename the "Subplans Removed" that's there today to "Subplans removed by run-time pruning" These names are not very good, also. I'm also not very excited about adding this. This also does nothing for phase 3. Would something like that address your concern? Or do you have another idea? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, May 10, 2018 at 3:45 PM, Alvaro Herrera wrote: > The examples for runtime pruning are lame -- in the first, the text says > "watch out for Subplans Removed" and then the example provided doesn't > show one. (That example is probably exercising the wrong thing.) It seems to me that EXPLAIN output should have a clear way to show -- and to distinguish -- (1) plan-time pruning, (2) executor startup time pruning, (3) mid-execution pruning. I don't think that's entirely the case right now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we add GUCs to allow partition pruning to be disabled?
I'm thinking something like this. The examples for runtime pruning are lame -- in the first, the text says "watch out for Subplans Removed" and then the example provided doesn't show one. (That example is probably exercising the wrong thing.) Anyway, wording suggestions for 5.10.4 and 5.10.5 in the attached file are welcome. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services Title: 5.10. Table Partitioning 5.10. Table PartitioningPrev UpChapter 5. Data DefinitionHome Next5.10. Table Partitioning5.10.1. Overview5.10.2. Declarative Partitioning5.10.3. Implementation Using Inheritance5.10.4. Partitioning and Partition Pruning5.10.5. Partitioning and Constraint Exclusion PostgreSQL supports basic table partitioning. This section describes why and how to implement partitioning as part of your database design. 5.10.1. Overview Partitioning refers to splitting what is logically one large table into smaller physical pieces. Partitioning can provide several benefits: Query performance can be improved dramatically in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory. When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index and random access reads scattered across the whole table. Bulk loads and deletes can be accomplished by adding or removing partitions, if that requirement is planned into the partitioning design. Doing ALTER TABLE DETACH PARTITION or dropping an individual partition using DROP TABLE is far faster than a bulk operation. These commands also entirely avoid the VACUUM overhead caused by a bulk DELETE. Seldom-used data can be migrated to cheaper and slower storage media. The benefits will normally be worthwhile only when a table would otherwise be very large. The exact point at which a table will benefit from partitioning depends on the application, although a rule of thumb is that the size of the table should exceed the physical memory of the database server. PostgreSQL offers built-in support for the following forms of partitioning: Range Partitioning The table is partitioned into “ranges” defined by a key column or set of columns, with no overlap between the ranges of values assigned to different partitions. For example, one might partition by date ranges, or by ranges of identifiers for particular business objects. List Partitioning The table is partitioned by explicitly listing which key values appear in each partition. Hash Partitioning The table is partitioned by specifying a modulus and a remainder for each partition. Each partition will hold the rows for which the hash value of the partition key divided by the specified modulus will produce the specified remainder. If your application needs to use other forms of partitioning not listed above, alternative methods such as inheritance and UNION ALL views can be used instead. Such methods offer flexibility but do not have some of the performance benefits of built-in declarative partitioning. 5.10.2. Declarative Partitioning PostgreSQL offers a way to specify how to divide a table into pieces called partitions. The table that is divided is referred to as a partitioned table. The specification consists of the partitioning method and a list of columns or expressions to be used as the partition key. All rows inserted into a partitioned table will be routed to one of the partitions based on the value of the partition key. Each partition has a subset of the data defined by its partition bounds. The currently supported partitioning methods are range, list, and hash. Partitions may themselves be defined as partitioned tables, using what is called sub-partitioning. Partitions may have their own indexes, constraints and default values, distinct from those of other partitions. See CREATE TABLE for more details on creating partitioned tables and partitions. It is not possible to turn a regular table into a partitioned table or vice versa. However, it is possible to add a regular or partitioned table containing data as a partition of a partitioned table, or remove a partition from a partitioned table turning it into a stand
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, May 10, 2018 at 1:51 PM, Alvaro Herrera wrote: > David G. Johnston wrote: >> Seems like if it stays the name is good - but at this point no has voiced >> opposition to removing it and making the name a moot point. > > If we think the probability of bugs is 0%, then I'm all for removing it. > I don't. I vote to remove the GUC in a couple of releases, once it's > proven completely useless. No feature ever written has a 0% probability of bugs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we add GUCs to allow partition pruning to be disabled?
David G. Johnston wrote: > Seems like if it stays the name is good - but at this point no has voiced > opposition to removing it and making the name a moot point. If we think the probability of bugs is 0%, then I'm all for removing it. I don't. I vote to remove the GUC in a couple of releases, once it's proven completely useless. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, May 10, 2018 at 10:13 AM, Robert Haas wrote: > On Thu, May 10, 2018 at 12:58 PM, Alvaro Herrera > wrote: > > David G. Johnston wrote: > >> As a user I don't really need to know which model is implemented and the > >> name doesn't necessarily imply the implementation. Pruning seems to be > the > >> commonly-used term for this feature and we should stick with that. > > > > I agree with this conclusion. So we have it right and we shouldn't > > change it. > > +1. > > Seems like if it stays the name is good - but at this point no has voiced opposition to removing it and making the name a moot point. David J.
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, May 10, 2018 at 12:58 PM, Alvaro Herrera wrote: > David G. Johnston wrote: >> As a user I don't really need to know which model is implemented and the >> name doesn't necessarily imply the implementation. Pruning seems to be the >> commonly-used term for this feature and we should stick with that. > > I agree with this conclusion. So we have it right and we shouldn't > change it. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we add GUCs to allow partition pruning to be disabled?
David G. Johnston wrote: > As a user I don't really need to know which model is implemented and the > name doesn't necessarily imply the implementation. Pruning seems to be the > commonly-used term for this feature and we should stick with that. I agree with this conclusion. So we have it right and we shouldn't change it. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, May 10, 2018 at 8:57 AM, Alvaro Herrera wrote: > b) by default, no partitions are > scanned, and we examine the query to determine which ones must be > scanned. > There is an element of logic that says "by default, no partitions are scanned" is not a reasonable behavior mode. Thus an alternative analogy would be: Bucket A is the set of all relevant partitions in the tree Pruning: remove from bucket A those which we know we can skip; then iterate over A Selection: choose those items from A that are possible holders of our data and process each one (place all selected items into bucket B and iterate over B if you want to perform selection in total first). As a user I don't really need to know which model is implemented and the name doesn't necessarily imply the implementation. Pruning seems to be the commonly-used term for this feature and we should stick with that. David J.
Re: Should we add GUCs to allow partition pruning to be disabled?
David Rowley wrote: > On 1 May 2018 at 21:44, Amit Langote wrote: > > About the patch in general, it seems like the newly added documentation > > talks about "Partition Pruning" as something that *replaces* constraint > > exclusion. But, I think "Partition Pruning" is not the thing that > > replaces constraint exclusion. > > Just thinking about this a bit more. I've become a bit concerned that > we've completely misnamed this feature. It's true that at the moment > we build RelOptInfos for all partitions then eliminate what we can, > but the new algorithm that we've been calling "partition pruning" is > really not pruning anything at all, it's selecting the smallest set of > matching partitions. It's only the current usage of the algorithm > that's using it that way, and I kinda hope to change that for PG12. > > Isn't the whole thing better to be named "partition selection"? I think that approach makes it more difficult to explain, not less so. There are two logically opposite ways to explain this feature: a) by default, all partitions must be scanned, and we examine the query to determine which ones can be pruned. b) by default, no partitions are scanned, and we examine the query to determine which ones must be scanned. The whole "enable_partition_pruning" thing is based on the idea that we do a). You propose that we do b) instead. The only difference is what happens if the feature is disabled -- the "by default" clause gets inverted. So it would have to be b) if the feature is enabled, by default no partitions are scanned, and we examine the query to determine which ones must be scanned; if the feature is disabled, all partitions must be scanned. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
Robert Haas wrote: > In defense of constraint exclusion, let me note that constraint > exclusion is not restricted to inheritance cases. It could eliminate > the need to scan a completely unpartitioned table if the WHERE clause > can be refuted by CHECK constraints. It could eliminate the need to > scan some partitions of a partitioned table based on whatever > additional CHECK constraints exist beyond the partitioning > constraints. This is a great point that hadn't occurred to me. It means that we should keep constraint exclusion on its own rather than relegate it to , as my proposed patch does. I think it's a good idea to add this point there too. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Wed, May 9, 2018 at 10:10 PM, David Rowley wrote: > On 10 May 2018 at 14:01, Alvaro Herrera wrote: >> I'm thinking something a bit more radical. First, since partition >> pruning is the future and constraint exclusion is soon to be a thing of >> the past, we should describe pruning first, and then describe exclusion >> in terms of pruning. > > But... that's not true. The chapter describes inheritance partitioned > tables too, and we're not getting rid of constraint exclusion because > it's needed for those. However, that might not mean your patch has to > be changed. I'd better have a look... I agree that constraint exclusion isn't going to die any time soon, but I think Alvaro is right to say that we should explain the new partition pruning technique first, and then later explain, hey, we have this constraint exclusion thing, too. It takes more work to reorganize the documentation along those lines, but nobody wants to read about the techniques in the order we implemented them. They want to read about the important stuff first, and in this case, that's the new form of partition pruning. In defense of constraint exclusion, let me note that constraint exclusion is not restricted to inheritance cases. It could eliminate the need to scan a completely unpartitioned table if the WHERE clause can be refuted by CHECK constraints. It could eliminate the need to scan some partitions of a partitioned table based on whatever additional CHECK constraints exist beyond the partitioning constraints. These are less likely scenarios, perhaps, but not out of the question. For example, imagine a partitioned order table that is range-partitioned by order ID. You could add CHECK constraints based on the order_date that appears in each partition, and then constraint exclusion could eliminate partitions based on quals related to order_date. The order date correlates with the order ID, but partition pruning doesn't know that, so it can can only help with quals based on order ID. Constraint exclusion doesn't have that restriction. That's potentially useful, I think, although BRIN indexes on each partition are another way to tackle this sort of problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we add GUCs to allow partition pruning to be disabled?
David Rowley wrote: > On 10 May 2018 at 14:01, Alvaro Herrera wrote: > > I'm thinking something a bit more radical. First, since partition > > pruning is the future and constraint exclusion is soon to be a thing of > > the past, we should describe pruning first, and then describe exclusion > > in terms of pruning. > > But... that's not true. The chapter describes inheritance partitioned > tables too, and we're not getting rid of constraint exclusion because > it's needed for those. Oh, I'm sure it is, but nobody is going to set up new inheritance partitioned tables anymore, except people who pg_upgrade from older releases. (And while I haven't tried, I'm sure it's possible to migrate from old-style to new-style partitioned tables without incurring full table rewrites, with little downside and lots to gain.) Now, maybe you argue that we could have a structure like this instead: 5.10.1. Overview 5.10.2. Declarative Partitioning 5.10.3. Partition Pruning 5.10.4. Implementation Using Inheritance 5.10.5. Constraint Exclusion I wouldn't oppose that. > However, that might not mean your patch has to be changed. I'd better > have a look... Thanks :-) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On 10 May 2018 at 14:01, Alvaro Herrera wrote: > I'm thinking something a bit more radical. First, since partition > pruning is the future and constraint exclusion is soon to be a thing of > the past, we should describe pruning first, and then describe exclusion > in terms of pruning. But... that's not true. The chapter describes inheritance partitioned tables too, and we're not getting rid of constraint exclusion because it's needed for those. However, that might not mean your patch has to be changed. I'd better have a look... -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
David Rowley wrote: > Thanks for reviewing again. Hi, I'm thinking something a bit more radical. First, since partition pruning is the future and constraint exclusion is soon to be a thing of the past, we should describe pruning first, and then describe exclusion in terms of pruning. Second, I'd put constraint exclusion as a inside the that describes pruning (but keep the XML "id" the same, so that old links continue to work.) I took a stab at this, but ran out of time before trimming the text for constraint exclusion. What do you think of this rough sketch? I'm thinking 5.10.4 is close to its final form (wording suggestions of course welcome), but 5.10.4.1 still needs to be trimmed heavily, to avoid repeating what was already explained in 5.10.4 (we need only explain how exclusion differs from pruning.) I'm a bit undecided on where to leave the . (Note: make -C doc/src/sgml html XSLTPROCFLAGS='--stringparam rootid ddl' builds only the 'ddl' chapter, which is nice when proofreading.) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 3f3f567222..2152b4d16d 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -3759,7 +3759,151 @@ ANALYZE measurement; - + + Partition Pruning + + +partition pruning + + + +Partition pruning is a query optimization technique +that improves performance for partitioned tables. As an example: + + +SET enable_partition_pruning = on;-- the default +SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01'; + + +Without partition pruning, the above query would scan each of the +the partitions of the measurement table. With +partition pruning enabled, the planner will examine the definition of each +partition and prove that the partition need not +be scanned because it could not contain any rows meeting the query's +WHERE clause. When the planner can prove this, it +excludes the partition from the query plan. + + + +You can use the EXPLAIN command to show the difference +between a plan with enable_partition_pruning on and a plan +with it off. A typical unoptimized plan for this type of table setup is: + + +SET enable_partition_pruning = off; +EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01'; +QUERY PLAN +─── + Aggregate (cost=188.76..188.77 rows=1 width=8) + -> Append (cost=0.00..181.05 rows=3085 width=0) + -> Seq Scan on measurement_y2006m02 (cost=0.00..33.12 rows=617 width=0) + Filter: (logdate >= '2008-01-01'::date) + -> Seq Scan on measurement_y2006m03 (cost=0.00..33.12 rows=617 width=0) + Filter: (logdate >= '2008-01-01'::date) + -> Seq Scan on measurement_y2007m11 (cost=0.00..33.12 rows=617 width=0) + Filter: (logdate >= '2008-01-01'::date) + -> Seq Scan on measurement_y2007m12 (cost=0.00..33.12 rows=617 width=0) + Filter: (logdate >= '2008-01-01'::date) + -> Seq Scan on measurement_y2008m01 (cost=0.00..33.12 rows=617 width=0) + Filter: (logdate >= '2008-01-01'::date) + + +Some or all of the partitions might use index scans instead of +full-table sequential scans, but the point here is that there +is no need to scan the older partitions at all to answer this query. +When we enable partition pruning, we get a significantly +cheaper plan that will deliver the same answer: + + +SET enable_partition_pruning = on; +EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01'; +QUERY PLAN +─── + Aggregate (cost=37.75..37.76 rows=1 width=8) + -> Append (cost=0.00..36.21 rows=617 width=0) + -> Seq Scan on measurement_y2008m01 (cost=0.00..33.12 rows=617 width=0) + Filter: (logdate >= '2008-01-01'::date) + + + + +Note that partition pruning is driven only by the constraints defined by +the partition keys, not by the presence of indexes. Therefore it isn't +necessary to define indexes on the key columns. Whether an index +needs to be created for a given partition depends on whether you +expect that queries that scan the partition will generally scan +a large part of the partition or just a small part. An index will +be helpful in the latter case but not the former. + + + +Partition pruning +can be performed not only during the planning of a given query, but also +during its execution. This is useful as it can allow more partitions to +
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2018/05/09 13:14, Amit Langote wrote: > Hi David. > > Thanks for addressing my comments. > > On 2018/05/07 15:00, David Rowley wrote: >> v2 patch is attached. > > Looks good to me. Sorry, I should've seen noticed v3 before sending my email. v3 looks good too, but when going through it, I noticed one bit in 5.10.4. Partitioning and Constraint Exclusion: A good rule of thumb is that partitioning constraints should contain only comparisons of the partitioning column(s) to constants using B-tree-indexable operators, which applies even to partitioned tables, because only B-tree-indexable column(s) are allowed in the partition key. I think the part after ", which applies even to partitioned tables,.." should be removed. Attached find the updated patch. Thanks, Amit diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index ffea744cb8..76606a8535 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -3841,6 +3841,11 @@ ANY num_sync ( for more information +on partition pruning and partitioning. + diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 004ecacbbf..d02edd771f 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -3760,7 +3760,7 @@ ANALYZE measurement; - Partitioning and Constraint Exclusion + Inheritance Partitioning and Constraint Exclusion constraint exclusion @@ -3768,9 +3768,8 @@ ANALYZE measurement; Constraint exclusion is a query optimization technique -that improves performance for partitioned tables defined in the -fashion described above (both declaratively partitioned tables and those -implemented using inheritance). As an example: +that improves performance for inheritance partitioned tables defined in the +fashion described above. As an example: SET constraint_exclusion = on; @@ -3847,15 +3846,14 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01'; is actually neither on nor off, but an intermediate setting called partition, which causes the technique to be -applied only to queries that are likely to be working on partitioned +applied only to queries that are likely to be working on inheritance partitioned tables. The on setting causes the planner to examine CHECK constraints in all queries, even simple ones that are unlikely to benefit. -The following caveats apply to constraint exclusion, which is used by -both inheritance and partitioned tables: +The following caveats apply to constraint exclusion: @@ -3877,11 +3875,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01'; range tests for range partitioning, as illustrated in the preceding examples. A good rule of thumb is that partitioning constraints should contain only comparisons of the partitioning column(s) to constants - using B-tree-indexable operators, which applies even to partitioned - tables, because only B-tree-indexable column(s) are allowed in the - partition key. (This is not a problem when using declarative - partitioning, since the automatically generated constraints are simple - enough to be understood by the planner.) + using B-tree-indexable operators. @@ -3898,6 +3892,94 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01'; + + + Declarative Partitioning and Partition Pruning + + +partition pruning + + + +Partition pruning is a query optimization technique +similar to constraint exclusion, but applies only to declaratively +partitioned tables. Like constraint exclusion, this uses (but is not +limited to using) the query's WHERE clause to exclude +partitions which cannot possibly contain any matching records. + + + +Partition pruning is much more efficient than constraint exclusion, since +it avoids scanning each partition's metadata to determine if the partition +is required for a particular query. + + + +Partition pruning is also more powerful than constraint exclusion as it +can be performed not only during the planning of a given query, but also +during its execution. This is useful as it can allow more partitions to +be pruned when clauses contain expressions whose values are unknown to the +query planner. For example, parameters defined in a +PREPARE statement, using a value obtained from a +subquery or using a parameterized value on the inner side of a nested loop +join. + + + +Partition pruning during execution can be performed at any of the +following times: + + + + + During initialization of the query plan. Partition pruning can be + performed here for parameter values which are known during the + initialization phase of execution. Pa
Re: Should we add GUCs to allow partition pruning to be disabled?
Hi David. Thanks for addressing my comments. On 2018/05/07 15:00, David Rowley wrote: > v2 patch is attached. Looks good to me. Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
Thanks for reviewing again. On 9 May 2018 at 01:32, Justin Pryzby wrote: > On Mon, May 07, 2018 at 06:00:59PM +1200, David Rowley wrote: >> Many thanks for reviewing this. > > 2nd round - from the minimalist department: > > +partitions which cannot possibly contain any matching records. > maybe: partitions which cannot match any records. I don't think that's an improvement. I don't think there's such a thing as "partitions which match records". A partition can contain a record, it never matches one. > + > +Partition pruning done during execution can be performed at any of the > +following times: > > remove "done"? Removed. > + number of partitions which were removed during this phase of pruning > by > remove "of prunning" Removed. v3 attached. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services further_enable_partition_pruning_doc_updates_v3.patch Description: Binary data
Re: Should we add GUCs to allow partition pruning to be disabled?
On Mon, May 07, 2018 at 06:00:59PM +1200, David Rowley wrote: > Many thanks for reviewing this. 2nd round - from the minimalist department: +partitions which cannot possibly contain any matching records. maybe: partitions which cannot match any records. + +Partition pruning done during execution can be performed at any of the +following times: remove "done"? + number of partitions which were removed during this phase of pruning by remove "of prunning" Justin
Re: Should we add GUCs to allow partition pruning to be disabled?
Many thanks for reviewing this. On 2 May 2018 at 20:07, Amit Langote wrote: > + > +Partition Pruning is also more powerful than constraint exclusion as > +partition pruning is not something that is performed only during the > +planning of a given query. > > Maybe, don't repeat "partition pruning" again in the same sentence. How > about: > > .. more powerful than constraint exclusion as *it* is not something.. changed. > Or may suggest to rewrite it as: > > Partition pruning is also more powerful than constraint exclusion as it > can be performed not only during the planning of a given query, but also > during its execution. > > If you accept the above rewrite, the next sentences in the paragraph: > > +In certain cases, partition pruning may also > +be performed during execution of the query as well. This allows pruning > +to be performed using values which are unknown during query planning, for > +example, using parameters defined in a PREPARE > +statement, using a value obtained from a subquery or using parameters > from > +a parameterized nested loop join. > > could be adjusted a bit to read as: > > For example, this allows pruning to be performed using values which are > unknown during query planning but will be known during execution, such as > using parameters defined in a PREPARE statement (if a > generic plan is chosen), or using a value obtained from a subquery, or > using values from an outer row of a parameterized nested loop join. I've changed this a bit but I didn't mention generic plans. What you say is true, but I didn't think we needed to be so specific. > + > +The partition pruning which is performed during execution is done so at > +either one or both of the following times: > > done so at -> done at Changed > + If partition pruning can be > + performed here then there is the added benefit of not having to > + initialize partitions which are pruned. Partitions which are pruned > + during this stage will not show up in the query's > + EXPLAIN or EXPLAIN ANALYZE. It > + is possible to determine the number of partitions which were removed > + using this method by observing the Subplans Removed > + property in the EXPLAIN output. > > While it might be OK to keep the last two sentences, not sure about the > 1st, which seems like it's spelling out an implementation detail -- that > there is an initialization step for partitions. It's a nice performance > enhancement, sure, but might be irrelevant to the users reading this > documentation. I've reworded this. I think it's important to inform the reader that this is performed during initialization of the plan as without that they might ask why there are two phases of pruning and not just one. Not having to initialize the subnode for pruned partitions is the sole advantage of doing this pruning phase, so I would rather be specific about when it occurs. > + nested loop joins. Since the value of these parameters may change > many > + times during the execution of the query, partition pruning is > performed > + whenever one of the execution parameters which is being compared to a > + partition column or expression changes. > > How about writing the last part as: whenever one of the execution > parameters relevant to pruning changes I've reworded this. > + > + > + Currently, partition pruning of partitions during the planning of an > + UPDATE or DELETE command are > + internally implemented using the constraint exclusion method. Only > + SELECT uses the faster partition pruning method. > Also > + partition pruning performed during execution is only done so for the > + Append node type. Both of these limitations are likely to be removed > + in a future release of PostgreSQL. > + > + > > Do we need to write this given that we decided to decouple even the > UPDATE/DELETE pruning from the constraint_exclusion configuration? I think it's important to inform people of the limitations. I know there's a lot of opinions floating around about the usability of partitioning in PostgreSQL with a large number of partitions. I included this here so interested parties know that their problems are not all solved by partition pruning. Perhaps those people can watch for the removal of this notice. > Also, > noting that only Append nodes can use execution-time pruning seems > unnecessary. I don't see plan node names mentioned like this elsewhere in > the documentation. But more to the point, it seems like spilling out > finer implementation details (and/or limitations thereof) in the > user-facing documentation. I thought about this while writing the patch, and it forced me to grep for instances of "Append" in the docs. There were some, so I didn't think I was breaking any rules. I also have no idea how else we might explain that it works for Append and not MergeAppend. It's likely g
Re: Should we add GUCs to allow partition pruning to be disabled?
Thanks for looking at this. I've taken most of your suggestions, some I had changed around as a result of Amit's review. On 7 May 2018 at 15:34, Justin Pryzby wrote: > + During actual execution of the query plan. Partition pruning may also > Remove "actual" ? I left this out one. I imagined it would be common to think of executor startup/execution/execution shutdown as "query execution", but I really only mean during the middle of those three things. If you can think of a better way to make that more clear, then it might be worth considering. For now, I think removing "actual" won't help. A patch will follow shortly, in response to Amit's review. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, Apr 26, 2018 at 07:29:37PM +1200, David Rowley wrote: > On 25 April 2018 at 09:59, Alvaro Herrera wrote: > > Amit Langote wrote: > >> Although the config.sgml coverage of the new capabilities seems pretty > >> good, some may find their being mentioned in 5.10 Table Partitioning > >> helpful. Or if we don't want to hijack 5.10.4, maybe create a 5.10.5. > > > > Can you (or someone) describe what would that section contain? > > I've drafted and attached a patch of how I think this should look. > Likely it will need some tweaking, but it is probably a good starting > point for discussion. diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 34da0d8d57..89735b4804 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml + +Unlike constraint exclusion, partition pruning can be performed much more +quickly as it does not have to scan each individual partition's metadata quickly COMMA But actually I suggest: Partition pruning is much more efficient than constraint exclusion, since pruning avoids scanning each partition's metadata... +Partition Pruning is also more powerful than constraint exclusion as +partition pruning is not something that is performed only during the remove "something that is" ? Or just merge into the next sentence. Note: Amit and David commented on this previously. +planning of a given query. In certain cases, partition pruning may also +be performed during execution of the query as well. This allows pruning "also" is redundant with "as well" +to be performed using values which are unknown during query planning, for could say "are not yet known" +The partition pruning which is performed during execution is done so at +either one or both of the following times: remove "either" ? + During initialization of the query plan. Partition pruning can be + initialization phase of execution. If partition pruning can be + performed here then there is the added benefit of not having to here COMMA + initialize partitions which are pruned. Partitions which are pruned + during this stage will not show up in the query's + During actual execution of the query plan. Partition pruning may also Remove "actual" ? + be performed here to remove partitions using values which are only known + during actual query execution. This includes values from subqueries and + values from execution time parameters such as ones from parameterized execution-time? s/ones/those/ + partition column or expression changes. In order to determine if + partitions were pruned at this stage requires careful inspection of the + nloops property in the + EXPLAIN ANALYZE output. s/In order to determine/Determining/ + + Currently, partition pruning of partitions during the planning of an s/partition //1 (just "pruning of partitions") + UPDATE or DELETE command are s/are/is/ + internally implemented using the constraint exclusion method. Only remove "internally"? + SELECT uses the faster partition pruning method. Also Also COMMA + partition pruning performed during execution is only done so for the Remove "so". Justin
Should we add GUCs to allow partition pruning to be disabled?
On Wednesday, May 2, 2018, Alvaro Herrera wrote: > Robert Haas wrote: > > On Wed, May 2, 2018 at 9:28 AM, Alvaro Herrera > wrote: > > > I admit I am more concerned about the possibility of bugs than I am > > > about providing a performance-related tool. > > > > I agree that if partition pruning has bugs, somebody might want to > > turn it off. On the other hand, when they do, there's a good chance > > that they will lose so much performance that they'll still be pretty > > sad. Somebody certainly could have a workload where the pruning > > helps, but by a small enough amount that shutting it off is > > acceptable. But I suspect that's a somewhat narrow target. > > > > I'm not going to go to war over this, though. I'm just telling you > > what I think. > > Well, we didn't have a GUC initially, evidently because none of us > thought that this would be a huge problem. So maybe you're both right > and it's overkill to have it. I'm not set on having it, either. Does > anybody else have an opinion? > I toss my +1 to removing it altogether. David J.
Re: Should we add GUCs to allow partition pruning to be disabled?
Robert Haas wrote: > On Wed, May 2, 2018 at 9:28 AM, Alvaro Herrera > wrote: > > I admit I am more concerned about the possibility of bugs than I am > > about providing a performance-related tool. > > I agree that if partition pruning has bugs, somebody might want to > turn it off. On the other hand, when they do, there's a good chance > that they will lose so much performance that they'll still be pretty > sad. Somebody certainly could have a workload where the pruning > helps, but by a small enough amount that shutting it off is > acceptable. But I suspect that's a somewhat narrow target. > > I'm not going to go to war over this, though. I'm just telling you > what I think. Well, we didn't have a GUC initially, evidently because none of us thought that this would be a huge problem. So maybe you're both right and it's overkill to have it. I'm not set on having it, either. Does anybody else have an opinion? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Wed, May 2, 2018 at 9:28 AM, Alvaro Herrera wrote: > I admit I am more concerned about the possibility of bugs than I am > about providing a performance-related tool. I agree that if partition pruning has bugs, somebody might want to turn it off. On the other hand, when they do, there's a good chance that they will lose so much performance that they'll still be pretty sad. Somebody certainly could have a workload where the pruning helps, but by a small enough amount that shutting it off is acceptable. But I suspect that's a somewhat narrow target. I'm not going to go to war over this, though. I'm just telling you what I think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we add GUCs to allow partition pruning to be disabled?
On 3 May 2018 at 11:38, David G. Johnston wrote: > Maybe "Partition Filtering" (I'm disliking selection, I'm thinking we must > always select partitions) I don't see why "Filtering" is any different from pruning, they both imply removing something that was once there. What I'm saying is, that it's backward to think of what we have now as pruning, so I don't think renaming it to "partition filtering" addresses my concern. FWIW, I'm not set on changing this. I just want to discuss this now so that the chances of having regrets about this later are reduced. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Wed, May 2, 2018 at 4:06 PM, David Rowley wrote: > On 1 May 2018 at 21:44, Amit Langote > wrote: > > About the patch in general, it seems like the newly added documentation > > talks about "Partition Pruning" as something that *replaces* constraint > > exclusion. But, I think "Partition Pruning" is not the thing that > > replaces constraint exclusion. > > Just thinking about this a bit more. I've become a bit concerned that > we've completely misnamed this feature. It's true that at the moment > we build RelOptInfos for all partitions then eliminate what we can, > but the new algorithm that we've been calling "partition pruning" is > really not pruning anything at all, it's selecting the smallest set of > matching partitions. It's only the current usage of the algorithm > that's using it that way, and I kinda hope to change that for PG12. > > Isn't the whole thing better to be named "partition selection"? The user-exposed Name/GUC need (and in some ways should) not reflect the implementation. Partitioning creates a tree and during planning and execution we prune those branches/paths from the tree that are not going to yield fruit. Its not like you can outright ignore their existence so at some point you choose to ignore them which is a form of pruning. Writing that I can support partition_pruning on technical grounds but to what extent are we alienating the international community that we serve? Maybe "Partition Filtering" (I'm disliking selection, I'm thinking we must always select partitions) Then again a Google search suggests we will be keeping good company by sticking with "Partition Pruning" - any language dynamic is probably overcome through extent of use. On the whole I'd stick with what we've got. David J.
Re: Should we add GUCs to allow partition pruning to be disabled?
On 1 May 2018 at 21:44, Amit Langote wrote: > About the patch in general, it seems like the newly added documentation > talks about "Partition Pruning" as something that *replaces* constraint > exclusion. But, I think "Partition Pruning" is not the thing that > replaces constraint exclusion. Just thinking about this a bit more. I've become a bit concerned that we've completely misnamed this feature. It's true that at the moment we build RelOptInfos for all partitions then eliminate what we can, but the new algorithm that we've been calling "partition pruning" is really not pruning anything at all, it's selecting the smallest set of matching partitions. It's only the current usage of the algorithm that's using it that way, and I kinda hope to change that for PG12. Isn't the whole thing better to be named "partition selection"? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Wed, May 2, 2018 at 1:07 AM, Amit Langote wrote: > Hi David. > > On 2018/05/02 8:18, David Rowley wrote: > > On 1 May 2018 at 21:44, Amit Langote > wrote: > > > > I re-read the patch and it still looks fine to me. I'm sure it could > > be made better, but I just don't currently see how. I think it would > > be better if you commented on the specifics of what you think could be > > improved rather than a general comment that it could be improved. > > Sorry, I may have been a bit vague. I've read the patch one more time by > considering the phrase "partition pruning" as the name of the new feature > and that constraint exclusion is an optimization technique which doubled > as partition pruning until now. The new feature achieves results faster > and can be used in more cases than constraint exclusion. With that > reading, I don't see much to complain about with your patch at a high > level. > > Except some nitpicking: > > + > +Partition Pruning is also more powerful than constraint exclusion as > +partition pruning is not something that is performed only during the > +planning of a given query. > > Maybe, don't repeat "partition pruning" again in the same sentence. How > about: > good thought > > .. more powerful than constraint exclusion as *it* is not something.. > technically "it" refers to "constraint exclusion" when written this way. Better would be: Partition pruning, unlike constraint exclusion, may be performed during query execution. Saying "not only planning" where there is only one other possible time it happens is unnecessarily vague. > + If partition pruning can be > + performed here then there is the added benefit of not having to > + initialize partitions which are pruned. Partitions which are > pruned > + during this stage will not show up in the query's > + EXPLAIN or EXPLAIN ANALYZE. > It > + is possible to determine the number of partitions which were > removed > + using this method by observing the Subplans Removed > + property in the EXPLAIN output. > > While it might be OK to keep the last two sentences, not sure about the > 1st, which seems like it's spelling out an implementation detail -- that > there is an initialization step for partitions. It's a nice performance > enhancement, sure, but might be irrelevant to the users reading this > documentation. > I would concur with omitting the initialization implementation detail. > > + nested loop joins. Since the value of these parameters may change > many > + times during the execution of the query, partition pruning is > performed > + whenever one of the execution parameters which is being compared > to a > + partition column or expression changes. > > How about writing the last part as: whenever one of the execution > parameters relevant to pruning changes > Is it when the values change or for each different value? The difference being if values are not sorted an something like: 1,2,3,2,3,4,1,2 were to appear. > > + > + > + Currently, partition pruning of partitions during the planning of an > + UPDATE or DELETE command are > + internally implemented using the constraint exclusion method. Only > + SELECT uses the faster partition pruning method. > Also > + partition pruning performed during execution is only done so for the > + Append node type. Both of these limitations are likely to be removed > + in a future release of PostgreSQL. > + > + > > Do we need to write this given that we decided to decouple even the > UPDATE/DELETE pruning from the constraint_exclusion configuration? Also, > noting that only Append nodes can use execution-time pruning seems > unnecessary. I don't see plan node names mentioned like this elsewhere in > the documentation. But more to the point, it seems like spilling out > finer implementation details (and/or limitations thereof) in the > user-facing documentation. > I suppose it would matter relative to what explain plans the user might see. I do think the distinction between UPDATE/DELETE and SELECT can be removed here though. The execution limitation seems potentially worthy though as written I am unable to convert the provided information into something I can use. Knowing when it cannot happen, even if incomplete, would be more helpful to me. David J.
Re: Should we add GUCs to allow partition pruning to be disabled?
Robert Haas wrote: > On Tue, Apr 24, 2018 at 5:59 PM, Alvaro Herrera > wrote: > > Constraint > > exclusion was pretty easy to get wrong, hence the need for a separate > > section, and I suppose the new partition pruning may be prey to the same > > problems, so it seems worth to document them specially. But not sure > > about the others, if they are mostly debugging tools. > > Weighing in here late, but I have a hard time understanding why we > want a GUC to control partition pruning at all. With constraint > exclusion, the issue is whether you want to spend planner cycles to > try to deduce things using CHECK constraints when, quite possibly, > your CHECK constraints are unrelated to table inheritance and thus > won't help. But seems extremely unlikely that the same thing would > happen with partition pruning. Unlike your CHECK constraints, your > partition bounds are, by definition, potentially useful for pruning. I admit I am more concerned about the possibility of bugs than I am about providing a performance-related tool. If partition prune can do its thing with only a 1.1% of overhead, that's a great result. While I'm sure that some real-world partitioning scenarios exist that have a higher overhead than that, that's not what I am worried about the most. In a couple of releases, once we know for sure that all this new code is absolutely stable and that there are no bugs (keeping in mind that PG12 will boast additional pruning for MergeAppend as well as for UPDATE/ DELETE queries,) we can remove the GUC -- hoping that no user will bark at us about they having to keep it disabled by default. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
Hi David. On 2018/05/02 8:18, David Rowley wrote: > On 1 May 2018 at 21:44, Amit Langote wrote: >> About the patch in general, it seems like the newly added documentation >> talks about "Partition Pruning" as something that *replaces* constraint >> exclusion. But, I think "Partition Pruning" is not the thing that >> replaces constraint exclusion. > > Not sure where you see the mention partition pruning replacing > constraint exclusion. > >> We used to do partition pruning even >> before and used constraint exclusion as the algorithm. > > That depends on if you think of partition pruning as the new feature > or the act of removing unneeded partitions. We seem to have settled on > partition pruning being the new feature given that we named the GUC > this way. So I don't quite understand what you mean here. > >> What's new is the >> algorithm that we now use to perform partition pruning for declaratively >> partitioned tables. Also, the characteristics of the new algorithm are >> such that it can now be used in more situations, thus making it more >> useful than the earlier method of partition pruning, so that new features >> like runtime pruning could be realized. I like that the patch adds >> various details about the new pruning features, but think that the wording >> and the flow could be improved a bit. >> >> What do you think? > > I re-read the patch and it still looks fine to me. I'm sure it could > be made better, but I just don't currently see how. I think it would > be better if you commented on the specifics of what you think could be > improved rather than a general comment that it could be improved. Sorry, I may have been a bit vague. I've read the patch one more time by considering the phrase "partition pruning" as the name of the new feature and that constraint exclusion is an optimization technique which doubled as partition pruning until now. The new feature achieves results faster and can be used in more cases than constraint exclusion. With that reading, I don't see much to complain about with your patch at a high level. Except some nitpicking: + +Partition Pruning is also more powerful than constraint exclusion as +partition pruning is not something that is performed only during the +planning of a given query. Maybe, don't repeat "partition pruning" again in the same sentence. How about: .. more powerful than constraint exclusion as *it* is not something.. Or may suggest to rewrite it as: Partition pruning is also more powerful than constraint exclusion as it can be performed not only during the planning of a given query, but also during its execution. If you accept the above rewrite, the next sentences in the paragraph: +In certain cases, partition pruning may also +be performed during execution of the query as well. This allows pruning +to be performed using values which are unknown during query planning, for +example, using parameters defined in a PREPARE +statement, using a value obtained from a subquery or using parameters from +a parameterized nested loop join. could be adjusted a bit to read as: For example, this allows pruning to be performed using values which are unknown during query planning but will be known during execution, such as using parameters defined in a PREPARE statement (if a generic plan is chosen), or using a value obtained from a subquery, or using values from an outer row of a parameterized nested loop join. + +The partition pruning which is performed during execution is done so at +either one or both of the following times: done so at -> done at + If partition pruning can be + performed here then there is the added benefit of not having to + initialize partitions which are pruned. Partitions which are pruned + during this stage will not show up in the query's + EXPLAIN or EXPLAIN ANALYZE. It + is possible to determine the number of partitions which were removed + using this method by observing the Subplans Removed + property in the EXPLAIN output. While it might be OK to keep the last two sentences, not sure about the 1st, which seems like it's spelling out an implementation detail -- that there is an initialization step for partitions. It's a nice performance enhancement, sure, but might be irrelevant to the users reading this documentation. + nested loop joins. Since the value of these parameters may change many + times during the execution of the query, partition pruning is performed + whenever one of the execution parameters which is being compared to a + partition column or expression changes. How about writing the last part as: whenever one of the execution parameters relevant to pruning changes + + + Currently, partition pruning of partitions during the planning of an + UPDATE or DELETE command are + internally implemented using the constraint exclusion method. Only + SELECT
Re: Should we add GUCs to allow partition pruning to be disabled?
Hi Amit, Thanks for looking at the patch. On 1 May 2018 at 21:44, Amit Langote wrote: > About the patch in general, it seems like the newly added documentation > talks about "Partition Pruning" as something that *replaces* constraint > exclusion. But, I think "Partition Pruning" is not the thing that > replaces constraint exclusion. Not sure where you see the mention partition pruning replacing constraint exclusion. > We used to do partition pruning even > before and used constraint exclusion as the algorithm. That depends on if you think of partition pruning as the new feature or the act of removing unneeded partitions. We seem to have settled on partition pruning being the new feature given that we named the GUC this way. So I don't quite understand what you mean here. > What's new is the > algorithm that we now use to perform partition pruning for declaratively > partitioned tables. Also, the characteristics of the new algorithm are > such that it can now be used in more situations, thus making it more > useful than the earlier method of partition pruning, so that new features > like runtime pruning could be realized. I like that the patch adds > various details about the new pruning features, but think that the wording > and the flow could be improved a bit. > > What do you think? I re-read the patch and it still looks fine to me. I'm sure it could be made better, but I just don't currently see how. I think it would be better if you commented on the specifics of what you think could be improved rather than a general comment that it could be improved. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2 May 2018 at 09:14, David Rowley wrote: > It might be worth running a series of benchmarks to test where the worst case > performance hit is with partition pruning. I just did this: Setup: create table parttable (a int, b int) partition by list (a); create table parttable1 partition of parttable for values in(1); Benchmark: $ echo "select * from parttable where b in(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20);" > parttable.sql $ pgbench -T 30 -f parttable.sql -n postgres transaction type: parttable.sql scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 30 s number of transactions actually processed: 230908 latency average = 0.130 ms tps = 7696.884795 (including connections establishing) tps = 7697.304782 (excluding connections establishing) $ psql -c "alter system set enable_partition_pruning = off" postgres ALTER SYSTEM $ pgbench -T 30 -f parttable.sql -n postgres transaction type: parttable.sql scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 30 s number of transactions actually processed: 233545 latency average = 0.128 ms tps = 7784.800130 (including connections establishing) tps = 7785.225490 (excluding connections establishing) So about 1.1% performance improvement. That's not very much. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2 May 2018 at 07:24, Robert Haas wrote: > On Tue, Apr 24, 2018 at 5:59 PM, Alvaro Herrera > wrote: > > Constraint > > exclusion was pretty easy to get wrong, hence the need for a separate > > section, and I suppose the new partition pruning may be prey to the same > > problems, so it seems worth to document them specially. But not sure > > about the others, if they are mostly debugging tools. > > Weighing in here late, but I have a hard time understanding why we > want a GUC to control partition pruning at all. With constraint > exclusion, the issue is whether you want to spend planner cycles to > try to deduce things using CHECK constraints when, quite possibly, > your CHECK constraints are unrelated to table inheritance and thus > won't help. But seems extremely unlikely that the same thing would > happen with partition pruning. Unlike your CHECK constraints, your > partition bounds are, by definition, potentially useful for pruning. > Thanks for weighing in here. It's certainly true that I was a bit undecided about this, hence the subject. I ended up leaning more towards having the GUC due to the fact that partition pruning, although much cheaper than constraint exclusion, it's still not free. There's a good chance of there being workloads that just never benefit from it. People running those workloads might be quite glad we added the ability to switch it off. It might be worth running a series of benchmarks to test where the worst case performance hit is with partition pruning. We'd need some fast to execute query that has items in the WHERE clause, but none that match the partition key. It should be easy to test the overhead of this now that the GUC is committed. Perhaps if we're unable to measure the performance drop then the GUC is not worth it, but if we can, then perhaps it is, as it will help speed up someone's workload. I'll try to do this today. I imagine something like: SELECT * FROM parttable WHERE non_part_key_but_indexed_col IN() might be the best bet. Another reason to have the GUC is in case some bug is discovered in the pruning code. Being able to disable it could be useful until we can release a minor version containing a fix. From my time reviewing the faster partition pruning code, I very much am aware that it's not simple code, so it would not surprise me if we find a few bugs in it down the track. The problem with this reason is that it carries less weight every day that passes with no bug discovered. If no bug is found in 10 years then we'll likely wonder why we bothered doing it for this reason. Lack of any sort of crystal ball makes it hard to know what to do here, so let's focus on the performance reason first. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Tue, Apr 24, 2018 at 5:59 PM, Alvaro Herrera wrote: > Constraint > exclusion was pretty easy to get wrong, hence the need for a separate > section, and I suppose the new partition pruning may be prey to the same > problems, so it seems worth to document them specially. But not sure > about the others, if they are mostly debugging tools. Weighing in here late, but I have a hard time understanding why we want a GUC to control partition pruning at all. With constraint exclusion, the issue is whether you want to spend planner cycles to try to deduce things using CHECK constraints when, quite possibly, your CHECK constraints are unrelated to table inheritance and thus won't help. But seems extremely unlikely that the same thing would happen with partition pruning. Unlike your CHECK constraints, your partition bounds are, by definition, potentially useful for pruning. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2018/04/26 16:29, David Rowley wrote: > On 25 April 2018 at 09:59, Alvaro Herrera wrote: >> Amit Langote wrote: >>> Although the config.sgml coverage of the new capabilities seems pretty >>> good, some may find their being mentioned in 5.10 Table Partitioning >>> helpful. Or if we don't want to hijack 5.10.4, maybe create a 5.10.5. >> >> Can you (or someone) describe what would that section contain? > > I've drafted and attached a patch of how I think this should look. > Likely it will need some tweaking, but it is probably a good starting > point for discussion. Thanks David for drafting this. I see that you have not included the description of other partitioning-based optimizations such partitionwise plans that the planner can now consider. I guess that's fine though, as Alvaro also seemed a bit doubtful about the value of providing the description of those optimizations in this part of the documentation (that is, in 5.10 Table Partitioning). About the patch in general, it seems like the newly added documentation talks about "Partition Pruning" as something that *replaces* constraint exclusion. But, I think "Partition Pruning" is not the thing that replaces constraint exclusion. We used to do partition pruning even before and used constraint exclusion as the algorithm. What's new is the algorithm that we now use to perform partition pruning for declaratively partitioned tables. Also, the characteristics of the new algorithm are such that it can now be used in more situations, thus making it more useful than the earlier method of partition pruning, so that new features like runtime pruning could be realized. I like that the patch adds various details about the new pruning features, but think that the wording and the flow could be improved a bit. What do you think? Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On 25 April 2018 at 09:59, Alvaro Herrera wrote: > Amit Langote wrote: >> Although the config.sgml coverage of the new capabilities seems pretty >> good, some may find their being mentioned in 5.10 Table Partitioning >> helpful. Or if we don't want to hijack 5.10.4, maybe create a 5.10.5. > > Can you (or someone) describe what would that section contain? I've drafted and attached a patch of how I think this should look. Likely it will need some tweaking, but it is probably a good starting point for discussion. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services further_enable_partition_pruning_doc_updates.patch Description: Binary data
Re: Should we add GUCs to allow partition pruning to be disabled?
On 24 April 2018 at 09:10, Alvaro Herrera wrote: > I just pushed David patch, with some pretty minor changes. I hope not > to have broken anything. Thanks for pushing and thanks Amit for reviewing. The only thing that stands out in the actual commit is: +executor to remove (ignore) partitions during query execution. The I had originally written: +executor to remove or ignore partitions during query execution. The The reason I was using "remove or ignore" was that partitions pruned during init plan are effectively "removed" from the plan, whereas partitions pruned during the running of the planner are just "ignored". It's minor details but I thought I'd better point it out. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
Amit Langote wrote: > On 2018/04/24 6:10, Alvaro Herrera wrote: > > Not really sure how best to handle that one. For starters, I think it need > > to stop mentioning the GUC name in the title; > > Hmm, "Constraint Exclusion" that's used in the title is a concept, not a > GUC, although pretty close. Yeah, I meant that if we want that section to cover the general concept of partition pruning, with either technique, better not use the words "constraint exclusion" in the title. > > maybe rename it to > > "Partition Pruning" or some such, and then in the text explain that > > sometimes the enable_partition_pruning param is used in one case and > > constraint_exclusion in the other, and approximately what effects they > > have. I don't think it's worth going into too much detail on exactly > > how they differ, but then I'm not 100% sure of that either. > > Just a thought -- How about making 5.10.4 cover partitioning based > optimizations in general? I see that a number of partitioning-based > optimizations have been developed in this release cycle, but I only see > various enable_partition* GUCs listed in config.sgml and not much else. I think we should not rely on the config.sgml blurbs exclusively; some narrative is always welcome -- except that for planner enable_* settings I'm not sure we really need all that much text after all. Constraint exclusion was pretty easy to get wrong, hence the need for a separate section, and I suppose the new partition pruning may be prey to the same problems, so it seems worth to document them specially. But not sure about the others, if they are mostly debugging tools. > Although the config.sgml coverage of the new capabilities seems pretty > good, some may find their being mentioned in 5.10 Table Partitioning > helpful. Or if we don't want to hijack 5.10.4, maybe create a 5.10.5. Can you (or someone) describe what would that section contain? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
Jim Finnerty wrote: > The patch looks good to me, David. > Thanks for checking! It's already pushed. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2018/04/24 6:10, Alvaro Herrera wrote: >> BTW, while we're at it, would it also be a good idea to consider the patch >> you had proposed, which I then posted an updated version of, to adjust the >> documentation in ddl.sgml (in the section 5.10. Table Partitioning) >> regarding the relationship between constraint exclusion and declarative >> partitioning? > > I looked at this one. That patch has two hunks. I applied a change > where the first hunk is, to replace constraint_exclusion with the new > GUC -- seemed easy enough. Looks good. > However, the second hunk is on "5.10.4. > Partitioning and Constraint Exclusion" which needs major editing. Reading 5.10.4 again, I tend to agree with this. > Not really sure how best to handle that one. For starters, I think it need > to stop mentioning the GUC name in the title; Hmm, "Constraint Exclusion" that's used in the title is a concept, not a GUC, although pretty close. > maybe rename it to > "Partition Pruning" or some such, and then in the text explain that > sometimes the enable_partition_pruning param is used in one case and > constraint_exclusion in the other, and approximately what effects they > have. I don't think it's worth going into too much detail on exactly > how they differ, but then I'm not 100% sure of that either. Just a thought -- How about making 5.10.4 cover partitioning based optimizations in general? I see that a number of partitioning-based optimizations have been developed in this release cycle, but I only see various enable_partition* GUCs listed in config.sgml and not much else. Although the config.sgml coverage of the new capabilities seems pretty good, some may find their being mentioned in 5.10 Table Partitioning helpful. Or if we don't want to hijack 5.10.4, maybe create a 5.10.5. Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
Hi, I just pushed David patch, with some pretty minor changes. I hope not to have broken anything. Amit Langote wrote: > Your proposed changes to inheritance_planner() look fine to me. In the > comment added by the patch in relation_excluded_by_constraints(): > > + /* > + * When constraint_exclusion is set to 'partition' we only handle > + * OTHER_MEMBER_RELs, or BASERELs in cases where the result target is an > + * inheritance parent or a partitioned table. > + */ > > Just to clarify this a bit, would it be a good idea to be specific by > appending " (see inheritance_planner() where this is determined)" or some > such to this sentence? I didn't think that was really required. > BTW, while we're at it, would it also be a good idea to consider the patch > you had proposed, which I then posted an updated version of, to adjust the > documentation in ddl.sgml (in the section 5.10. Table Partitioning) > regarding the relationship between constraint exclusion and declarative > partitioning? I looked at this one. That patch has two hunks. I applied a change where the first hunk is, to replace constraint_exclusion with the new GUC -- seemed easy enough. However, the second hunk is on "5.10.4. Partitioning and Constraint Exclusion" which needs major editing. Not really sure how best to handle that one. For starters, I think it need to stop mentioning the GUC name in the title; maybe rename it to "Partition Pruning" or some such, and then in the text explain that sometimes the enable_partition_pruning param is used in one case and constraint_exclusion in the other, and approximately what effects they have. I don't think it's worth going into too much detail on exactly how they differ, but then I'm not 100% sure of that either. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
Hi David. On 2018/04/21 14:09, David Rowley wrote: > On 20 April 2018 at 20:51, Amit Langote wrote: >> set constraint_exclusion to off; >> >> -- not ok! > > It needed a bit more effort than I put in the first time around to > make this work properly. constraint_exclusion = 'off' becomes a bit of > a special case for partitioned tables now. To make this work I had to > get rid of hasInheritedTarget and make a new enum that tracks if we're > inheritance planning for an inheritance parent or a partitioned table. > We can't simply only set hasInheritedTarget to true when planning with > inheritance parents as constraint_exclusion = 'partition' must still > know that we're planning using the inheritance planner. > > v2 patch attached. Thanks for the updated patch. Your proposed changes to inheritance_planner() look fine to me. In the comment added by the patch in relation_excluded_by_constraints(): + /* + * When constraint_exclusion is set to 'partition' we only handle + * OTHER_MEMBER_RELs, or BASERELs in cases where the result target is an + * inheritance parent or a partitioned table. + */ Just to clarify this a bit, would it be a good idea to be specific by appending " (see inheritance_planner() where this is determined)" or some such to this sentence? BTW, while we're at it, would it also be a good idea to consider the patch you had proposed, which I then posted an updated version of, to adjust the documentation in ddl.sgml (in the section 5.10. Table Partitioning) regarding the relationship between constraint exclusion and declarative partitioning? https://www.postgresql.org/message-id/c2838545-0e77-3c08-cd14-1c3bbf9eb62d%40lab.ntt.co.jp Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
Thank you both of you for looking at this. On 21 April 2018 at 06:28, Alvaro Herrera wrote: > + {"enable_partition_pruning", PGC_USERSET, QUERY_TUNING_METHOD, > + gettext_noop("Enables the planner's ability to remove > non-required partitions from the query plan."), > + NULL > + }, > + &enable_partition_pruning, > + true, > + NULL, NULL, NULL > + }, > > I would make the short description shorter, and use the long description > to elaborate. So gettext_noop("Enable plan-time and run-time partition > pruning.") > followed by something like > > gettext_noop("Allows the query planner and executor to compare partition > bounds to conditions in the query, and determine which partitions {can be > skipped | must be scanned} ...") I've taken a slight variation of this, but instead of ", and" I used "to" and went with the "must be scanned" option. select * from pg_settings where name like 'enable%'; does show that this is the only enable_* GUC to have a long description, but perhaps that does not matter. On 20 April 2018 at 20:51, Amit Langote wrote: > set constraint_exclusion to off; > > -- not ok! It needed a bit more effort than I put in the first time around to make this work properly. constraint_exclusion = 'off' becomes a bit of a special case for partitioned tables now. To make this work I had to get rid of hasInheritedTarget and make a new enum that tracks if we're inheritance planning for an inheritance parent or a partitioned table. We can't simply only set hasInheritedTarget to true when planning with inheritance parents as constraint_exclusion = 'partition' must still know that we're planning using the inheritance planner. v2 patch attached. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services v2-0001-Add-GUC-to-allow-partition-pruning-to-be-disabled.patch Description: Binary data
Re: Should we add GUCs to allow partition pruning to be disabled?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index fa92ce2e68..c51a9270e4 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -951,6 +951,15 @@ static struct config_bool ConfigureNamesBool[] = true, NULL, NULL, NULL }, + { + {"enable_partition_pruning", PGC_USERSET, QUERY_TUNING_METHOD, + gettext_noop("Enables the planner's ability to remove non-required partitions from the query plan."), + NULL + }, + &enable_partition_pruning, + true, + NULL, NULL, NULL + }, I would make the short description shorter, and use the long description to elaborate. So gettext_noop("Enable plan-time and run-time partition pruning.") followed by something like gettext_noop("Allows the query planner and executor to compare partition bounds to conditions in the query, and determine which partitions {can be skipped | must be scanned} ...") (Not wedded to those particular phrasings.) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
Amit Langote wrote: > Sorry, I should have said what I said after quoting only the last sentence > of what you had said. That is, I want to the new GUC to be the only > determiner of whether the pruning occurs or not for partitioned tables. > To implement that behavior, it will have to override the setting of > constraint_exclusion (the parameter) in *some* cases, because some > commands still rely on constraint exclusion (the algorithm) as the > underlying pruning mechanism. I agree -- it will make more sense now, and will continue to make sense later when we remove usage of constraint exclusion for upd/del, to make it work as you suggest: * if the table is partitioned, do constraint exclusion based on enable_partition_prune=on rather than constraint_exclusion=partition. This will only affect upd/del, because the select queries would be affected by the enable_partition_prune anyway since constraint_exclusion does not apply. * If the table is using regular inheritance, continue to use the original behavior. > Now, the "override the setting of constraint_exclusion" implementation > may not be the most popular choice in the end. I guess there are different ways to implement it. Supposedly this is going to disappear in pg12, so I don't think it's a big deal. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2018/04/20 17:51, Amit Langote wrote: > On 2018/04/20 14:47, David Rowley wrote: >> On 20 April 2018 at 14:07, Amit Langote >> wrote: >>> To clarify: if we're going to add a new parameter *for partitioned tables* >>> to configure whether or not pruning occurs, even if UPDATE and DELETE now >>> rely on constraint exclusion for pruning, we should ignore the setting of >>> constraint_exclusion the configuration parameter. For UPDATE and DELETE, >>> if enable_partition_pruning is on, we proceed to prune using constraint >>> exclusion (because that's the only method available now), irrespective of >>> the setting of constraint_exclusion. >>> >>> So to users, enable_partition_pruning should be the only way to configure >>> whether or not pruning occurs. >> >> I hope the attached implements what is being discussed here. >> >> Please test it to ensure it behaves as you'd expect. >> >> I was a little unsure if the new GUCs declaration should live in >> costsize.c or not since it really has no effect on plan costs, but in >> the end, I stuck it there anyway so that it can be with its friends. > > The patch looks good except one thing, OK, I forgot to comment on a couple of minor issues. + + enable_partition_pruning (boolean) + +enable_partition_pruning configuration parameter + + + + +Enables or disables the query planner's ability to eliminate a +partitioned table's subpartitions from query plans. Why subpartitions? Maybe, just "partitions" is fine. + This also +controls the planner's ability to generate query plans which allow the +query executor to remove or ignoring partitions during query Here: ignoring -> ignore Also, maybe add the GUC to postgresql.conf.sample. Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
Hi David. Thanks for writing the patch. On 2018/04/20 14:47, David Rowley wrote: > On 20 April 2018 at 14:07, Amit Langote wrote: >> To clarify: if we're going to add a new parameter *for partitioned tables* >> to configure whether or not pruning occurs, even if UPDATE and DELETE now >> rely on constraint exclusion for pruning, we should ignore the setting of >> constraint_exclusion the configuration parameter. For UPDATE and DELETE, >> if enable_partition_pruning is on, we proceed to prune using constraint >> exclusion (because that's the only method available now), irrespective of >> the setting of constraint_exclusion. >> >> So to users, enable_partition_pruning should be the only way to configure >> whether or not pruning occurs. > > I hope the attached implements what is being discussed here. > > Please test it to ensure it behaves as you'd expect. > > I was a little unsure if the new GUCs declaration should live in > costsize.c or not since it really has no effect on plan costs, but in > the end, I stuck it there anyway so that it can be with its friends. The patch looks good except one thing, which I was trying to emphasize shouldn't be the behavior. drop table p; create table p (a int) partition by list (a); create table p1 partition of p for values in (1); create table p2 partition of p for values in (2); set enable_partition_pruning to off; -- ok explain select * from p where a = 1; QUERY PLAN -- Append (cost=0.00..83.88 rows=26 width=4) -> Seq Scan on p1 (cost=0.00..41.88 rows=13 width=4) Filter: (a = 1) -> Seq Scan on p2 (cost=0.00..41.88 rows=13 width=4) Filter: (a = 1) (5 rows) reset enable_partition_pruning; -- ok explain select * from p where a = 1; QUERY PLAN -- Append (cost=0.00..41.94 rows=13 width=4) -> Seq Scan on p1 (cost=0.00..41.88 rows=13 width=4) Filter: (a = 1) (3 rows) set enable_partition_pruning to off; -- ok explain update p set a = 2 where a = 1; QUERY PLAN --- Update on p (cost=0.00..83.75 rows=26 width=10) Update on p1 Update on p2 -> Seq Scan on p1 (cost=0.00..41.88 rows=13 width=10) Filter: (a = 1) -> Seq Scan on p2 (cost=0.00..41.88 rows=13 width=10) Filter: (a = 1) (7 rows) reset enable_partition_pruning; -- ok explain update p set a = 2 where a = 1; QUERY PLAN --- Update on p (cost=0.00..41.88 rows=13 width=10) Update on p1 -> Seq Scan on p1 (cost=0.00..41.88 rows=13 width=10) Filter: (a = 1) (4 rows) set constraint_exclusion to off; -- not ok! explain update p set a = 2 where a = 1; QUERY PLAN --- Update on p (cost=0.00..83.75 rows=26 width=10) Update on p1 Update on p2 -> Seq Scan on p1 (cost=0.00..41.88 rows=13 width=10) Filter: (a = 1) -> Seq Scan on p2 (cost=0.00..41.88 rows=13 width=10) Filter: (a = 1) (7 rows) I think we should teach relation_excluded_by_constraints() to forge ahead based on the value of enable_partition_pruning, ignoring whatever constraint_exclusion has been set to. What do you think of doing that sort of thing? Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2018/04/20 15:00, Ashutosh Bapat wrote: > On Fri, Apr 20, 2018 at 7:37 AM, Amit Langote wrote: >> On 2018/04/19 21:50, Ashutosh Bapat wrote: >>> There's no point in confusing users >>> with by adding dependencies between these two GUCs. >> >> That's exactly what I'm trying to propose. > > Not really. By pruning based on the partition bounds I didn't mean > constraint exclusion working on partition bound based constraints. Sorry, I should have said what I said after quoting only the last sentence of what you had said. That is, I want to the new GUC to be the only determiner of whether the pruning occurs or not for partitioned tables. To implement that behavior, it will have to override the setting of constraint_exclusion (the parameter) in *some* cases, because some commands still rely on constraint exclusion (the algorithm) as the underlying pruning mechanism. Now, the "override the setting of constraint_exclusion" implementation may not be the most popular choice in the end. Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On Fri, Apr 20, 2018 at 7:37 AM, Amit Langote wrote: > On 2018/04/19 21:50, Ashutosh Bapat wrote: >> On Thu, Apr 19, 2018 at 5:02 PM, Amit Langote >>> I can imagine having a enable_partition_pruning which defaults to true, if >>> only to avoid the performance overhead of pruning code when a user knows >>> for sure that it won't help for some queries. Although, I'm a bit dubious >>> why they'd write such queries if they're using partitioning in the first >>> place. >>> >>> Also, I'd think that enable_partition_pruning set to false means pruning >>> doesn't occur at all, not even using constraint exclusion. That is, >>> behavior equivalent of constraint_exclusion < partition (that is, off/on). >>> >>> Also, if we do have such a GUC, it should apply to all command types, >>> including UPDATE and DELETE which don't yet invoke the new pruning code, >>> from the start. So, if enable_partition_pruning is false, we won't load >>> the partition constraints at all, which we currently do for UPDATE and >>> DELETE so that constraint exclusion can be used for pruning. OTOH, if >>> enable_partition_pruning is on, we perform constraint exclusion -based >>> pruning for UPDATE and DELETE irrespective of the setting of >>> constraint_exclusion GUC. In other words, we completely dissociate >>> partitioned table pruning from the setting of constraint_exclusion. >> >> Isn't word "dissociate" turns the last sentence into a sentence >> contradicting everything you wrote prior to it? >> >> I think we should keep these two things separate. > > Yes, that's what I meant. > > To clarify: if we're going to add a new parameter *for partitioned tables* > to configure whether or not pruning occurs, even if UPDATE and DELETE now > rely on constraint exclusion for pruning, we should ignore the setting of > constraint_exclusion the configuration parameter. For UPDATE and DELETE, > if enable_partition_pruning is on, we proceed to prune using constraint > exclusion (because that's the only method available now), irrespective of > the setting of constraint_exclusion. > > So to users, enable_partition_pruning should be the only way to configure > whether or not pruning occurs. > > Does that make sense? > > It seems like talking about the finer implementation details is making > this discussion a bit confusing. > >> enable_partition_pruning affects the partition pruning based on the >> partition bounds and that currently does not work for UPDATE/DELETE. >> When it does work in those case, we might think of not loading >> partition bound based constraints. constraint_exclusion affects >> whether constraints can be used to exclude a relation (with partition >> option affecting the child tables). Once we stop loading partition >> bound based constraints, constraint exclusion would stop pruning >> partitions based on the bounds. There's no point in confusing users >> with by adding dependencies between these two GUCs. > > That's exactly what I'm trying to propose. Not really. By pruning based on the partition bounds I didn't mean constraint exclusion working on partition bound based constraints. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Should we add GUCs to allow partition pruning to be disabled?
On 20 April 2018 at 14:07, Amit Langote wrote: > To clarify: if we're going to add a new parameter *for partitioned tables* > to configure whether or not pruning occurs, even if UPDATE and DELETE now > rely on constraint exclusion for pruning, we should ignore the setting of > constraint_exclusion the configuration parameter. For UPDATE and DELETE, > if enable_partition_pruning is on, we proceed to prune using constraint > exclusion (because that's the only method available now), irrespective of > the setting of constraint_exclusion. > > So to users, enable_partition_pruning should be the only way to configure > whether or not pruning occurs. I hope the attached implements what is being discussed here. Please test it to ensure it behaves as you'd expect. I was a little unsure if the new GUCs declaration should live in costsize.c or not since it really has no effect on plan costs, but in the end, I stuck it there anyway so that it can be with its friends. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services 0001-Add-GUC-to-allow-partition-pruning-to-be-disabled.patch Description: Binary data
Re: Should we add GUCs to allow partition pruning to be disabled?
On 20 April 2018 at 14:33, Amit Langote wrote: > On 2018/04/20 11:18, David Rowley wrote: >> 4. Replace test doing (constraint_exclusion == >> CONSTRAINT_EXCLUSION_PARTITION) with (enable_partition_pruning). >> 5. Get rid of CONSTRAINT_EXCLUSION_PARTITION. > > About 4 & 5: > > Perhaps we should leave constraint_exclusion = partition alone because > there might be users who want to continue using the old inheritance method > to set up partitioning for whatever reason? Yeah, for some reason that keeps falling out my brain. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
Hi. On 2018/04/20 11:18, David Rowley wrote: > On 20 April 2018 at 14:07, Amit Langote wrote: >> To clarify: if we're going to add a new parameter *for partitioned tables* >> to configure whether or not pruning occurs, even if UPDATE and DELETE now >> rely on constraint exclusion for pruning, we should ignore the setting of >> constraint_exclusion the configuration parameter. For UPDATE and DELETE, >> if enable_partition_pruning is on, we proceed to prune using constraint >> exclusion (because that's the only method available now), irrespective of >> the setting of constraint_exclusion. >> >> So to users, enable_partition_pruning should be the only way to configure >> whether or not pruning occurs. >> >> Does that make sense? > > So to summarise my understanding (plus filling in the blanks): > > 1. Add single GUC named enable_partition_pruning, default = on. > 2. Check this setting in set_append_rel_size to only perform > prune_append_rel_partitions when enable_partition_pruning is true. > 3. Add code in create_append_plan to only call > make_partition_pruneinfo when enable_partition_pruning is true. > 4. Replace test doing (constraint_exclusion == > CONSTRAINT_EXCLUSION_PARTITION) with (enable_partition_pruning). > 5. Get rid of CONSTRAINT_EXCLUSION_PARTITION. > > I don't think you mentioned 5. but if I understand you correctly then > it would leave that option doing nothing. So we should remove it. About 4 & 5: Perhaps we should leave constraint_exclusion = partition alone because there might be users who want to continue using the old inheritance method to set up partitioning for whatever reason? >> BTW, should this thread be listed somewhere on the open items page? > > Yeah. we need to decide this before PG11 is let loose. I will add it. OK, thanks. Regards, Amit [1] https://www.postgresql.org/docs/devel/static/ddl-partitioning.html#DDL-PARTITIONING-IMPLEMENTATION-INHERITANCE
Re: Should we add GUCs to allow partition pruning to be disabled?
On 20 April 2018 at 14:07, Amit Langote wrote: > To clarify: if we're going to add a new parameter *for partitioned tables* > to configure whether or not pruning occurs, even if UPDATE and DELETE now > rely on constraint exclusion for pruning, we should ignore the setting of > constraint_exclusion the configuration parameter. For UPDATE and DELETE, > if enable_partition_pruning is on, we proceed to prune using constraint > exclusion (because that's the only method available now), irrespective of > the setting of constraint_exclusion. > > So to users, enable_partition_pruning should be the only way to configure > whether or not pruning occurs. > > Does that make sense? So to summarise my understanding (plus filling in the blanks): 1. Add single GUC named enable_partition_pruning, default = on. 2. Check this setting in set_append_rel_size to only perform prune_append_rel_partitions when enable_partition_pruning is true. 3. Add code in create_append_plan to only call make_partition_pruneinfo when enable_partition_pruning is true. 4. Replace test doing (constraint_exclusion == CONSTRAINT_EXCLUSION_PARTITION) with (enable_partition_pruning). 5. Get rid of CONSTRAINT_EXCLUSION_PARTITION. I don't think you mentioned 5. but if I understand you correctly then it would leave that option doing nothing. So we should remove it. > BTW, should this thread be listed somewhere on the open items page? Yeah. we need to decide this before PG11 is let loose. I will add it. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2018/04/19 21:50, Ashutosh Bapat wrote: > On Thu, Apr 19, 2018 at 5:02 PM, Amit Langote >> I can imagine having a enable_partition_pruning which defaults to true, if >> only to avoid the performance overhead of pruning code when a user knows >> for sure that it won't help for some queries. Although, I'm a bit dubious >> why they'd write such queries if they're using partitioning in the first >> place. >> >> Also, I'd think that enable_partition_pruning set to false means pruning >> doesn't occur at all, not even using constraint exclusion. That is, >> behavior equivalent of constraint_exclusion < partition (that is, off/on). >> >> Also, if we do have such a GUC, it should apply to all command types, >> including UPDATE and DELETE which don't yet invoke the new pruning code, >> from the start. So, if enable_partition_pruning is false, we won't load >> the partition constraints at all, which we currently do for UPDATE and >> DELETE so that constraint exclusion can be used for pruning. OTOH, if >> enable_partition_pruning is on, we perform constraint exclusion -based >> pruning for UPDATE and DELETE irrespective of the setting of >> constraint_exclusion GUC. In other words, we completely dissociate >> partitioned table pruning from the setting of constraint_exclusion. > > Isn't word "dissociate" turns the last sentence into a sentence > contradicting everything you wrote prior to it? > > I think we should keep these two things separate. Yes, that's what I meant. To clarify: if we're going to add a new parameter *for partitioned tables* to configure whether or not pruning occurs, even if UPDATE and DELETE now rely on constraint exclusion for pruning, we should ignore the setting of constraint_exclusion the configuration parameter. For UPDATE and DELETE, if enable_partition_pruning is on, we proceed to prune using constraint exclusion (because that's the only method available now), irrespective of the setting of constraint_exclusion. So to users, enable_partition_pruning should be the only way to configure whether or not pruning occurs. Does that make sense? It seems like talking about the finer implementation details is making this discussion a bit confusing. > enable_partition_pruning affects the partition pruning based on the > partition bounds and that currently does not work for UPDATE/DELETE. > When it does work in those case, we might think of not loading > partition bound based constraints. constraint_exclusion affects > whether constraints can be used to exclude a relation (with partition > option affecting the child tables). Once we stop loading partition > bound based constraints, constraint exclusion would stop pruning > partitions based on the bounds. There's no point in confusing users > with by adding dependencies between these two GUCs. That's exactly what I'm trying to propose. I don't want any new GUC to work only for SELECT now and UPDATE/DELETE only later when we teach the code path handling the latter to use the new pruning implementation. In other words, I don't want a situation where two parameters control pruning for partitioned tables in PG 11. BTW, should this thread be listed somewhere on the open items page? Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, Apr 19, 2018 at 5:02 PM, Amit Langote wrote: > On 2018/04/19 13:32, Ashutosh Bapat wrote: >> On Thu, Apr 19, 2018 at 2:54 AM, David Rowley >>> The more I think about this the more undecided I am as to whether we >>> need to add a GUC for this at all, so I'm keen to hear more people >>> voice their opinion about this. If bugs are the only true reason to >>> add it, then the need for the GUC should diminish every day that >>> nobody reports any bugs. >>> >> >> Apart from bugs, I think, this GUC can be used to avoid extra planning >> time/memory/CPU incurred in pruning, when users know for sure that >> pruning is not going to happen e.g. the cases like no qual on >> partition key or no equality qual on hash partition key etc. Do we >> know how much planning time can be saved this way? > > I can imagine having a enable_partition_pruning which defaults to true, if > only to avoid the performance overhead of pruning code when a user knows > for sure that it won't help for some queries. Although, I'm a bit dubious > why they'd write such queries if they're using partitioning in the first > place. > > Also, I'd think that enable_partition_pruning set to false means pruning > doesn't occur at all, not even using constraint exclusion. That is, > behavior equivalent of constraint_exclusion < partition (that is, off/on). > > Also, if we do have such a GUC, it should apply to all command types, > including UPDATE and DELETE which don't yet invoke the new pruning code, > from the start. So, if enable_partition_pruning is false, we won't load > the partition constraints at all, which we currently do for UPDATE and > DELETE so that constraint exclusion can be used for pruning. OTOH, if > enable_partition_pruning is on, we perform constraint exclusion -based > pruning for UPDATE and DELETE irrespective of the setting of > constraint_exclusion GUC. In other words, we completely dissociate > partitioned table pruning from the setting of constraint_exclusion. Isn't word "dissociate" turns the last sentence into a sentence contradicting everything you wrote prior to it? I think we should keep these two things separate. enable_partition_pruning affects the partition pruning based on the partition bounds and that currently does not work for UPDATE/DELETE. When it does work in those case, we might think of not loading partition bound based constraints. constraint_exclusion affects whether constraints can be used to exclude a relation (with partition option affecting the child tables). Once we stop loading partition bound based constraints, constraint exclusion would stop pruning partitions based on the bounds. There's no point in confusing users with by adding dependencies between these two GUCs. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Should we add GUCs to allow partition pruning to be disabled?
On 2018/04/19 13:32, Ashutosh Bapat wrote: > On Thu, Apr 19, 2018 at 2:54 AM, David Rowley >> The more I think about this the more undecided I am as to whether we >> need to add a GUC for this at all, so I'm keen to hear more people >> voice their opinion about this. If bugs are the only true reason to >> add it, then the need for the GUC should diminish every day that >> nobody reports any bugs. >> > > Apart from bugs, I think, this GUC can be used to avoid extra planning > time/memory/CPU incurred in pruning, when users know for sure that > pruning is not going to happen e.g. the cases like no qual on > partition key or no equality qual on hash partition key etc. Do we > know how much planning time can be saved this way? I can imagine having a enable_partition_pruning which defaults to true, if only to avoid the performance overhead of pruning code when a user knows for sure that it won't help for some queries. Although, I'm a bit dubious why they'd write such queries if they're using partitioning in the first place. Also, I'd think that enable_partition_pruning set to false means pruning doesn't occur at all, not even using constraint exclusion. That is, behavior equivalent of constraint_exclusion < partition (that is, off/on). Also, if we do have such a GUC, it should apply to all command types, including UPDATE and DELETE which don't yet invoke the new pruning code, from the start. So, if enable_partition_pruning is false, we won't load the partition constraints at all, which we currently do for UPDATE and DELETE so that constraint exclusion can be used for pruning. OTOH, if enable_partition_pruning is on, we perform constraint exclusion -based pruning for UPDATE and DELETE irrespective of the setting of constraint_exclusion GUC. In other words, we completely dissociate partitioned table pruning from the setting of constraint_exclusion. Now as Justin pointed out upthread, the new GUC might cause confusion for users who are long accustomed to using constraint_exclusion for this, but I'm not sure anybody tries to change its setting a lot. The new GUC defaulting to pruning=on would be useful for occasional debugging, as we all seem to more or less agree. Thanks, Amit
Re: Should we add GUCs to allow partition pruning to be disabled?
On Thu, Apr 19, 2018 at 2:54 AM, David Rowley wrote: > If we just did it at plan time then > pre-PREPAREd queries might still prune. That does not seem very > useful if it's being disabled due to the discovery of some bug. > As you have pointed out upthread, that's a problem with every enable_* GUC. After seeing a bug, users would usually re-prepare their statements with pruning turned off. So, I don't see this as a reason for introducing two GUCs. > The more I think about this the more undecided I am as to whether we > need to add a GUC for this at all, so I'm keen to hear more people > voice their opinion about this. If bugs are the only true reason to > add it, then the need for the GUC should diminish every day that > nobody reports any bugs. > Apart from bugs, I think, this GUC can be used to avoid extra planning time/memory/CPU incurred in pruning, when users know for sure that pruning is not going to happen e.g. the cases like no qual on partition key or no equality qual on hash partition key etc. Do we know how much planning time can be saved this way? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Should we add GUCs to allow partition pruning to be disabled?
On 18 April 2018 at 21:36, Ashutosh Bapat wrote: > On Wed, Apr 18, 2018 at 5:37 AM, David Rowley >> a) Disable run-time pruning during execution. >> b) Disable run-time pruning during planning. >> c) Both of the above. >> >> The differentiation of the above is important when you consider >> PREPAREd statements. Currently, no enable_ GUC will affect a >> pre-PREPAREd query. We might want to keep that rule despite there >> being flexibility not to, in this case. > > > If run-time pruning is disabled, why do we want to waste CPU cycles > and memory to produce plan time details? It might be useful to do so, > if there was a large chance that people prepared a statement which > could use partition pruning with run-time pruning disables but > EXECUTEd it with run-time pruning enabled. It will be less likely that > the session which prepares a plan would change the GUCs before > executing it. I have to admit, can't really imagine any valid cases were disabling this feature would be useful. Generally, enable_* properties can be used to coax the planner into producing some plan shape that it otherwise didn't due to some costing problem. I can only imagine it might be useful to disable either for testing or as a workaround for some bug that might crop up. Perhaps that's not enough reason to go and add a GUC that'll likely need to exist forever. But it probably does mean that we'd want c) so that the code is completely disabled as soon as the setting is off. If we just did it at plan time then pre-PREPAREd queries might still prune. That does not seem very useful if it's being disabled due to the discovery of some bug. The more I think about this the more undecided I am as to whether we need to add a GUC for this at all, so I'm keen to hear more people voice their opinion about this. If bugs are the only true reason to add it, then the need for the GUC should diminish every day that nobody reports any bugs. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Wed, Apr 18, 2018 at 5:37 AM, David Rowley wrote: > In PG10 the planner's partition pruning could be disabled by changing > the constraint_exclusion GUC to off. This is still the case for PG11, > but only for UPDATE and DELETE queries. There is currently no way to > disable partition pruning for SELECT. > > Should we allow this? I think GUC would be useful for debugging purposes for sure. Given that we have added this feature late in v11, there might be some bugs that will bite customers in production. It's better to provide them some way to work-around. > > 1. Add a single enable_ GUC which allows both plan-time and run-time > pruning to be disabled. I would go for this. Both of those features have common code and it will get cumbersome to carefully enable/disable them separately. > 2. Add two new enable_ GUCs, one for plan-time and one for run-time pruning. This would give more granularity but > 3. No new GUCs / Do nothing. > > Run-time pruning is a little special here too, as it's the first > feature to exist in PostgreSQL which changes the plan in the executor. > From searching through the code I see no other enable_* GUC being > referenced in the executor. So there's also questions here as to > where we'd disable run-time pruning. We could disable it in the > planner so that the plan does not include the details that the > executor needs to enable the pruning, or we could just disable it in > the executor and have the planner still form plans with these details. > This separates #1 and #2 into: > > a) Disable run-time pruning during execution. > b) Disable run-time pruning during planning. > c) Both of the above. > > The differentiation of the above is important when you consider > PREPAREd statements. Currently, no enable_ GUC will affect a > pre-PREPAREd query. We might want to keep that rule despite there > being flexibility not to, in this case. If run-time pruning is disabled, why do we want to waste CPU cycles and memory to produce plan time details? It might be useful to do so, if there was a large chance that people prepared a statement which could use partition pruning with run-time pruning disables but EXECUTEd it with run-time pruning enabled. It will be less likely that the session which prepares a plan would change the GUCs before executing it. > > For UPDATE/DELETE: > It would also be quite strange if someone disabled plan-time pruning > and still got partition pruning. So I suggest we require both > constraint_exclusion and the plan-time GUC not off for pruning to be > enabled for UPDATE/DELETE. Alternatively, we just ditch > constraint_exclusion = 'partition'. > > Personally, I'm for 2b and ditching constraint_exclusion = > 'partition'. I don't see any sense in keeping constraint_exclusion = > 'partition' if we have something else to mean the same thing. > That will still be useful for inheritance based partitioning. We might re-use constraint_exclusion = 'partition' to mean enable_partition_pruning (ok, I suggested a name as well) = true, although that's not my favourite. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Should we add GUCs to allow partition pruning to be disabled?
On Tue, Apr 17, 2018 at 6:12 PM, David Rowley wrote: > On 18 April 2018 at 13:03, David G. Johnston > wrote: > > My initial reaction is that we need to fix the bug introduced in v10 - > > leaving constraint_exclusion working as it has historically and not > affect > > the new-as-of-10 ability to prune (maybe better termed as skip...) > > partitions known during execution to contain no qualified tuples. > > Can you explain which bug in PG10 you are talking about? Did you > perhaps mean PG11? > "In PG10 the planner's partition pruning could be disabled by changing the constraint_exclusion GUC to off. This is still the case for PG11, but only for UPDATE and DELETE queries. There is currently no way to disable partition pruning for SELECT." I read the word "currently" in your initial paragraph as meaning "currently released", hence version v10. Re-reading it now I'm understanding you meant currently to mean v11 and thus now so do I. I'm not onboard with overloading the constraint_exclusion GUC any > further to mean something it shouldn't. The PG11 partition pruning > code does not use CHECK constraints to eliminate partitions, so I see > no reason why constraint_exclusion should turn it on or off. You propose that the "This is still the case for PG11, but only for UPDATE and DELETE queries" is actually wrong and none of the query types should be impacted? Basically go with partition pruning is always on, check constraint evaluation defaults to off and can be turned on - and the current default for "constraint_exclusion" changes to 'off' and if someone tries to explicitly set it to 'partition' it fails. Add some new knobs for partitions if desired. I'd go that route in a green-field...I'm less convinced it is the best way forward from today. non-partition related exclusion is something I'm not understanding conceptually; and I don't know why one, outside of debugging system code, would want to not perform partition related exclusion. I could live with straight removal of the existing option and behave as if it was indeed set to 'partition'. David J.
Re: Should we add GUCs to allow partition pruning to be disabled?
On 18 April 2018 at 13:03, David G. Johnston wrote: > My initial reaction is that we need to fix the bug introduced in v10 - > leaving constraint_exclusion working as it has historically and not affect > the new-as-of-10 ability to prune (maybe better termed as skip...) > partitions known during execution to contain no qualified tuples. Can you explain which bug in PG10 you are talking about? Did you perhaps mean PG11? I'm not onboard with overloading the constraint_exclusion GUC any further to mean something it shouldn't. The PG11 partition pruning code does not use CHECK constraints to eliminate partitions, so I see no reason why constraint_exclusion should turn it on or off. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Should we add GUCs to allow partition pruning to be disabled?
On Tue, Apr 17, 2018 at 5:42 PM, Justin Pryzby wrote: > On Wed, Apr 18, 2018 at 12:07:18PM +1200, David Rowley wrote: > > In PG10 the planner's partition pruning could be disabled by changing > > the constraint_exclusion GUC to off. This is still the case for PG11, > > but only for UPDATE and DELETE queries. There is currently no way to > > disable partition pruning for SELECT. > > > > Should we allow this? > > > 3. No new GUCs / Do nothing. > > Maybe this is divergent from the details of the implementation; but, from a > user's perspective: why not continue to use constraint_exclusion? > > I would suggest to add zero new GUCs: > > 0. constraint_exclusion={off,partition,on,PLANNER*,EXECUTOR*} > My initial reaction is that we need to fix the bug introduced in v10 - leaving constraint_exclusion working as it has historically and not affect the new-as-of-10 ability to prune (maybe better termed as skip...) partitions known during execution to contain no qualified tuples. David J.