Re: [HACKERS] Surjective functional indexes
Stephen Frost writes: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> What I'm not willing to do is write hacks for pg_upgrade or pg_dump >> to mask cases where the option has been set on a v11 index. I judge >> that it's not worth the trouble. If someone else disagrees, they >> can do that work. > I'd love for there to be a better option beyond "just let people run the > pg_upgrade and have it fail half-way through"... Particularly after > someone's run the pg_upgrade check against the database... > If there isn't, then I'll write the code to add the check to pg_upgrade. Go for it. regards, tom lane
Re: [HACKERS] Surjective functional indexes
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Andres Freund writes: > > On 2019-01-14 18:53:02 -0500, Tom Lane wrote: > >> But I suspect just doing the revert is already going to be painful > >> enough :-( > > > I assume you're not particularly interested in doing that? > > No, I'm willing to do it, and will do so tomorrow if there haven't > been objections. > > What I'm not willing to do is write hacks for pg_upgrade or pg_dump > to mask cases where the option has been set on a v11 index. I judge > that it's not worth the trouble. If someone else disagrees, they > can do that work. I'd love for there to be a better option beyond "just let people run the pg_upgrade and have it fail half-way through"... Particularly after someone's run the pg_upgrade check against the database... If there isn't, then I'll write the code to add the check to pg_upgrade. I'll also offer to add other such checks, if there's similar cases that people are aware of. Thanks, Stephen signature.asc Description: PGP signature
Re: [HACKERS] Surjective functional indexes
Hi, On 2019-01-14 19:04:07 -0500, Stephen Frost wrote: > As that's the case, then I guess I'm thinking we really should make > pg_upgrade complain if it finds it during the check phase. I really > don't like having a case like this where the pg_upgrade will fail from > something that we could have detected during the pre-flight check, > that's what it's for, after all. I suggest you write a separate patch for that in that case. Greetings, Andres Freund
Re: [HACKERS] Surjective functional indexes
Greetings, * Andres Freund (and...@anarazel.de) wrote: > On 2019-01-14 18:55:18 -0500, Stephen Frost wrote: > > * Andres Freund (and...@anarazel.de) wrote: > > > > Or are you suggesting that pg_dump in v12+ would throw errors if it > > > > finds that set? Or that we'll dump it, but fail to allow it into a > > > > v12+ database? What if v12 sees "recheck_on_update='false'", as a v11 > > > > pg_dump might output today? > > > > > > It'll just error out on restore (including the internal one by > > > pg_upgrade). I don't see much point in doing more, this isn't a widely > > > used option by any stretch. > > > > This.. doesn't actually make sense. The pg_upgrade will use v12+ > > pg_dump. You're saying that the v12+ pg_dump will dump out the option, > > but then the restore will fail to understand it? > > Why does that not make sense? With one exception the reloptions code in > pg_dump doesn't have knowledge of individual reloptions. So without > adding any special case code pg_dump will just continue to dump the > option. And the server will just refuse to restore it, because it > doesn't know it anymore. Ugh, yeah, I was catching up to realize that we'd have to special-case add this into pg_dump to get it avoided; most things in pg_dump don't work that way. As that's the case, then I guess I'm thinking we really should make pg_upgrade complain if it finds it during the check phase. I really don't like having a case like this where the pg_upgrade will fail from something that we could have detected during the pre-flight check, that's what it's for, after all. Thanks, Stephen signature.asc Description: PGP signature
Re: [HACKERS] Surjective functional indexes
Andres Freund writes: > On 2019-01-14 18:53:02 -0500, Tom Lane wrote: >> But I suspect just doing the revert is already going to be painful >> enough :-( > I assume you're not particularly interested in doing that? No, I'm willing to do it, and will do so tomorrow if there haven't been objections. What I'm not willing to do is write hacks for pg_upgrade or pg_dump to mask cases where the option has been set on a v11 index. I judge that it's not worth the trouble. If someone else disagrees, they can do that work. regards, tom lane
Re: [HACKERS] Surjective functional indexes
Hi, On 2019-01-14 18:55:18 -0500, Stephen Frost wrote: > * Andres Freund (and...@anarazel.de) wrote: > > > Or are you suggesting that pg_dump in v12+ would throw errors if it > > > finds that set? Or that we'll dump it, but fail to allow it into a > > > v12+ database? What if v12 sees "recheck_on_update='false'", as a v11 > > > pg_dump might output today? > > > > It'll just error out on restore (including the internal one by > > pg_upgrade). I don't see much point in doing more, this isn't a widely > > used option by any stretch. > > This.. doesn't actually make sense. The pg_upgrade will use v12+ > pg_dump. You're saying that the v12+ pg_dump will dump out the option, > but then the restore will fail to understand it? Why does that not make sense? With one exception the reloptions code in pg_dump doesn't have knowledge of individual reloptions. So without adding any special case code pg_dump will just continue to dump the option. And the server will just refuse to restore it, because it doesn't know it anymore. Greetings, Andres Freund
Re: [HACKERS] Surjective functional indexes
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frost writes: > > * Tom Lane (t...@sss.pgh.pa.us) wrote: > >> After a few minutes' more thought, I think that the most attractive > >> option is to leave v11 alone and do a full revert in HEAD. In this > >> way, if anyone's attached "recheck_on_update" options to their indexes, > >> it'll continue to work^H^H^H^Hdo nothing in v11, though they won't be > >> able to migrate to v12 till they remove the options. That way we > >> aren't bound to the questionable design and naming of that storage > >> option if/when we try this again. > > > So the plan is to add a check into pg_upgrade to complain if it comes > > across any cases where recheck_on_update is set during its pre-flight > > checks..? > > It wasn't my plan particularly. I think the number of databases with > that option set is probably negligible, not least because it was > on-by-default during its short lifespan. So there really has never been > a point where someone would have had a reason to turn it on explicitly. > > Now if somebody else is excited enough to add such logic to pg_upgrade, > I wouldn't stand in their way. But I suspect just doing the revert is > already going to be painful enough :-( It seems like the thing to do would be to just ignore the option in v12+ pg_dump then, meaning that pg_dump wouldn't output it and pg_restore/v12+ wouldn't ever see it, and therefore users wouldn't get an error even if that option was used when they upgrade. I could live with that, but you seemed to be suggesting that something else would happen earlier. > > What if v12 sees "recheck_on_update='false'", as a v11 > > pg_dump might output today? > > It'll complain that that's an unknown option. Right, that's kinda what I figured. I'm not thrilled with that either, but hopefully it won't be too big a deal for users. Thanks! Stephen signature.asc Description: PGP signature
Re: [HACKERS] Surjective functional indexes
Hi, On 2019-01-14 18:53:02 -0500, Tom Lane wrote: > But I suspect just doing the revert is already going to be painful > enough :-( I assume you're not particularly interested in doing that? I'm more than happy to leave this to others, but if nobody signals interest I'll give it a go, because it'll get a lot harder after the pluggable storage work (and it'd work even less usefully afterwards, given the location in heapam.c). Greetings, Andres Freund
Re: [HACKERS] Surjective functional indexes
Greetings, * Andres Freund (and...@anarazel.de) wrote: > On 2019-01-14 18:46:18 -0500, Stephen Frost wrote: > > * Tom Lane (t...@sss.pgh.pa.us) wrote: > > > Andres Freund writes: > > > > On 2019-01-14 18:03:24 -0500, Tom Lane wrote: > > > >> Do we want to revert entirely, or leave the "recheck_on_update" option > > > >> present but nonfunctional? > > > > > > > I think it depends a bit on whether we want to revert in master or > > > > master and 11. If only master, I don't see much point in leaving the > > > > option around. If both, I think we should (need to?) leave it around in > > > > 11 only. > > > > > > After a few minutes' more thought, I think that the most attractive > > > option is to leave v11 alone and do a full revert in HEAD. In this > > > way, if anyone's attached "recheck_on_update" options to their indexes, > > > it'll continue to work^H^H^H^Hdo nothing in v11, though they won't be > > > able to migrate to v12 till they remove the options. That way we > > > aren't bound to the questionable design and naming of that storage > > > option if/when we try this again. > > > > So the plan is to add a check into pg_upgrade to complain if it comes > > across any cases where recheck_on_update is set during its pre-flight > > checks..? > > I don't think so. I could possibly see us just ignoring the option in pg_dump, but that goes against what Tom was suggesting, since users wouldn't see an error if we don't dump the option out.. > > Or are you suggesting that pg_dump in v12+ would throw errors if it > > finds that set? Or that we'll dump it, but fail to allow it into a > > v12+ database? What if v12 sees "recheck_on_update='false'", as a v11 > > pg_dump might output today? > > It'll just error out on restore (including the internal one by > pg_upgrade). I don't see much point in doing more, this isn't a widely > used option by any stretch. This.. doesn't actually make sense. The pg_upgrade will use v12+ pg_dump. You're saying that the v12+ pg_dump will dump out the option, but then the restore will fail to understand it? Thanks! Stephen signature.asc Description: PGP signature
Re: [HACKERS] Surjective functional indexes
Stephen Frost writes: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> After a few minutes' more thought, I think that the most attractive >> option is to leave v11 alone and do a full revert in HEAD. In this >> way, if anyone's attached "recheck_on_update" options to their indexes, >> it'll continue to work^H^H^H^Hdo nothing in v11, though they won't be >> able to migrate to v12 till they remove the options. That way we >> aren't bound to the questionable design and naming of that storage >> option if/when we try this again. > So the plan is to add a check into pg_upgrade to complain if it comes > across any cases where recheck_on_update is set during its pre-flight > checks..? It wasn't my plan particularly. I think the number of databases with that option set is probably negligible, not least because it was on-by-default during its short lifespan. So there really has never been a point where someone would have had a reason to turn it on explicitly. Now if somebody else is excited enough to add such logic to pg_upgrade, I wouldn't stand in their way. But I suspect just doing the revert is already going to be painful enough :-( > What if v12 sees "recheck_on_update='false'", as a v11 > pg_dump might output today? It'll complain that that's an unknown option. regards, tom lane
Re: [HACKERS] Surjective functional indexes
On 2019-01-14 18:46:18 -0500, Stephen Frost wrote: > Greetings, > > * Tom Lane (t...@sss.pgh.pa.us) wrote: > > Andres Freund writes: > > > On 2019-01-14 18:03:24 -0500, Tom Lane wrote: > > >> Do we want to revert entirely, or leave the "recheck_on_update" option > > >> present but nonfunctional? > > > > > I think it depends a bit on whether we want to revert in master or > > > master and 11. If only master, I don't see much point in leaving the > > > option around. If both, I think we should (need to?) leave it around in > > > 11 only. > > > > After a few minutes' more thought, I think that the most attractive > > option is to leave v11 alone and do a full revert in HEAD. In this > > way, if anyone's attached "recheck_on_update" options to their indexes, > > it'll continue to work^H^H^H^Hdo nothing in v11, though they won't be > > able to migrate to v12 till they remove the options. That way we > > aren't bound to the questionable design and naming of that storage > > option if/when we try this again. > > So the plan is to add a check into pg_upgrade to complain if it comes > across any cases where recheck_on_update is set during its pre-flight > checks..? I don't think so. > Or are you suggesting that pg_dump in v12+ would throw errors if it > finds that set? Or that we'll dump it, but fail to allow it into a > v12+ database? What if v12 sees "recheck_on_update='false'", as a v11 > pg_dump might output today? It'll just error out on restore (including the internal one by pg_upgrade). I don't see much point in doing more, this isn't a widely used option by any stretch. Greetings, Andres Freund
Re: [HACKERS] Surjective functional indexes
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Andres Freund writes: > > On 2019-01-14 18:03:24 -0500, Tom Lane wrote: > >> Do we want to revert entirely, or leave the "recheck_on_update" option > >> present but nonfunctional? > > > I think it depends a bit on whether we want to revert in master or > > master and 11. If only master, I don't see much point in leaving the > > option around. If both, I think we should (need to?) leave it around in > > 11 only. > > After a few minutes' more thought, I think that the most attractive > option is to leave v11 alone and do a full revert in HEAD. In this > way, if anyone's attached "recheck_on_update" options to their indexes, > it'll continue to work^H^H^H^Hdo nothing in v11, though they won't be > able to migrate to v12 till they remove the options. That way we > aren't bound to the questionable design and naming of that storage > option if/when we try this again. So the plan is to add a check into pg_upgrade to complain if it comes across any cases where recheck_on_update is set during its pre-flight checks..? Or are you suggesting that pg_dump in v12+ would throw errors if it finds that set? Or that we'll dump it, but fail to allow it into a v12+ database? What if v12 sees "recheck_on_update='false'", as a v11 pg_dump might output today? To be clear, I'm in agreement with reverting this, just trying to think through what's going to happen and how users will be impacted. Thanks! Stephen signature.asc Description: PGP signature
Re: [HACKERS] Surjective functional indexes
Andres Freund writes: > On 2019-01-14 18:03:24 -0500, Tom Lane wrote: >> Do we want to revert entirely, or leave the "recheck_on_update" option >> present but nonfunctional? > I think it depends a bit on whether we want to revert in master or > master and 11. If only master, I don't see much point in leaving the > option around. If both, I think we should (need to?) leave it around in > 11 only. After a few minutes' more thought, I think that the most attractive option is to leave v11 alone and do a full revert in HEAD. In this way, if anyone's attached "recheck_on_update" options to their indexes, it'll continue to work^H^H^H^Hdo nothing in v11, though they won't be able to migrate to v12 till they remove the options. That way we aren't bound to the questionable design and naming of that storage option if/when we try this again. A revert in v11 would have ABI compatibility issues to think about, and would also likely be a bunch more work on top of what we'll have to do for HEAD, so leaving it as-is seems like a good idea. regards, tom lane
Re: [HACKERS] Surjective functional indexes
Hi, On 2019-01-14 18:03:24 -0500, Tom Lane wrote: > Andres Freund writes: > > On 2018-11-07 14:25:54 -0500, Tom Lane wrote: > >> In short, it seems likely to me that large parts of this patch need to > >> be pulled out, rewritten, and then put back in different places than > >> they are today. I'm not sure if a complete revert is the best next > >> step, or if we can make progress without that. > > > We've not really made progress on this. I continue to think that we > > ought to revert this feature, and then work to re-merge it an > > architecturally correct way afterwards. Other opinions? > > Given the lack of progress, I'd agree with a revert. It's probably > already going to be a bit painful to undo due to subsequent changes, > so we shouldn't wait too much longer. Yea, the reason I re-encountered this is cleaning up the pluggable storage patch. Which certainly would make this revert harder... > Do we want to revert entirely, or leave the "recheck_on_update" option > present but nonfunctional? I think it depends a bit on whether we want to revert in master or master and 11. If only master, I don't see much point in leaving the option around. If both, I think we should (need to?) leave it around in 11 only. Greetings, Andres Freund
Re: [HACKERS] Surjective functional indexes
Andres Freund writes: > On 2018-11-07 14:25:54 -0500, Tom Lane wrote: >> In short, it seems likely to me that large parts of this patch need to >> be pulled out, rewritten, and then put back in different places than >> they are today. I'm not sure if a complete revert is the best next >> step, or if we can make progress without that. > We've not really made progress on this. I continue to think that we > ought to revert this feature, and then work to re-merge it an > architecturally correct way afterwards. Other opinions? Given the lack of progress, I'd agree with a revert. It's probably already going to be a bit painful to undo due to subsequent changes, so we shouldn't wait too much longer. Do we want to revert entirely, or leave the "recheck_on_update" option present but nonfunctional? regards, tom lane
Re: [HACKERS] Surjective functional indexes
On 2018-11-07 14:25:54 -0500, Tom Lane wrote: > Robert Haas writes: > > On Fri, May 11, 2018 at 11:01 AM, Simon Riggs wrote: > I have no problem if you want to replace this with an even better > design in a later release. > > >>> Meh. The author / committer should get a patch into the right shape > > >> They have done, at length. Claiming otherwise is just trash talk. > > > Some people might call it "honest disagreement". > > So where we are today is that this patch was lobotomized by commits > 77366d90f/d06fe6ce2 as a result of this bug report: > > https://postgr.es/m/20181106185255.776mstcyehnc63ty@alvherre.pgsql > > We need to move forward, either by undertaking a more extensive > clean-out, or by finding a path to a version of the code that is > satisfactory. I wanted to enumerate my concerns while yesterday's > events are still fresh in mind. (Andres or Robert might have more.) > > * I do not understand why this feature is on-by-default in the first > place. It can only be a win for expression indexes that are many-to-one > mappings; for indexes that are one-to-one or few-to-one, it's a pretty > big loss. I see no reason to assume that most expression indexes fall > into the first category. I suggest that the design ought to be to use > this optimization only for indexes for which the user has explicitly > enabled recheck_on_update. That would allow getting rid of the cost check > in IsProjectionFunctionalIndex, about which we'd otherwise have to have > an additional fight: I do not like its ad-hoc-ness, nor the modularity > violation (and potential circularity) involved in having the relcache call > cost_qual_eval. > > * Having heapam.c calling the executor also seems like a > modularity/layering violation that will bite us in the future. > > * The implementation seems incredibly inefficient. ProjIndexIsUnchanged > is doing creation/destruction of an EState, index_open/index_close > (including acquisition of a lock we should already have), BuildIndexInfo, > and expression compilation, all of which are completely redundant with > work done elsewhere in the executor. And it's doing that *over again for > every tuple*, which totally aside from wasted cycles probably means there > are large query-lifespan memory leaks in an UPDATE affecting many rows. > I think a minimum expectation should be that one-time work is done only > one time; but ideally none of those things would happen at all because > we could share work with the regular code path. Perhaps it's too much > to hope that we could also avoid duplicate computation of the new index > expression values, but as long as I'm listing complaints ... > > * As noted in the bug thread, the implementation of the new reloption > is broken because (a) it fails to work for some built-in index AMs > and (b) by design, it can't work for add-on AMs. I agree with Andres > that that's not acceptable. > > * This seems like bad data structure design: > > + Bitmapset *rd_projidx; /* Oids of projection indexes */ > > That comment is a lie, although IMO it'd be better if it were true; > a list of target index OIDs would be a far better design here. The use > of rd_projidx as a set of indexes into the relation's indexlist is > inefficient and overcomplicated, plus it creates an unnecessary and not > very safe (even if it were documented) coupling between rd_indexlist and > rd_projidx. > > * Having ProjIndexIsUnchanged examine relation->rd_projidx directly is > broken by design anyway, both from a modularity standpoint and because > its inner loop involves catalog accesses that could result in relcache > flushes. I'm somewhat amazed that the regression tests passed on > CLOBBER_CACHE_ALWAYS buildfarm animals, although possibly this is > explained by the fact that they're only testing cases with a single > expression index, so that the bitmap isn't checked again after the cache > flush happens. Again, this could be fixed if what was returned was just > a list of relevant index OIDs. > > * This bit of coding is unsafe, for the reason explained in the existing > comment: > > /* > * Now save copies of the bitmaps in the relcache entry. We > intentionally > * set rd_indexattr last, because that's the one that signals validity of > * the values; if we run out of memory before making that copy, we won't > * leave the relcache entry looking like the other ones are valid but > * empty. > */ > oldcxt = MemoryContextSwitchTo(CacheMemoryContext); > relation->rd_keyattr = bms_copy(uindexattrs); > relation->rd_pkattr = bms_copy(pkindexattrs); > relation->rd_idattr = bms_copy(idindexattrs); > relation->rd_indexattr = bms_copy(indexattrs); > +relation->rd_projindexattr = bms_copy(projindexattrs); > +relation->rd_projidx = bms_copy(projindexes); > MemoryContextSwitchTo(oldcxt); > > * There's a lot of other inattention to comments. For example, I noticed > that
Re: [HACKERS] Surjective functional indexes
On 08.11.2018 22:05, Alvaro Herrera wrote: On 2018-Nov-08, Konstantin Knizhnik wrote: Before doing any other refactoring of projection indexes I want to attach small bug fix patch which fixes the original problem (SIGSEGV) and also disables recheck_on_update by default. As Laurenz has suggested, I replaced boolean recheck_on_update option with "on","auto,"off" (default). I think this causes an ABI break for GenericIndexOpts. Not sure to what extent that is an actual problem (i.e. how many modules were compiled with 11.0 that are gonna be reading that struct with later Pg), but I think it should be avoided anyhow. Ok, I reverted back my change of reach_on_update option type. Now if reach_on_update option is not explicitly specified, then decision is made based on the expression cost. Patches becomes very small and fix only error in comparison of index tuple values. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index fb63471..81ab0ac 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -4546,9 +4546,10 @@ ProjIndexIsUnchanged(Relation relation, HeapTuple oldtup, HeapTuple newtup) } else if (!old_isnull[i]) { - Form_pg_attribute att = TupleDescAttr(RelationGetDescr(indexDesc), i); - - if (!datumIsEqual(old_values[i], new_values[i], att->attbyval, att->attlen)) + int16 elmlen; + bool elmbyval; + get_typlenbyval(indexDesc->rd_opcintype[i], &elmlen, &elmbyval); + if (!datumIsEqual(old_values[i], new_values[i], elmbyval, elmlen)) { equals = false; break; diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index fd3d010..ed8a01e 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -4745,10 +4745,11 @@ RelationGetIndexPredicate(Relation relation) * for old and new tuple values. * * Decision made by this function is based on two sources: - * 1. Calculated cost of index expression: if greater than some heuristic limit - then extra comparison of index expression values is expected to be too - expensive, so we don't attempt it by default. - * 2. "recheck_on_update" index option explicitly set by user, which overrides 1) + * 1. If "recheck_on_update" is true, then index is considered as projection. + * 2. If "recheck_on_update" is false, then changing any column on which index depends disables hot update + * 3. If "recheck_on_update" is not explicitly specified then recheck on update is + *enabled if cost of index expression is not greater than some heuristic limit (1000). + *In this case extra comparison of index expression values is expected to be too expensive. */ static bool IsProjectionFunctionalIndex(Relation index, IndexInfo *ii) @@ -4760,26 +4761,9 @@ IsProjectionFunctionalIndex(Relation index, IndexInfo *ii) HeapTuple tuple; Datum reloptions; bool isnull; + bool is_explicitly_set = false; QualCost index_expr_cost; - /* by default functional index is considered as non-injective */ - is_projection = true; - - cost_qual_eval(&index_expr_cost, ii->ii_Expressions, NULL); - - /* - * If index expression is too expensive, then disable projection - * optimization, because extra evaluation of index expression is - * expected to be more expensive than index update. Currently the - * projection optimization has to calculate index expression twice - * when the value of index expression has not changed and three times - * when values differ because the expression is recalculated when - * inserting a new index entry for the changed value. - */ - if ((index_expr_cost.startup + index_expr_cost.per_tuple) > - HEURISTIC_MAX_HOT_RECHECK_EXPR_COST) - is_projection = false; - tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(RelationGetRelid(index))); if (!HeapTupleIsValid(tuple)) elog(ERROR, "cache lookup failed for relation %u", RelationGetRelid(index)); @@ -4795,9 +4779,25 @@ IsProjectionFunctionalIndex(Relation index, IndexInfo *ii) if (idxopts != NULL) { is_projection = idxopts->recheck_on_update; +is_explicitly_set = true; pfree(idxopts); } } + if (!is_explicitly_set) + { + cost_qual_eval(&index_expr_cost, ii->ii_Expressions, NULL); + /* + * If index expression is too expensive, then disable projection + * optimization, because extra evaluation of index expression is + * expected to be more expensive than index update. Currently the + * projection optimization has to calculate index expression twice + * when the value of index expression has not changed and three times + * when values differ because the expression is recalculated when + * inserting a new index entry for the changed value. + */ + is_projection = (index_expr_cost.startup + index_expr_cost.per_tuple) +<= HEURIST
Re: [HACKERS] Surjective functional indexes
On 09.11.2018 2:27, Tom Lane wrote: I wrote: The bigger picture here, and the reason for my skepticism about having any intelligence in the enabling logic, is that there is no scenario in which this code can be smarter than the user about what to do. We have no insight today, and are unlikely to have any in future, about whether a specific index expression is many-to-one or not. Hmm ... actually, I take that back. Since we're only interested in this for expression indexes, we can expect that statistics will be available for the expression index, at least for tables that have been around long enough that UPDATE performance is really an exciting topic. So you could imagine pulling up the stadistinct numbers for the index column(s) and the underlying column(s), and enabling the optimization when their ratio is less than $something. Figuring out how to merge numbers for multiple columns might be tricky, but it's not going to be completely fact-free. (I still think that the cost-estimate logic is quite bogus, however.) Another issue in all this is the cost of doing this work over again after any relcache flush. Maybe we could move the responsibility into ANALYZE? BTW, the existing code appears to be prepared to enable this logic if *any* index column is an expression, but surely we should do so only if they are *all* expressions? regards, tom lane From my point of view "auto" value should be default, otherwise it has not so much sense. If somebody decides to switch on this optimization for some particular index, then it will set it to "on", not "auto". So I agree with your previous opinion, that if this optimization is disabled by default, then it is enough to have boolean parameter. Concerning muticolumn indexes: why we should apply this optimization only if *all* of index columns are expressions? Assume very simple example: we have some kind of document storage represented by the following table: create table document(owner integer, name text, last_updated timestamp, description json); So there are some static document attributes (name, date,...) and some dynamic, stored in json field. Consider that most frequently users will search among their own documents. So we may create index like this: create index by_title on documents(owner,(description->>'title')); Document description may include many attributes which are updated quite frequently, like "comments", "keywords",... But "title" is rarely changed, so this optimization will be very useful for such index. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Surjective functional indexes
I wrote: > The bigger picture here, and the reason for my skepticism about having > any intelligence in the enabling logic, is that there is no scenario > in which this code can be smarter than the user about what to do. > We have no insight today, and are unlikely to have any in future, about > whether a specific index expression is many-to-one or not. Hmm ... actually, I take that back. Since we're only interested in this for expression indexes, we can expect that statistics will be available for the expression index, at least for tables that have been around long enough that UPDATE performance is really an exciting topic. So you could imagine pulling up the stadistinct numbers for the index column(s) and the underlying column(s), and enabling the optimization when their ratio is less than $something. Figuring out how to merge numbers for multiple columns might be tricky, but it's not going to be completely fact-free. (I still think that the cost-estimate logic is quite bogus, however.) Another issue in all this is the cost of doing this work over again after any relcache flush. Maybe we could move the responsibility into ANALYZE? BTW, the existing code appears to be prepared to enable this logic if *any* index column is an expression, but surely we should do so only if they are *all* expressions? regards, tom lane
Re: [HACKERS] Surjective functional indexes
Alvaro Herrera writes: > On 2018-Nov-08, Konstantin Knizhnik wrote: >> Before doing any other refactoring of projection indexes I want to attach >> small bug fix patch which >> fixes the original problem (SIGSEGV) and also disables recheck_on_update by >> default. >> As Laurenz has suggested, I replaced boolean recheck_on_update option with >> "on","auto,"off" (default). > I think this causes an ABI break for GenericIndexOpts. Not sure to what > extent that is an actual problem (i.e. how many modules were compiled > with 11.0 that are gonna be reading that struct with later Pg), but I > think it should be avoided anyhow. I do not see the point of having more than a boolean option anyway, if the default is going to be "off". If the user is going to the trouble of marking a specific index for this feature, what else is she likely to want other than having it "on"? The bigger picture here, and the reason for my skepticism about having any intelligence in the enabling logic, is that there is no scenario in which this code can be smarter than the user about what to do. We have no insight today, and are unlikely to have any in future, about whether a specific index expression is many-to-one or not. I have exactly no faith in cost-estimate-based logic either, because of the extremely poor quality of our function cost estimates --- very little effort has been put into assigning trustworthy procost numbers to the built-in functions, and as for user-defined ones, it's probably much worse. So that's why I'm on the warpath against the cost-based logic that's there now; I think it's just garbage-in-garbage-out. regards, tom lane
Re: [HACKERS] Surjective functional indexes
Hi, On 2018-11-07 14:25:54 -0500, Tom Lane wrote: > We need to move forward, either by undertaking a more extensive > clean-out, or by finding a path to a version of the code that is > satisfactory. > [...] > In short, it seems likely to me that large parts of this patch need to > be pulled out, rewritten, and then put back in different places than > they are today. I'm not sure if a complete revert is the best next > step, or if we can make progress without that. I think the feature has merit, but I don't think it makes that much sense to start with the current in-tree version. There's just too many architectural issues. So I think we should clean it out as much as we can, and then have the feature re-submitted with proper review etc. Greetings, Andres Freund
Re: [HACKERS] Surjective functional indexes
On 2018-Nov-08, Konstantin Knizhnik wrote: > Before doing any other refactoring of projection indexes I want to attach > small bug fix patch which > fixes the original problem (SIGSEGV) and also disables recheck_on_update by > default. > As Laurenz has suggested, I replaced boolean recheck_on_update option with > "on","auto,"off" (default). I think this causes an ABI break for GenericIndexOpts. Not sure to what extent that is an actual problem (i.e. how many modules were compiled with 11.0 that are gonna be reading that struct with later Pg), but I think it should be avoided anyhow. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 08.11.2018 15:23, Laurenz Albe wrote: Tom Lane wrote: I wanted to enumerate my concerns while yesterday's events are still fresh in mind. (Andres or Robert might have more.) * I do not understand why this feature is on-by-default in the first place. It can only be a win for expression indexes that are many-to-one mappings; for indexes that are one-to-one or few-to-one, it's a pretty big loss. I see no reason to assume that most expression indexes fall into the first category. I suggest that the design ought to be to use this optimization only for indexes for which the user has explicitly enabled recheck_on_update. That would allow getting rid of the cost check in IsProjectionFunctionalIndex, about which we'd otherwise have to have an additional fight: I do not like its ad-hoc-ness, nor the modularity violation (and potential circularity) involved in having the relcache call cost_qual_eval. That was my impression too when I had a closer look at this feature. What about an option "hot_update_check" with values "off" (default), "on" and "always"? Yours, Laurenz Albe Before doing any other refactoring of projection indexes I want to attach small bug fix patch which fixes the original problem (SIGSEGV) and also disables recheck_on_update by default. As Laurenz has suggested, I replaced boolean recheck_on_update option with "on","auto,"off" (default). -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index 4df3d75..e17dd09 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -365,8 +365,9 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] Specifies whether to recheck a functional index value to see whether - we can use a HOT update or not. The default value is on for functional - indexes with an total expression cost less than 1000, otherwise off. + we can use a HOT update or not. Accepted values are "on", "auto" and "off" + (default). In case of "auto", recheck is done for indexes with an total expression + cost less than 1000. You might decide to turn this off if you knew that a function used in an index is unlikely to return the same value when one of the input columns is updated and so the recheck is not worth the additional cost diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c index db84da0..9fb5ebc 100644 --- a/src/backend/access/common/reloptions.c +++ b/src/backend/access/common/reloptions.c @@ -19,6 +19,7 @@ #include "access/gist_private.h" #include "access/hash.h" +#include "access/heapam.h" #include "access/htup_details.h" #include "access/nbtree.h" #include "access/reloptions.h" @@ -131,15 +132,6 @@ static relopt_bool boolRelOpts[] = }, { { - "recheck_on_update", - "Recheck functional index expression for changed value after update", - RELOPT_KIND_INDEX, - ShareUpdateExclusiveLock /* since only applies to later UPDATEs */ - }, - true - }, - { - { "security_barrier", "View acts as a row security barrier", RELOPT_KIND_VIEW, @@ -448,6 +440,18 @@ static relopt_string stringRelOpts[] = validateWithCheckOption, NULL }, + { + { + "recheck_on_update", + "Recheck functional index expression for changed value after update", + RELOPT_KIND_INDEX, + ShareUpdateExclusiveLock /* since only applies to later UPDATEs */ + }, + 3, + false, + validateRecheckOnUpdateOption, + "off" + }, /* list terminator */ {{NULL}} }; @@ -1499,7 +1503,7 @@ index_generic_reloptions(Datum reloptions, bool validate) GenericIndexOpts *idxopts; relopt_value *options; static const relopt_parse_elt tab[] = { - {"recheck_on_update", RELOPT_TYPE_BOOL, offsetof(GenericIndexOpts, recheck_on_update)} + {"recheck_on_update", RELOPT_TYPE_STRING, offsetof(GenericIndexOpts, recheck_on_update)} }; options = parseRelOptions(reloptions, validate, diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index fb63471..21cdc1b 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -4491,6 +4491,21 @@ heap_tuple_attr_equals(TupleDesc tupdesc, int attrnum, } } + +void validateRecheckOnUpdateOption(const char *value) +{ + if (value == NULL || + (strcmp(value, "on") != 0 && + strcmp(value, "off") != 0 && + strcmp(value, "auto") != 0)) + { + ereport(ERROR, +(errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid value for \"recheck_on_update\" option"), + errdetail("Valid values are \"on\", \"off\", and \"auto\"."))); + } +} + /* * Check whether the value is unchanged after update of a projection * functional index. Compare the new and old values of the indexed @@ -4546,9 +4561,10 @@ ProjIndexIsUnchanged(Relation relation, HeapTuple oldtup, HeapTuple newtup) } else
Re: [HACKERS] Surjective functional indexes
Tom Lane wrote: > I wanted to enumerate my concerns while yesterday's > events are still fresh in mind. (Andres or Robert might have more.) > > * I do not understand why this feature is on-by-default in the first > place. It can only be a win for expression indexes that are many-to-one > mappings; for indexes that are one-to-one or few-to-one, it's a pretty > big loss. I see no reason to assume that most expression indexes fall > into the first category. I suggest that the design ought to be to use > this optimization only for indexes for which the user has explicitly > enabled recheck_on_update. That would allow getting rid of the cost check > in IsProjectionFunctionalIndex, about which we'd otherwise have to have > an additional fight: I do not like its ad-hoc-ness, nor the modularity > violation (and potential circularity) involved in having the relcache call > cost_qual_eval. That was my impression too when I had a closer look at this feature. What about an option "hot_update_check" with values "off" (default), "on" and "always"? Yours, Laurenz Albe
Re: [HACKERS] Surjective functional indexes
On 07.11.2018 22:25, Tom Lane wrote: Robert Haas writes: On Fri, May 11, 2018 at 11:01 AM, Simon Riggs wrote: I have no problem if you want to replace this with an even better design in a later release. Meh. The author / committer should get a patch into the right shape They have done, at length. Claiming otherwise is just trash talk. Some people might call it "honest disagreement". So where we are today is that this patch was lobotomized by commits 77366d90f/d06fe6ce2 as a result of this bug report: https://postgr.es/m/20181106185255.776mstcyehnc63ty@alvherre.pgsql We need to move forward, either by undertaking a more extensive clean-out, or by finding a path to a version of the code that is satisfactory. I wanted to enumerate my concerns while yesterday's events are still fresh in mind. (Andres or Robert might have more.) * I do not understand why this feature is on-by-default in the first place. It can only be a win for expression indexes that are many-to-one mappings; for indexes that are one-to-one or few-to-one, it's a pretty big loss. I see no reason to assume that most expression indexes fall into the first category. I suggest that the design ought to be to use this optimization only for indexes for which the user has explicitly enabled recheck_on_update. That would allow getting rid of the cost check in IsProjectionFunctionalIndex, about which we'd otherwise have to have an additional fight: I do not like its ad-hoc-ness, nor the modularity violation (and potential circularity) involved in having the relcache call cost_qual_eval. * Having heapam.c calling the executor also seems like a modularity/layering violation that will bite us in the future. * The implementation seems incredibly inefficient. ProjIndexIsUnchanged is doing creation/destruction of an EState, index_open/index_close (including acquisition of a lock we should already have), BuildIndexInfo, and expression compilation, all of which are completely redundant with work done elsewhere in the executor. And it's doing that *over again for every tuple*, which totally aside from wasted cycles probably means there are large query-lifespan memory leaks in an UPDATE affecting many rows. I think a minimum expectation should be that one-time work is done only one time; but ideally none of those things would happen at all because we could share work with the regular code path. Perhaps it's too much to hope that we could also avoid duplicate computation of the new index expression values, but as long as I'm listing complaints ... * As noted in the bug thread, the implementation of the new reloption is broken because (a) it fails to work for some built-in index AMs and (b) by design, it can't work for add-on AMs. I agree with Andres that that's not acceptable. * This seems like bad data structure design: + Bitmapset *rd_projidx; /* Oids of projection indexes */ That comment is a lie, although IMO it'd be better if it were true; a list of target index OIDs would be a far better design here. The use of rd_projidx as a set of indexes into the relation's indexlist is inefficient and overcomplicated, plus it creates an unnecessary and not very safe (even if it were documented) coupling between rd_indexlist and rd_projidx. * Having ProjIndexIsUnchanged examine relation->rd_projidx directly is broken by design anyway, both from a modularity standpoint and because its inner loop involves catalog accesses that could result in relcache flushes. I'm somewhat amazed that the regression tests passed on CLOBBER_CACHE_ALWAYS buildfarm animals, although possibly this is explained by the fact that they're only testing cases with a single expression index, so that the bitmap isn't checked again after the cache flush happens. Again, this could be fixed if what was returned was just a list of relevant index OIDs. * This bit of coding is unsafe, for the reason explained in the existing comment: /* * Now save copies of the bitmaps in the relcache entry. We intentionally * set rd_indexattr last, because that's the one that signals validity of * the values; if we run out of memory before making that copy, we won't * leave the relcache entry looking like the other ones are valid but * empty. */ oldcxt = MemoryContextSwitchTo(CacheMemoryContext); relation->rd_keyattr = bms_copy(uindexattrs); relation->rd_pkattr = bms_copy(pkindexattrs); relation->rd_idattr = bms_copy(idindexattrs); relation->rd_indexattr = bms_copy(indexattrs); +relation->rd_projindexattr = bms_copy(projindexattrs); +relation->rd_projidx = bms_copy(projindexes); MemoryContextSwitchTo(oldcxt); * There's a lot of other inattention to comments. For example, I noticed that this patch added new responsibilities to RelationGetIndexList without updating its header comment to mention them. * There's a lack of attention to terminology, too. I do not think that "projection index
Re: [HACKERS] Surjective functional indexes
Robert Haas writes: > On Fri, May 11, 2018 at 11:01 AM, Simon Riggs wrote: I have no problem if you want to replace this with an even better design in a later release. >>> Meh. The author / committer should get a patch into the right shape >> They have done, at length. Claiming otherwise is just trash talk. > Some people might call it "honest disagreement". So where we are today is that this patch was lobotomized by commits 77366d90f/d06fe6ce2 as a result of this bug report: https://postgr.es/m/20181106185255.776mstcyehnc63ty@alvherre.pgsql We need to move forward, either by undertaking a more extensive clean-out, or by finding a path to a version of the code that is satisfactory. I wanted to enumerate my concerns while yesterday's events are still fresh in mind. (Andres or Robert might have more.) * I do not understand why this feature is on-by-default in the first place. It can only be a win for expression indexes that are many-to-one mappings; for indexes that are one-to-one or few-to-one, it's a pretty big loss. I see no reason to assume that most expression indexes fall into the first category. I suggest that the design ought to be to use this optimization only for indexes for which the user has explicitly enabled recheck_on_update. That would allow getting rid of the cost check in IsProjectionFunctionalIndex, about which we'd otherwise have to have an additional fight: I do not like its ad-hoc-ness, nor the modularity violation (and potential circularity) involved in having the relcache call cost_qual_eval. * Having heapam.c calling the executor also seems like a modularity/layering violation that will bite us in the future. * The implementation seems incredibly inefficient. ProjIndexIsUnchanged is doing creation/destruction of an EState, index_open/index_close (including acquisition of a lock we should already have), BuildIndexInfo, and expression compilation, all of which are completely redundant with work done elsewhere in the executor. And it's doing that *over again for every tuple*, which totally aside from wasted cycles probably means there are large query-lifespan memory leaks in an UPDATE affecting many rows. I think a minimum expectation should be that one-time work is done only one time; but ideally none of those things would happen at all because we could share work with the regular code path. Perhaps it's too much to hope that we could also avoid duplicate computation of the new index expression values, but as long as I'm listing complaints ... * As noted in the bug thread, the implementation of the new reloption is broken because (a) it fails to work for some built-in index AMs and (b) by design, it can't work for add-on AMs. I agree with Andres that that's not acceptable. * This seems like bad data structure design: + Bitmapset *rd_projidx; /* Oids of projection indexes */ That comment is a lie, although IMO it'd be better if it were true; a list of target index OIDs would be a far better design here. The use of rd_projidx as a set of indexes into the relation's indexlist is inefficient and overcomplicated, plus it creates an unnecessary and not very safe (even if it were documented) coupling between rd_indexlist and rd_projidx. * Having ProjIndexIsUnchanged examine relation->rd_projidx directly is broken by design anyway, both from a modularity standpoint and because its inner loop involves catalog accesses that could result in relcache flushes. I'm somewhat amazed that the regression tests passed on CLOBBER_CACHE_ALWAYS buildfarm animals, although possibly this is explained by the fact that they're only testing cases with a single expression index, so that the bitmap isn't checked again after the cache flush happens. Again, this could be fixed if what was returned was just a list of relevant index OIDs. * This bit of coding is unsafe, for the reason explained in the existing comment: /* * Now save copies of the bitmaps in the relcache entry. We intentionally * set rd_indexattr last, because that's the one that signals validity of * the values; if we run out of memory before making that copy, we won't * leave the relcache entry looking like the other ones are valid but * empty. */ oldcxt = MemoryContextSwitchTo(CacheMemoryContext); relation->rd_keyattr = bms_copy(uindexattrs); relation->rd_pkattr = bms_copy(pkindexattrs); relation->rd_idattr = bms_copy(idindexattrs); relation->rd_indexattr = bms_copy(indexattrs); +relation->rd_projindexattr = bms_copy(projindexattrs); +relation->rd_projidx = bms_copy(projindexes); MemoryContextSwitchTo(oldcxt); * There's a lot of other inattention to comments. For example, I noticed that this patch added new responsibilities to RelationGetIndexList without updating its header comment to mention them. * There's a lack of attention to terminology, too. I do not think that "projection index" is an appropriate term at all, no
Re: [HACKERS] Surjective functional indexes
On Fri, May 11, 2018 at 11:01 AM, Simon Riggs wrote: >>> I have no problem if you want to replace this with an even better >>> design in a later release. >> >> Meh. The author / committer should get a patch into the right shape > > They have done, at length. Claiming otherwise is just trash talk. Some people might call it "honest disagreement". -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Surjective functional indexes
On 11 May 2018 at 16:37, Andres Freund wrote: > On 2018-05-11 14:56:12 +0200, Simon Riggs wrote: >> On 11 May 2018 at 05:32, Andres Freund wrote: >> > No. Simon just claimed it's not actually a concern: >> > https://www.postgresql.org/message-id/canp8+j+vtskphep_gmqmeqdwakst2kbotee0yz-my+agh0a...@mail.gmail.com >> > >> > And yes, it got committed without doing squat to address the >> > architectural concerns. >> >> "Squat" means "zero, nothing" to me. So that comment would be inaccurate. > > Yes, I know it means that. And I don't see how it is inaccurate. > > >> I have no problem if you want to replace this with an even better >> design in a later release. > > Meh. The author / committer should get a patch into the right shape They have done, at length. Claiming otherwise is just trash talk. As you pointed out, the design of the patch avoids layering violations that could have led to architectural objections. Are you saying I should have ignored your words and rewritten the patch to introduce a layering violation? What other objection do you think has not been addressed? -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 2018-05-11 14:56:12 +0200, Simon Riggs wrote: > On 11 May 2018 at 05:32, Andres Freund wrote: > > No. Simon just claimed it's not actually a concern: > > https://www.postgresql.org/message-id/canp8+j+vtskphep_gmqmeqdwakst2kbotee0yz-my+agh0a...@mail.gmail.com > > > > And yes, it got committed without doing squat to address the > > architectural concerns. > > "Squat" means "zero, nothing" to me. So that comment would be inaccurate. Yes, I know it means that. And I don't see how it is inaccurate. > I have no problem if you want to replace this with an even better > design in a later release. Meh. The author / committer should get a patch into the right shape, not other people that are concerned with the consequences. Greetings, Andres Freund
Re: [HACKERS] Surjective functional indexes
On Fri, May 11, 2018 at 4:58 AM, Konstantin Knizhnik < k.knizh...@postgrespro.ru> wrote: > > Sorry, may be I do not completely understand you. > So whats happed before this patch: > > - On update postgres compares old and new values of all changed attributes > to determine whether them are actually changed. > - If value of some indexed attribute is changed, then hot update is not > applicable and we have to rebuild indexed. > - Otherwise hot update is used and indexes should not be updated. > > What is changed: > > - When some of attributes, on which functional index depends, is changed, > then we calculate value of index expression. It is done using existed > FormIndexDatum function which stores calculated expression value in the > provided slot. This evaluation of index expressions and their comparison is > done in access/heap/heapam.c file. > - Only if old and new values of index expression are different, then hot > update is really not applicable. > > Thanks. So, in my version of layman's terms, before this patch we treated simple column referenced indexes and expression indexes differently because in a functional index we didn't actually compare the expression results, only the original values of the depended-on columns. With this patch both are treated the same - and in a manner that I would think would be normal/expected. Treating expression indexes this way, however, is potentially more costly than before and thus we want to provide the user a way to say "hey, PG, don't bother with the extra effort of comparing the entire expression, it ain't gonna matter since if I change the depended-on values odds are the expression result will change as well." Changing an option named "recheck_on_update" doesn't really communicate that to me (I dislike the word recheck, too implementation specific). Something like "only_compare_dependencies_on_update (default false)" would do a better job at that - tell me what the abnormal behavior is, not the normal, and lets me enable it instead of disabling the default behavior and not know what it is falling back to. The description for the parameter does a good job of describing this - just suggesting that the name match how the user might see things. On the whole the change makes sense - and the general assumptions around runtime cost seem sound and support the trade-off simply re-computing the expression on the old tuple versus engineering an inexpensive way to retrieve the existing value from the index. I don't have a problem with "total expression cost" being used as a heuristic - though maybe that would mean we should make this an enum with (off, on, never/always) with the third option overriding all heuristics. David J.
Re: [HACKERS] Surjective functional indexes
On 11 May 2018 at 05:32, Andres Freund wrote: > On 2018-05-10 23:25:58 -0400, Robert Haas wrote: >> On Thu, Mar 1, 2018 at 2:48 PM, Andres Freund wrote: >> > I still don't think, as commented upon by Tom and me upthread, that we >> > want this feature in the current form. >> >> Was this concern ever addressed, or did the patch just get committed anyway? > > No. Simon just claimed it's not actually a concern: > https://www.postgresql.org/message-id/canp8+j+vtskphep_gmqmeqdwakst2kbotee0yz-my+agh0a...@mail.gmail.com > > And yes, it got committed without doing squat to address the > architectural concerns. "Squat" means "zero, nothing" to me. So that comment would be inaccurate. I've spent a fair amount of time reviewing and grooming the patch, and I am happy that Konstantin has changed his patch significantly in response to those reviews, as well as explaining why the patch is fine as it is. It's a useful patch that PostgreSQL needs, so all good. I have no problem if you want to replace this with an even better design in a later release. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 11.05.2018 07:48, David G. Johnston wrote: On Thursday, February 1, 2018, Konstantin Knizhnik mailto:k.knizh...@postgrespro.ru>> wrote: Old + New for check = 2 plus calculate again in index = 3 Yes, we have to calculate the value of index expression for original and updated version of the record. If them are equal, then it is all we have to do with this index: hot update is applicable. In this case we calculate index expression twice. But if them are not equal, then hot update is not applicable and we have to update index. In this case expression will be calculated one more time. So totally three times. This is why, if calculation of index expression is very expensive, then effect of this optimization may be negative even if value of index expression is not changed. For the old/new comparison and the re-calculate if changed dynamics - is this a side effect of separation of concerns only or is there some other reason the old computed value already stored in the index isn't compared to the one and only function result of the new tuple which, if found to be different, is then stored. One function invocation, which has to happen anyway, and one extra equality check. Seems like this could be used for non-functional indexes too, so that mere presence in the update listing doesn't negate HOT if the column didn't actually change (if I'm not mis-remembering something here anyway...) Also, create index page doc typo from site: "with an total" s/b "with a total" (expression cost less than 1000) - maybe add a comma for 1,000 David J. Sorry, may be I do not completely understand you. So whats happed before this patch: - On update postgres compares old and new values of all changed attributes to determine whether them are actually changed. - If value of some indexed attribute is changed, then hot update is not applicable and we have to rebuild indexed. - Otherwise hot update is used and indexes should not be updated. What is changed: - When some of attributes, on which functional index depends, is changed, then we calculate value of index expression. It is done using existed FormIndexDatum function which stores calculated expression value in the provided slot. This evaluation of index expressions and their comparison is done in access/heap/heapam.c file. - Only if old and new values of index expression are different, then hot update is really not applicable. - In this case we have to rebuild indexes. It is done by ExecInsertIndexTuples in executor/execIndexing.c which calls FormIndexDatum once again to calculate index expression. So in principle, it is certainly possible to store value of index expression calculated in ProjIndexIsUnchanged and reuse it ExecInsertIndexTuples. But I do not know right place (context) where this value can be stored. And also it will add some dependency between heapam and execIndexing modules. Also it is necessary to take in account that ProjIndexIsUnchanged is not always called. So index expression value may be not present. Finally, most of practically used index expressions are not expensive. It is usually something like extraction of JSON component. Cost of execution of this function is much smaller than cost of extracting and unpacking toasted JSON value. So I do not think that such optimization will be really useful, while it is obvious that it significantly complicate code. Also please notice that FormIndexDatum is used in some other places: utils/adt/selfuncs.c utils/sort/tuplesort.c mmands/constraint.c So this patch just adds two more calls of this function. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Surjective functional indexes
On Thursday, February 1, 2018, Konstantin Knizhnik < k.knizh...@postgrespro.ru> wrote: > > Old + New for check = 2 >> plus calculate again in index = 3 >> > > Yes, we have to calculate the value of index expression for original and > updated version of the record. If them are equal, then it is all we have to > do with this index: hot update is applicable. > In this case we calculate index expression twice. > But if them are not equal, then hot update is not applicable and we have > to update index. In this case expression will be calculated one more time. > So totally three times. > This is why, if calculation of index expression is very expensive, then > effect of this optimization may be negative even if value of index > expression is not changed. > For the old/new comparison and the re-calculate if changed dynamics - is this a side effect of separation of concerns only or is there some other reason the old computed value already stored in the index isn't compared to the one and only function result of the new tuple which, if found to be different, is then stored. One function invocation, which has to happen anyway, and one extra equality check. Seems like this could be used for non-functional indexes too, so that mere presence in the update listing doesn't negate HOT if the column didn't actually change (if I'm not mis-remembering something here anyway...) Also, create index page doc typo from site: "with an total" s/b "with a total" (expression cost less than 1000) - maybe add a comma for 1,000 David J.
Re: [HACKERS] Surjective functional indexes
On 2018-05-10 23:25:58 -0400, Robert Haas wrote: > On Thu, Mar 1, 2018 at 2:48 PM, Andres Freund wrote: > > I still don't think, as commented upon by Tom and me upthread, that we > > want this feature in the current form. > > Was this concern ever addressed, or did the patch just get committed anyway? No. Simon just claimed it's not actually a concern: https://www.postgresql.org/message-id/canp8+j+vtskphep_gmqmeqdwakst2kbotee0yz-my+agh0a...@mail.gmail.com And yes, it got committed without doing squat to address the architectural concerns. Greetings, Andres Freund
Re: [HACKERS] Surjective functional indexes
On Thu, Mar 1, 2018 at 2:48 PM, Andres Freund wrote: > I still don't think, as commented upon by Tom and me upthread, that we > want this feature in the current form. Was this concern ever addressed, or did the patch just get committed anyway? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Surjective functional indexes
On 27.03.2018 22:08, Simon Riggs wrote: On 23 March 2018 at 15:54, Simon Riggs wrote: So please could you make the change? Committed, but I still think that change would be good. Thank you. But I still not sure whether replacement of bitmap with List or array of Oids is really good idea. There are two aspects: size and speed. It seems to me that memory overhead is most important. At least you rejected your own idea about using autotune for this parameters because of presence of extra 16 bytes in statistic. But RelationData structure is even more space critical: its size is multiplied by number of relations and backends. Bitmap seems to provide the most compact representation of the projective index list. Concerning speed aspect, certainly iteration through the list of all indexes with checking presence of index in the bitmap is more expensive than just direct iteration through Oid list or array. But since check of bitmap can be done in constant time, both approaches have the same complexity. Also for typical case (< 5 normal indexes for a table and 1-2 functional indexes) difference in performance can not be measured. Also bitmap provides convenient interface for adding new members. To construct array of Oids I need first to determine number of indexes, so I have to perform two loops. So if you and Alvaro insist on this change, then I will definitely do it. But from my point of view, using bitmap here is acceptable solution. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Surjective functional indexes
On 23 March 2018 at 15:54, Simon Riggs wrote: > So please could you make the change? Committed, but I still think that change would be good. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 23.03.2018 18:45, Alvaro Herrera wrote: Konstantin Knizhnik wrote: rd_projidx is not a list, it is Bitmapset. It is just one of many bitmap sets in RelationData: Yes, but the other bitmapsets are of AttrNumber of the involved column. They new one is of list_nth() counters for items in the index list. That seems weird and it scares me -- do we use that coding pattern elsewhere? Maybe use an array of OIDs instead? Using list or array instead of bitmap requires much more space... And bitmaps are much more efficient for many operations: check for element presence, combine, intersect,... Check in bitmap has O(0) complexity so iterating through list of all indexes or attributes with bitmap check doesn't add any essential overhead. Sorry, I do not understand this: "They new one is of list_nth() counters for items in the index list" -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Surjective functional indexes
On 22.03.2018 23:53, Alvaro Herrera wrote: The whole IsProjectionFunctionalIndex looks kinda bogus/ugly to me. Set the boolean to false, but keep evaluating anyway? But then, I thought the idea was to do this based on the reloption, not by comparing the expression cost to a magical (unmodifiable) value? The idea is very simple, I tried to explain it in the comments. Sorry if my explanation was not so clear. By default we assume all functional indexes as projectional. This is done by first assignment is_projection = true. Then we check cost of index function. I agree that "magical (unmodifiable) value" used as cost threshold is not a good idea. But I tried to avoid as much as possible adding yet another configuration parameter. I do no think that flexibility here is so important. I believe that in 99% cases functional indexes are projectional, and in all other cases DBA cna explicitly specify it using recheck_on_update index option. Which should override cost threshold: if DBA thinks that recheck is useful for this index, then it should be used despite to previsions guess based on index expression cost. This is why after setting "is_projection" to false because of too large cost of index expression, we continue work and check index options for explicitly specified value. In RelationGetIndexAttrBitmap(), indexattrs no longer gets the columns corresponding to projection indexes. Isn't that weird/error prone/confusing? I think it'd be saner to add these bits to both bitmaps. This bitmap is needed only to mark attributes, which modification prevents hot update. Should I rename rd_indexattr to rd_hotindexattr or whatever else to avoid confusion? Please notice, that rd_indexattr is used only inside RelationGetIndexAttrBitmap and is not visible from outside. It is returned for INDEX_ATTR_BITMAP_HOT IndexAttrBitmapKind. Please update the comments ending in heapam.c:4188, and generally all comments that you should update. Sorry, did you apply projection-7.patch? I have checked all trailing spaces and other indentation issues. At least there is no trailing space at heapam.c:4188... Please keep serial_schedule in sync with parallel_schedule. Sorry, once again do not understand your complaint: I have added func_index both to serial and parallel schedule. Also, pgindent. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Surjective functional indexes
On 23 March 2018 at 15:39, Konstantin Knizhnik wrote: > > > On 22.03.2018 23:37, Alvaro Herrera wrote: >> >> The rd_projidx (list of each nth element in the index list that is a >> projection index) thing looks odd. Wouldn't it make more sense to have >> a list of index OIDs that are projection indexes? >> > > rd_projidx is not a list, it is Bitmapset. It is just one of many bitmap > sets in RelationData: > > Bitmapset *rd_indexattr;/* identifies columns used in indexes */ > Bitmapset *rd_keyattr;/* cols that can be ref'd by foreign keys > */ > Bitmapset *rd_pkattr;/* cols included in primary key */ > Bitmapset *rd_idattr;/* included in replica identity index */ > > Yes, it is possible to construct Oid list of projectional indexes instead of > having bitmap which marks some indexes in projectional, but I am not sure > that > it will be more efficient both from CPU and memory footprint point of view > (in most indexes bitmap will consists of just one integer). In any case, I > do not think that it can have some measurable impact on performance or > memory size: > number of indexes and especially projectional indexes will very rarely > exceed 10. Alvaro's comment seems reasonable to me. The patch adds both rd_projindexattr and rd_projidx rd_projindexattr should be a Bitmapset, but rd_projidx looks strange now he mentions it. Above that in RelationData we have other structures that are List of OIDs, so Alvaro's proposal make sense. That would simplify the code in ProjectionIsNotChanged() by just looping over the list of projection indexes rather than the list of indexes So please could you make the change? -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
Konstantin Knizhnik wrote: > rd_projidx is not a list, it is Bitmapset. It is just one of many bitmap > sets in RelationData: Yes, but the other bitmapsets are of AttrNumber of the involved column. They new one is of list_nth() counters for items in the index list. That seems weird and it scares me -- do we use that coding pattern elsewhere? Maybe use an array of OIDs instead? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 22.03.2018 23:37, Alvaro Herrera wrote: The rd_projidx (list of each nth element in the index list that is a projection index) thing looks odd. Wouldn't it make more sense to have a list of index OIDs that are projection indexes? rd_projidx is not a list, it is Bitmapset. It is just one of many bitmap sets in RelationData: Bitmapset *rd_indexattr; /* identifies columns used in indexes */ Bitmapset *rd_keyattr; /* cols that can be ref'd by foreign keys */ Bitmapset *rd_pkattr; /* cols included in primary key */ Bitmapset *rd_idattr; /* included in replica identity index */ Yes, it is possible to construct Oid list of projectional indexes instead of having bitmap which marks some indexes in projectional, but I am not sure that it will be more efficient both from CPU and memory footprint point of view (in most indexes bitmap will consists of just one integer). In any case, I do not think that it can have some measurable impact on performance or memory size: number of indexes and especially projectional indexes will very rarely exceed 10. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Surjective functional indexes
The whole IsProjectionFunctionalIndex looks kinda bogus/ugly to me. Set the boolean to false, but keep evaluating anyway? But then, I thought the idea was to do this based on the reloption, not by comparing the expression cost to a magical (unmodifiable) value? In RelationGetIndexAttrBitmap(), indexattrs no longer gets the columns corresponding to projection indexes. Isn't that weird/error prone/confusing? I think it'd be saner to add these bits to both bitmaps. Please update the comments ending in heapam.c:4188, and generally all comments that you should update. Please keep serial_schedule in sync with parallel_schedule. Also, pgindent. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
The rd_projidx (list of each nth element in the index list that is a projection index) thing looks odd. Wouldn't it make more sense to have a list of index OIDs that are projection indexes? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
Please excuse my absence from this thread. On 2 March 2018 at 12:34, Konstantin Knizhnik wrote: > > > On 01.03.2018 22:48, Andres Freund wrote: >> >> Hi, >> >> I still don't think, as commented upon by Tom and me upthread, that we >> want this feature in the current form. The performance benefit of this patch is compelling. I don't see a comment from Tom along those lines, just a queston as to whether the code is in the right place. >> Your arguments about layering violations seem to be counteracted by the >> fact that ProjectionIsNotChanged() basically appears to do purely >> executor work inside inside heapam.c. > > > ProjectionIsNotChanged doesn't perform evaluation itslef, is calls > FormIndexDatum. > FormIndexDatum is also called in 13 other places in Postgres: > analyze.c, constraint.c, index.c, tuplesort, execIndexing.c > > I still think that trying to store somewhere result odf index expression > evaluation to be able to reuse in index update is not so good idea: > it complicates code and add extra dependencies between different modules. > And benefits of such optimization are not obvious: is index expression > evaluation is so expensive, then recheck_on_update should be prohibited for > such index at all. Agreed. What we are doing is trying to avoid HOT updates. We make the check for HOT update dynamically when it could be made statically in some cases. Simplicity says just test it dynamically. We could also do work to check for HOT update for functional indexes earlier but for the same reason, I don't think its worth adding. The patch itself is about as simple as it can be, so the code is in the right place. My problems with it have been around the actual user interface to request it. Index option handling has changed (and this needs rebase!), but other than that I think we want this and am planning to commit something early next week. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 01.03.2018 22:48, Andres Freund wrote: Hi, I still don't think, as commented upon by Tom and me upthread, that we want this feature in the current form. Your arguments about layering violations seem to be counteracted by the fact that ProjectionIsNotChanged() basically appears to do purely executor work inside inside heapam.c. ProjectionIsNotChanged doesn't perform evaluation itslef, is calls FormIndexDatum. FormIndexDatum is also called in 13 other places in Postgres: analyze.c, constraint.c, index.c, tuplesort, execIndexing.c I still think that trying to store somewhere result odf index expression evaluation to be able to reuse in index update is not so good idea: it complicates code and add extra dependencies between different modules. And benefits of such optimization are not obvious: is index expression evaluation is so expensive, then recheck_on_update should be prohibited for such index at all. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Surjective functional indexes
Hi, I still don't think, as commented upon by Tom and me upthread, that we want this feature in the current form. Your arguments about layering violations seem to be counteracted by the fact that ProjectionIsNotChanged() basically appears to do purely executor work inside inside heapam.c. Greetings, Andres Freund
Re: [HACKERS] Surjective functional indexes
On 1 February 2018 at 09:32, Simon Riggs wrote: > OK, thanks. Just checking my understanding and will add to the > comments in the patch. I'm feeling good about ths now, but for personal reasons won't be committing this until late Feb/early March. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 1 February 2018 at 08:49, Konstantin Knizhnik wrote: > > > On 01.02.2018 03:10, Simon Riggs wrote: >> >> On 10 January 2018 at 09:54, Konstantin Knizhnik >> wrote: >> >>> (new version attached) >> >> Why this comment? >> >> Current implementation of projection optimization has to calculate >> index expression twice >> in case of hit (value of index expression is not changed) and three >> times if values are different. >> >> Old + New for check = 2 >> plus calculate again in index = 3 >> ? >> > Sorry, I do not completely understand your question. > You do not agree with this statement or you think that this comment is > irrelevant in this place? > Or you just want to understand why index expression is calculated 2/3 times? > If so, then you have answered this question yourself: >> >> Old + New for check = 2 >> plus calculate again in index = 3 > > > Yes, we have to calculate the value of index expression for original and > updated version of the record. If them are equal, then it is all we have to > do with this index: hot update is applicable. > In this case we calculate index expression twice. > But if them are not equal, then hot update is not applicable and we have to > update index. In this case expression will be calculated one more time. So > totally three times. > This is why, if calculation of index expression is very expensive, then > effect of this optimization may be negative even if value of index > expression is not changed. OK, thanks. Just checking my understanding and will add to the comments in the patch. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 01.02.2018 03:10, Simon Riggs wrote: On 10 January 2018 at 09:54, Konstantin Knizhnik wrote: (new version attached) Why this comment? Current implementation of projection optimization has to calculate index expression twice in case of hit (value of index expression is not changed) and three times if values are different. Old + New for check = 2 plus calculate again in index = 3 ? Sorry, I do not completely understand your question. You do not agree with this statement or you think that this comment is irrelevant in this place? Or you just want to understand why index expression is calculated 2/3 times? If so, then you have answered this question yourself: Old + New for check = 2 plus calculate again in index = 3 Yes, we have to calculate the value of index expression for original and updated version of the record. If them are equal, then it is all we have to do with this index: hot update is applicable. In this case we calculate index expression twice. But if them are not equal, then hot update is not applicable and we have to update index. In this case expression will be calculated one more time. So totally three times. This is why, if calculation of index expression is very expensive, then effect of this optimization may be negative even if value of index expression is not changed. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Surjective functional indexes
On 10 January 2018 at 09:54, Konstantin Knizhnik wrote: > (new version attached) Why this comment? Current implementation of projection optimization has to calculate index expression twice in case of hit (value of index expression is not changed) and three times if values are different. Old + New for check = 2 plus calculate again in index = 3 ? -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 18 January 2018 at 08:59, Konstantin Knizhnik wrote: > > > On 18.01.2018 11:38, Simon Riggs wrote: >> >> On 10 January 2018 at 09:54, Konstantin Knizhnik >> wrote: >> >>> Sorry, issue with documentation is fixed. >> >> OK, thanks. >> >> Patch appears to work cleanly now. >> >> I'm wondering now about automatically inferring "recheck_on_update = >> true" for certain common datatype/operators. It doesn't need to be an >> exhaustive list, but it would be useful if we detected the main use >> case of >> >> (JSONB datatype column)->>CONSTANT >> >> Seems like we could do a test to see if the index function is >> FUNCTION(COLUMNNAME, CONSTANTs...) >> {JSONB, ->>} or >> {jsonb_object_field_text(Columnname, Constant)} >> {substring(Columname, Constants...)} >> >> It would be a shame if people had to remember to use this for the >> common and obvious cases. >> > Right now by default index is considered as projective. So even if you do > not specify "recheck_on_update" option, then recheck will be done. That's good > This decision is based on the assumption that most of functional indexes are > actually projective and looks likes (JSONB datatype column)->>CONSTANT. > So do you think that this assumption is not correct and we should switch > disable recheck_on_update by default? No thanks > If not, then there is an opposite challenge: find out class of functions > which definitely are not projective and recheck on them will have no sense. If there are some. Projective is not quite correct, since sin((col->>'angle')::numeric)) could stay same but the result is not a subset of the input. I think it would be better to avoid the use of mathematical terms and keep the description simple "If the indexed value depends upon only a subset of the data, it is possible that the function value will remain constant after an UPDATE that changes the non-indexed data. e.g. If a column is updated from '/some/url/before' to '/some/url/after' then the value of substing(col, 1, 5) will not change when updated -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 18.01.2018 11:38, Simon Riggs wrote: On 10 January 2018 at 09:54, Konstantin Knizhnik wrote: Sorry, issue with documentation is fixed. OK, thanks. Patch appears to work cleanly now. I'm wondering now about automatically inferring "recheck_on_update = true" for certain common datatype/operators. It doesn't need to be an exhaustive list, but it would be useful if we detected the main use case of (JSONB datatype column)->>CONSTANT Seems like we could do a test to see if the index function is FUNCTION(COLUMNNAME, CONSTANTs...) {JSONB, ->>} or {jsonb_object_field_text(Columnname, Constant)} {substring(Columname, Constants...)} It would be a shame if people had to remember to use this for the common and obvious cases. Right now by default index is considered as projective. So even if you do not specify "recheck_on_update" option, then recheck will be done. This decision is based on the assumption that most of functional indexes are actually projective and looks likes (JSONB datatype column)->>CONSTANT. So do you think that this assumption is not correct and we should switch disable recheck_on_update by default? If not, then there is an opposite challenge: find out class of functions which definitely are not projective and recheck on them will have no sense. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Surjective functional indexes
On 10 January 2018 at 09:54, Konstantin Knizhnik wrote: > Sorry, issue with documentation is fixed. OK, thanks. Patch appears to work cleanly now. I'm wondering now about automatically inferring "recheck_on_update = true" for certain common datatype/operators. It doesn't need to be an exhaustive list, but it would be useful if we detected the main use case of (JSONB datatype column)->>CONSTANT Seems like we could do a test to see if the index function is FUNCTION(COLUMNNAME, CONSTANTs...) {JSONB, ->>} or {jsonb_object_field_text(Columnname, Constant)} {substring(Columname, Constants...)} It would be a shame if people had to remember to use this for the common and obvious cases. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 07.01.2018 01:59, Stephen Frost wrote: Greetings, * Konstantin Knizhnik (k.knizh...@postgrespro.ru) wrote: On 15.12.2017 01:21, Michael Paquier wrote: On Fri, Dec 15, 2017 at 6:15 AM, Alvaro Herrera wrote: Konstantin Knizhnik wrote: If you still thing that additional 16 bytes per relation in statistic is too high overhead, then I will also remove autotune. I think it's pretty clear that these additional bytes are excessive. The bar to add new fields in PgStat_TableCounts in very high, and one attempt to tackle its scaling problems with many relations is here by Horiguchi-san: https://www.postgresql.org/message-id/20171211.201523.24172046.horiguchi.kyot...@lab.ntt.co.jp His patch may be worth a look if you need more fields for your feature. So it seems to me that the patch as currently presented has close to zero chance to be committed as long as you keep your changes to pgstat.c. Ok, looks like everybody think that autotune based on statistic is bad idea. Attached please find patch without autotune. This patch appears to apply with a reasonable amount of fuzz, builds, and passes 'make check', at least, therefore I'm going to mark it 'Needs Review'. I will note that the documentation doesn't currently build due to this: /home/sfrost/git/pg/dev/cleanup/doc/src/sgml/ref/create_index.sgml:302: parser error : Opening and ending tag mismatch: literal line 302 and unparseable recheck_on_update but I don't think it makes sense for that to stand in the way of someone doing a review of the base patch. Still, please do fix the documentation build when you get a chance. Thanks! Stephen Sorry, issue with documentation is fixed. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index 0255375..acccaa5 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] The optional WITH clause specifies storage parameters for the index. Each index method has its own set of allowed -storage parameters. The B-tree, hash, GiST and SP-GiST index methods all -accept this parameter: +storage parameters. All indexes accept the following parameter: + + + + +recheck_on_update + + + Functional index is based on on projection function: function which extract subset of its argument. + In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed + expression is changed, then value of index expression is also changed. So to check that index is affected by the + update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered + as non-injective. + In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for + the expression expression(bookinfo->>'isbn') defined + for column of JSON type is changed only when ISBN is changed, which rarely happen. The same is true for most + functional indexes. For non-injective functions, Postgres compares values of indexed expression for old and updated tuple and updates + index only when function results are different. It allows to eliminate index update and use HOT update. + But there are extra evaluations of the functions. So if function is expensive or probability that change of indexed column will not effect + the function value is small, then marking index as recheck_on_update may increase update speed. + + + + + + + The B-tree, hash, GiST and SP-GiST index methods all accept this parameter: diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c index aa9c0f1..1aaab78 100644 --- a/src/backend/access/common/reloptions.c +++ b/src/backend/access/common/reloptions.c @@ -131,6 +131,15 @@ static relopt_bool boolRelOpts[] = }, { { + "recheck_on_update", + "Evaluate functional index expression on update to check if its values is changed", + RELOPT_KIND_INDEX, + AccessExclusiveLock + }, + true + }, + { + { "security_barrier", "View acts as a row security barrier", RELOPT_KIND_VIEW, @@ -1311,7 +1320,7 @@ fillRelOptions(void *rdopts, Size basesize, break; } } - if (validate && !found) + if (validate && !found && options[i].gen->kinds != RELOPT_KIND_INDEX) elog(ERROR, "reloption \"%s\" not found in parse table", options[i].gen->name); } @@ -1468,6 +1477,40 @@ index_reloptions(amoptions_function amoptions, Datum reloptions, bool validate) } /* + * Parse generic options for all indexes. + * + * reloptions options as text[] datum + * validate error flag + */ +bytea * +index_generic_reloptions(Datum reloptions, bool validate) +{
Re: [HACKERS] Surjective functional indexes
Greetings, * Konstantin Knizhnik (k.knizh...@postgrespro.ru) wrote: > On 15.12.2017 01:21, Michael Paquier wrote: > >On Fri, Dec 15, 2017 at 6:15 AM, Alvaro Herrera > >wrote: > >>Konstantin Knizhnik wrote: > >>>If you still thing that additional 16 bytes per relation in statistic is > >>>too > >>>high overhead, then I will also remove autotune. > >>I think it's pretty clear that these additional bytes are excessive. > >The bar to add new fields in PgStat_TableCounts in very high, and one > >attempt to tackle its scaling problems with many relations is here by > >Horiguchi-san: > >https://www.postgresql.org/message-id/20171211.201523.24172046.horiguchi.kyot...@lab.ntt.co.jp > >His patch may be worth a look if you need more fields for your > >feature. So it seems to me that the patch as currently presented has > >close to zero chance to be committed as long as you keep your changes > >to pgstat.c. > > Ok, looks like everybody think that autotune based on statistic is bad idea. > Attached please find patch without autotune. This patch appears to apply with a reasonable amount of fuzz, builds, and passes 'make check', at least, therefore I'm going to mark it 'Needs Review'. I will note that the documentation doesn't currently build due to this: /home/sfrost/git/pg/dev/cleanup/doc/src/sgml/ref/create_index.sgml:302: parser error : Opening and ending tag mismatch: literal line 302 and unparseable recheck_on_update but I don't think it makes sense for that to stand in the way of someone doing a review of the base patch. Still, please do fix the documentation build when you get a chance. Thanks! Stephen signature.asc Description: PGP signature
Re: [HACKERS] Surjective functional indexes
On 15.12.2017 01:21, Michael Paquier wrote: On Fri, Dec 15, 2017 at 6:15 AM, Alvaro Herrera wrote: Konstantin Knizhnik wrote: If you still thing that additional 16 bytes per relation in statistic is too high overhead, then I will also remove autotune. I think it's pretty clear that these additional bytes are excessive. The bar to add new fields in PgStat_TableCounts in very high, and one attempt to tackle its scaling problems with many relations is here by Horiguchi-san: https://www.postgresql.org/message-id/20171211.201523.24172046.horiguchi.kyot...@lab.ntt.co.jp His patch may be worth a look if you need more fields for your feature. So it seems to me that the patch as currently presented has close to zero chance to be committed as long as you keep your changes to pgstat.c. Ok, looks like everybody think that autotune based on statistic is bad idea. Attached please find patch without autotune. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index 0255375..c4ee15e 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] The optional WITH clause specifies storage parameters for the index. Each index method has its own set of allowed -storage parameters. The B-tree, hash, GiST and SP-GiST index methods all -accept this parameter: +storage parameters. All indexes accept the following parameter: + + + + +recheck_on_update + + + Functional index is based on on projection function: function which extract subset of its argument. + In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed + expression is changed, then value of index expression is also changed. So to check that index is affected by the + update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered + as non-injective. + In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for + the expression expression(bookinfo->>'isbn') defined + for column of JSON type is changed only when ISBN is changed, which rarely happen. The same is true for most + functional indexes. For non-injective functions, Postgres compares values of indexed expression for old and updated tuple and updates + index only when function results are different. It allows to eliminate index update and use HOT update. + But there are extra evaluations of the functions. So if function is expensive or probability that change of indexed column will not effect + the function value is small, then marking index as recheck_on_update may increase update speed. + + + + + + + The B-tree, hash, GiST and SP-GiST index methods all accept this parameter: diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c index aa9c0f1..1aaab78 100644 --- a/src/backend/access/common/reloptions.c +++ b/src/backend/access/common/reloptions.c @@ -131,6 +131,15 @@ static relopt_bool boolRelOpts[] = }, { { + "recheck_on_update", + "Evaluate functional index expression on update to check if its values is changed", + RELOPT_KIND_INDEX, + AccessExclusiveLock + }, + true + }, + { + { "security_barrier", "View acts as a row security barrier", RELOPT_KIND_VIEW, @@ -1311,7 +1320,7 @@ fillRelOptions(void *rdopts, Size basesize, break; } } - if (validate && !found) + if (validate && !found && options[i].gen->kinds != RELOPT_KIND_INDEX) elog(ERROR, "reloption \"%s\" not found in parse table", options[i].gen->name); } @@ -1468,6 +1477,40 @@ index_reloptions(amoptions_function amoptions, Datum reloptions, bool validate) } /* + * Parse generic options for all indexes. + * + * reloptions options as text[] datum + * validate error flag + */ +bytea * +index_generic_reloptions(Datum reloptions, bool validate) +{ + int numoptions; + GenericIndexOpts *idxopts; + relopt_value *options; + static const relopt_parse_elt tab[] = { + {"recheck_on_update", RELOPT_TYPE_BOOL, offsetof(GenericIndexOpts, recheck_on_update)} + }; + + options = parseRelOptions(reloptions, validate, + RELOPT_KIND_INDEX, + &numoptions); + + /* if none set, we're done */ + if (numoptions == 0) + return NULL; + + idxopts = allocateReloptStruct(sizeof(GenericIndexOpts), options, numoptions); + + fillRelOptions((void *)idxopts, sizeof(GenericIndexOpts), options, numoptions, + validate, tab, lengthof(tab)); + + pfree(options); + + return (bytea*) idxopts; +} + +/* * Option parser for attribute reloptions */ bytea * diff --git a/src/backend/acces
Re: [HACKERS] Surjective functional indexes
On Fri, Dec 15, 2017 at 6:15 AM, Alvaro Herrera wrote: > Konstantin Knizhnik wrote: >> If you still thing that additional 16 bytes per relation in statistic is too >> high overhead, then I will also remove autotune. > > I think it's pretty clear that these additional bytes are excessive. The bar to add new fields in PgStat_TableCounts in very high, and one attempt to tackle its scaling problems with many relations is here by Horiguchi-san: https://www.postgresql.org/message-id/20171211.201523.24172046.horiguchi.kyot...@lab.ntt.co.jp His patch may be worth a look if you need more fields for your feature. So it seems to me that the patch as currently presented has close to zero chance to be committed as long as you keep your changes to pgstat.c. -- Michael
Re: [HACKERS] Surjective functional indexes
Konstantin Knizhnik wrote: > If you still thing that additional 16 bytes per relation in statistic is too > high overhead, then I will also remove autotune. I think it's pretty clear that these additional bytes are excessive. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On Wed, Dec 13, 2017 at 12:32 PM, Konstantin Knizhnik wrote: > I can not believe that there can be more than thousand non-temporary > relations in any database. I ran across a cluster with more than 5 million non-temporary relations just this week. That's extreme, but having more than a thousand non-temporary tables is not particularly uncommon. I would guess that the percentage of EnterpriseDB's customers who have more than a thousand non-temporary tables in a database is in the double digits. That number is only going to go up as our partitioning capabilities improve. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Surjective functional indexes
On 13.12.2017 14:29, Simon Riggs wrote: On 4 December 2017 at 15:35, Konstantin Knizhnik wrote: On 30.11.2017 05:02, Michael Paquier wrote: On Wed, Sep 27, 2017 at 4:07 PM, Simon Riggs wrote: On 15 September 2017 at 16:34, Konstantin Knizhnik wrote: Attached please find yet another version of the patch. Thanks. I'm reviewing it. Two months later, this patch is still waiting for a review (you are listed as well as a reviewer of this patch). The documentation of the patch has conflicts, please provide a rebased version. I am moving this patch to next CF with waiting on author as status. Attached please find new patch with resolved documentation conflict. I’ve not posted a review because it was my hope that I could fix up the problems with this clever and useful patch and just commit it. I spent 2 days trying to do that but some problems remain and I’ve run out of time. Patch 3 has got some additional features that on balance I don’t think we need. Patch 1 didn’t actually work which was confusing when I tried to go back to that. Patch 3 Issues * Auto tuning added 2 extra items to Stats for each relation. That is too high a price to pay, so we should remove that. If we can’t do autotuning without that, please remove it. * Patch needs a proper test case added. We shouldn’t just have a DEBUG1 statement in there for testing - very ugly. Please rewrite tests using functions that show how many changes have occurred during the current statement/transaction. * Parameter coding was specific to that section of code. We need a capability to parse that parameter from other locations, just as we do with all other reloptions. It looks like it was coded that way because of difficulties with code placement. Please solve this with generic code, not just code that solves only the current issue. I’d personally be happier without any option at all, but if y’all want it, then please add the code in the right place. * The new coding for heap_update() uses the phrase “warm”, which is already taken by another patch. Please avoid confusing things by re-using the same term for different things. The use of these technical terms like projection and surjective doesn’t seem to add anything useful to the patch * Rename parameter to recheck_on_update * Remove random whitespace changes in the patch So at this stage the concept is good and works, but the patch is only just at the prototype stage and nowhere near committable. I hope you can correct these issues and if you do I will be happy to review and perhaps commit. Thanks Attached please find new patch which fix all the reported issues except disabling autotune. If you still thing that additional 16 bytes per relation in statistic is too high overhead, then I will also remove autotune. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index 0255375..c4ee15e 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] The optional WITH clause specifies storage parameters for the index. Each index method has its own set of allowed -storage parameters. The B-tree, hash, GiST and SP-GiST index methods all -accept this parameter: +storage parameters. All indexes accept the following parameter: + + + + +recheck_on_update + + + Functional index is based on on projection function: function which extract subset of its argument. + In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed + expression is changed, then value of index expression is also changed. So to check that index is affected by the + update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered + as non-injective. + In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for + the expression expression(bookinfo->>'isbn') defined + for column of JSON type is changed only when ISBN is changed, which rarely happen. The same is true for most + functional indexes. For non-injective functions, Postgres compares values of indexed expression for old and updated tuple and updates + index only when function results are different. It allows to eliminate index update and use HOT update. + But there are extra evaluations of the functions. So if function is expensive or probability that change of indexed column will not effect + the function value is small, then marking index as recheck_on_update may increase update speed. + + + + + + + The B-tree, hash, GiST and SP-GiST index methods all accept this parameter: diff --git a/src/backend/ac
Re: [HACKERS] Surjective functional indexes
Thank you for review. On 13.12.2017 14:29, Simon Riggs wrote: On 4 December 2017 at 15:35, Konstantin Knizhnik wrote: On 30.11.2017 05:02, Michael Paquier wrote: On Wed, Sep 27, 2017 at 4:07 PM, Simon Riggs wrote: On 15 September 2017 at 16:34, Konstantin Knizhnik wrote: Attached please find yet another version of the patch. Thanks. I'm reviewing it. Two months later, this patch is still waiting for a review (you are listed as well as a reviewer of this patch). The documentation of the patch has conflicts, please provide a rebased version. I am moving this patch to next CF with waiting on author as status. Attached please find new patch with resolved documentation conflict. I’ve not posted a review because it was my hope that I could fix up the problems with this clever and useful patch and just commit it. I spent 2 days trying to do that but some problems remain and I’ve run out of time. Patch 3 has got some additional features that on balance I don’t think we need. Patch 1 didn’t actually work which was confusing when I tried to go back to that. Patch 3 Issues * Auto tuning added 2 extra items to Stats for each relation. That is too high a price to pay, so we should remove that. If we can’t do autotuning without that, please remove it. Right now size of this structure is 168 bytes. Adding two extra fields increase it to 184 bytes - 9%. Is it really so critical? What is the typical/maximal number of relation in a database? I can not believe that there can be more than thousand non-temporary relations in any database. Certainly application can generate arbitrary number of temporary tables. But millions of such tables cause catalog bloating and will have awful impact on performance in any case. And for any reasonable number of relations (< million), extra memory used by this statistic is negligible. Also I think that these two fields can be useful not only for projection indexes. It can be also used by DBA and optimizer because them more precisely explain relation update pattern. I really love you idea with autotune and it will be pity to reject it just because of extra 16 bytes per relation. * Patch needs a proper test case added. We shouldn’t just have a DEBUG1 statement in there for testing - very ugly. Please rewrite tests using functions that show how many changes have occurred during the current statement/transaction. * Parameter coding was specific to that section of code. We need a capability to parse that parameter from other locations, just as we do with all other reloptions. It looks like it was coded that way because of difficulties with code placement. Please solve this with generic code, not just code that solves only the current issue. I’d personally be happier without any option at all, but if y’all want it, then please add the code in the right place. Sorry, I do not completely understand what you mean by "parameter coding". Do you mean IsProjectionFunctionalIndex function and filling relation options in it? It is the only reported issue which I do not understand how to fix. * The new coding for heap_update() uses the phrase “warm”, which is already taken by another patch. Please avoid confusing things by re-using the same term for different things. The use of these technical terms like projection and surjective doesn’t seem to add anything useful to the patch * Rename parameter to recheck_on_update * Remove random whitespace changes in the patch So at this stage the concept is good and works, but the patch is only just at the prototype stage and nowhere near committable. I hope you can correct these issues and if you do I will be happy to review and perhaps commit. Thank you once again for your efforts in improving this patch. I will fix all reported issues. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Surjective functional indexes
On 4 December 2017 at 15:35, Konstantin Knizhnik wrote: > On 30.11.2017 05:02, Michael Paquier wrote: >> >> On Wed, Sep 27, 2017 at 4:07 PM, Simon Riggs >> wrote: >>> >>> On 15 September 2017 at 16:34, Konstantin Knizhnik >>> wrote: >>> Attached please find yet another version of the patch. >>> >>> Thanks. I'm reviewing it. >> >> Two months later, this patch is still waiting for a review (you are >> listed as well as a reviewer of this patch). The documentation of the >> patch has conflicts, please provide a rebased version. I am moving >> this patch to next CF with waiting on author as status. > > Attached please find new patch with resolved documentation conflict. I’ve not posted a review because it was my hope that I could fix up the problems with this clever and useful patch and just commit it. I spent 2 days trying to do that but some problems remain and I’ve run out of time. Patch 3 has got some additional features that on balance I don’t think we need. Patch 1 didn’t actually work which was confusing when I tried to go back to that. Patch 3 Issues * Auto tuning added 2 extra items to Stats for each relation. That is too high a price to pay, so we should remove that. If we can’t do autotuning without that, please remove it. * Patch needs a proper test case added. We shouldn’t just have a DEBUG1 statement in there for testing - very ugly. Please rewrite tests using functions that show how many changes have occurred during the current statement/transaction. * Parameter coding was specific to that section of code. We need a capability to parse that parameter from other locations, just as we do with all other reloptions. It looks like it was coded that way because of difficulties with code placement. Please solve this with generic code, not just code that solves only the current issue. I’d personally be happier without any option at all, but if y’all want it, then please add the code in the right place. * The new coding for heap_update() uses the phrase “warm”, which is already taken by another patch. Please avoid confusing things by re-using the same term for different things. The use of these technical terms like projection and surjective doesn’t seem to add anything useful to the patch * Rename parameter to recheck_on_update * Remove random whitespace changes in the patch So at this stage the concept is good and works, but the patch is only just at the prototype stage and nowhere near committable. I hope you can correct these issues and if you do I will be happy to review and perhaps commit. Thanks -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Surjective functional indexes
On 30.11.2017 05:02, Michael Paquier wrote: On Wed, Sep 27, 2017 at 4:07 PM, Simon Riggs wrote: On 15 September 2017 at 16:34, Konstantin Knizhnik wrote: Attached please find yet another version of the patch. Thanks. I'm reviewing it. Two months later, this patch is still waiting for a review (you are listed as well as a reviewer of this patch). The documentation of the patch has conflicts, please provide a rebased version. I am moving this patch to next CF with waiting on author as status. Attached please find new patch with resolved documentation conflict. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index 0255375..7ec6a0d 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] The optional WITH clause specifies storage parameters for the index. Each index method has its own set of allowed -storage parameters. The B-tree, hash, GiST and SP-GiST index methods all -accept this parameter: +storage parameters. All indexes accept the following parameter: + + + + +projection + + + Functional index is based on on projection function: function which extract subset of its argument. + In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed + expression is changed, then value of index expression is also changed. So to check that index is affected by the + update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered + as non-injective. + In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for + the expression expression(bookinfo->>'isbn') defined + for column of JSON type is changed only when ISBN is changed, which rarely happen. The same is true for most + functional indexes. For non-injective functions, Postgres compares values of indexed expression for old and updated tuple and updates + index only when function results are different. It allows to eliminate index update and use HOT update. + But there are extra evaluations of the functions. So if function is expensive or probability that change of indexed column will not effect + the function value is small, then marking index as projection may increase update speed. + + + + + + + The B-tree, hash, GiST and SP-GiST index methods all accept this parameter: diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c index aa9c0f1..0dae5f1 100644 --- a/src/backend/access/common/reloptions.c +++ b/src/backend/access/common/reloptions.c @@ -131,6 +131,15 @@ static relopt_bool boolRelOpts[] = }, { { + "projection", + "Evaluate functional index expression on update to check if its values is changed", + RELOPT_KIND_INDEX, + AccessExclusiveLock + }, + true + }, + { + { "security_barrier", "View acts as a row security barrier", RELOPT_KIND_VIEW, @@ -1311,7 +1320,7 @@ fillRelOptions(void *rdopts, Size basesize, break; } } - if (validate && !found) + if (validate && !found && options[i].gen->kinds != RELOPT_KIND_INDEX) elog(ERROR, "reloption \"%s\" not found in parse table", options[i].gen->name); } diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 3acef27..7f356e7 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -56,6 +56,7 @@ #include "access/xlogutils.h" #include "catalog/catalog.h" #include "catalog/namespace.h" +#include "catalog/index.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -74,7 +75,9 @@ #include "utils/snapmgr.h" #include "utils/syscache.h" #include "utils/tqual.h" - +#include "utils/memutils.h" +#include "nodes/execnodes.h" +#include "executor/executor.h" /* GUC variable */ bool synchronize_seqscans = true; @@ -126,6 +129,7 @@ static bool ConditionalMultiXactIdWait(MultiXactId multi, MultiXactStatus status static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup); static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_modified, bool *copy); +static bool ProjectionIsNotChanged(Relation relation, HeapTuple oldtup, HeapTuple newtup); /* @@ -3485,6 +3489,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, HTSU_Result result; TransactionId xid = GetCurrentTransactionId(); Bitmapset *hot_attrs; + Bitmapset *warm_attrs; Bitmapset *key_attrs; Bitmapset *id_attrs; Bitmapset *interesting_attrs; @@ -3548,12 +3553,11 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup
Re: [HACKERS] Surjective functional indexes
On Wed, Sep 27, 2017 at 4:07 PM, Simon Riggs wrote: > On 15 September 2017 at 16:34, Konstantin Knizhnik > wrote: > >> Attached please find yet another version of the patch. > > Thanks. I'm reviewing it. Two months later, this patch is still waiting for a review (you are listed as well as a reviewer of this patch). The documentation of the patch has conflicts, please provide a rebased version. I am moving this patch to next CF with waiting on author as status. -- Michael