Re: Eliminate redundant tuple visibility check in vacuum
On Sat, Sep 30, 2023 at 1:02 PM Andres Freund wrote: > The only thought I have is that it might be worth to amend the comment in > lazy_scan_prune() to mention that such a tuple won't need to be frozen, > because it was visible to another session when vacuum started. I revised the comment a bit, incorporating that language, and committed. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Eliminate redundant tuple visibility check in vacuum
Hi, On 2023-09-28 11:25:04 -0400, Robert Haas wrote: > I went ahead and committed 0001. If Andres still wants to push for > more renaming there, that can be a follow-up patch. Agreed. > And we can see if he or anyone else has any comments on this new version of > 0002. To me we're down into the level of details that probably don't matter > very much one way or the other, but others may disagree. The only thought I have is that it might be worth to amend the comment in lazy_scan_prune() to mention that such a tuple won't need to be frozen, because it was visible to another session when vacuum started. Greetings, Andres Freund
Re: Eliminate redundant tuple visibility check in vacuum
On Thu, Sep 28, 2023 at 8:46 AM Melanie Plageman wrote: > Once I started writing the function comment, however, I felt a bit > awkward. In order to make the function available to both pruneheap.c > and vacuumlazy.c, I had to put it in a header file. Writing a > function, available to anyone including heapam.h, which takes an int > and returns an HTSV_Result feels a bit odd. Do we want it to be common > practice to use an int value outside the valid enum range to store > "status not yet computed" for HTSV_Results? I noticed the awkwardness of that return convention when I reviewed the first version of this patch, but I decided it wasn't worth spending time discussing. To avoid it, we'd either need to add a new HTSV_Result that is only used here, or add a new type HTSV_Result_With_An_Extra_Value and translate between the two, or pass back a boolean + an enum instead of an array of int8. And all of those seem to me to suck -- the first two are messy and the third would make the return value much wider. So, no, I don't really like this, but also, what would actually be any better? Also, IMV at least, it's more of an issue of it being sort of ugly than of anything becoming common practice, because how many callers of heap_page_prune() are there ever going to be? AFAIK, we've only ever had two since forever, and even if we grow one or two more at some point, that's still not that many. I went ahead and committed 0001. If Andres still wants to push for more renaming there, that can be a follow-up patch. And we can see if he or anyone else has any comments on this new version of 0002. To me we're down into the level of details that probably don't matter very much one way or the other, but others may disagree. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Eliminate redundant tuple visibility check in vacuum
On Thu, Sep 21, 2023 at 3:53 PM Robert Haas wrote: > > On Wed, Sep 13, 2023 at 1:29 PM Andres Freund wrote: > > > > /* > > >* The criteria for counting a tuple as live in this block > > > need to > > > @@ -1682,7 +1664,7 @@ retry: > > >* (Cases where we bypass index vacuuming will violate this > > > optimistic > > >* assumption, but the overall impact of that should be > > > negligible.) > > >*/ > > > - switch (res) > > > + switch ((HTSV_Result) presult.htsv[offnum]) > > > { > > > case HEAPTUPLE_LIVE: > > > > I think we should assert that we have a valid HTSV_Result here, i.e. not > > -1. You could wrap the cast and Assert into an inline funciton as well. > > This isn't a bad idea, although I don't find it completely necessary either. Attached v5 does this. Even though a value of -1 would hit the default switch case and error out, I can see the argument for this validation -- as all other places switching on an HTSV_Result are doing so on a value which was always an HTSV_Result. Once I started writing the function comment, however, I felt a bit awkward. In order to make the function available to both pruneheap.c and vacuumlazy.c, I had to put it in a header file. Writing a function, available to anyone including heapam.h, which takes an int and returns an HTSV_Result feels a bit odd. Do we want it to be common practice to use an int value outside the valid enum range to store "status not yet computed" for HTSV_Results? Anyway, on a tactical note, I added the inline function to heapam.h below the PruneResult definition since it is fairly tightly coupled to the htsv array in PruneResult. All of the function prototypes are under a comment that says "function prototypes for heap access method" -- which didn't feel like an accurate description of this function. I wonder if it makes sense to have pruneheap.c include vacuum.h and move pruning specific stuff like this helper and PruneResult over there? I can't remember why I didn't do this before, but maybe there is a reason not to? I also wasn't sure if I needed to forward declare the inline function or not. Oh, and, one more note. I've dropped the former patch 0001 which changed the function comment about off_loc above heap_page_prune(). I have plans to write a separate patch adding an error context callback for HOT pruning with the offset number and would include such a change in that patch. - Melanie From aa1ec4fa1a689f9c6328d02fd13af1a42ecb0396 Mon Sep 17 00:00:00 2001 From: Melanie Plageman Date: Wed, 6 Sep 2023 16:54:41 -0400 Subject: [PATCH v5 2/2] Reuse heap_page_prune() tuple visibility statuses heap_page_prune() obtains the HTSV_Result (tuple visibility status) returned from HeapTupleSatisfiesVacuum() for every tuple on the page and stores them in an array. By making this array available to heap_page_prune()'s caller lazy_scan_prune(), we can avoid an additional call to HeapTupleSatisfiesVacuum() when freezing the tuples and recording LP_DEAD items for vacuum. This saves resources and eliminates the possibility that vacuuming corrupted data results in a hang due to endless retry looping. This replaces the retry mechanism introduced in 8523492d4 to handle cases in which a tuple's inserting transaction aborted between the visibility check in heap_page_prune() and lazy_scan_prune()'s call to HeapTupleSatisfiesVacuum() -- rendering it dead but without a dead line pointer. We can instead reuse the tuple's original visibility status, circumventing any disagreements. Discussion: https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com --- src/backend/access/heap/pruneheap.c | 36 +++- src/backend/access/heap/vacuumlazy.c | 42 src/include/access/heapam.h | 25 + 3 files changed, 54 insertions(+), 49 deletions(-) diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c index d5892a2db4..c5f1abd95a 100644 --- a/src/backend/access/heap/pruneheap.c +++ b/src/backend/access/heap/pruneheap.c @@ -53,16 +53,6 @@ typedef struct * 1. Otherwise every access would need to subtract 1. */ bool marked[MaxHeapTuplesPerPage + 1]; - - /* - * Tuple visibility is only computed once for each tuple, for correctness - * and efficiency reasons; see comment in heap_page_prune() for details. - * This is of type int8[], instead of HTSV_Result[], so we can use -1 to - * indicate no visibility has been computed, e.g. for LP_DEAD items. - * - * Same indexing as ->marked. - */ - int8 htsv[MaxHeapTuplesPerPage + 1]; } PruneState; /* Local functions */ @@ -71,6 +61,7 @@ static HTSV_Result heap_prune_satisfies_vacuum(PruneState *prstate, Buffer buffer); static int heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, + int8 *htsv, PruneState *p
Re: Eliminate redundant tuple visibility check in vacuum
On Thu, Sep 21, 2023 at 3:53 PM Robert Haas wrote: > > > static int heap_prune_chain(Buffer buffer, > > >OffsetNumber > > > rootoffnum, > > > + int8 *htsv, > > >PruneState > > > *prstate); > > > > Hm, do we really want to pass this explicitly to a bunch of functions? Seems > > like it might be better to either pass the PruneResult around or to have a > > pointer in PruneState? > > As far as I can see, 0002 adds it to one function (heap_page_pune) and > 0003 adds it to one more (heap_prune_chain). That's not much of a > bunch. I didn't read this carefully enough. Actually, heap_prune_chain() is the *only* function that gets int8 *htsv as an argument. I don't understand how that's a bunch ... unless there are later patches not shown here that you're worried abot. What happens in 0002 is a function getting PruneResult * as an argument, not int8 *htsv. Honestly I think 0002 and 0003 are ready to commit, if you're not too opposed to them, or if we can find some relatively small changes that would address your objections. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Eliminate redundant tuple visibility check in vacuum
[ sorry for the delay getting back to this ] On Wed, Sep 13, 2023 at 1:29 PM Andres Freund wrote: > I think it might be worth making the names a bit less ambiguous than they > were. It's a bit odd that one has "new" in the name, the other doesn't, > despite both being about newly marked things. And "deleted" seems somewhat > ambiguous, it could also be understood as marking things LP_DEAD. Maybe > nnewunused? I like it the better the way Melanie did it. The current name may not be for the best, but that can be changed some other time, in a separate patch, if someone likes. For now, changing the name seems like a can of worms we don't need to open; the existing names have precedent on their side if nothing else. > > static int heap_prune_chain(Buffer buffer, > >OffsetNumber > > rootoffnum, > > + int8 *htsv, > >PruneState *prstate); > > Hm, do we really want to pass this explicitly to a bunch of functions? Seems > like it might be better to either pass the PruneResult around or to have a > pointer in PruneState? As far as I can see, 0002 adds it to one function (heap_page_pune) and 0003 adds it to one more (heap_prune_chain). That's not much of a bunch. > > /* > >* The criteria for counting a tuple as live in this block > > need to > > @@ -1682,7 +1664,7 @@ retry: > >* (Cases where we bypass index vacuuming will violate this > > optimistic > >* assumption, but the overall impact of that should be > > negligible.) > >*/ > > - switch (res) > > + switch ((HTSV_Result) presult.htsv[offnum]) > > { > > case HEAPTUPLE_LIVE: > > I think we should assert that we have a valid HTSV_Result here, i.e. not > -1. You could wrap the cast and Assert into an inline funciton as well. This isn't a bad idea, although I don't find it completely necessary either. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Eliminate redundant tuple visibility check in vacuum
Hi, On 2023-09-07 18:23:22 -0400, Melanie Plageman wrote: > From e986940e546171d1f1d06f62a101d695a8481e7a Mon Sep 17 00:00:00 2001 > From: Melanie Plageman > Date: Wed, 6 Sep 2023 14:57:20 -0400 > Subject: [PATCH v4 2/3] Move heap_page_prune output parameters into struct > > Add PruneResult, a structure containing the output parameters and result > of heap_page_prune(). Reorganizing the results of heap_page_prune() into > a struct simplifies the function signature and provides a location for > future commits to store additional output parameters. > > Discussion: > https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com > --- > src/backend/access/heap/pruneheap.c | 33 +--- > src/backend/access/heap/vacuumlazy.c | 17 +- > src/include/access/heapam.h | 13 +-- > src/tools/pgindent/typedefs.list | 1 + > 4 files changed, 33 insertions(+), 31 deletions(-) > > diff --git a/src/backend/access/heap/pruneheap.c > b/src/backend/access/heap/pruneheap.c > index 392b54f093..5ac286e152 100644 > --- a/src/backend/access/heap/pruneheap.c > +++ b/src/backend/access/heap/pruneheap.c > @@ -155,15 +155,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer) >*/ > if (PageIsFull(page) || PageGetHeapFreeSpace(page) < minfree) > { > - int ndeleted, > - nnewlpdead; > + PruneResult presult; > > - ndeleted = heap_page_prune(relation, buffer, vistest, > - > &nnewlpdead, NULL); > + heap_page_prune(relation, buffer, vistest, &presult, > NULL); > > /* >* Report the number of tuples reclaimed to pgstats. > This is > - * ndeleted minus the number of newly-LP_DEAD-set items. > + * presult.ndeleted minus the number of > newly-LP_DEAD-set items. >* >* We derive the number of dead tuples like this to > avoid totally >* forgetting about items that were set to LP_DEAD, > since they > @@ -175,9 +173,9 @@ heap_page_prune_opt(Relation relation, Buffer buffer) >* tracks ndeleted, since it will set the same LP_DEAD > items to >* LP_UNUSED separately. >*/ > - if (ndeleted > nnewlpdead) > + if (presult.ndeleted > presult.nnewlpdead) > pgstat_update_heap_dead_tuples(relation, > - >ndeleted - nnewlpdead); > + >presult.ndeleted - presult.nnewlpdead); > } > > /* And release buffer lock */ > @@ -204,24 +202,22 @@ heap_page_prune_opt(Relation relation, Buffer buffer) > * (see heap_prune_satisfies_vacuum and > * HeapTupleSatisfiesVacuum). > * > - * Sets *nnewlpdead for caller, indicating the number of items that were > - * newly set LP_DEAD during prune operation. > + * presult contains output parameters needed by callers such as the number of > + * tuples removed and the number of line pointers newly marked LP_DEAD. > + * heap_page_prune() is responsible for initializing it. > * > * off_loc is the current offset into the line pointer array while pruning. > * This is used by vacuum to populate the error context message. On-access > * pruning has no such callback mechanism for populating the error context, > so > * it passes NULL. When provided by the caller, off_loc is set exclusively by > * heap_page_prune(). > - * > - * Returns the number of tuples deleted from the page during this call. > */ > -int > +void > heap_page_prune(Relation relation, Buffer buffer, > GlobalVisState *vistest, > - int *nnewlpdead, > + PruneResult *presult, > OffsetNumber *off_loc) > { > - int ndeleted = 0; > Pagepage = BufferGetPage(buffer); > BlockNumber blockno = BufferGetBlockNumber(buffer); > OffsetNumber offnum, > @@ -247,6 +243,9 @@ heap_page_prune(Relation relation, Buffer buffer, > prstate.nredirected = prstate.ndead = prstate.nunused = 0; > memset(prstate.marked, 0, sizeof(prstate.marked)); > > + presult->ndeleted = 0; > + presult->nnewlpdead = 0; > + > maxoff = PageGetMaxOffsetNumber(page); > tup.t_tableOid = RelationGetRelid(prstate.rel); > > @@ -321,7 +320,7 @@ heap_page_prune(Relation relation, Buffer buffer, > continue; > > /* Process
Re: Eliminate redundant tuple visibility check in vacuum
On Fri, Sep 8, 2023 at 11:06 AM Robert Haas wrote: > > On Thu, Sep 7, 2023 at 6:23 PM Melanie Plageman > wrote: > > I mostly wanted to remove the NULL checks because I found them > > distracting (so, a stylistic complaint). However, upon further > > reflection, I actually think it is better if heap_page_prune_opt() > > passes NULL. heap_page_prune() has no error callback mechanism that > > could use it, and passing a valid value implies otherwise. Also, as > > you said, off_loc will always be InvalidOffsetNumber when > > heap_page_prune() returns normally, so having heap_page_prune_opt() > > pass a dummy value might actually be more confusing for future > > programmers. > > I'll look at the new patches more next week, but I wanted to comment > on this point. I think this is kind of six of one, half a dozen of the > other. It's not that hard to spot a variable that's only used in a > function call and never initialized beforehand or used afterward, and > if someone really feels the need to hammer home the point, they could > always name it dummy or dummy_loc or whatever. So this point doesn't > really carry a lot of weight with me. I actually think that the > proposed change is probably better, but it seems like such a minor > improvement that I get slightly reluctant to make it, only because > churning the source code for arguable points sometimes annoys other > developers. > > But I also had the thought that maybe it wouldn't be such a terrible > idea if heap_page_prune_opt() actually used off_loc for some error > reporting goodness. I mean, if HOT pruning fails, and we don't get the > detail as to which tuple caused the failure, we can always run VACUUM > and it will give us that information, assuming of course that the same > failure happens again. But is there any reason why HOT pruning > shouldn't include that error detail? If it did, then off_loc would be > passed by all callers, at which point we surely would want to get rid > of the branches. This is a good idea. I will work on a separate patch set to add an error context callback for on-access HOT pruning. - Melanie
Re: Eliminate redundant tuple visibility check in vacuum
On Thu, Sep 7, 2023 at 6:23 PM Melanie Plageman wrote: > I mostly wanted to remove the NULL checks because I found them > distracting (so, a stylistic complaint). However, upon further > reflection, I actually think it is better if heap_page_prune_opt() > passes NULL. heap_page_prune() has no error callback mechanism that > could use it, and passing a valid value implies otherwise. Also, as > you said, off_loc will always be InvalidOffsetNumber when > heap_page_prune() returns normally, so having heap_page_prune_opt() > pass a dummy value might actually be more confusing for future > programmers. I'll look at the new patches more next week, but I wanted to comment on this point. I think this is kind of six of one, half a dozen of the other. It's not that hard to spot a variable that's only used in a function call and never initialized beforehand or used afterward, and if someone really feels the need to hammer home the point, they could always name it dummy or dummy_loc or whatever. So this point doesn't really carry a lot of weight with me. I actually think that the proposed change is probably better, but it seems like such a minor improvement that I get slightly reluctant to make it, only because churning the source code for arguable points sometimes annoys other developers. But I also had the thought that maybe it wouldn't be such a terrible idea if heap_page_prune_opt() actually used off_loc for some error reporting goodness. I mean, if HOT pruning fails, and we don't get the detail as to which tuple caused the failure, we can always run VACUUM and it will give us that information, assuming of course that the same failure happens again. But is there any reason why HOT pruning shouldn't include that error detail? If it did, then off_loc would be passed by all callers, at which point we surely would want to get rid of the branches. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Eliminate redundant tuple visibility check in vacuum
On Thu, Sep 7, 2023 at 3:30 PM Robert Haas wrote: > > On Thu, Sep 7, 2023 at 3:10 PM Melanie Plageman > wrote: > > I can fix it by changing the type of PruneResult->off_loc to be an > > OffsetNumber pointer. This does mean that PruneResult will be > > initialized partially by heap_page_prune() callers. I wonder if you > > think that undermines the case for making a new struct. > > I think that it undermines the case for including that particular > argument in the struct. It's not really a Prune*Result* if the caller > initializes it in part. It seems fairly reasonable to still have a > PruneResult struct for the other stuff, though, at least to me. How do > you see it? Yes, I think off_loc probably didn't belong in PruneResult to begin with. It is inherently not a result of pruning since it will only be used in the event that pruning doesn't complete (it errors out). In the attached v4 patch set, I have both PruneResult and off_loc as parameters to heap_page_prune(). I have also added a separate commit which adds comments both above heap_page_prune()'s call site in lazy_scan_prune() and in the heap_page_prune() function header clarifying the various points we discussed. > > I still want to eliminate the NULL check of off_loc in > > heap_page_prune() by making it a required parameter. Even though > > on-access pruning does not have an error callback mechanism which uses > > the offset, it seems better to have a pointless local variable in > > heap_page_prune_opt() than to do a NULL check for every tuple pruned. > > It doesn't seem important to me unless it improves performance. If > it's just stylistic, I don't object, but I also don't really see a > reason to care. I did some performance testing but, as expected, I couldn't concoct a scenario where the overhead was noticeable in a profile. So much else is happening in that code, the NULL check basically doesn't matter (even though it isn't optimized out). I mostly wanted to remove the NULL checks because I found them distracting (so, a stylistic complaint). However, upon further reflection, I actually think it is better if heap_page_prune_opt() passes NULL. heap_page_prune() has no error callback mechanism that could use it, and passing a valid value implies otherwise. Also, as you said, off_loc will always be InvalidOffsetNumber when heap_page_prune() returns normally, so having heap_page_prune_opt() pass a dummy value might actually be more confusing for future programmers. > > > I haven't run the regression suite with 0001 applied. I'm guessing > > > that you did, and that they passed, which if true means that we don't > > > have a test for this, which is a shame, although writing such a test > > > might be a bit tricky. If there is a test case for this and you didn't > > > run it, woops. > > > > There is no test coverage for the vacuum error callback context > > currently (tests passed for me). I looked into how we might add such a > > test. First, I investigated what kind of errors might occur during > > heap_prune_satisfies_vacuum(). Some of the multixact functions called > > by HeapTupleSatisfiesVacuumHorizon() could error out -- for example > > GetMultiXactIdMembers(). It seems difficult to trigger the errors in > > GetMultiXactIdMembers(), as we would have to cause wraparound. It > > would be even more difficult to ensure that we hit those specific > > errors from a call stack containing heap_prune_satisfies_vacuum(). As > > such, I'm not sure I can think of a way to protect future developers > > from repeating my mistake--apart from improving the comment like you > > mentioned. > > 004_verify_heapam.pl has some tests that intentionally corrupt pages > and then use pg_amcheck to detect the corruption. Such an approach > could probably also be used here. But it's a pain to get such tests > right, because any change to the page format due to endianness, > different block size, or whatever can make carefully-written tests go > boom. Cool! I hadn't examined how these tests work until now. I took a stab at writing a test in the existing 0004_verify_heapam.pl. The simplest thing would be if we could just vacuum the corrupted table ("test") after running pg_amcheck and compare the error context to our expectation. I found that this didn't work, though. In an assert build, vacuum trips an assert before it hits an error while vacuuming a corrupted tuple in the "test" table. There might be a way of modifying the existing test code to avoid this, but I tried the next easiest thing -- corrupting a tuple in the other existing table in the file, "junk". This is possible to do, but we have to repeat a lot of the setup code done for the "test" table to get the line pointer array and loop through and corrupt a tuple. In order to do this well, I would want to refactor some of the boilerplate into a function. There are other fiddly bits as well that I needed to consider. I think a test like this could be useful coverage of the some of the possible errors th
Re: Eliminate redundant tuple visibility check in vacuum
On Thu, Sep 7, 2023 at 3:10 PM Melanie Plageman wrote: > I can fix it by changing the type of PruneResult->off_loc to be an > OffsetNumber pointer. This does mean that PruneResult will be > initialized partially by heap_page_prune() callers. I wonder if you > think that undermines the case for making a new struct. I think that it undermines the case for including that particular argument in the struct. It's not really a Prune*Result* if the caller initializes it in part. It seems fairly reasonable to still have a PruneResult struct for the other stuff, though, at least to me. How do you see it? (One could also argue that this is a somewhat more byzantine way of doing error reporting than would be desirable, but fixing that problem doesn't seem too straightforward so perhaps it's prudent to leave it well enough alone.) > I still want to eliminate the NULL check of off_loc in > heap_page_prune() by making it a required parameter. Even though > on-access pruning does not have an error callback mechanism which uses > the offset, it seems better to have a pointless local variable in > heap_page_prune_opt() than to do a NULL check for every tuple pruned. It doesn't seem important to me unless it improves performance. If it's just stylistic, I don't object, but I also don't really see a reason to care. > > I haven't run the regression suite with 0001 applied. I'm guessing > > that you did, and that they passed, which if true means that we don't > > have a test for this, which is a shame, although writing such a test > > might be a bit tricky. If there is a test case for this and you didn't > > run it, woops. > > There is no test coverage for the vacuum error callback context > currently (tests passed for me). I looked into how we might add such a > test. First, I investigated what kind of errors might occur during > heap_prune_satisfies_vacuum(). Some of the multixact functions called > by HeapTupleSatisfiesVacuumHorizon() could error out -- for example > GetMultiXactIdMembers(). It seems difficult to trigger the errors in > GetMultiXactIdMembers(), as we would have to cause wraparound. It > would be even more difficult to ensure that we hit those specific > errors from a call stack containing heap_prune_satisfies_vacuum(). As > such, I'm not sure I can think of a way to protect future developers > from repeating my mistake--apart from improving the comment like you > mentioned. 004_verify_heapam.pl has some tests that intentionally corrupt pages and then use pg_amcheck to detect the corruption. Such an approach could probably also be used here. But it's a pain to get such tests right, because any change to the page format due to endianness, different block size, or whatever can make carefully-written tests go boom. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Eliminate redundant tuple visibility check in vacuum
On Thu, Sep 7, 2023 at 1:37 PM Robert Haas wrote: > > On Wed, Sep 6, 2023 at 5:21 PM Melanie Plageman > wrote: > > Yeah, I think this is a fair concern. I have addressed it in the > > attached patches. > > > > I thought a lot about whether or not adding a PruneResult which > > contains only the output parameters and result of heap_page_prune() is > > annoying since we have so many other *Prune* data structures. I > > decided it's not annoying. In some cases, four outputs don't merit a > > new structure. In this case, I think it declutters the code a bit -- > > independent of any other patches I may be writing :) > > I took a look at 0001 and I think that it's incorrect. In the existing > code, *off_loc gets updated before each call to > heap_prune_satisfies_vacuum(). This means that if an error occurs in > heap_prune_satisfies_vacuum(), *off_loc will as of that moment contain > the relevant offset number. In your version, the relevant offset > number will only be stored in some local structure to which the caller > doesn't yet have access. The difference is meaningful. lazy_scan_prune > passes off_loc as vacrel->offnum, which means that if an error > happens, vacrel->offnum will have the right value, and so when the > error context callback installed by heap_vacuum_rel() fires, namely > vacuum_error_callback(), it can look at vacrel->offnum and know where > the error happened. With your patch, I think that would no longer > work. You are right. That is a major problem. Thank you for finding that. I was able to confirm the breakage by patching in an error to heap_page_prune() after we have set off_loc and confirming that the error context has the offset in master and is missing it with my patch applied. I can fix it by changing the type of PruneResult->off_loc to be an OffsetNumber pointer. This does mean that PruneResult will be initialized partially by heap_page_prune() callers. I wonder if you think that undermines the case for making a new struct. I still want to eliminate the NULL check of off_loc in heap_page_prune() by making it a required parameter. Even though on-access pruning does not have an error callback mechanism which uses the offset, it seems better to have a pointless local variable in heap_page_prune_opt() than to do a NULL check for every tuple pruned. > I haven't run the regression suite with 0001 applied. I'm guessing > that you did, and that they passed, which if true means that we don't > have a test for this, which is a shame, although writing such a test > might be a bit tricky. If there is a test case for this and you didn't > run it, woops. There is no test coverage for the vacuum error callback context currently (tests passed for me). I looked into how we might add such a test. First, I investigated what kind of errors might occur during heap_prune_satisfies_vacuum(). Some of the multixact functions called by HeapTupleSatisfiesVacuumHorizon() could error out -- for example GetMultiXactIdMembers(). It seems difficult to trigger the errors in GetMultiXactIdMembers(), as we would have to cause wraparound. It would be even more difficult to ensure that we hit those specific errors from a call stack containing heap_prune_satisfies_vacuum(). As such, I'm not sure I can think of a way to protect future developers from repeating my mistake--apart from improving the comment like you mentioned. - Melanie
Re: Eliminate redundant tuple visibility check in vacuum
On Wed, Sep 6, 2023 at 5:21 PM Melanie Plageman wrote: > Yeah, I think this is a fair concern. I have addressed it in the > attached patches. > > I thought a lot about whether or not adding a PruneResult which > contains only the output parameters and result of heap_page_prune() is > annoying since we have so many other *Prune* data structures. I > decided it's not annoying. In some cases, four outputs don't merit a > new structure. In this case, I think it declutters the code a bit -- > independent of any other patches I may be writing :) I took a look at 0001 and I think that it's incorrect. In the existing code, *off_loc gets updated before each call to heap_prune_satisfies_vacuum(). This means that if an error occurs in heap_prune_satisfies_vacuum(), *off_loc will as of that moment contain the relevant offset number. In your version, the relevant offset number will only be stored in some local structure to which the caller doesn't yet have access. The difference is meaningful. lazy_scan_prune passes off_loc as vacrel->offnum, which means that if an error happens, vacrel->offnum will have the right value, and so when the error context callback installed by heap_vacuum_rel() fires, namely vacuum_error_callback(), it can look at vacrel->offnum and know where the error happened. With your patch, I think that would no longer work. I haven't run the regression suite with 0001 applied. I'm guessing that you did, and that they passed, which if true means that we don't have a test for this, which is a shame, although writing such a test might be a bit tricky. If there is a test case for this and you didn't run it, woops. This is also why I think it's *extremely* important for the header comment of a function that takes pointer parameters to document the semantics of those pointers. Normally they are in parameters or out parameters or in-out parameters, but here it's something even more complicated. The existing header comment says "off_loc is the offset location required by the caller to use in error callback," which I didn't really understand until I actually looked at what the code is doing, so I consider that somebody could've done a better job writing this comment, but in theory you could've also noticed that, at least AFAICS, there's no way for the function to return with *off_loc set to anything other than InvalidOffsetNumber. That means that the statements which set *off_loc to other values must have some other purpose. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Eliminate redundant tuple visibility check in vacuum
On Wed, Sep 6, 2023 at 1:04 PM Robert Haas wrote: > > On Thu, Aug 31, 2023 at 6:29 PM Melanie Plageman > wrote: > > I have changed this. > > I spent a bunch of time today looking at this, thinking maybe I could > commit it. But then I got cold feet. > > With these patches applied, PruneResult ends up being declared in > heapam.h, with a comment that says /* State returned from pruning */. > But that comment isn't accurate. The two new members that get added to > the structure by 0002, namely nnewlpdead and htsv, are in fact state > that is returned from pruning. But the other 5 members aren't. They're > just initialized to constant values by pruning and then filled in for > real by the vacuum logic. That's extremely weird. It would be fine if > heap_page_prune() just grew a new output argument that only returned > the HTSV results, or perhaps it could make sense to bundle any > existing out parameters together into a struct and then add new things > to that struct instead of adding even more parameters to the function > itself. But there doesn't seem to be any good reason to muddle > together the new output parameters for heap_page_prune() with a bunch > of state that is currently internal to vacuumlazy.c. > > I realize that the shape of the patches probably stems from the fact > that they started out life as part of a bigger patch set. But to be > committed independently, they need to be shaped in a way that makes > sense independently, and I don't think this qualifies. On the plus > side, it seems to me that it's probably not that much work to fix this > issue and that the result would likely be a smaller patch than what > you have now, which is something. Yeah, I think this is a fair concern. I have addressed it in the attached patches. I thought a lot about whether or not adding a PruneResult which contains only the output parameters and result of heap_page_prune() is annoying since we have so many other *Prune* data structures. I decided it's not annoying. In some cases, four outputs don't merit a new structure. In this case, I think it declutters the code a bit -- independent of any other patches I may be writing :) - Melanie From 11f5d895e4efe7b3b3a7bbc58efbb4d6c40274c0 Mon Sep 17 00:00:00 2001 From: Melanie Plageman Date: Wed, 6 Sep 2023 14:57:20 -0400 Subject: [PATCH v3 1/2] Move heap_page_prune output parameters into struct Add PruneResult, a structure containing the output parameters and result of heap_page_prune(). Reorganizing the results of heap_page_prune() into a struct simplifies the function signature and provides a location for future commits to store additional output parameters. Note that heap_page_prune() no longer NULL checks off_loc, the current tuple's offset in the line pointer array. It will never be NULL now that it is a member of a required output parameter, PruneResult. Discussion: https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com --- src/backend/access/heap/pruneheap.c | 47 +++- src/backend/access/heap/vacuumlazy.c | 19 +-- src/include/access/heapam.h | 20 ++-- src/tools/pgindent/typedefs.list | 1 + 4 files changed, 45 insertions(+), 42 deletions(-) diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c index 18193efa23..f0847795a7 100644 --- a/src/backend/access/heap/pruneheap.c +++ b/src/backend/access/heap/pruneheap.c @@ -155,15 +155,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer) */ if (PageIsFull(page) || PageGetHeapFreeSpace(page) < minfree) { - int ndeleted, - nnewlpdead; + PruneResult presult; - ndeleted = heap_page_prune(relation, buffer, vistest, - &nnewlpdead, NULL); + heap_page_prune(relation, buffer, vistest, &presult); /* * Report the number of tuples reclaimed to pgstats. This is - * ndeleted minus the number of newly-LP_DEAD-set items. + * presult.ndeleted minus the number of newly-LP_DEAD-set items. * * We derive the number of dead tuples like this to avoid totally * forgetting about items that were set to LP_DEAD, since they @@ -175,9 +173,9 @@ heap_page_prune_opt(Relation relation, Buffer buffer) * tracks ndeleted, since it will set the same LP_DEAD items to * LP_UNUSED separately. */ - if (ndeleted > nnewlpdead) + if (presult.ndeleted > presult.nnewlpdead) pgstat_update_heap_dead_tuples(relation, - ndeleted - nnewlpdead); + presult.ndeleted - presult.nnewlpdead); } /* And release buffer lock */ @@ -204,21 +202,15 @@ heap_page_prune_opt(Relation relation, Buffer buffer) * (see heap_prune_satisfies_vacuum and * HeapTupleSatisfiesVacuum). * - * Sets *nnewlpdead for caller, indicating the number of items that were - * newly set LP_DEAD during prune operation. - * - * off_loc is the offset location required by the caller to use in error - * callback. - * - * Returns the number o
Re: Eliminate redundant tuple visibility check in vacuum
On Thu, Aug 31, 2023 at 6:29 PM Melanie Plageman wrote: > I have changed this. I spent a bunch of time today looking at this, thinking maybe I could commit it. But then I got cold feet. With these patches applied, PruneResult ends up being declared in heapam.h, with a comment that says /* State returned from pruning */. But that comment isn't accurate. The two new members that get added to the structure by 0002, namely nnewlpdead and htsv, are in fact state that is returned from pruning. But the other 5 members aren't. They're just initialized to constant values by pruning and then filled in for real by the vacuum logic. That's extremely weird. It would be fine if heap_page_prune() just grew a new output argument that only returned the HTSV results, or perhaps it could make sense to bundle any existing out parameters together into a struct and then add new things to that struct instead of adding even more parameters to the function itself. But there doesn't seem to be any good reason to muddle together the new output parameters for heap_page_prune() with a bunch of state that is currently internal to vacuumlazy.c. I realize that the shape of the patches probably stems from the fact that they started out life as part of a bigger patch set. But to be committed independently, they need to be shaped in a way that makes sense independently, and I don't think this qualifies. On the plus side, it seems to me that it's probably not that much work to fix this issue and that the result would likely be a smaller patch than what you have now, which is something. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Eliminate redundant tuple visibility check in vacuum
Hi, On 9/1/23 03:25, Peter Geoghegan wrote: On Thu, Aug 31, 2023 at 3:35 PM Melanie Plageman wrote: Any inserting transaction which aborts after heap_page_prune()'s visibility check will now be of no concern to lazy_scan_prune(). Since we don't do the visibility check again, we won't find the tuple HEAPTUPLE_DEAD and thus won't have the problem of adding the tuple to the array of tuples for vacuum to reap. This does mean that we won't reap and remove tuples whose inserting transactions abort right after heap_page_prune()'s visibility check. But, this doesn't seem like an issue. It's definitely not an issue. The loop is essentially a hacky way of getting a consistent picture of which tuples should be treated as HEAPTUPLE_DEAD, and which tuples need to be left behind (consistent at the level of each page and each HOT chain, at least). Making that explicit seems strictly better. They may not be removed until the next vacuum, but ISTM it is actually worse to pay the cost of re-pruning the whole page just to clean up that one tuple. Maybe if that renders the page all visible and we can mark it as such in the visibility map -- but that seems like a relatively narrow use case. The chances of actually hitting the retry are microscopic anyway. It has nothing to do with making sure that dead tuples from aborted tuples get removed for its own sake, or anything. Rather, the retry is all about making sure that all TIDs that get removed from indexes can only point to LP_DEAD stubs. Prior to Postgres 14, HEAPTUPLE_DEAD tuples with storage would very occasionally be left behind, which made life difficult in a bunch of other places -- for no good reason. That makes sense and seems like a much better compromise. Thanks for the explanations. Please update the comment to document the corner case and how we handle it. -- David Geier (ServiceNow)
Re: Eliminate redundant tuple visibility check in vacuum
On Thu, Aug 31, 2023 at 3:35 PM Melanie Plageman wrote: > Any inserting transaction which aborts after heap_page_prune()'s > visibility check will now be of no concern to lazy_scan_prune(). Since > we don't do the visibility check again, we won't find the tuple > HEAPTUPLE_DEAD and thus won't have the problem of adding the tuple to > the array of tuples for vacuum to reap. This does mean that we won't > reap and remove tuples whose inserting transactions abort right after > heap_page_prune()'s visibility check. But, this doesn't seem like an > issue. It's definitely not an issue. The loop is essentially a hacky way of getting a consistent picture of which tuples should be treated as HEAPTUPLE_DEAD, and which tuples need to be left behind (consistent at the level of each page and each HOT chain, at least). Making that explicit seems strictly better. > They may not be removed until the next vacuum, but ISTM it is > actually worse to pay the cost of re-pruning the whole page just to > clean up that one tuple. Maybe if that renders the page all visible > and we can mark it as such in the visibility map -- but that seems > like a relatively narrow use case. The chances of actually hitting the retry are microscopic anyway. It has nothing to do with making sure that dead tuples from aborted tuples get removed for its own sake, or anything. Rather, the retry is all about making sure that all TIDs that get removed from indexes can only point to LP_DEAD stubs. Prior to Postgres 14, HEAPTUPLE_DEAD tuples with storage would very occasionally be left behind, which made life difficult in a bunch of other places -- for no good reason. -- Peter Geoghegan
Re: Eliminate redundant tuple visibility check in vacuum
On Thu, Aug 31, 2023 at 5:39 AM David Geier wrote: > Regarding the 2nd patch (disclaimer: I'm not too familiar with that area > of the code): I don't completely understand why the retry loop is not > needed anymore and how you now detect/handle the possible race > condition? It can still happen that an aborting transaction changes the > state of a row after heap_page_prune() looked at that row. Would that > case now not be ignored? Thanks for asking. I've updated the comment in the code and the commit message about this, as it seems important to be clear. Any inserting transaction which aborts after heap_page_prune()'s visibility check will now be of no concern to lazy_scan_prune(). Since we don't do the visibility check again, we won't find the tuple HEAPTUPLE_DEAD and thus won't have the problem of adding the tuple to the array of tuples for vacuum to reap. This does mean that we won't reap and remove tuples whose inserting transactions abort right after heap_page_prune()'s visibility check. But, this doesn't seem like an issue. They may not be removed until the next vacuum, but ISTM it is actually worse to pay the cost of re-pruning the whole page just to clean up that one tuple. Maybe if that renders the page all visible and we can mark it as such in the visibility map -- but that seems like a relatively narrow use case. - Melanie
Re: Eliminate redundant tuple visibility check in vacuum
On Thu, Aug 31, 2023 at 2:03 PM Robert Haas wrote: > > I have a few suggestions: > > - Rather than removing the rather large comment block at the top of > lazy_scan_prune() I'd like to see it rewritten to explain how we now > deal with the problem. I'd suggest leaving the first paragraph ("Prior > to...") just as it is and replace all the words following "The > approach we take now is" with a description of the approach that this > patch takes to the problem. Good idea. I've updated the comment. I also explain why this new approach works in the commit message and reference the commit which added the previous approach. > - I'm not a huge fan of the caller of heap_page_prune() having to know > how to initialize the PruneResult. Can heap_page_prune() itself do > that work, so that presult is an out parameter rather than an in-out > parameter? Or, if not, can it be moved to a new helper function, like > heap_page_prune_init(), rather than having that logic in 2+ places? Ah, yes. Now that it has two callers, and since it is exclusively an output parameter, it is quite ugly to initialize it in both callers. Fixed in the attached. > - int ndeleted, > - nnewlpdead; > - > - ndeleted = heap_page_prune(relation, buffer, vistest, limited_xmin, > -limited_ts, &nnewlpdead, NULL); > + int ndeleted = heap_page_prune(relation, buffer, vistest, limited_xmin, > +limited_ts, &presult, NULL); > > - I don't particularly like merging the declaration with the > assignment unless the call is narrow enough that we don't need a line > break in there, which is not the case here. I have changed this. > - I haven't thoroughly investigated yet, but I wonder if there are any > other places where comments need updating. As a general note, I find > it desirable for a function's header comment to mention whether any > pointer parameters are in parameters, out parameters, or in-out > parameters, and what the contract between caller and callee actually > is. I've investigated vacuumlazy.c and pruneheap.c and looked at the commit that added the retry loop (8523492d4e349) to see everywhere it added comments and don't see anywhere else that needs updating. I have updated lazy_scan_prune()'s function header comment to describe the nature of the in-out and output parameters and the contract. - Melanie From 6ea7a2aa5698e08ce67d47efd3dbfb54be30d9cb Mon Sep 17 00:00:00 2001 From: Melanie Plageman Date: Thu, 3 Aug 2023 15:08:58 -0400 Subject: [PATCH v2 1/2] Rebrand LVPagePruneState as PruneResult With this rename, and relocation to heapam.h, we can use PruneResult as an output parameter for on-access pruning as well. It also makes it harder to confuse with PruneState. We'll be adding and removing PruneResult fields in future commits, but this rename and relocation is separate to make the diff easier to understand. Discussion: https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com --- src/backend/access/heap/vacuumlazy.c | 122 --- src/include/access/heapam.h | 19 + src/tools/pgindent/typedefs.list | 2 +- 3 files changed, 76 insertions(+), 67 deletions(-) diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 6a41ee635d..5e0a7422bb 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -212,24 +212,6 @@ typedef struct LVRelState int64 missed_dead_tuples; /* # removable, but not removed */ } LVRelState; -/* - * State returned by lazy_scan_prune() - */ -typedef struct LVPagePruneState -{ - bool hastup; /* Page prevents rel truncation? */ - bool has_lpdead_items; /* includes existing LP_DEAD items */ - - /* - * State describes the proper VM bit states to set for the page following - * pruning and freezing. all_visible implies !has_lpdead_items, but don't - * trust all_frozen result unless all_visible is also set to true. - */ - bool all_visible; /* Every item visible to all? */ - bool all_frozen; /* provided all_visible is also true */ - TransactionId visibility_cutoff_xid; /* For recovery conflicts */ -} LVPagePruneState; - /* Struct for saving and restoring vacuum error information. */ typedef struct LVSavedErrInfo { @@ -250,7 +232,7 @@ static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, bool sharelock, Buffer vmbuffer); static void lazy_scan_prune(LVRelState *vacrel, Buffer buf, BlockNumber blkno, Page page, - LVPagePruneState *prunestate); + PruneResult *presult); static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf, BlockNumber blkno, Page page, bool *hastup, bool *recordfreespace); @@ -854,7 +836,7 @@ lazy_scan_heap(LVRelState *vacrel) Buffer buf; Page page; bool all_visible_according_to_vm; - LVPagePruneState prunestate; + PruneResult presult; if (blkno == next_unskippable_block) { @@ -1019,12 +1001,12 @@ lazy_scan_heap(LVRelState *vacrel) * wer
Re: Eliminate redundant tuple visibility check in vacuum
On Mon, Aug 28, 2023 at 7:49 PM Melanie Plageman wrote: > While working on a set of patches to combine the freeze and visibility > map WAL records into the prune record, I wrote the attached patches > reusing the tuple visibility information collected in heap_page_prune() > back in lazy_scan_prune(). > > heap_page_prune() collects the HTSV_Result for every tuple on a page > and saves it in an array used by heap_prune_chain(). If we make that > array available to lazy_scan_prune(), it can use it when collecting > stats for vacuum and determining whether or not to freeze tuples. > This avoids calling HeapTupleSatisfiesVacuum() again on every tuple in > the page. > > It also gets rid of the retry loop in lazy_scan_prune(). In general, I like these patches. I think it's a clever approach, and I don't really see any down side. It should just be straight-up better than what we have now, and if it's not better, it still shouldn't be any worse. I have a few suggestions: - Rather than removing the rather large comment block at the top of lazy_scan_prune() I'd like to see it rewritten to explain how we now deal with the problem. I'd suggest leaving the first paragraph ("Prior to...") just as it is and replace all the words following "The approach we take now is" with a description of the approach that this patch takes to the problem. - I'm not a huge fan of the caller of heap_page_prune() having to know how to initialize the PruneResult. Can heap_page_prune() itself do that work, so that presult is an out parameter rather than an in-out parameter? Or, if not, can it be moved to a new helper function, like heap_page_prune_init(), rather than having that logic in 2+ places? - int ndeleted, - nnewlpdead; - - ndeleted = heap_page_prune(relation, buffer, vistest, limited_xmin, -limited_ts, &nnewlpdead, NULL); + int ndeleted = heap_page_prune(relation, buffer, vistest, limited_xmin, +limited_ts, &presult, NULL); - I don't particularly like merging the declaration with the assignment unless the call is narrow enough that we don't need a line break in there, which is not the case here. - I haven't thoroughly investigated yet, but I wonder if there are any other places where comments need updating. As a general note, I find it desirable for a function's header comment to mention whether any pointer parameters are in parameters, out parameters, or in-out parameters, and what the contract between caller and callee actually is. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Eliminate redundant tuple visibility check in vacuum
Hi Melanie, On 8/31/23 02:59, Melanie Plageman wrote: I created a large table and then updated a tuple on every page in the relation and vacuumed it. I saw a consistent slight improvement in vacuum execution time. I profiled a bit with perf stat as well. The difference is relatively small for this kind of example, but I can work on a more compelling, realistic example. I think eliminating the retry loop is also useful, as I have heard that users have had to cancel vacuums which were in this retry loop for too long. Just to provide a specific test case, if you create a small table like this create table foo (a int, b int, c int) with(autovacuum_enabled=false); insert into foo select i, i, i from generate_series(1, 1000); And then vacuum it. I find that with my patch applied I see a consistent ~9% speedup (averaged across multiple runs). master: ~533ms patch: ~487ms And in the profile, with my patch applied, you notice less time spent in HeapTupleSatisfiesVacuumHorizon() master: 11.83% postgres postgres [.] heap_page_prune 11.59% postgres postgres [.] heap_prepare_freeze_tuple 8.77% postgres postgres [.] lazy_scan_prune 8.01% postgres postgres [.] HeapTupleSatisfiesVacuumHorizon 7.79% postgres postgres [.] heap_tuple_should_freeze patch: 13.41% postgres postgres [.] heap_prepare_freeze_tuple 9.88% postgres postgres [.] heap_page_prune 8.61% postgres postgres [.] lazy_scan_prune 7.00% postgres postgres [.] heap_tuple_should_freeze 6.43% postgres postgres [.] HeapTupleSatisfiesVacuumHorizon Thanks a lot for providing additional information and the test case. I tried it on a release build and I also see a 10% speed-up. I reset the visibility map between VACUUM runs, see: CREATE EXTENSION pg_visibility; CREATE TABLE foo (a INT, b INT, c INT) WITH(autovacuum_enabled=FALSE); INSERT INTO foo SELECT i, i, i from generate_series(1, 1000) i; VACUUM foo; SELECT pg_truncate_visibility_map('foo'); VACUUM foo; SELECT pg_truncate_visibility_map('foo'); VACUUM foo; ... The first patch, which refactors the code so we can pass the result of the visibility checks to the caller, looks good to me. Regarding the 2nd patch (disclaimer: I'm not too familiar with that area of the code): I don't completely understand why the retry loop is not needed anymore and how you now detect/handle the possible race condition? It can still happen that an aborting transaction changes the state of a row after heap_page_prune() looked at that row. Would that case now not be ignored? -- David Geier (ServiceNow)
Re: Eliminate redundant tuple visibility check in vacuum
> On Tue, Aug 29, 2023 at 5:07 AM David Geier wrote: > > Could you measure any performance difference? > > > > If so could you provide your test case? > > I created a large table and then updated a tuple on every page in the > relation and vacuumed it. I saw a consistent slight improvement in > vacuum execution time. I profiled a bit with perf stat as well. The > difference is relatively small for this kind of example, but I can > work on a more compelling, realistic example. I think eliminating the > retry loop is also useful, as I have heard that users have had to > cancel vacuums which were in this retry loop for too long. Just to provide a specific test case, if you create a small table like this create table foo (a int, b int, c int) with(autovacuum_enabled=false); insert into foo select i, i, i from generate_series(1, 1000); And then vacuum it. I find that with my patch applied I see a consistent ~9% speedup (averaged across multiple runs). master: ~533ms patch: ~487ms And in the profile, with my patch applied, you notice less time spent in HeapTupleSatisfiesVacuumHorizon() master: 11.83% postgres postgres [.] heap_page_prune 11.59% postgres postgres [.] heap_prepare_freeze_tuple 8.77% postgres postgres [.] lazy_scan_prune 8.01% postgres postgres [.] HeapTupleSatisfiesVacuumHorizon 7.79% postgres postgres [.] heap_tuple_should_freeze patch: 13.41% postgres postgres [.] heap_prepare_freeze_tuple 9.88% postgres postgres [.] heap_page_prune 8.61% postgres postgres [.] lazy_scan_prune 7.00% postgres postgres [.] heap_tuple_should_freeze 6.43% postgres postgres [.] HeapTupleSatisfiesVacuumHorizon - Melanie
Re: Eliminate redundant tuple visibility check in vacuum
Hi David, Thanks for taking a look! On Tue, Aug 29, 2023 at 5:07 AM David Geier wrote: > > Hi Melanie, > > On 8/29/23 01:49, Melanie Plageman wrote: > > While working on a set of patches to combine the freeze and visibility > > map WAL records into the prune record, I wrote the attached patches > > reusing the tuple visibility information collected in heap_page_prune() > > back in lazy_scan_prune(). > > > > heap_page_prune() collects the HTSV_Result for every tuple on a page > > and saves it in an array used by heap_prune_chain(). If we make that > > array available to lazy_scan_prune(), it can use it when collecting > > stats for vacuum and determining whether or not to freeze tuples. > > This avoids calling HeapTupleSatisfiesVacuum() again on every tuple in > > the page. > > > > It also gets rid of the retry loop in lazy_scan_prune(). > > How did you test this change? I didn't add a new test, but you can confirm some existing test coverage if you, for example, set every HTSV_Result in the array to HEAPTUPLE_LIVE and run the regression tests. Tests relying on vacuum removing the right tuples may fail. For example, select count(*) from gin_test_tbl where i @> array[1, 999]; in src/test/regress/sql/gin.sql fails for me since it sees a tuple it shouldn't. > Could you measure any performance difference? > > If so could you provide your test case? I created a large table and then updated a tuple on every page in the relation and vacuumed it. I saw a consistent slight improvement in vacuum execution time. I profiled a bit with perf stat as well. The difference is relatively small for this kind of example, but I can work on a more compelling, realistic example. I think eliminating the retry loop is also useful, as I have heard that users have had to cancel vacuums which were in this retry loop for too long. - Melanie
Re: Eliminate redundant tuple visibility check in vacuum
Hi Melanie, On 8/29/23 01:49, Melanie Plageman wrote: While working on a set of patches to combine the freeze and visibility map WAL records into the prune record, I wrote the attached patches reusing the tuple visibility information collected in heap_page_prune() back in lazy_scan_prune(). heap_page_prune() collects the HTSV_Result for every tuple on a page and saves it in an array used by heap_prune_chain(). If we make that array available to lazy_scan_prune(), it can use it when collecting stats for vacuum and determining whether or not to freeze tuples. This avoids calling HeapTupleSatisfiesVacuum() again on every tuple in the page. It also gets rid of the retry loop in lazy_scan_prune(). How did you test this change? Could you measure any performance difference? If so could you provide your test case? -- David Geier (ServiceNow)
Eliminate redundant tuple visibility check in vacuum
While working on a set of patches to combine the freeze and visibility map WAL records into the prune record, I wrote the attached patches reusing the tuple visibility information collected in heap_page_prune() back in lazy_scan_prune(). heap_page_prune() collects the HTSV_Result for every tuple on a page and saves it in an array used by heap_prune_chain(). If we make that array available to lazy_scan_prune(), it can use it when collecting stats for vacuum and determining whether or not to freeze tuples. This avoids calling HeapTupleSatisfiesVacuum() again on every tuple in the page. It also gets rid of the retry loop in lazy_scan_prune(). - Melanie From bdb432b220571086077085b29d3c90a3a1c68c47 Mon Sep 17 00:00:00 2001 From: Melanie Plageman Date: Thu, 3 Aug 2023 15:37:39 -0400 Subject: [PATCH v1 2/2] Reuse heap_page_prune() tuple visibility statuses heap_page_prune() obtains the HTSV_Result (tuple visibility status) returned from HeapTupleSatisfiesVacuum() for every tuple on the page and stores them in an array. By making this array available to heap_page_prune()'s caller lazy_scan_prune(), we can avoid an additional call to HeapTupleSatisfiesVacuum() when freezing the tuples and recording LP_DEAD items for vacuum. This saves resources and eliminates the possibility that vacuuming corrupted data results in a hang due to endless retry looping. We pass the HTSV array from heap_page_prune() back to lazy_scan_prune() in PruneResult. Now that we are passing in PruneResult to heap_page_prune(), we can move the other output parameters passed individually into heap_page_prune() into the PruneResult. --- src/backend/access/heap/pruneheap.c | 68 +++- src/backend/access/heap/vacuumlazy.c | 66 --- src/include/access/heapam.h | 14 +- 3 files changed, 68 insertions(+), 80 deletions(-) diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c index 47b9e20915..b5bc126399 100644 --- a/src/backend/access/heap/pruneheap.c +++ b/src/backend/access/heap/pruneheap.c @@ -65,16 +65,6 @@ typedef struct * 1. Otherwise every access would need to subtract 1. */ bool marked[MaxHeapTuplesPerPage + 1]; - - /* - * Tuple visibility is only computed once for each tuple, for correctness - * and efficiency reasons; see comment in heap_page_prune() for details. - * This is of type int8[], instead of HTSV_Result[], so we can use -1 to - * indicate no visibility has been computed, e.g. for LP_DEAD items. - * - * Same indexing as ->marked. - */ - int8 htsv[MaxHeapTuplesPerPage + 1]; } PruneState; /* Local functions */ @@ -83,7 +73,7 @@ static HTSV_Result heap_prune_satisfies_vacuum(PruneState *prstate, Buffer buffer); static int heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, - PruneState *prstate); + PruneState *prstate, PruneResult *presult); static void heap_prune_record_prunable(PruneState *prstate, TransactionId xid); static void heap_prune_record_redirect(PruneState *prstate, OffsetNumber offnum, OffsetNumber rdoffnum); @@ -191,10 +181,25 @@ heap_page_prune_opt(Relation relation, Buffer buffer) if (PageIsFull(page) || PageGetHeapFreeSpace(page) < minfree) { + PruneResult presult; + /* OK, try to get exclusive buffer lock */ if (!ConditionalLockBufferForCleanup(buffer)) return; + /* Initialize prune result fields */ + presult.hastup = false; + presult.has_lpdead_items = false; + presult.all_visible = true; + presult.all_frozen = true; + presult.visibility_cutoff_xid = InvalidTransactionId; + + /* + * nnewlpdead only includes those items which were newly set to + * LP_DEAD during pruning. + */ + presult.nnewlpdead = 0; + /* * Now that we have buffer lock, get accurate information about the * page's free space, and recheck the heuristic about whether to @@ -202,11 +207,8 @@ heap_page_prune_opt(Relation relation, Buffer buffer) */ if (PageIsFull(page) || PageGetHeapFreeSpace(page) < minfree) { - int ndeleted, - nnewlpdead; - - ndeleted = heap_page_prune(relation, buffer, vistest, limited_xmin, - limited_ts, &nnewlpdead, NULL); + int ndeleted = heap_page_prune(relation, buffer, vistest, limited_xmin, + limited_ts, &presult, NULL); /* * Report the number of tuples reclaimed to pgstats. This is @@ -222,9 +224,9 @@ heap_page_prune_opt(Relation relation, Buffer buffer) * tracks ndeleted, since it will set the same LP_DEAD items to * LP_UNUSED separately. */ - if (ndeleted > nnewlpdead) + if (ndeleted > presult.nnewlpdead) pgstat_update_heap_dead_tuples(relation, - ndeleted - nnewlpdead); + ndeleted - presult.nnewlpdead); } /* And release buffer lock */ @@ -253,12 +255,13 @@ heap_page_prune_opt(Relation relation, Buffer buffer) * either have been set by TransactionIdLimitedForOldSnapshots, or * Inval