On Wed, Sep 6, 2023 at 5:21 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > Yeah, I think this is a fair concern. I have addressed it in the > attached patches. > > I thought a lot about whether or not adding a PruneResult which > contains only the output parameters and result of heap_page_prune() is > annoying since we have so many other *Prune* data structures. I > decided it's not annoying. In some cases, four outputs don't merit a > new structure. In this case, I think it declutters the code a bit -- > independent of any other patches I may be writing :)
I took a look at 0001 and I think that it's incorrect. In the existing code, *off_loc gets updated before each call to heap_prune_satisfies_vacuum(). This means that if an error occurs in heap_prune_satisfies_vacuum(), *off_loc will as of that moment contain the relevant offset number. In your version, the relevant offset number will only be stored in some local structure to which the caller doesn't yet have access. The difference is meaningful. lazy_scan_prune passes off_loc as vacrel->offnum, which means that if an error happens, vacrel->offnum will have the right value, and so when the error context callback installed by heap_vacuum_rel() fires, namely vacuum_error_callback(), it can look at vacrel->offnum and know where the error happened. With your patch, I think that would no longer work. I haven't run the regression suite with 0001 applied. I'm guessing that you did, and that they passed, which if true means that we don't have a test for this, which is a shame, although writing such a test might be a bit tricky. If there is a test case for this and you didn't run it, woops. This is also why I think it's *extremely* important for the header comment of a function that takes pointer parameters to document the semantics of those pointers. Normally they are in parameters or out parameters or in-out parameters, but here it's something even more complicated. The existing header comment says "off_loc is the offset location required by the caller to use in error callback," which I didn't really understand until I actually looked at what the code is doing, so I consider that somebody could've done a better job writing this comment, but in theory you could've also noticed that, at least AFAICS, there's no way for the function to return with *off_loc set to anything other than InvalidOffsetNumber. That means that the statements which set *off_loc to other values must have some other purpose. -- Robert Haas EDB: http://www.enterprisedb.com