On Mon, Mar 30, 2026 at 7:53 AM Robert Haas <[email protected]> wrote: > > Hi, > > Thanks for taking the time to respond. My reading of your comments is > that we are in overall agreement on the design, with the possible > exception of persisting data cross restarts. I will write more about > that topic below; but I think if that's the only design disagreement > we have, it makes sense to go forward with committing the patch that I > have, and we can continue to discuss whether we want to add something > related to persistence. The only reason not to do that would be if > there were a consensus that the lack of a persistence framework was > such a critical defect that we shouldn't ship this at all without > that, but I don't agree with that idea and I think it would be a > pretty strong position for someone to take.
Yeah, I think my position is that having a solution to persistence would be very good, but if that's not doable for this release, I think we have a potential way forward in future releases, at least when it comes to being restart-safe. That said, I still think it'll make a big difference in practice to be restart safe right away, if we can make it happen. > > On Sun, Mar 29, 2026 at 2:59 PM Lukas Fittl <[email protected]> wrote: > > I think a simple disk file is the way to go, similar to how > > autoprewarm works with its "autoprewarm.blocks" file. Its a bit > > awkward that that just sits in the main data directory, but since > > pg_prewarm already does it today, I think its okay to have another > > contrib module do the same. As noted I'm mainly worried about restarts > > that the user didn't control, causing advice that was set to be lost. > > > > I've attached a patch of how that could look like on top of your v23, > > that copies the modified stash information to a > > "pg_stash_advice.entries" file, and loads it after restarts. > > I'll be honest: I don't like this design much at all, but I do see the > practical advantages of it, and we have done similar things elsewhere, > in particular in autoprewarm. Before I get to the specifics of your > patch, let me complain about some things that I don't like at the > design level. We lose a lot by directing data through a bespoke > mechanism rather than handling it as table data. There are no > checksums, so we have less protection against corruption. There is no > write-ahead logging, so data does not make it to standbys, which is > more of a potential issue for pg_stash_advice than it is for > autoprewarm. All the code to read and write the file is specific to > this contrib module, so it can have its own bugs separate from every > similar module's bugs. The data can't easily be examined and > manipulated from SQL as table data can. It's just a messy one-off that > solves a practical problem but is otherwise not very nice. Of course, > sometimes such messy one-offs are the right answer. I think if we wanted a table, we should make it a table - but I think the fact that we want this to be low-overhead for running queries to examine whether there is anything for them to apply, would require some kind of cache in front of it, and that gets complicated pretty quickly. For the file-based direction, just for reference, I'm attaching an updated version of that (on top of Robert's earlier v23), that utilizes a background worker to write out the dump file as needed, at most every 60 seconds. It also reworks some of the output logic to do better memory management, and uses a TSV file format that can be easily read again. To be clear, I think its okay to go ahead with merging pg_stash_advice without that and make it a best effort to get the file saving in too - but I think with the current design in this patch represents a reasonable solution to what we can do in terms of persistence across restarts in either 19 or 20. > > > Because the number of entries here is controlled by the user (i.e. its > > not a function of the workload, but a function of how much advice you > > as a user have set), I'm much less worried about memory usage, as long > > as we document it clearly how to measure the amount of memory used. > > The module doesn't have a built-in way to do that right now. Are you > thinking we would document that pg_get_dsm_registry_allocations() can > be used? Yeah, for example. Alternatively we could provide a function/view that lists all advice across all stashes, so you can more easily see the result size of that and estimate what the in-memory use is. But pointing to pg_get_dsm_registry_allocations seems easier. > > In practice for a good amount of our user base these days the question > > will be "Does my cloud provider give me access to create stash > > entries", so its maybe worth thinking about if we could also allow > > pg_maintain to manage entries by default? > > Wouldn't it make more sense for the cloud provider to grant execute > permissions on these functions to pg_maintain if they are so inclined? > This is a brand-new facility, so I think we had better be conservative > in terms of default permissions. I guess. I'm always worried that providers get that wrong and forget to give end users the permissions - but I suppose end users can complain to their providers if that's the case. I've done another look over pg_set_stashed_advice and I think its in good shape. The only trailing thought I have is that we could consider running a fuzzer against the pg_set_advice function in particular, just to see if anything pops up (beyond having the ability to make a very large memory allocation through a large advice string, which is maybe a problem?). Thanks, Lukas -- Lukas Fittl
vnocfbot-2-0001-Make-pg_stash_advice-dump-advice-to-disk-.patch
Description: Binary data
