On Wed, Mar 4, 2026 at 4:50 AM Jakub Wartak <[email protected]> wrote: > This is micro-thing, feel free to ignore, but well I was after something much > more easy: `CREATE EXTENSION pg_plan_advice`to be no-op without any failure > even if it doesnt provide any views or fucntions right now (so empty > share/extension/pg_plan_advice.control and -1.0.sql) or at least some dummy > function, just so that CREATE EXTENSION pg_plan_advice wouldn't fail. > This is nothing technical, it's just sharp rough edge for user (but > technically sound), that 2 are deployed with CREATE EXTENSION but 3rd one is > not. I just think that if we put 3 into shared_preload_libraries then I won't > have to think for which ones to exec CREATE EXTENSION (I would just blindly do > it for all and the error wouldn't make somebody unhappy that something is > potentially not working because CREATE EXTENSION is not for it) - it's pure > user-focused usability feedback.
Absolutely not. We don't do that for other contrib modules that don't require SQL definitions (e.g. auth_delay, auto_explain, basebackup_to_shell, passwordcheck) and I don't think we should start now. I agree that it can be confusing for users that there is a distinction between extensions and loadable modules, but the right solution to that problem is to educate the users. I think people who do know how things are supposed to work would find it quite irritating to be handed an extension that doesn't actually install anything. Also, I have some hope that at some point in the future, we might decide to ingest pg_plan_advice into core while keeping the other things as contrib modules, and if that ever happens, it will be a lot easier if it's only a loadable module and has no extension associated with it. > Yes with -M extended it is instant. I have found also that with > -M prepared I can do simple one-time `analyze pgbench_accounts` (when changing > SEQ_SCAN <-> INDEX_SCAN for that table) and that is also enough for > the backend to immediatley see (and react to) to what's in the active > configured stash even for future changes without further ANALYZEs. > Not sure if pg_stash_advice needs a function to flush-force all backends, > so the plans are 100% effecitve, as apparently we seem to have ANALYZE > already, but it is not that obvious that one might want to use it. > > If there would be such function to gurantee, we probably wouldn't see > complaints like 'I have done this and session still is using old plan'. True, but causing a system-wide cache invalidation can also create quite a performance hit. I'm not quite sure what the best thing to do here is. I can see an argument that changing the advice for a certain query ought to invalidated stored plans for that query, but I don't think our invalidation infrastructure is capable of doing anything that targeted. Just invalidating everything seems pretty heavy-handed, but maybe it will turn out to be the right answer. I think we should wait for more people to play with this before deciding anything. > > > 5. QQ: will pg_stash_advice persist the stashes one day? > > > > I have no current plan to implement that, but who knows? > > OK, so perhaps docs for pg_create_advice_stash() and pg_set_stashed_advice() > should mention those 'stashes' are not persistent across restarts. Without > this I can already hear scream of some users from the future that they > applied advice, it fixed problem and after some time it disappeared (In > other RDBMS it is persistent, so users coming from there might have such > expectations). I mean, there is already text about this in the very first paragraph of the pg_stash_advice documentation. Perhaps you're saying you think that information needs to be mentioned in multiple places in that documentation, or in a different place than where it's currently mentioned, but I'm slightly suspicious that you might not have actually read what I already wrote. > Well IMHO all the rest naming in pretty great shape and I think that > pg_[collect|plan]_advice are great names too. It's just that `stash` > keyword doesn't ring a bell to me at all that `pg_stash_advice` is > related in any way to online/transparent/runtime plan modification and > can be used to alter plans for other backends. Something like > `pg_deploy_advice` or `pg_apply_advice` would be more in line with the > other two, but perhaps it's just me.. I don't think it's just you. I originally named this pg_auto_advice. However, the naming problem that then ran into was: what do you call the containers that actually store the advice? I ended up calling them "advice stores". But I didn't like that very much, because now the name of the module (pg_auto_advice) and the name of the objects that it creates (advice stores) are not obviously related. Moreover, I had an unpleasant hunch that people weren't going to like the idea of using "store" as a noun. So I tried to think of a word that I could use for both the name of the extension and the name of the objects, and stash is what I came up with. Your suggestions here are in much the same vein as my original idea, but I would argue that they also have the same problem: we're not going to call the named advice containers "advice deploys" or "advice applies". Now, perhaps there's an argument that those names don't have to match, but I think it makes it a lot less confusing that they do. > How about if we would just measure (estimate) with some small table number of > entries vs memory used and put that into the docs?, so users are wary that > they shouldn't just blidnly set it to high value? E.g. with > 1000 local limit I get ~280kB for pg_collect_advice context and with 1000000 > local limit I've got it to ~50MB and stopped looking further (it was still > growing). Itself that's not terrible but higher values with lots of backends > might cause huge memory pressure (OOMs). It depends a lot on the length of the query strings and the advice strings. I wondered whether there was some point in having a setting to collect only the advice string and not the query string, but in the end I feel like pg_collect_advice is very much 1.0 software. It works, but it's fairly primitive. It definitely shouldn't be confused with industrial-strength, battle-grade, Teflon-hardened code. The problem from my point of view is that not only are we short on time for this release, but I do not feel like I know what the requirements for something better really are. I think your suggestions so far -- deduplication, better memory control -- are very reasonable ideas, but I think it's hard to know what is really important until more people get their hands on this, try it out, and then (probably) complain about it. I think between the people on this thread, or even between you and I, we could easily come up with a list of 30, maybe even 50, possible improvements to pg_collect_advice, but I am not at all confident we'd correctly guess which 3-5 of those were going to be most important to users. I think we just need to wait for more data before deciding how to evolve this. > Or maybe other idea: is there is possibility of making GUCs like > local_collection_limits/local_collector settable only using > SET/SET LOCAL, but not global? I mean what's the point of having being > able to collect locally system-wide when realistically I cannot > pull it back from backend-local memory? (and this removes the danger > of multple backends goind wild with memory together). The GUC infrastructure doesn't support this. > resetting it back to 0 or disabling local collector and reloading won't > fix it, backend needs to re-establish connection. Even with just 10000 > local collection limit it just gets down from ~1500 tps to 900 tps. > It's seems the impact on CPU coming to be from exec_simple_query() > -> finish_xact_command() -> MemoryContextCheck() -> AllocSetCheck(), > so memory context validation that have literally nothing to do with > with this patch (other than it using a lot of memory in those scenarios) Sounds like you are running a debug build with asserts enabled, which you probably don't want to do if you're trying to benchmark. > > > 9. If IsQueryIdEnabled() is false (even after trying out to use 'auto') > > > shouldn't this module raise warning when pg_stash_advice.stash_name != > > > NULL? > > > > I think the bar for printing a warning on every single query is > > extremely high. Doing that just because of a questionable > > configuration setting doesn't seem like a good idea. > > Why on every single query? I'm thinking that this should bail out in pgca&pgsa > during initialization of those modules. What's the point of those modules > if queryid is always 0? (I'm assuming somebody has compue_query_id=off and > still loaded those modules). That seems like a tremendous overreaction, given that (1) it might cause the server to fail to start and (2) compute_query_id can be changed at any time. > BTW: the good news is that I haven't seen a single crash when throwing > wild stuff on it or having strange ideas at pg_stash_advice usage. I don't think there are too many crashes left (famous last words). Here's a list of things I'm currently most worried about, approximately in order starting from the most worrying: * Maybe the advice-generation stuff doesn't correctly analyze all possible plan trees, esp. cases not covered by our core regression tests. * Maybe the stuff that uses DSM isn't careful enough and can therefore cause server-lifespan memory leaks in some scenario. * Maybe I haven't got the security model right and some aspect of what I've done there is CVE-worthy. * Maybe advice application is broken in some cases in a way that can't be fixed without additional core changes. I'd be very grateful for review targeting any of these areas. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
