Re: pg_plan_advice (now with transparent SQL plan performance overrides - pg_stash_advice)

Robert Haas Wed, 04 Mar 2026 06:17:55 -0800

On Wed, Mar 4, 2026 at 4:50 AM Jakub Wartak
<[email protected]> wrote:
> This is micro-thing, feel free to ignore, but well I was after something much
> more easy: `CREATE EXTENSION pg_plan_advice`to be no-op without any failure
> even if it doesnt provide any views or fucntions right now (so empty
> share/extension/pg_plan_advice.control and -1.0.sql) or at least some dummy
> function, just so that CREATE EXTENSION pg_plan_advice wouldn't fail.
> This is nothing technical, it's just sharp rough edge for user (but
> technically sound), that 2 are deployed with CREATE EXTENSION but 3rd one is
> not. I just think that if we put 3 into shared_preload_libraries then I won't
> have to think for which ones to exec CREATE EXTENSION (I would just blindly do
> it for all and the error wouldn't make somebody unhappy that something is
> potentially not working because CREATE EXTENSION is not for it) - it's pure
> user-focused usability feedback.


Absolutely not. We don't do that for other contrib modules that don't
require SQL definitions (e.g. auth_delay, auto_explain,
basebackup_to_shell, passwordcheck) and I don't think we should start
now. I agree that it can be confusing for users that there is a
distinction between extensions and loadable modules, but the right
solution to that problem is to educate the users. I think people who
do know how things are supposed to work would find it quite irritating
to be handed an extension that doesn't actually install anything.
Also, I have some hope that at some point in the future, we might
decide to ingest pg_plan_advice into core while keeping the other
things as contrib modules, and if that ever happens, it will be a lot
easier if it's only a loadable module and has no extension associated
with it.

> Yes with -M extended it is instant. I have found also that with
> -M prepared I can do simple one-time `analyze pgbench_accounts` (when changing
> SEQ_SCAN <-> INDEX_SCAN for that table) and that is also enough for
> the backend to immediatley see (and react to) to what's in the active
> configured stash even for future changes without further ANALYZEs.
> Not sure if pg_stash_advice needs a function to flush-force all backends,
> so the plans are 100% effecitve, as apparently we seem to have ANALYZE
> already, but it is not that obvious that one might want to use it.
>
> If there would be such function to gurantee, we probably wouldn't see
> complaints like 'I have done this and session still is using old plan'.

True, but causing a system-wide cache invalidation can also create
quite a performance hit. I'm not quite sure what the best thing to do
here is. I can see an argument that changing the advice for a certain
query ought to invalidated stored plans for that query, but I don't
think our invalidation infrastructure is capable of doing anything
that targeted. Just invalidating everything seems pretty heavy-handed,
but maybe it will turn out to be the right answer. I think we should
wait for more people to play with this before deciding anything.

> > > 5. QQ: will pg_stash_advice persist the stashes one day?
> >
> > I have no current plan to implement that, but who knows?
>
> OK, so perhaps docs for pg_create_advice_stash() and pg_set_stashed_advice()
> should mention those 'stashes' are not persistent across restarts. Without
> this I can already hear scream of some users from the future that they
> applied advice, it fixed problem and after some time it disappeared (In
> other RDBMS it is persistent, so users coming from there might have such
> expectations).

I mean, there is already text about this in the very first paragraph
of the pg_stash_advice documentation. Perhaps you're saying you think
that information needs to be mentioned in multiple places in that
documentation, or in a different place than where it's currently
mentioned, but I'm slightly suspicious that you might not have
actually read what I already wrote.

> Well IMHO all the rest naming in pretty great shape and I think that
> pg_[collect|plan]_advice are great names too. It's just that `stash`
> keyword doesn't ring a bell to me at all that `pg_stash_advice` is
> related in any way to online/transparent/runtime plan modification and
> can be used to alter plans for other backends. Something like
> `pg_deploy_advice` or `pg_apply_advice` would be more in line with the
> other two, but perhaps it's just me..

I don't think it's just you. I originally named this pg_auto_advice.
However, the naming problem that then ran into was: what do you call
the containers that actually store the advice? I ended up calling them
"advice stores". But I didn't like that very much, because now the
name of the module (pg_auto_advice) and the name of the objects that
it creates (advice stores) are not obviously related. Moreover, I had
an unpleasant hunch that people weren't going to like the idea of
using "store" as a noun. So I tried to think of a word that I could
use for both the name of the extension and the name of the objects,
and stash is what I came up with. Your suggestions here are in much
the same vein as my original idea, but I would argue that they also
have the same problem: we're not going to call the named advice
containers "advice deploys" or "advice applies". Now, perhaps there's
an argument that those names don't have to match, but I think it makes
it a lot less confusing that they do.

> How about if we would just measure (estimate) with some small table number of
> entries vs memory used and put that into the docs?, so users are wary that
> they shouldn't just blidnly set it to high value? E.g. with
> 1000 local limit I get ~280kB for pg_collect_advice context and with 1000000
> local limit I've got it to ~50MB and stopped looking further (it was still
> growing). Itself that's not terrible but higher values with lots of backends
> might cause huge memory pressure (OOMs).

It depends a lot on the length of the query strings and the advice
strings. I wondered whether there was some point in having a setting
to collect only the advice string and not the query string, but in the
end I feel like pg_collect_advice is very much 1.0 software. It works,
but it's fairly primitive. It definitely shouldn't be confused with
industrial-strength, battle-grade, Teflon-hardened code. The problem
from my point of view is that not only are we short on time for this
release, but I do not feel like I know what the requirements for
something better really are. I think your suggestions so far --
deduplication, better memory control -- are very reasonable ideas, but
I think it's hard to know what is really important until more people
get their hands on this, try it out, and then (probably) complain
about it. I think between the people on this thread, or even between
you and I, we could easily come up with a list of 30, maybe even 50,
possible improvements to pg_collect_advice, but I am not at all
confident we'd correctly guess which 3-5 of those were going to be
most important to users. I think we just need to wait for more data
before deciding how to evolve this.

> Or maybe other idea: is there is possibility of making GUCs like
> local_collection_limits/local_collector settable only using
> SET/SET LOCAL, but not global? I mean what's the point of having being
> able to collect locally system-wide when realistically I cannot
> pull it back from backend-local memory? (and this removes the danger
> of multple backends goind wild with memory together).

The GUC infrastructure doesn't support this.

> resetting it back to 0 or disabling local collector and reloading won't
> fix it, backend needs to re-establish connection. Even with just 10000
> local collection limit it just gets down from ~1500 tps to 900 tps.
> It's seems the impact on CPU coming to be from exec_simple_query()
> -> finish_xact_command() -> MemoryContextCheck() -> AllocSetCheck(),
> so memory context validation that have literally nothing to do with
> with this patch (other than it using a lot of memory in those scenarios)

Sounds like you are running a debug build with asserts enabled, which
you probably don't want to do if you're trying to benchmark.

> > > 9. If IsQueryIdEnabled() is false (even after trying out to use 'auto')
> > > shouldn't this module raise warning when pg_stash_advice.stash_name != 
> > > NULL?
> >
> > I think the bar for printing a warning on every single query is
> > extremely high. Doing that just because of a questionable
> > configuration setting doesn't seem like a good idea.
>
> Why on every single query? I'm thinking that this should bail out in pgca&pgsa
> during initialization of those modules. What's the point of those modules
> if queryid is always 0? (I'm assuming somebody has compue_query_id=off and
> still loaded those modules).

That seems like a tremendous overreaction, given that (1) it might
cause the server to fail to start and (2) compute_query_id can be
changed at any time.

> BTW: the good news is that I haven't seen a single crash when throwing
> wild stuff on it or having strange ideas at pg_stash_advice usage.

I don't think there are too many crashes left (famous last words).
Here's a list of things I'm currently most worried about,
approximately in order starting from the most worrying:

* Maybe the advice-generation stuff doesn't correctly analyze all
possible plan trees, esp. cases not covered by our core regression
tests.
* Maybe the stuff that uses DSM isn't careful enough and can therefore
cause server-lifespan memory leaks in some scenario.
* Maybe I haven't got the security model right and some aspect of what
I've done there is CVE-worthy.
* Maybe advice application is broken in some cases in a way that can't
be fixed without additional core changes.

I'd be very grateful for review targeting any of these areas.

Thanks,

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg_plan_advice (now with transparent SQL plan performance overrides - pg_stash_advice)

Reply via email to