One of the interesting properties of Oracle-compatible variable references in plpgsql is that the set of variables referenced by a given query could change during a forced replan. For example, consider
declare x int; r record; ... for r in select x,y from tab loop ... If tab contains a column "x" then the "x" reference in the SELECT refers to tab.x; if not, it refers to the plpgsql variable x. So when first executing the SELECT we might find that it requires a Param reference to the plpgsql variable, and then after a replan is forced by ALTER TABLE tab ADD COLUMN x, there is no need for the Param anymore. Or vice versa. This kinda calls into question whether the Oracle way is actually a good idea or not; but my purpose here is not to debate that, just to look at what it takes to implement it. Currently, plpgsql generates a list of the variables referenced by any SQL statement or expression immediately upon seeing the text, before it's ever even fed to the core parser. I had been envisioning having the parser callback hook construct the list on-the-fly during parsing, but the possibility that the list will change from time to time means that other changes are needed too. Notably: 1. plancache.c does not have any provision for letting the Param type array associated with a stored statement change when the statement is replanned due to SI invalidation. 2. The control flow for a replan is that plpgsql calls SPI_execute_plan, which calls RevalidateCachedPlan, which does the replan if the cached plan is discovered to be stale. However, plpgsql already had to set up the list of actual parameter values before it called SPI_execute_plan, which means it is *way* too late to change the list of required Params even if plancache let us do it. After chewing on these facts for awhile, I am thinking that the best solution is for plpgsql to abandon the notion of a predetermined list of parameters for a SQL query altogether. What that list basically provides is a mapping from Param numbers ($n) to plpgsql "datum numbers" (indexes in the list of a plpgsql function's variables). We could make that mapping always be one-to-one, since there's no real reason that the Params available to a query have to be consecutively numbered. So the transformColumnRef hook would just pass back a Param using the referenced variable's datum number as paramid; it wouldn't bother at all with building a data structure listing the specific variables actually used in the query. As far as plancache goes, it would therefore always see a null array of Param type OIDs associated with a plpgsql-generated query, and we'd not have to provide a way to update that. (We'd still keep the ability to store such an array, because most other callers of plancache will still want a fixed list of Params.) What we'd have to add to plancache instead is the ability to install caller-determined parser callback hooks when it is calling the parser for a replan. This seems fairly easy to do --- I'm envisioning a sort of meta-hook function that gets called with the new ParseState and can insert hook function pointers in it. The other issue with this is what to do at runtime. We could do it with no other changes if we had plpgsql always set up Values/Nulls arrays listing *every* datum's current value. This seems a bit brute-force though --- it could be slow in a function with a lot of variables, and in most cases any specific query or expression would not need most of those values. What I think we should do instead is extend the ParamListInfo structure to add a callback hook function that populates individual ParamExternData array entries on-demand. The core executor would call the hook when it tried to fetch the value of a Param that was currently invalid (ptype == 0). So the hook would be invoked only once per query per referenced parameter, which shouldn't be much overhead. Another interesting property of this approach is that it'd fix the longstanding user complaint that constructions like if (TG_OP = 'INSERT' and NEW.foo = 'bar') ... fail prematurely. The executor would never demand the value of NEW.foo, and thus not fail, if TG_OP isn't INSERT. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers