On Thu, May 4, 2017 at 6:05 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > PLpgSQL_datum is really a symbol table entry. The conflict against what > we mean by "Datum" elsewhere is pretty unfortunate.
Yeah. That's particularly bad because datum is a somewhat vague word under the best of circumstances (like "thing"). Maybe I'm missing something, but isn't it more like a parse tree than a symbol table entry? The symbol table entries seem to be based on PLpgSQL_nsitem_type, not PLpgSQL_datum, and they only come in the flavors you'd expect to find in a symbol table: label, var, row, rec. The datums on the other hand seem to consist of every kind of thing that PLpgSQL might be expected to interpret itself, either as an rvalue or as an lvalue. > ROW - this is where it gets fun. A ROW is effectively a variable > of a possibly-anonymous composite type, and it is defined by a list > (in its own datum) of links to PLpgSQL_datums representing the > individual columns. Typically the member datums would be VARs > but that's not the only possibility. > > As I mentioned earlier, the case that ROW is actually well adapted > for is multiple targets in INTO and similar constructs. For example, > if you have > > SELECT ...blah blah... INTO a,b,c > > then the target of the PLpgSQL_stmt_execsql is represented as a single > ROW datum whose members are the datums for a, b, and c. That's totally > determined by the text of the function and can't change under us. > > However ... somebody thought it'd be cute to use the ROW infrastructure > for variables of named composite types, too. So if you have > > DECLARE foo some_composite_type; > > then the name "foo" refers to a ROW datum, and the plpgsql compiler > generates additional anonymous VAR datums, one for each declared column > in some_composite_type, which become the members of the ROW datum. > The runtime representation is actually that each field value is stored > separately in its datum, as though it were an independent VAR. Field > references "foo.col1" are not compiled into RECFIELD datums; we just look > up the appropriate member datum during compile and make the expression > tree point to that datum directly. Ugh. > So, this representation is great for speed of access and modification > of individual fields of the composite variable. It sucks when you > want to assign to the composite as a whole or retrieve its value as > a whole, because you have to deconstruct or reconstruct a tuple to > do that. (The REC/RECFIELD approach has approximately the opposite > strengths and weaknesses.) Also, dealing with changes in the named > composite type is a complete fail, because we've built its structure > into the function's symbol table at parse time. It would probably be possible to come up with a representation that allowed both things to be efficient, the way a slot can contain either a Datum/isnull array or a heap tuple or both. Call the resulting data structure a record slot. foo.col1 could retain the original column name and also a cache containing a pointer to the record slot and a column offset within the slot, so that it can say e.g. assign_record_slot_column(slot, col, val). But it could also register the reference with the slot, so that if the slot structure changes, the cached slot and column offset (or at least the column offset) are cleared and get recomputed on next access. I'm not volunteering to do the work, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers