subject:"\"\\\\\\\[HACKERS\\\\\\\] Better support for whole\\\\\\\-row operations and composite\""

[HACKERS] Better support for whole-row operations and composite types

2004-03-29 Thread Tom Lane

We have a number of issues revolving around the fact that composite types
(row types) aren't first-class objects.  I think it's past time to fix
that.  Here are some notes about doing it.  I am not sure all these ideas
are fully-baked ... comments appreciated.

When represented as a Datum, the format of a row-type object needs to be
something like this:

* overall length:   int4(this makes the Datum a valid varlena item)
* row type id:  Oid (either a composite type id or RECORDOID)
* row type typmod:  int4(see below for usage)
-- pad if needed to MAXALIGN boundary
* heap tuple representation, beginning with a HeapTupleHeaderData struct

If we do it exactly as above then we will be wasting some space, because
the xmin/xmax/cmax and ctid fields of HeapTupleHeaderData are of no use
in a row that isn't actually a table member row.  It is very tempting to
overlay the length and rowtype fields with the HeapTupleHeaderData struct.
This would save some code as well as space --- see discussion below.

Only named composite types, not RECORD, will be allowed to be used as
table column types.  This ensures that any row object stored on disk will
have a valid composite type ID embedded in it, so that the row structure
can be retrieved when the row is read.  However, we want to be able to
support row objects in memory that are of transient record types (for
example, the output of a function returning RECORD will have a record type
determined by the query itself).  I propose that we handle this case by
setting the type id to RECORDOID and using the typmod to identify the
particular record type --- the typmod will essentially be an index into
a backend-local cache of record types.  More detail below.

We'll add "tdtypeid" and "tdtypmod" fields to TupleDesc structs.  This
will make it easy to set the embedded type information correctly when
manufacturing a row datum using a TupleDesc.  For TupleDescs associated
with relations, tdtypeid is just the relation's row type OID, and tdtypmod
is -1.  For TupleDescs representing transient row types, we initially set
tdtypeid to RECORDOID and tdtypmod to -1 (indicating a completely
anonymous row type).  If the row type actually needs to be identifiable
then we establish a cache entry for it and set the typmod to an index for
the cache entry.  I think this will only need to happen when the query
contains a function-returning-RECORD or a whole-row variable referencing
what would otherwise be an anonymous row type, such as a JOIN result.

Composite types, as well as the RECORD type, will be marked in pg_type as
pass-by-ref, varlena (typlen -1), typalign 'd'.  (We will use the maximum
alignment always to avoid any dependency on types of the contained
columns.)

The present function call and return conventions involving TupleTableSlots
will be replaced by simply passing and returning these row objects as
pass-by-reference Datums.  In the case of functions returning rowtypes,
we'll continue to support the present ReturnSetInfo convention for
returning a separate TupleDesc describing the result type --- but this
will just be a crosscheck.

We will be able to make generic I/O routines for composite types,
comparable to those used now for arrays.  Not sure what a convenient
external format would look like.  (Possibly use the same conventions as
for a 1-D array?)  We will need to make the convention that the type OID
of a composite type is passed to the input routine, in the same way that
an array input routine gets the typelem OID; else the input routine won't
know what to do.

We could also think about allowing functions that are declared as
accepting RECORD (ie, polymorphic-across-row-types functions).  They would
use the same methods already used by polymorphic functions to find out the
true types of their inputs.  (Might be best to invent a separate
pseudotype, say ANYRECORD, rather than overloading RECORD for this purpose.)

The recently developed SRF API is a bit unfortunate since it exposes the
assumption that a TupleTableSlot must be involved in returning a tuple.
If we don't overlay the Datum header with HeapTupleHeader then I think we
have to make TupleGetDatum copy the passed tuple and insert the row type
info from the slot's tupledesc, which'd be pretty inefficient because it
means making an extra copy of the row data.  But if we do overlay the
header fields, then I think we can set up backwards-compatibility
definitions in which the slot is simply ignored.  Specifically:

TupleDescGetSlot: no-op, returns NULL
TupleGetDatum: ignore slot, return tuple t_data pointer as datum

This will work because heap_formtuple and BuildTupleFromCStrings can
return a HeapTuple whose t_data part is already a valid row Datum, simply
by setting the appropriate length and type fields in it.  (If the tuple is
ever stored to disk as a regular table row, these fields will be
overwritten with xmin/cmin info at that time.)

To convert a row Datum into something

Re: [HACKERS] Better support for whole-row operations and composite

2004-03-30 Thread Andrew Dunstan

Tom Lane wrote:

We have a number of issues revolving around the fact that composite types
(row types) aren't first-class objects.  I think it's past time to fix
that.  Here are some notes about doing it.  I am not sure all these ideas
are fully-baked ... comments appreciated.
 

[snip]

Only named composite types, not RECORD, will be allowed to be used as
table column types.  

[snip]

Interesting. I'm slightly curious to know if there's an external driver 
for this.

Will this apply recursively (an a has a b which has an array of c's)? 
Are there indexing implications? Could one index on a subfield?

cheers

andrew

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Better support for whole-row operations and composite

2004-03-30 Thread Tom Lane

Andrew Dunstan <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Only named composite types, not RECORD, will be allowed to be used as
>> table column types.  

> Interesting. I'm slightly curious to know if there's an external driver 
> for this.

There's noplace to store a permanent record of an anonymous rowtype's
structure.  To do otherwise would amount to executing an implicit CREATE
TYPE AS for the user, so we might as well just say up front that you
have to create the type.

> Will this apply recursively (an a has a b which has an array of c's)? 

Yup.

> Are there indexing implications? Could one index on a subfield?

Using an expression index, sure.  I don't think we need to support it as
a "primitive" index type.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-02 Thread Joe Conway

Tom Lane wrote:
We have a number of issues revolving around the fact that composite
types (row types) aren't first-class objects.  I think it's past time
to fix that.  Here are some notes about doing it.  I am not sure all
these ideas are fully-baked ... comments appreciated.
[Sorry for the delay in responding]

Nice work, and in general it makes sense to me. A few comments below.

We will be able to make generic I/O routines for composite types, 
comparable to those used now for arrays.  Not sure what a convenient 
external format would look like.  (Possibly use the same conventions
as for a 1-D array?)
So you mean like an array, but with possibly mixed datatypes?

'{1 , "abc def", 2.3}'

Seems to make sense.

Another option might be to use the ROW keyword, something like:

ROW[1 , 'abc', 2.3]

We could also think about allowing functions that are declared as 
accepting RECORD (ie, polymorphic-across-row-types functions).  They
would use the same methods already used by polymorphic functions to
find out the true types of their inputs.  (Might be best to invent a
separate pseudotype, say ANYRECORD, rather than overloading RECORD
for this purpose.)
Check. I really like this idea.

TupleDescGetSlot: no-op, returns NULL TupleGetDatum: ignore slot,
return tuple t_data pointer as datum
This will work because heap_formtuple and BuildTupleFromCStrings can 
return a HeapTuple whose t_data part is already a valid row Datum,
simply by setting the appropriate length and type fields in it.  (If
the tuple is ever stored to disk as a regular table row, these fields
will be overwritten with xmin/cmin info at that time.)
Is this the way you did things in your recent commit?

To convert a row Datum into something that can be passed to
heap_getattr, one could use a local variable of type HeapTupleData
and set its t_data field to the datum's pointer value.  t_len is
copied from the datum contents, while the other fields of
HeapTupleData can just be set to zeroes.
I think I understand this, but an example would help.

* We have to be able to re-use an already-existing cache entry if it 
matches a requested TupleDesc.
For anonymous record types, how will that lookup be done efficiently? 
Can the hash key be an array of attribute oids?

If an ALTER TABLE command does something that requires examining or 
changing every row of a table, it would presumably have to do the
same to all entries in any composite-type column of the table's
rowtype.  To avoid surprises and interesting debates about who has
permissions to do this, it might be wise to restrict on-disk
composite columns to be only of standalone composite types (ie, those
made with CREATE TYPE AS).  This restriction would also avoid debates
about whether table constraints apply to composite-type columns.
I agree.

As an aside, it would be quite useful to have support for arrays of 
tuples. Any idea on how to do that without needing to define an explicit 
array type for each tuple type?

Joe

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-02 Thread Joe Conway

Tom Lane wrote:
Joe Conway <[EMAIL PROTECTED]> writes:
So you mean like an array, but with possibly mixed datatypes?
'{1 , "abc def", 2.3}'
Seems to make sense.
The unresolved question in my mind is how to represent NULL elements.
However, we have to solve that sooner or later for arrays too.  Any
thoughts?
Good point. What's really ugly is that the external representation of 
string types differs depending on whether quotes are needed or not. If 
strings were *always* surrounded by quotes, we could just use the word 
NULL, without the quotes.

Another option might be to use the ROW keyword, something like:
ROW[1 , 'abc', 2.3]


This is a separate issue, just as the ARRAY[] constructor has different
uses from the array I/O representation.  I do want some kind of runtime
constructor, but ROW[...] doesn't get the job done because it doesn't
provide any place to specify the rowtype name.  Maybe we could combine
ROW[...] with some sort of cast notation?
ROW[1 , 'abc', 2.3] :: composite_type_name
CAST(ROW[1 , 'abc', 2.3] AS composite_type_name)
Does SQL99 provide any guidance here?
The latter seems to agree with 6.12 () of SQL2003. 
I'd think we'd want the former supported anyway as an extension to standard.

Almost.  I ended up keeping TupleDescGetSlot as a live function, but its
true purpose is only to ensure that the tupledesc gets registered with
the type cache (see BlessTupleDesc() in CVS tip).  The slot per se never
gets used.  I believe that CVS tip is source-code-compatible with
existing SRFs, even though I adjusted all the ones in the distribution
to stop using the TupleTableSlot stuff.
Almost compatible. I found that, to my surprise, PL/R compiles with no 
changes after your commit. However it no segfaults (as I expected) on 
composite type arguments. Should be easy to fix though (I think, really 
haven't looked at it hard yet).

The main point though is that row Datums now contain sufficient info
embedded in them to allow runtime type lookup the same as we do for arrays.
Sounds good to me.

There are several in the PL sources now, for instance plpgsql does this
with an incoming rowtype argument:
Perfect -- thanks.

As an aside, it would be quite useful to have support for arrays of 
tuples. Any idea on how to do that without needing to define an explicit 
array type for each tuple type?
Hmm, messy ...

I wonder now whether we still really need a separate pg_type entry for
every array type.  The original motivation for doing that has been at
least partly subsumed by storing element type OIDs inside the arrays
themselves.  I wonder if we could go over to a scheme where, say,
atttypid is the base type ID and attndims being nonzero is what you
check to find out it's really an array of atttypid.  Not sure how we
could map that idea into function and expression args/results, though.
Hmmm. I had thought maybe we could use a single datatype (anyarray?) 
with in/out functions that would need to do the right thing based on the 
element type. This would also allow, for example, arrays-of-arrays, 
which is the way that SQL99/2003 seem to allow for multidimensional arrays.

Plan B would be to go ahead and create array types.  Not sure I would
want to do this for table rowtypes, but if we did it only for CREATE
TYPE AS then it doesn't sound like an unreasonable amount of overhead.
I was hoping we wouldn't need to do that.

Joe

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-02 Thread Joe Conway

Tom Lane wrote:
Joe Conway <[EMAIL PROTECTED]> writes:
Almost compatible. I found that, to my surprise, PL/R compiles with no 
changes after your commit. However it no segfaults (as I expected) on 
composite type arguments. Should be easy to fix though (I think, really 
haven't looked at it hard yet).
Let me know what you find out --- if I missed a trick on compatibility,
there's still plenty of time to fix it.
I still haven't had time to look closely, and well may have been doing 
something non-standard all along, but in any case this is the current 
failing code:

  else if (function->arg_is_rel[i])
  {
/* for tuple args, convert to a one row data.frame */
TupleTableSlot *slot = (TupleTableSlot *) arg[i];
HeapTuple   tuples = slot->val;
TupleDesc   tupdesc = slot->ttc_tupleDescriptor;
PROTECT(el = pg_tuple_get_r_frame(1, &tuples, tupdesc));
  }
The problem was (I think -- I'll check a little later) that 
slot->ttc_tupleDescriptor is now '\0'.

Hmmm. I had thought maybe we could use a single datatype (anyarray?) 
with in/out functions that would need to do the right thing based on the 
element type.
If we have just one datatype, how will the parser determine the type of
a "foo[subscript]" expression?  After thinking a bit, I don't see how to
do that except by adding an out-of-line decoration to the underlying
type, somewhat like we do for "setof" or atttypmod.  This is doable as
far as the backend itself is concerned, but the compatibility
implications for clients and user-written extensions seem daunting :-(
I'll think-about/play-with this some more, hopefully this weekend.

Thanks,

Joe

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-02 Thread Joe Conway

Tom Lane wrote:
Joe Conway <[EMAIL PROTECTED]> writes:
/* for tuple args, convert to a one row data.frame */ 
TupleTableSlot *slot = (TupleTableSlot *) arg[i]; HeapTuple		tuples
= slot->val; TupleDesc		tupdesc = slot->ttc_tupleDescriptor;
Um.  Well, the arg is not a TupleTableSlot * anymore, so this is 
guaranteed to fail.  This isn't part of what I thought the documented
 SRF API was though.
I'm sure you're correct. The SRF API was for user defined functions, not
procedural languages anyway. I'll look at how the other procedural
languages handle tuple arguments.
If you take the arg[i] value and pass it to GetAttributeByName or
GetAttributeByNum it will work (with some compiler warnings) and
AFAICS we never documented more than that.
OK, thanks,

Joe

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-02 Thread Joe Conway

Tom Lane wrote:
Joe Conway <[EMAIL PROTECTED]> writes:
... The SRF API was for user defined functions, not
procedural languages anyway. I'll look at how the other procedural
languages handle tuple arguments.
It was a dozen-or-so-lines change in each of the PL languages AFAIR.
You will probably also want to look at what you do to return tuple
results.
OK, thanks.

Just for reference, what is arg[i] if it isn't a (TupleTableSlot *) 
anymore -- is it just a HeapTuple?

Joe

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-03 Thread Joe Conway

Tom Lane wrote:
No, it's a HeapTupleHeader pointer.  You need to reconstruct a
HeapTuple on top of that to work with heap_getattr and most other
core backend routines.
Thanks.

For triggers, I was previously building up the arguments thus:

slot = TupleDescGetSlot(tupdesc);
slot->val = trigdata->tg_trigtuple;
arg[7] = PointerGetDatum(slot);
I suppose now I should do this instead?

arg[7] = PointerGetDatum(trigdata->tg_trigtuple->t_data);


Also don't forget to ensure that you detoast the datum; this is not
useful at the moment but will be important Real Soon Now. I added
standard argument-fetch macros to fmgr.h to help with the detoasting
bit.
OK. This is the net result:

#ifdef PG_VERSION_75_COMPAT
  Oid   tupType;
  int32 tupTypmod;
  TupleDesc tupdesc;
  HeapTuple tuple = palloc(sizeof(HeapTupleData));
  HeapTupleHeader   tuple_hdr = DatumGetHeapTupleHeader(arg[i]);
  tupType = HeapTupleHeaderGetTypeId(tuple_hdr);
  tupTypmod = HeapTupleHeaderGetTypMod(tuple_hdr);
  tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);
  tuple->t_len = HeapTupleHeaderGetDatumLength(tuple_hdr);
  ItemPointerSetInvalid(&(tuple->t_self));
  tuple->t_tableOid = InvalidOid;
  tuple->t_data = tuple_hdr;
  PROTECT(el = pg_tuple_get_r_frame(1, &tuple, tupdesc));

  pfree(tuple);
#else
  TupleTableSlot *slot = (TupleTableSlot *) arg[i];
  HeapTuple   tuple = slot->val;
  TupleDesc   tupdesc = slot->ttc_tupleDescriptor;
  PROTECT(el = pg_tuple_get_r_frame(1, &tuple, tupdesc));
#endif /* PG_VERSION_75_COMPAT */
Given the above changes, it's almost working now -- only problem left is 
with triggers:

  insert into foo values(11,'cat99',1.89);
+ ERROR:  record type has not been registered
+ CONTEXT:  In PL/R function rejectfoo
  delete from foo;
+ ERROR:  cache lookup failed for type 0
+ CONTEXT:  In PL/R function rejectfoo
(and a few other similar failures)

Any ideas why the trigger tuple type isn't registered, or what I'm doing 
wrong?

Thanks,

Joe

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-03 Thread Joe Conway

Joe Conway wrote:
Given the above changes, it's almost working now -- only problem left is 
with triggers:

  insert into foo values(11,'cat99',1.89);
+ ERROR:  record type has not been registered
+ CONTEXT:  In PL/R function rejectfoo
  delete from foo;
+ ERROR:  cache lookup failed for type 0
+ CONTEXT:  In PL/R function rejectfoo
(and a few other similar failures)

Any ideas why the trigger tuple type isn't registered, or what I'm doing 
wrong?
A little more info on this. It appears that the tuple type is set to 
either 2249 (RECORDOID) or 0. In the case of RECORDOID this traces all 
the way back to here:

/* 
 *   CreateTemplateTupleDesc
 *
 *   This function allocates and zeros a tuple descriptor structure.
 *
 *   Tuple type ID information is initially set for an anonymous record
 *   type; caller can overwrite this if needed.
 * 
 */
But the type id is never overwritten for a BEFORE INSERT trigger. It 
appears that somewhere it is explictly set to InvalidOid for both BEFORE 
DELETE and AFTER INSERT triggers (and possibly others). My take is that 
we now need to explicitly set the tuple type id for INSERT/UPDATE/DELETE 
statements -- not sure where the best place to do that is though. Does 
this sound correct?

Joe

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-03 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
>> Any ideas why the trigger tuple type isn't registered, or what I'm doing 
>> wrong?

> A little more info on this. It appears that the tuple type is set to 
> either 2249 (RECORDOID) or 0.

After further thought, we could possibly make it work for BEFORE
triggers, but there's just no way for AFTER triggers: in that case what
you are getting is an image of what went to disk, which is going to
contain transaction info not type info.

If you really want the trigger API for PL/R to be indistinguishable from
the function-call API, then I think you will need to copy the passed
tuple and insert type information.  This is more or less what
ExecEvalVar does now in the whole-tuple case (the critical code is
actually in heap_getsysattr though):

HeapTupleHeaderdtup;

dtup = (HeapTupleHeader) palloc(tup->t_len);
memcpy((char *) dtup, (char *) tup->t_data, tup->t_len);

HeapTupleHeaderSetDatumLength(dtup, tup->t_len);
HeapTupleHeaderSetTypeId(dtup, tupleDesc->tdtypeid);
HeapTupleHeaderSetTypMod(dtup, tupleDesc->tdtypmod);

result = PointerGetDatum(dtup);

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-03 Thread Joe Conway

Tom Lane wrote:

Joe Conway <[EMAIL PROTECTED]> writes:

For triggers, I was previously building up the arguments thus:
slot = TupleDescGetSlot(tupdesc);
slot->val = trigdata->tg_trigtuple;
arg[7] = PointerGetDatum(slot);


I suppose now I should do this instead?
arg[7] = PointerGetDatum(trigdata->tg_trigtuple->t_data);


Hm, no, that won't work because a tuple being passed to a trigger
probably isn't going to contain valid type information.  The API for
calling triggers is different from calling ordinary functions, so
I never thought about trying to make it look the same.  At what point
are you trying to do the above, anyway?
That's a shame -- it used to work fine -- done this way so the same 
function could handle tuple arguments to regular functions, and old/new 
tuples to trigger functions. It is in plr_trigger_handler(); vaguely 
similar to pltcl_trigger_handler(). I'll have to figure out a workaround 
I guess.

Joe

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Better support for whole-row operations and composite

2004-04-03 Thread Joe Conway

Tom Lane wrote:
If you really want the trigger API for PL/R to be indistinguishable from
the function-call API, then I think you will need to copy the passed
tuple and insert type information.  This is more or less what
ExecEvalVar does now in the whole-tuple case (the critical code is
actually in heap_getsysattr though):
That got me there. It may not be the best in terms of pure speed, but it 
is easier and simpler than refactoring, at least at the moment. And I 
don't think the reason people will choose PL/R for triggers is speed in 
any case ;-)

Thanks!

Joe

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Better support for whole-row operations and composite types

2004-03-29 Thread Greg Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> We have a number of issues revolving around the fact that composite types
> (row types) aren't first-class objects.  I think it's past time to fix
> that.  
...
> Only named composite types, not RECORD, will be allowed to be used as
> table column types.  

If I understand what you're talking about, you would be allowed to CREATE TYPE
a composite type, like say, "address" and then use that as a datatype all over
your database? And then if you find "address" needs a new field you can add it
to the type and automatically have it added all over your database to any
table column using that type?

Speaking as a user, that would be **very** nice. I've often found myself
wishing for just such a feature. It would simplify data model maintenance a
whole heck of a lot.

How will client programs see the data if i do a "select *"? In my ideal world
it would be shipped over in a binary representation that a driver would
translate to a perl hash / php array / whatever. But maybe it would be simpler
to just ship them over the subcolumns with names like "shipping.line_1" and
"shipping.country".

-- 
greg

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Better support for whole-row operations and composite types

2004-03-29 Thread Tom Lane

Greg Stark <[EMAIL PROTECTED]> writes:
> If I understand what you're talking about, you would be allowed to
> CREATE TYPE a composite type, like say, "address" and then use that as
> a datatype all over your database? And then if you find "address"
> needs a new field you can add it to the type and automatically have it
> added all over your database to any table column using that type?

I believe that would work, though you might have some issues with cached
plans.

> How will client programs see the data if i do a "select *"?

TBD.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Better support for whole-row operations and composite types

2004-03-30 Thread Josh Berkus

Tom,

> We have a number of issues revolving around the fact that composite types
> (row types) aren't first-class objects.  I think it's past time to fix
> that.  Here are some notes about doing it.  I am not sure all these ideas
> are fully-baked ... comments appreciated.

I'll want to add to the documentation on composite types, then.   We'll need a 
stern warning to users not to abuse them.  Easily done, I think.

Composite types are frequently abused by OO and Windows programmers to break 
the relational model.I used to be an MSDN member (thank you, I've 
recovered) and frequently ran into, on the mailing list, users getting 
themselves into some unresolvable mess becuase they'd used composite types in 
SQL server to combine several rows ... or even effectively an entire child 
table ... into one field.

Othewise, looks good to me.   I don't think I'm qualified to second-guess you 
on the implementation.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Better support for whole-row operations and composite types

2004-04-02 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> We will be able to make generic I/O routines for composite types, 
>> comparable to those used now for arrays.  Not sure what a convenient 
>> external format would look like.  (Possibly use the same conventions
>> as for a 1-D array?)

> So you mean like an array, but with possibly mixed datatypes?
> '{1 , "abc def", 2.3}'
> Seems to make sense.

The unresolved question in my mind is how to represent NULL elements.
However, we have to solve that sooner or later for arrays too.  Any
thoughts?

> Another option might be to use the ROW keyword, something like:
> ROW[1 , 'abc', 2.3]

This is a separate issue, just as the ARRAY[] constructor has different
uses from the array I/O representation.  I do want some kind of runtime
constructor, but ROW[...] doesn't get the job done because it doesn't
provide any place to specify the rowtype name.  Maybe we could combine
ROW[...] with some sort of cast notation?

ROW[1 , 'abc', 2.3] :: composite_type_name
CAST(ROW[1 , 'abc', 2.3] AS composite_type_name)

Does SQL99 provide any guidance here?

>> TupleDescGetSlot: no-op, returns NULL TupleGetDatum: ignore slot,
>> return tuple t_data pointer as datum
>> 
>> This will work because heap_formtuple and BuildTupleFromCStrings can 
>> return a HeapTuple whose t_data part is already a valid row Datum,
>> simply by setting the appropriate length and type fields in it.  (If
>> the tuple is ever stored to disk as a regular table row, these fields
>> will be overwritten with xmin/cmin info at that time.)

> Is this the way you did things in your recent commit?

Almost.  I ended up keeping TupleDescGetSlot as a live function, but its
true purpose is only to ensure that the tupledesc gets registered with
the type cache (see BlessTupleDesc() in CVS tip).  The slot per se never
gets used.  I believe that CVS tip is source-code-compatible with
existing SRFs, even though I adjusted all the ones in the distribution
to stop using the TupleTableSlot stuff.

The main point though is that row Datums now contain sufficient info
embedded in them to allow runtime type lookup the same as we do for arrays.

>> To convert a row Datum into something that can be passed to
>> heap_getattr, one could use a local variable of type HeapTupleData
>> and set its t_data field to the datum's pointer value.  t_len is
>> copied from the datum contents, while the other fields of
>> HeapTupleData can just be set to zeroes.

> I think I understand this, but an example would help.

There are several in the PL sources now, for instance plpgsql does this
with an incoming rowtype argument:

if (!fcinfo->argnull[i])
{
HeapTupleHeader td;
OidtupType;
int32tupTypmod;
TupleDesctupdesc;
HeapTupleData tmptup;

td = DatumGetHeapTupleHeader(fcinfo->arg[i]);

/* Extract rowtype info and find a tupdesc */
tupType = HeapTupleHeaderGetTypeId(td);
tupTypmod = HeapTupleHeaderGetTypMod(td);
tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);

/* Build a temporary HeapTuple control structure */
tmptup.t_len = HeapTupleHeaderGetDatumLength(td);
ItemPointerSetInvalid(&(tmptup.t_self));
tmptup.t_tableOid = InvalidOid;
tmptup.t_data = td;

exec_move_row(&estate, NULL, row, &tmptup, tupdesc);
 }

This is okay because the HeapTupleData is not needed after the call to
exec_move_row.

>> * We have to be able to re-use an already-existing cache entry if it 
>> matches a requested TupleDesc.

> For anonymous record types, how will that lookup be done efficiently? 
> Can the hash key be an array of attribute oids?

Right, that's the way I did it.  See src/backend/utils/cache/typcache.c

> As an aside, it would be quite useful to have support for arrays of 
> tuples. Any idea on how to do that without needing to define an explicit 
> array type for each tuple type?

Hmm, messy ...

I wonder now whether we still really need a separate pg_type entry for
every array type.  The original motivation for doing that has been at
least partly subsumed by storing element type OIDs inside the arrays
themselves.  I wonder if we could go over to a scheme where, say,
atttypid is the base type ID and attndims being nonzero is what you
check to find out it's really an array of atttypid.  Not sure how we
could map that idea into function and expression args/results, though.

Plan B would be to go ahead and create array types.  Not sure I would
want to do this for table rowtypes, but if we did it only for CREATE
TYPE AS then it doesn't sound like an unreasonable amount of overhead.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Better support for whole-row operations and composite types

2004-04-02 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> ...  I believe that CVS tip is source-code-compatible with
>> existing SRFs, even though I adjusted all the ones in the distribution
>> to stop using the TupleTableSlot stuff.

> Almost compatible. I found that, to my surprise, PL/R compiles with no 
> changes after your commit. However it no segfaults (as I expected) on 
> composite type arguments. Should be easy to fix though (I think, really 
> haven't looked at it hard yet).

Let me know what you find out --- if I missed a trick on compatibility,
there's still plenty of time to fix it.

>> ... I wonder if we could go over to a scheme where, say,
>> atttypid is the base type ID and attndims being nonzero is what you
>> check to find out it's really an array of atttypid.  Not sure how we
>> could map that idea into function and expression args/results, though.

> Hmmm. I had thought maybe we could use a single datatype (anyarray?) 
> with in/out functions that would need to do the right thing based on the 
> element type.

If we have just one datatype, how will the parser determine the type of
a "foo[subscript]" expression?  After thinking a bit, I don't see how to
do that except by adding an out-of-line decoration to the underlying
type, somewhat like we do for "setof" or atttypmod.  This is doable as
far as the backend itself is concerned, but the compatibility
implications for clients and user-written extensions seem daunting :-(

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Better support for whole-row operations and composite types

2004-04-02 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> I still haven't had time to look closely, and well may have been doing 
> something non-standard all along, but in any case this is the current 
> failing code:

>  /* for tuple args, convert to a one row data.frame */
>  TupleTableSlot *slot = (TupleTableSlot *) arg[i];
>  HeapTupletuples = slot->val;
>  TupleDesctupdesc = slot->ttc_tupleDescriptor;

Um.  Well, the arg is not a TupleTableSlot * anymore, so this is
guaranteed to fail.  This isn't part of what I thought the documented
SRF API was though.  If you take the arg[i] value and pass it to
GetAttributeByName or GetAttributeByNum it will work (with some compiler
warnings) and AFAICS we never documented more than that.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Better support for whole-row operations and composite types

2004-04-02 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> ... The SRF API was for user defined functions, not
> procedural languages anyway. I'll look at how the other procedural
> languages handle tuple arguments.

It was a dozen-or-so-lines change in each of the PL languages AFAIR.
You will probably also want to look at what you do to return tuple
results.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Better support for whole-row operations and composite types

2004-04-02 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> Just for reference, what is arg[i] if it isn't a (TupleTableSlot *) 
> anymore -- is it just a HeapTuple?

No, it's a HeapTupleHeader pointer.  You need to reconstruct a HeapTuple
on top of that to work with heap_getattr and most other core backend
routines.  Also don't forget to ensure that you detoast the datum;
this is not useful at the moment but will be important Real Soon Now.
I added standard argument-fetch macros to fmgr.h to help with the
detoasting bit.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Better support for whole-row operations and composite types

2004-04-03 Thread Tom Lane

Joe Conway <[EMAIL PROTECTED]> writes:
> For triggers, I was previously building up the arguments thus:
>  slot = TupleDescGetSlot(tupdesc);
>  slot->val = trigdata->tg_trigtuple;
>  arg[7] = PointerGetDatum(slot);

> I suppose now I should do this instead?
>  arg[7] = PointerGetDatum(trigdata->tg_trigtuple->t_data);

Hm, no, that won't work because a tuple being passed to a trigger
probably isn't going to contain valid type information.  The API for
calling triggers is different from calling ordinary functions, so
I never thought about trying to make it look the same.  At what point
are you trying to do the above, anyway?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] Better support for whole-row operations and composite types

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite

Re: [HACKERS] Better support for whole-row operations and composite types

Re: [HACKERS] Better support for whole-row operations and composite types

Re: [HACKERS] Better support for whole-row operations and composite types

Re: [HACKERS] Better support for whole-row operations and composite types

Re: [HACKERS] Better support for whole-row operations and composite types

Re: [HACKERS] Better support for whole-row operations and composite types

Re: [HACKERS] Better support for whole-row operations and composite types

Re: [HACKERS] Better support for whole-row operations and composite types

Re: [HACKERS] Better support for whole-row operations and composite types

22 matches

Site Navigation

Mail list logo

Footer information