Re: [HACKERS] I/O support for composite types

2004-06-10 Thread Greg Stark

  regression=# insert into bar values (row(row(1.1, 2.2), row(3.3, 4.4)));
 
 BTW, I forgot to mention that the keyword ROW is optional as long as
 you've got at least two items in the row expression, so the above can
 be simplified to
 
 regression=# insert into bar values (((1.1, 2.2), (3.3,4.4)));
 INSERT 155011 1
 
 Some other examples:
 
 regression=# select (1,2)::complex;
 ERROR:  output of composite types not implemented yet
 regression=# select cast ((1,2) as complex);
 ERROR:  output of composite types not implemented yet
 
 Looking at these, it does seem like it would be natural to get back
 
  complex
 -
   (1,2)
 
 so I'll now agree with you that the I/O syntax should use parens not
 braces as the outer delimiters.


Following this path, perhaps the array i/o syntax should be changed to use []s
and the keyword ARRAY should likewise be optional in the array constructor.

That would let people do things like insert into bar values ([(1,2),(2,3)])
to insert a list of point/complex data structures. and get back
'[(1,2),(2,3)]' in their dumps.


Personally I would have been more inclined to use braces for structs in both
places. And either parens or brackets for arrays. But eh. This whole thing is
just too cool to worry about the choice of delimiters.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] I/O support for composite types

2004-06-10 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Following this path, perhaps the array i/o syntax should be changed to
 use []s

I would think about that if there weren't compatibility issues to worry
about, but in practice the pain from such an incompatible change would
vastly outweigh the benefit.

 and the keyword ARRAY should likewise be optional in the array constructor.

Not sure this is syntactically feasible, or a good idea even if it is
possible to get bison to take it --- it might foreclose more useful
syntactic ideas later on.  (I wouldn't think that omitting ROW is a
good idea either, but the spec says we have to.)

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] I/O support for composite types

2004-06-05 Thread Thomas Hallgren
Tom Lane wrote:
I am inclined to define this similarly to the representation for arrays;
however, we need to allow for NULLs.  I suggest
{item,item,item}
The separator is always comma (it can't be type-specific since the items
might have different types).  Backslashes and double quotes can be used
in the usual ways to quote characters in the item strings.  If an item
string is completely empty it is taken as NULL; to write an actual
empty-string value, you must write .  There is an ambiguity whether
'{}' represents a zero-column row or a one-column row containing a NULL,
but I don't think this is a problem since the input converter will
always know how many columns it is expecting.
There are a couple of fine points of the array I/O behavior that I think
we should not emulate.  One is that leading whitespace in an item string
is discarded.  This seems inconsistent, mainly because trailing
whitespace isn't discarded.  In the cases where it really makes sense to
discard whitespace (namely numeric datatypes), the underlying datatype's
input converter can do that just fine, and so I suggest that the record
converter itself should not discard whitespace.  It seems OK to ignore
whitespace before and after the outer braces, however.
The other fine point has to do with double quoting.  In the array code,
{abcd}
is legal input representing an item 'abcd'.  I think it would be more
consistent with usual SQL conventions to treat it as meaning 'abcd',
that is a doubled double quote within double quotes should represent a
double quote not nothing.  Anyone have a strong feeling one way or the
other?
Why not use standard C semantics for the textual representation with 
your addition that empty items are NULL? It becomes fairly stright 
forward, IMO highly readable, and the rules to define both arrays and 
complex types are well known and documented.

Here's an array of two composite elements of the same type. The last two 
items of the second element is NULL. The type is {int, double, string, char}

{
  {12, 123.4, some string with \a qouted string\ inside of it, 'c'},
  {13, -3.2,,}
}
This will also allow you to distinguish strings from identifiers. That 
might prove extremely important if you ever plan to serialize self 
referencing structures (a structure could then represent itself as 
ref_oid or something and thereby refer to itself).

Kind regards,
Thomas Hallgren
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] I/O support for composite types

2004-06-05 Thread Tom Lane
Thomas Hallgren [EMAIL PROTECTED] writes:
 Why not use standard C semantics for the textual representation with 
 your addition that empty items are NULL?

This isn't C, it's SQL; and I think the array I/O representation is a
closer precedent for us than the C standard.

In any case, how much of C syntax are you proposing to emulate exactly?
Comments?  Backslashed newlines?  Joining of adjacent double-quoted
strings?  Conversion of octal and hex integer constants (and what about
L, U, LL, etc suffixes)?  There's a lot more stuff there than meets the
eye, and most of it isn't something I want to code.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] I/O support for composite types

2004-06-05 Thread elein
Composite types will work recursively, right?
That is a composite type inside of a composite type column?
Does the SQL dot syntax support this nested referencing?
Or are we only allowing one level.


Why not just use the syntax of the insert values with parens?

insert into tble values (...); 

is very familiar so the corresponding:

insert into table values ( 'xxx', ('yyy', 123), 456 );

is also easy to understand and remember: a row is being inserted.  

Is there a specific reason why you want curly brackets?
I have not been following this much to my chagrin.

On Sat, Jun 05, 2004 at 12:57:27PM -0400, Tom Lane wrote:
 good stuff deleted...
 
 There are a couple of fine points of the array I/O behavior that I think
 we should not emulate.  One is that leading whitespace in an item string
 is discarded.  This seems inconsistent, mainly because trailing
 whitespace isn't discarded.  In the cases where it really makes sense to
 discard whitespace (namely numeric datatypes), the underlying datatype's
 input converter can do that just fine, and so I suggest that the record
 converter itself should not discard whitespace.  It seems OK to ignore
 whitespace before and after the outer braces, however.

If the whitespace is inside of the item, do not discard it; let the
underlying type deal with it. If the white space is outside of the
item, ignore it.  I think you probably meant this, but just to be sure.  
{ item number one   } == input_text(   item number one   )


 more good stuff deleted
 
 Comments, objections, better ideas?
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

--elein

[EMAIL PROTECTED]Varlena, LLCwww.varlena.com

  PostgreSQL Consulting, Support  Training   

PostgreSQL General Bits   http://www.varlena.com/GeneralBits/
=
I have always depended on the [QA] of strangers.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] I/O support for composite types

2004-06-05 Thread Thomas Hallgren
Tom Lane wrote:
Thomas Hallgren [EMAIL PROTECTED] writes:
Why not use standard C semantics for the textual representation with 
your addition that empty items are NULL?

This isn't C, it's SQL; and I think the array I/O representation is a
closer precedent for us than the C standard.
In any case, how much of C syntax are you proposing to emulate exactly?
Comments?  Backslashed newlines?  Joining of adjacent double-quoted
strings?  Conversion of octal and hex integer constants (and what about
L, U, LL, etc suffixes)?  There's a lot more stuff there than meets the
eye, and most of it isn't something I want to code.
I'm not proposing a full C parser implementation :-) Just static data 
initializer part.

To answer how much of the C syntax:
Comments, no. SQL has a standard for comments that doesn't conflict with 
C semantics for data initializers.

Joining of adjacent double-quoted strings. Yes, of course. That's what 
you already do for arrays today. Without this, it will be hard to write 
long strings in a readable way.

Conversion of backslashed newlines, octal and integer constants within 
strings, yes, why not? The issue of non-printables needs to be addressed 
somehow. What do you propose?

Regarding the L, U, LL suffixes, depends in what way do you plan to 
tackle different character sets. Perhaps UTF-8 with unicode escapes 
would be better. Some mechanism i needed, that's for sure.

Kind regards,
Thomas Hallgren
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] I/O support for composite types

2004-06-05 Thread Tom Lane
elein [EMAIL PROTECTED] writes:
 Composite types will work recursively, right?
 That is a composite type inside of a composite type column?

You can have that, but I'm not intending that the I/O syntax be
explicitly aware of it.  A composite type field would just be an
item (and hence would have to be suitably quoted).  So it would
look something like
{somecolumn,{anothercolumn,\a quoted column\},column3}
if we go with the syntax I originally suggested.

Note that just as we offer ARRAY[] to avoid having to write this sort
of thing in SQL statements, we offer ROW() so you can synthesize
composite values without actually having to write this junk.  I see
this mainly as a dump/reload representation, so I'm not too worried
about whether complex cases can be written simply.

 Does the SQL dot syntax support this nested referencing?
 Or are we only allowing one level.

You have to parenthesize to avoid ambiguity against the normal
table.field notation, but beyond that it works.  For instance
(this is a real example with CVS tip + error check removed):

regression=# create type complex as (r float8, i float8);
CREATE TYPE
regression=# create table foo (c complex);
CREATE TABLE
regression=# insert into foo values(row(1.1, 2.2));
INSERT 154998 1
-- this doesn't work yet:
regression=# select c from foo f;
ERROR:  output of composite types not implemented yet
-- here is the ambiguity problem:
regression=# select c.r from foo f;
NOTICE:  adding missing FROM-clause entry for table c
ERROR:  column c.r does not exist
-- which you can fix like this:
regression=# select (c).r, (f.c).i from foo f;
  r  |  i
-+-
 1.1 | 2.2
(1 row)

-- nested types work about like you'd expect:
regression=# create type quad as (c1 complex, c2 complex);
CREATE TYPE
regression=# create table bar (q quad);
CREATE TABLE
regression=# insert into bar values (row(row(1.1, 2.2), row(3.3, 4.4)));
INSERT 155006 1
regression=# select (q).c2.r from bar;
  r
-
 3.3
(1 row)

 Why not just use the syntax of the insert values with parens?
   insert into tble values (...); 
 is very familiar so the corresponding:
   insert into table values ( 'xxx', ('yyy', 123), 456 );
 is also easy to understand and remember: a row is being inserted.  

I don't particularly care one way or the other about parens versus
braces; anyone else have an opinion on that?

However, I do want to follow the array syntax to the extent of using
double not single quotes for quoting items.  Otherwise you've got a mess
when you do try to write one of these things as a SQL literal.
For instance, instead of
'{1.1,2.2}'::complex
you'd have to write
'{\'1.1\',\'2.2\'}'::complex
which is just painful.  (In this particular example of course the inner
quotes could just be dropped entirely, but with textual fields they
would often be necessary.)

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] I/O support for composite types

2004-06-05 Thread Tom Lane
I wrote:
 regression=# insert into bar values (row(row(1.1, 2.2), row(3.3, 4.4)));

BTW, I forgot to mention that the keyword ROW is optional as long as
you've got at least two items in the row expression, so the above can
be simplified to

regression=# insert into bar values (((1.1, 2.2), (3.3,4.4)));
INSERT 155011 1

Some other examples:

regression=# select (1,2)::complex;
ERROR:  output of composite types not implemented yet
regression=# select cast ((1,2) as complex);
ERROR:  output of composite types not implemented yet

Looking at these, it does seem like it would be natural to get back

 complex
-
  (1,2)

so I'll now agree with you that the I/O syntax should use parens not
braces as the outer delimiters.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] I/O support for composite types

2004-06-05 Thread elein
Good reason. Now I'm excited.  I'll download and run
tests and try to do a write up in general bits next week.

cheers,
elein

On Sat, Jun 05, 2004 at 05:00:24PM -0400, Tom Lane wrote:
 I wrote:
  regression=# insert into bar values (row(row(1.1, 2.2), row(3.3, 4.4)));
 
 BTW, I forgot to mention that the keyword ROW is optional as long as
 you've got at least two items in the row expression, so the above can
 be simplified to
 
 regression=# insert into bar values (((1.1, 2.2), (3.3,4.4)));
 INSERT 155011 1
 
 Some other examples:
 
 regression=# select (1,2)::complex;
 ERROR:  output of composite types not implemented yet
 regression=# select cast ((1,2) as complex);
 ERROR:  output of composite types not implemented yet
 
 Looking at these, it does seem like it would be natural to get back
 
  complex
 -
   (1,2)
 
 so I'll now agree with you that the I/O syntax should use parens not
 braces as the outer delimiters.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] I/O support for composite types

2004-06-05 Thread Thomas Hallgren
Tom Lane wrote:
Why not just use the syntax of the insert values with parens?
	insert into tble values (...); 
is very familiar so the corresponding:
	insert into table values ( 'xxx', ('yyy', 123), 456 );
is also easy to understand and remember: a row is being inserted.  

I don't particularly care one way or the other about parens versus
braces; anyone else have an opinion on that?
My vote would be on parens. It's more coherent. Do you use braces 
anywhere else?

Kind regards,
Thomas Hallgren
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly