[HACKERS] column b is of type X but expression is of type text

2013-07-12 Thread Benedikt Grundmann
A third party application we use generates SQL queries.  Here is query it
generated that broke today and for which I have a hard time arguing that
the postgres behavior is correct (minimally the error message is confusing):

=# create temporary table foo (b double precision );
CREATE TABLE
Time: 40.368 ms
=# insert into foo select min(NULL);
ERROR:  column b is of type double precision but expression is of type
text
LINE 1: insert into foo select min(NULL);
   ^
HINT:  You will need to rewrite or cast the expression.

So why does min(NULL) have type text?  According to the docs it has the
type of the input.  The value is itself NULL which is a valid member of all
types in SQL isn't it?

So what is going on?

Thanks,

Bene


Re: [HACKERS] column b is of type X but expression is of type text

2013-07-12 Thread David Johnston
Benedikt Grundmann wrote
 A third party application we use generates SQL queries.  Here is query it
 generated that broke today and for which I have a hard time arguing that
 the postgres behavior is correct (minimally the error message is
 confusing):
 
 =# create temporary table foo (b double precision );
 CREATE TABLE
 Time: 40.368 ms
 =# insert into foo select min(NULL);
 ERROR:  column b is of type double precision but expression is of type
 text
 LINE 1: insert into foo select min(NULL);
^
 HINT:  You will need to rewrite or cast the expression.
 
 So why does min(NULL) have type text?  According to the docs it has the
 type of the input.  The value is itself NULL which is a valid member of
 all
 types in SQL isn't it?
 
 So what is going on?
 
 Thanks,
 
 Bene

Ideally PostgreSQL would be smart enough to recognize that min(NULL) is of
an unknown type and thus would use the definition of foo to coerce NULL to
the desired type.  I cannot explain why it does not do this but from the
example it cannot.

Using a literal NULL without an explicit type-cast is not recommended as the
system cannot always accurately figure out what type you mean for it to use. 
Being a valid value for all types does not mean it magically switches to fit
whatever usage is required.  Columns are typed, not values per-se, and so
NULL can belong in any column but once it is part of that column it takes on
that column's type.

The query you show is pretty pointless since the intent of min is to take
a column over which to aggregate; not a literal which will only ever return
itself.

In short the SELECT query is trying its best to execute and so in the
presence of an unadorned NULL - and being unable to infer the type from
context - it simply uses the default type which is text.  The SELECT
executes just fine, and outputs a min column of type text which when
supplied to the table foo causes the type mis-match for column b on
foo.

The PostgreSQL behavior is simple because it does not infer the type of
NULL from the column in foo but it is not required to do so its failure is
not wrong.  The error message, given what does occur, makes perfect sense
and is easy enough to trace (i.e., what column is feeding foo.b from the
SELECT statement; then, why is that column being seen as text).

PostgreSQL is in the opinion of some too verbose in its requirement to be
explicit regarding types but it does make for less buggy code overall.  This
particular use-case may be solvable but I'd argue that your example is not
likely to convince anyone that it is a serious enough problem worth the
effort it would take to do so.

David J.






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/column-b-is-of-type-X-but-expression-is-of-type-text-tp5763586p5763587.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] column b is of type X but expression is of type text

2013-07-12 Thread Merlin Moncure
On Fri, Jul 12, 2013 at 8:47 AM, Benedikt Grundmann
bgrundm...@janestreet.com wrote:
 A third party application we use generates SQL queries.  Here is query it
 generated that broke today and for which I have a hard time arguing that the
 postgres behavior is correct (minimally the error message is confusing):

 =# create temporary table foo (b double precision );
 CREATE TABLE
 Time: 40.368 ms
 =# insert into foo select min(NULL);
 ERROR:  column b is of type double precision but expression is of type
 text
 LINE 1: insert into foo select min(NULL);
^
 HINT:  You will need to rewrite or cast the expression.

 So why does min(NULL) have type text?  According to the docs it has the type
 of the input.  The value is itself NULL which is a valid member of all types
 in SQL isn't it?

 So what is going on?

This is not a question for -hackers.

Postgres is strictly typed -- there is no variant type.  So even
though some functions can configured to support multiple input types
via overloading, 'any' arguments, etc. when stuff actually gets done
type coercion has to take place and text is chosen as a type of last
resort.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] column b is of type X but expression is of type text

2013-07-12 Thread Benedikt Grundmann
Thanks David,

I like the fact that postgres is explicit in it's types.  All I'm arguing
is that error message is misleading. And that I had a hard time
understanding why happened what happened.  The part I was missing is that
despite supporting an any type the necessary type inference is very very
local and quickly resorts to the default type.

thanks everyone,

Bene


On Fri, Jul 12, 2013 at 3:17 PM, David Johnston pol...@yahoo.com wrote:

 Benedikt Grundmann wrote
  A third party application we use generates SQL queries.  Here is query it
  generated that broke today and for which I have a hard time arguing that
  the postgres behavior is correct (minimally the error message is
  confusing):
 
  =# create temporary table foo (b double precision );
  CREATE TABLE
  Time: 40.368 ms
  =# insert into foo select min(NULL);
  ERROR:  column b is of type double precision but expression is of type
  text
  LINE 1: insert into foo select min(NULL);
 ^
  HINT:  You will need to rewrite or cast the expression.
 
  So why does min(NULL) have type text?  According to the docs it has the
  type of the input.  The value is itself NULL which is a valid member of
  all
  types in SQL isn't it?
 
  So what is going on?
 
  Thanks,
 
  Bene

 Ideally PostgreSQL would be smart enough to recognize that min(NULL) is
 of
 an unknown type and thus would use the definition of foo to coerce NULL
 to
 the desired type.  I cannot explain why it does not do this but from the
 example it cannot.

 Using a literal NULL without an explicit type-cast is not recommended as
 the
 system cannot always accurately figure out what type you mean for it to
 use.
 Being a valid value for all types does not mean it magically switches to
 fit
 whatever usage is required.  Columns are typed, not values per-se, and so
 NULL can belong in any column but once it is part of that column it takes
 on
 that column's type.

 The query you show is pretty pointless since the intent of min is to take
 a column over which to aggregate; not a literal which will only ever return
 itself.

 In short the SELECT query is trying its best to execute and so in the
 presence of an unadorned NULL - and being unable to infer the type from
 context - it simply uses the default type which is text.  The SELECT
 executes just fine, and outputs a min column of type text which when
 supplied to the table foo causes the type mis-match for column b on
 foo.

 The PostgreSQL behavior is simple because it does not infer the type of
 NULL from the column in foo but it is not required to do so its failure is
 not wrong.  The error message, given what does occur, makes perfect sense
 and is easy enough to trace (i.e., what column is feeding foo.b from the
 SELECT statement; then, why is that column being seen as text).

 PostgreSQL is in the opinion of some too verbose in its requirement to be
 explicit regarding types but it does make for less buggy code overall.
  This
 particular use-case may be solvable but I'd argue that your example is not
 likely to convince anyone that it is a serious enough problem worth the
 effort it would take to do so.

 David J.






 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/column-b-is-of-type-X-but-expression-is-of-type-text-tp5763586p5763587.html
 Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] column b is of type X but expression is of type text

2013-07-12 Thread Josh Berkus
On 07/12/2013 07:28 AM, Benedikt Grundmann wrote:
 Thanks David,
 
 I like the fact that postgres is explicit in it's types.  All I'm arguing
 is that error message is misleading. And that I had a hard time
 understanding why happened what happened.  The part I was missing is that
 despite supporting an any type the necessary type inference is very very
 local and quickly resorts to the default type.

No argument that it would be nice to have a more apropos error message.
 However, that's harder to achieve than you realize.

Here's a simplified version what happens:

1. you hand PostgreSQL an unadorned NULL.  It realizes it doesn't have a
type, and makes it temporarily the default type (text) in hopes that the
next stage will provide a type.

2. you call min().  Min() works for many datatypes.  Min() says: can I
work for text?  The answer is yes, so at this point the NULL which
was default text becomes *really* text.

3. you try to assign the result of MIN() to a column of type double.
This is when the error is encountered.  The planner/executor doesn't
know that the reason min() is emitting text is because you handed it an
unadorned NULL; it just knows that it was expecting a double, and it got
text.  At this point, it can't tell the difference between min(NULL) and
min('Josh'::TEXT).

To get a better error message, the query engine would need to reach back
to step (1) when it encounters the error at step (3).

The alternative would be to disallow unadorned NULLs entirely, which
would break thousands of applications.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] column b is of type X but expression is of type text

2013-07-12 Thread David Johnston
Josh Berkus wrote
 On 07/12/2013 07:28 AM, Benedikt Grundmann wrote:
 Thanks David,
 
 I like the fact that postgres is explicit in it's types.  All I'm arguing
 is that error message is misleading. And that I had a hard time
 understanding why happened what happened.  The part I was missing is that
 despite supporting an any type the necessary type inference is very very
 local and quickly resorts to the default type.
 
 2. you call min().  Min() works for many datatypes.  Min() says: can I
 work for text?  The answer is yes, so at this point the NULL which
 was default text becomes *really* text.
 
 .
 .
 .
 
 The alternative would be to disallow unadorned NULLs entirely, which
 would break thousands of applications.

In the absence of the function call the system is able to delay resolving
the type until later in the query:

SELECT *
FROM (VALUES ('2013-02-01'::date), ('2013-01-01'), (NULL) ) vals (col1);
--works


SELECT *
FROM (VALUES ('2013-02-01'::date), ('2013-01-01'), (min(NULL)) ) vals
(col1); --fails

I have no idea how this mechanism works but ISTM that the planner could, for
anyelement, look at where the result of the function call is used and add
a cast to the function input value to match the desired result type if the
input type is undefined.

I'm curious what you would consider to be a more apropos error message in
this situation; regardless of how difficult it would be to implement.

I am also curious if you can think of a better example of where this
behavior is problematic.  The query for this thread is not something that I
would deem to be good SQL.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/column-b-is-of-type-X-but-expression-is-of-type-text-tp5763586p5763615.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] column b is of type X but expression is of type text

2013-07-12 Thread Josh Berkus
David,

 I have no idea how this mechanism works but ISTM that the planner could, for
 anyelement, look at where the result of the function call is used and add
 a cast to the function input value to match the desired result type if the
 input type is undefined.

Well, that's not how anyelement works, actually.  And the input type
for min() is not anyelement.

 I'm curious what you would consider to be a more apropos error message in
 this situation; regardless of how difficult it would be to implement.

ERROR: unable to determine appropriate type for 'NULL'

But again, don't hold your breath, per above.

 I am also curious if you can think of a better example of where this
 behavior is problematic.  The query for this thread is not something that I
 would deem to be good SQL.

Yeah, but it gets generated a lot.  And per your other example,
sometimes it *does* work, so developers/ORM authors start to rely on it.
 And then it breaks.

Mostly the problematic cases are involving function parameters, where
adding a new version of a function can suddently cause a call with an
unadorned NULL to break, when it used to work.  For example, suppose I
have only one function dingbat

dingbat( timestamptz, text, text, float )

I can easily call it with:

SELECT dingbat( '2013-01-01', 'Josh', 'pgsql-hackers', NULL )

But if someone else adds a second function, possibly due to a typo with
the version control system:

dingbat(timestamptz, text, text, text)

... then the above SELECT call will automatically choose the second
function, because NULL defaults to TEXT if unadorned.  Among other
things, that could make a fun exploit if people have been careless with
their SECURITY DEFINER functions.

A worse example is the CIText type.  A couple versions ago, I attempted
to force default case-insensitive comparisons for:

'val'::CITEXT = 'val'::TEXT

... which is what the user would intuitively believe would happen,
instead of the case-sensitive comparison, which is what *does* happen.
After a long weekend of messy bug-hunting and breaking built-in
postgresql functions, I gave up.

The root cause of this is that we treat default TEXT the same as real
TEXT as a type.  Changing that logic, though, would require a massive
refactoring and debugging of PostgreSQL.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] column b is of type X but expression is of type text

2013-07-12 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 The root cause of this is that we treat default TEXT the same as real
 TEXT as a type.

No, we do not do that at all.  A NULL is initially of type unknown, and
that is definitely not the same as text.  The type resolution rules
treat the two cases differently.

The real cause of what David is complaining about is that we resolve
expression datatypes bottom up.  Once we've determined that we're going
to consider foo(NULL) as an invocation of foo(text), that's what it is,
and the context won't cause us to go back and change that.

 Changing that logic, though, would require a massive
 refactoring and debugging of PostgreSQL.

This is true enough; and you forgot to mention all the existing
applications that would also need changes if we changed the expression
resolution rules.  We could possibly make marginal changes without too
much pain, but making function resolution context-dependent would hardly
be a marginal change.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers