[BUGS] BUG #8228: Unexpected set-valued function with varchar(n) but not varchar
The following bug has been logged on the website: Bug reference: 8228 Logged by: David Johnston Email address: david.g.johns...@gmail.com PostgreSQL version: 9.0.13 Operating system: Ubuntu Linux 10.04 Description: The following query results in SQL Error: ERROR: set-valued function called in context that cannot accept a set SELECT *, CASE WHEN id = 2 THEN (regexp_matches(input_string,'^0*([1-9]\d+)$'))[1] ELSE input_string END::varchar(30) AS o_l2_a FROM ( VALUES (1,''), (2,'49404'),(3,'FROM 1000876') ) l0_src (id, input_string) The nearly identical query: SELECT *, CASE WHEN id = 2 THEN (regexp_matches(input_string,'^0*([1-9]\d+)$'))[1] ELSE input_string END::varchar AS o_l2_a FROM ( VALUES (1,''), (2,'49404'),(3,'FROM 1000876') ) l0_src (id, input_string) returns 3 records as expected. The only difference is that the cast at the end of the case construct uses varchar(30) in the failure situation but a plain varchar in the successful situation. version PostgreSQL 9.0.13 on x86_64-unknown-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit Using substring(CASE ... END::varchar,1,30) also results in the set-valued function error message. David J. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #8228: Unexpected set-valued function with varchar(n) but not varchar
david.g.johns...@gmail.com writes: The following query results in SQL Error: ERROR: set-valued function called in context that cannot accept a set SELECT *, CASE WHEN id = 2 THEN (regexp_matches(input_string,'^0*([1-9]\d+)$'))[1] ELSE input_string END::varchar(30) AS o_l2_a FROM ( VALUES (1,''), (2,'49404'),(3,'FROM 1000876') ) l0_src (id, input_string) Hm, interesting example. What seems to be happening is that during evaluation of the SELECT list for the first VALUES row, the CASE expression doesn't call regexp_matches() but just returns the ELSE expression. The ExecMakeFunctionResult() call for the cast function then decides that the function's argument expression doesn't return a set, so it changes the node execution pointer so that subsequent executions go through the much simpler ExecMakeFunctionResultNoSets() execution function. And then that spits up when on the next row, the argument expression *does* return a set :-( You could work around that using the trick documented in the regexp_matches documentation to force it to return exactly one row, ie interpose a sub-SELECT: regression=# SELECT *, CASE WHEN id = 2 THEN (select (regexp_matches(input_string,'^0*([1-9]\d+)$'))[1]) ELSE input_string END::varchar(30) AS o_l2_a FROM ( VALUES (1,''), (2,'49404'),(3,'FROM 1000876') ) l0_src (id, input_string); id | input_string | o_l2_a +--+-- 1 | | 2 | 49404| 49404 3 | FROM 1000876 | FROM 1000876 (3 rows) Not sure about non-hack fixes. I guess we need to analyze can-it-return-a-set statically instead of believing the first execution result, but that might add an unpleasant amount of startup overhead. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #8228: Unexpected set-valued function with varchar(n) but not varchar
David Johnston david.g.johns...@gmail.com writes: The issue with the regexp_matches call generally is that absence of a g modifier means that the set-returning function will never return a set. It would seem to make more sense to not make that a modifier but instead have one function defined to return a set (i.e., the current definition) and another one defined to return a simply text[]. Well, it does return a set, namely either zero or one row. The point of the sub-SELECT workaround is to transform the zero-row case to a scalar NULL. I tend to agree that this API wasn't that well thought out, but it's really not regexp_matches()'s fault that you're running into this problem --- rather, it's the fact that one arm of the CASE can return a set while the other can't. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #8228: Unexpected set-valued function with varchar(n) but not varchar
On Thu, Jun 13, 2013 at 4:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: david.g.johns...@gmail.com writes: The following query results in SQL Error: ERROR: set-valued function called in context that cannot accept a set SELECT *, CASE WHEN id = 2 THEN (regexp_matches(input_string,'^0*([1-9]\d+)$'))[1] ELSE input_string END::varchar(30) AS o_l2_a FROM ( VALUES (1,''), (2,'49404'),(3,'FROM 1000876') ) l0_src (id, input_string) Hm, interesting example. What seems to be happening is that during evaluation of the SELECT list for the first VALUES row, the CASE expression doesn't call regexp_matches() but just returns the ELSE expression. Does all this explain why it DOES work if the cast on the END is a plain varchar? Not sure about non-hack fixes. I guess we need to analyze can-it-return-a-set statically instead of believing the first execution result, but that might add an unpleasant amount of startup overhead. regards, tom lane The issue with the regexp_matches call generally is that absence of a g modifier means that the set-returning function will never return a set. It would seem to make more sense to not make that a modifier but instead have one function defined to return a set (i.e., the current definition) and another one defined to return a simply text[]. This would make using the call in a scalar context easier. Is there any reason why a UDF defined as such would have a problem? The set-returning one accepting the parameter is nice since you can toggle global/single within the same query - but in many use-cases only the single-match mode is desired. Are there any other functions that have this same risk profile that would increase the applicability of such a patch? David J.