Re: [HACKERS] Possible bug in CASE evaluation

2013-06-26 Thread Noah Misch
On Mon, Jun 24, 2013 at 01:41:51PM +, Albe Laurenz wrote:
 Noah Misch wrote:
  If fixing the behaviour is undesirable, at least the documentation
  should be fixed.
  
  A brief documentation mention sounds fine.  Perhaps add a paragraph on
  constant folding in general and reference that from the CASE page.
 
 How about the attached?

Works for me; committed.  Thanks.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-25 Thread Andres Freund
On 2013-06-24 21:35:53 -0400, Noah Misch wrote:
 On Sat, Jun 22, 2013 at 04:54:50PM +0200, Andres Freund wrote:
  On 2013-06-21 16:45:28 +0200, Andres Freund wrote:
   On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
That being said, if we discover a simple-enough fix that performs well, 
we may
as well incorporate it.
  
   What about passing another parameter down eval_const_expressions_mutator
   (which is static, so changing the API isn't a problem) that basically
   tells us whether we actually should evaluate expressions or just perform
   the transformation?
   There's seems to be basically a couple of places where we call dangerous
   things:
   * simplify_function (via -evaluate_function-evaluate_expr) which is
 called in a bunch of places
   * evaluate_expr which is directly called in T_ArrayExpr
 T_ArrayCoerceExpr
  
   All places I've inspected so far need to deal with simplify_function
   returning NULL anyway, so that seems like a viable fix.
 
  *Prototype* patch - that seems simple enough - attached. Opinions?

 Simple enough, yes.  The other point still stands.

You mean performance? Primarily I still think we should first worry
about correctness first and then about performance. And CASE is the
documented (and really only, without writing procedual code) solution to
use for the cases where evaluation order actually *is* important.

But anyway, the question is to find realistic cases to measure the
performance of. Obviously you can just make arbitrarily expensive
expressions that can be computed full during constant folding. Which I
don't find very interesting, do you?

So, what I've done is to measure the performance difference when doing
full table queries of some CASE containing system views.

best of 5 everytime:
SELECT * FROM pg_stats;
master: 28.287 patched: 28.565

SELECT * FROM information_schema.applicable_roles;
master: 0.757 patched: 0.755

regression=# SELECT * FROM information_schema.attributes:
master: 8.392 patched: 8.555

SELECT * FROM information_schema.column_privileges;
master: 90.853 patched: 88.551

SELECT * FROM information_schema.columns;
master: 259.436 patched: 274.145

SELECT * FROM information_schema.constraint_column_usage ;
master: 14.736 patched 15.005

SELECT * FROM information_schema.parameters;
master: 76.173 patched: 79.850

SELECT * FROM information_schema.routines;
master: 45.102 patched: 46.517 ms

...

So, on those queries there's some difference (I've left out the ones
which are too short), but it's not big.

Now, for the other extreme, the following completely random query I just
typed out:
SELECT f FROM (SELECT (CASE g.i WHEN -1 THEN 0 WHEN 1 THEN 3.0/1 WHEN g.i THEN 
2.0/3 END) f FROM generate_series(1, 100) g(i)) s WHERE f = 0;
master: 491.931 patched: 943.629

suffers way much worse because the division is so expensive...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-25 Thread Pavel Stehule
2013/6/25 Andres Freund and...@2ndquadrant.com:
 On 2013-06-24 21:35:53 -0400, Noah Misch wrote:
 On Sat, Jun 22, 2013 at 04:54:50PM +0200, Andres Freund wrote:
  On 2013-06-21 16:45:28 +0200, Andres Freund wrote:
   On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
That being said, if we discover a simple-enough fix that performs 
well, we may
as well incorporate it.
  
   What about passing another parameter down eval_const_expressions_mutator
   (which is static, so changing the API isn't a problem) that basically
   tells us whether we actually should evaluate expressions or just perform
   the transformation?
   There's seems to be basically a couple of places where we call dangerous
   things:
   * simplify_function (via -evaluate_function-evaluate_expr) which is
 called in a bunch of places
   * evaluate_expr which is directly called in T_ArrayExpr
 T_ArrayCoerceExpr
  
   All places I've inspected so far need to deal with simplify_function
   returning NULL anyway, so that seems like a viable fix.
 
  *Prototype* patch - that seems simple enough - attached. Opinions?

 Simple enough, yes.  The other point still stands.

 You mean performance? Primarily I still think we should first worry
 about correctness first and then about performance. And CASE is the
 documented (and really only, without writing procedual code) solution to
 use for the cases where evaluation order actually *is* important.

 But anyway, the question is to find realistic cases to measure the
 performance of. Obviously you can just make arbitrarily expensive
 expressions that can be computed full during constant folding. Which I
 don't find very interesting, do you?

 So, what I've done is to measure the performance difference when doing
 full table queries of some CASE containing system views.

 best of 5 everytime:
 SELECT * FROM pg_stats;
 master: 28.287 patched: 28.565

 SELECT * FROM information_schema.applicable_roles;
 master: 0.757 patched: 0.755

 regression=# SELECT * FROM information_schema.attributes:
 master: 8.392 patched: 8.555

 SELECT * FROM information_schema.column_privileges;
 master: 90.853 patched: 88.551

 SELECT * FROM information_schema.columns;
 master: 259.436 patched: 274.145

 SELECT * FROM information_schema.constraint_column_usage ;
 master: 14.736 patched 15.005

 SELECT * FROM information_schema.parameters;
 master: 76.173 patched: 79.850

 SELECT * FROM information_schema.routines;
 master: 45.102 patched: 46.517 ms

 ...

 So, on those queries there's some difference (I've left out the ones
 which are too short), but it's not big.

 Now, for the other extreme, the following completely random query I just
 typed out:
 SELECT f FROM (SELECT (CASE g.i WHEN -1 THEN 0 WHEN 1 THEN 3.0/1 WHEN g.i 
 THEN 2.0/3 END) f FROM generate_series(1, 100) g(i)) s WHERE f = 0;
 master: 491.931 patched: 943.629

 suffers way much worse because the division is so expensive...

:-(

it is too high price

Pavel



 Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-25 Thread Noah Misch
On Tue, Jun 25, 2013 at 03:01:52PM +0200, Andres Freund wrote:
 On 2013-06-24 21:35:53 -0400, Noah Misch wrote:
  Simple enough, yes.  The other point still stands.
 
 You mean performance? Primarily I still think we should first worry
 about correctness first and then about performance. And CASE is the
 documented (and really only, without writing procedual code) solution to
 use for the cases where evaluation order actually *is* important.

I largely share that sentiment, but it's tempered here by the incorrect
behavior's long tenure, the difficulty of encountering a problem without
constructing a test case for the purpose of doing so, the availability of
workarounds, and the open-ended negative performance implications of your
proposed correction.

 But anyway, the question is to find realistic cases to measure the
 performance of. Obviously you can just make arbitrarily expensive
 expressions that can be computed full during constant folding. Which I
 don't find very interesting, do you?

 Now, for the other extreme, the following completely random query I just
 typed out:
 SELECT f FROM (SELECT (CASE g.i WHEN -1 THEN 0 WHEN 1 THEN 3.0/1 WHEN g.i 
 THEN 2.0/3 END) f FROM generate_series(1, 100) g(i)) s WHERE f = 0;
 master: 491.931 patched: 943.629
 
 suffers way much worse because the division is so expensive...

That's a clear indicator for this strategy being a dead end.  It's not far
from that to a realistic use case; e.g. log(10,2)/g.i or g.i*(2.0/3).

I'm still interested in your answer to my first question here:
http://www.postgresql.org/message-id/20130621150554.gc740...@tornado.leadboat.com

nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-25 Thread Andres Freund
Hi Noah,

On 2013-06-25 19:05:15 -0400, Noah Misch wrote:
 On Tue, Jun 25, 2013 at 03:01:52PM +0200, Andres Freund wrote:
  On 2013-06-24 21:35:53 -0400, Noah Misch wrote:
   Simple enough, yes.  The other point still stands.
  
  You mean performance? Primarily I still think we should first worry
  about correctness first and then about performance. And CASE is the
  documented (and really only, without writing procedual code) solution to
  use for the cases where evaluation order actually *is* important.
 
 I largely share that sentiment, but it's tempered here by the incorrect
 behavior's long tenure, the difficulty of encountering a problem without
 constructing a test case for the purpose of doing so, the availability of
 workarounds, and the open-ended negative performance implications of your
 proposed correction.

The workaround being that you need to know which version of postgres can
inline/evaluate which parts of a query?

  But anyway, the question is to find realistic cases to measure the
  performance of. Obviously you can just make arbitrarily expensive
  expressions that can be computed full during constant folding. Which I
  don't find very interesting, do you?
 
  Now, for the other extreme, the following completely random query I just
  typed out:
  SELECT f FROM (SELECT (CASE g.i WHEN -1 THEN 0 WHEN 1 THEN 3.0/1 WHEN g.i 
  THEN 2.0/3 END) f FROM generate_series(1, 100) g(i)) s WHERE f = 0;
  master: 491.931 patched: 943.629
  
  suffers way much worse because the division is so expensive...
 
 That's a clear indicator for this strategy being a dead end.  It's not far
 from that to a realistic use case; e.g. log(10,2)/g.i or g.i*(2.0/3).

Sure. But in most realistic scenarios the actual overhead of the query
will be larger - the above example is that expensive because basically
nothing but those computation are happening.


 I'm still interested in your answer to my first question here:
 http://www.postgresql.org/message-id/20130621150554.gc740...@tornado.leadboat.com

I guess you're referring to:

   Would it work to run eval_const_expressions() lazily on THEN clauses?  The
   plan-time eval_const_expressions() would not descend to them.  The first
   ExecEvalCase() to use a given THEN clause would run 
   eval_const_expressions()
   before proceeding.
  
  Ugh. Doesn't sound nice.
 
 Would you elaborate?

Several things:

1) As you reminded me of, other parts of the planner rely on a good part
of the transformations in eval_const_expressions() having already been
performed. So we cannot simply not descend there.

So what we would have to do is to apply something akin to the proposed
and then remember for various nodes (at least the individual CaseWhen's
expr and result, potentially  CaseTestExpr?) whether we already did
the full constant folding.

2) modifying the expression tree during execution seems unclean and a
layering violation and requires doing type lookups and all that during
runtime. That's rather subjective, I agree.

3) lists are cool and I don't remember the third thing I had in mind.

But I guess given the objections on performance the combined approach is
the way to go?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-25 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 But I guess given the objections on performance the combined approach is
 the way to go?

I think the documentation approach is the way to go.

It was pointed out in the pgsql-general thread about this that a naive
user might expect that, say, syntax or datatype violations in a CASE arm
would not get raised if the CASE doesn't select that arm.  We, who know
that all such errors are thrown in the parser, might say that's silly
--- but why?  I think it's not that far from acknowledging that some
errors will be thrown pre-execution to acknowledging that errors
produced by constant-folding can be thrown pre-execution, even if
they're inside a CASE.  Where is the bright line that says we can
complain about 42+'bogus' in that context but not about 1/0?

If there were realistic use-cases for this sort of thing then I'd have
more sympathy for contorting the planner code to support them, but I'm
not seeing that.  The examples shown so far are all pretty artificial,
unless your intent is to throw an error when the CASE fails; and in that
case why not do it with a volatile function containing a RAISE?  Not
only will the planner handle that as you want, but it lets you report
an actually-useful message.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-24 Thread Albe Laurenz
Noah Misch wrote:
 If fixing the behaviour is undesirable, at least the documentation
 should be fixed.
 
 A brief documentation mention sounds fine.  Perhaps add a paragraph on
 constant folding in general and reference that from the CASE page.

How about the attached?

Yours,
Laurenz Albe


case-doc.patch
Description: case-doc.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-24 Thread Noah Misch
On Sat, Jun 22, 2013 at 04:54:50PM +0200, Andres Freund wrote:
 On 2013-06-21 16:45:28 +0200, Andres Freund wrote:
  On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
   That being said, if we discover a simple-enough fix that performs well, 
   we may
   as well incorporate it.
  
  What about passing another parameter down eval_const_expressions_mutator
  (which is static, so changing the API isn't a problem) that basically
  tells us whether we actually should evaluate expressions or just perform
  the transformation?
  There's seems to be basically a couple of places where we call dangerous
  things:
  * simplify_function (via -evaluate_function-evaluate_expr) which is
called in a bunch of places
  * evaluate_expr which is directly called in T_ArrayExpr
T_ArrayCoerceExpr
  
  All places I've inspected so far need to deal with simplify_function
  returning NULL anyway, so that seems like a viable fix.
 
 *Prototype* patch - that seems simple enough - attached. Opinions?

Simple enough, yes.  The other point still stands.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-22 Thread Andres Freund
On 2013-06-21 16:45:28 +0200, Andres Freund wrote:
 On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
  That being said, if we discover a simple-enough fix that performs well, we 
  may
  as well incorporate it.
 
 What about passing another parameter down eval_const_expressions_mutator
 (which is static, so changing the API isn't a problem) that basically
 tells us whether we actually should evaluate expressions or just perform
 the transformation?
 There's seems to be basically a couple of places where we call dangerous
 things:
 * simplify_function (via -evaluate_function-evaluate_expr) which is
   called in a bunch of places
 * evaluate_expr which is directly called in T_ArrayExpr
   T_ArrayCoerceExpr
 
 All places I've inspected so far need to deal with simplify_function
 returning NULL anyway, so that seems like a viable fix.

*Prototype* patch - that seems simple enough - attached. Opinions?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 1349f8e4a8133eebbf66753b1aa3787f88ee3e33 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Sat, 22 Jun 2013 16:53:39 +0200
Subject: [PATCH] Don't evaluate potentially unreachable parts of CASE during
 planning

---
 src/backend/optimizer/util/clauses.c | 31 +--
 src/test/regress/expected/case.out   | 13 ++---
 src/test/regress/sql/case.sql|  4 ++--
 3 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 6d5b204..94b52f6 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -63,6 +63,7 @@ typedef struct
 	List	   *active_fns;
 	Node	   *case_val;
 	bool		estimate;
+	bool		doevil;
 } eval_const_expressions_context;
 
 typedef struct
@@ -2202,6 +2203,7 @@ eval_const_expressions(PlannerInfo *root, Node *node)
 	context.active_fns = NIL;	/* nothing being recursively simplified */
 	context.case_val = NULL;	/* no CASE being examined */
 	context.estimate = false;	/* safe transformations only */
+	context.doevil = true;		/* allowed to evaluate const expressions */
 	return eval_const_expressions_mutator(node, context);
 }
 
@@ -2233,6 +2235,7 @@ estimate_expression_value(PlannerInfo *root, Node *node)
 	context.active_fns = NIL;	/* nothing being recursively simplified */
 	context.case_val = NULL;	/* no CASE being examined */
 	context.estimate = true;	/* unsafe transformations OK */
+	context.doevil = true;		/* allowed to evaluate const expressions */
 	return eval_const_expressions_mutator(node, context);
 }
 
@@ -2751,7 +2754,7 @@ eval_const_expressions_mutator(Node *node,
  * If constant argument and it's a binary-coercible or
  * immutable conversion, we can simplify it to a constant.
  */
-if (arg  IsA(arg, Const) 
+if (context-doevil  arg  IsA(arg, Const) 
 	(!OidIsValid(newexpr-elemfuncid) ||
 func_volatile(newexpr-elemfuncid) == PROVOLATILE_IMMUTABLE))
 	return (Node *) evaluate_expr((Expr *) newexpr,
@@ -2843,16 +2846,20 @@ eval_const_expressions_mutator(Node *node,
 CaseExpr   *caseexpr = (CaseExpr *) node;
 CaseExpr   *newcase;
 Node	   *save_case_val;
+bool	save_doevil;
 Node	   *newarg;
 List	   *newargs;
 bool		const_true_cond;
 Node	   *defresult = NULL;
 ListCell   *arg;
 
+save_doevil = context-doevil;
+
 /* Simplify the test expression, if any */
 newarg = eval_const_expressions_mutator((Node *) caseexpr-arg,
 		context);
 
+
 /* Set up for contained CaseTestExpr nodes */
 save_case_val = context-case_val;
 if (newarg  IsA(newarg, Const))
@@ -2894,6 +2901,17 @@ eval_const_expressions_mutator(Node *node,
 		const_true_cond = true;
 	}
 
+	/*
+	 * We can only evaluate expressions as long we are sure the
+	 * the expression will be reached at runtime - otherwise we
+	 * might cause spurious errors to be thrown. The first
+	 * eval_const_expression above can always evaluate
+	 * expressions (unless we are in a nested CaseExpr) since
+	 * it will always be reached at runtime.
+	 */
+	if (!const_true_cond)
+		context-doevil = false;
+
 	/* Simplify this alternative's result value */
 	caseresult = eval_const_expressions_mutator((Node *) oldcasewhen-result,
 context);
@@ -2926,6 +2944,8 @@ eval_const_expressions_mutator(Node *node,
 
 context-case_val = save_case_val;
 
+context-doevil = save_doevil;
+
 /*
  * If no non-FALSE alternatives, CASE reduces to the default
  * result
@@ -2982,7 +3002,7 @@ eval_const_expressions_mutator(Node *node,
 newarray-multidims = arrayexpr-multidims;
 newarray-location = arrayexpr-location;
 
-if (all_const)
+if (all_const  context-doevil)
 	return (Node *) evaluate_expr((Expr *) newarray,
   

[HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Albe Laurenz
I want to draw attention to this thread on -general:
camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com

Would you concur that this is a bug?

The fine manual says about CASE:

  If the condition's result is true, the value of the CASE expression
  is the result that follows the condition, and the remainder of the
  CASE expression is not processed.

But:

test= CREATE FUNCTION zero() RETURNS integer IMMUTABLE LANGUAGE SQL AS 'SELECT 
0';
CREATE FUNCTION
test= SELECT CASE WHEN (SELECT zero()) = 0 THEN -1 ELSE 1/zero() END;
ERROR:  division by zero

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Andres Freund
Hi,


On 2013-06-21 08:16:22 +, Albe Laurenz wrote:
 I want to draw attention to this thread on -general:
 camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com

There's also a bug reported for it:
#8237: e1uovmc-0007ft...@wrigleys.postgresql.org

 Would you concur that this is a bug?

Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
though. If we can agree it is, the fix outlined over on -bugs seems to
be easily enough implemented...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Albe Laurenz
Andres Freund wrote:
 On 2013-06-21 08:16:22 +, Albe Laurenz wrote:
  I want to draw attention to this thread on -general:
  camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com
 
 There's also a bug reported for it:
 #8237: e1uovmc-0007ft...@wrigleys.postgresql.org
 
  Would you concur that this is a bug?
 
 Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
 though. If we can agree it is, the fix outlined over on -bugs seems to
 be easily enough implemented...

I think that it is surprising behaviour.

If fixing the behaviour is undesirable, at least the documentation
should be fixed.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Pavel Stehule
2013/6/21 Andres Freund and...@2ndquadrant.com:
 Hi,


 On 2013-06-21 08:16:22 +, Albe Laurenz wrote:
 I want to draw attention to this thread on -general:
 camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com

 There's also a bug reported for it:
 #8237: e1uovmc-0007ft...@wrigleys.postgresql.org

 Would you concur that this is a bug?

 Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
 though. If we can agree it is, the fix outlined over on -bugs seems to
 be easily enough implemented...

fix is not easy :-( you should to catch any possible exception, so it
means, so you should to do some optimalization under subtransactions -
or you should not do this kind of optimizations.

Pavel


 Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Noah Misch
On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote:
 Andres Freund wrote:
  Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
  though. If we can agree it is, the fix outlined over on -bugs seems to
  be easily enough implemented...

If you refer to this:

On Tue, Jun 18, 2013 at 03:31:32PM +0200, Andres Freund wrote:
 So it seems we need to stop processing after finding a single WHEN
 that's not const? Does anybody have a better idea?

eval_const_expressions() is not just an optimization: it performs mandatory
transformations such as the conversion of named-argument function calls to
positional function calls.  Even if you could skip it, queries with expensive
constant expressions would notice the performance loss.  The situations helped
by a change like this are too marginal to accept that cost.

Would it work to run eval_const_expressions() lazily on THEN clauses?  The
plan-time eval_const_expressions() would not descend to them.  The first
ExecEvalCase() to use a given THEN clause would run eval_const_expressions()
before proceeding.

 I think that it is surprising behaviour.

That's about how I characterize it, too.

I question whether real applications care.  It's important to have CASE usable
for avoiding data-dependent errors, but what's the use of including in your
query an expression that can do nothing but throw an error?  Does anyone have
a real-world example?  Perhaps some generated-query scenario.

That being said, if we discover a simple-enough fix that performs well, we may
as well incorporate it.

 If fixing the behaviour is undesirable, at least the documentation
 should be fixed.

A brief documentation mention sounds fine.  Perhaps add a paragraph on
constant folding in general and reference that from the CASE page.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Andres Freund
On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
 On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote:
  Andres Freund wrote:
   Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
   though. If we can agree it is, the fix outlined over on -bugs seems to
   be easily enough implemented...
 
 If you refer to this:
 
 On Tue, Jun 18, 2013 at 03:31:32PM +0200, Andres Freund wrote:
  So it seems we need to stop processing after finding a single WHEN
  that's not const? Does anybody have a better idea?
 
 eval_const_expressions() is not just an optimization: it performs mandatory
 transformations such as the conversion of named-argument function calls to
 positional function calls.

Ah yes. Forgot about that... Scrap that. Although it surely isn't nice
that all that is done in a function calleed eval_const_expressions()...

 Even if you could skip it, queries with expensive
 constant expressions would notice the performance loss.  The situations helped
 by a change like this are too marginal to accept that cost.

I have to say, that argument doesn't excite me mu8ch. It's not like we
don't want to do the constant expression evaluation at all anymore. Just
not inside CASE WHEN blocks which already are barring some optimizations
anyway...

 Would it work to run eval_const_expressions() lazily on THEN clauses?  The
 plan-time eval_const_expressions() would not descend to them.  The first
 ExecEvalCase() to use a given THEN clause would run eval_const_expressions()
 before proceeding.

Ugh. Doesn't sound nice. I don't have any better ideas than to actually
split eval_const_expressions into one function that does the necessary
things like canonicalization of AND/OR and one that actually evaluates
expressions inside though.
So maybe that's the way to go :/

  I think that it is surprising behaviour.

 That's about how I characterize it, too.

 I question whether real applications care.  It's important to have CASE usable
 for avoiding data-dependent errors, but what's the use of including in your
 query an expression that can do nothing but throw an error?  Does anyone have
 a real-world example?  Perhaps some generated-query scenario.

It doesn't need to be an actual constant. Something that evaluates to
the value at plan time is enough:
PREPARE foo AS SELECT CASE WHEN (SELECT $1::int)=0 THEN 0 ELSE 1/$1::int END;
EXECUTE foo(0);

That example will most likely only crashes in 9.2+ because it will
replan it with the acutal parameter values in place. But you could have
the same in earlier versions e.g. using PQExecParams(), but that's
harder to demonstrate.

Now, that example only crashes because one place uses (SELECT $1) and
the other doesn't, but...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Andres Freund
On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
 That being said, if we discover a simple-enough fix that performs well, we may
 as well incorporate it.

What about passing another parameter down eval_const_expressions_mutator
(which is static, so changing the API isn't a problem) that basically
tells us whether we actually should evaluate expressions or just perform
the transformation?
There's seems to be basically a couple of places where we call dangerous
things:
* simplify_function (via -evaluate_function-evaluate_expr) which is
  called in a bunch of places
* evaluate_expr which is directly called in T_ArrayExpr
  T_ArrayCoerceExpr

All places I've inspected so far need to deal with simplify_function
returning NULL anyway, so that seems like a viable fix.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Pavel Stehule
2013/6/21 Andres Freund and...@2ndquadrant.com:
 On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
 On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote:
  Andres Freund wrote:
   Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
   though. If we can agree it is, the fix outlined over on -bugs seems to
   be easily enough implemented...

 If you refer to this:

 On Tue, Jun 18, 2013 at 03:31:32PM +0200, Andres Freund wrote:
  So it seems we need to stop processing after finding a single WHEN
  that's not const? Does anybody have a better idea?

 eval_const_expressions() is not just an optimization: it performs mandatory
 transformations such as the conversion of named-argument function calls to
 positional function calls.

 Ah yes. Forgot about that... Scrap that. Although it surely isn't nice
 that all that is done in a function calleed eval_const_expressions()...

 Even if you could skip it, queries with expensive
 constant expressions would notice the performance loss.  The situations 
 helped
 by a change like this are too marginal to accept that cost.

yes, I dislike it too - then we can have inconsistent behave of
constant between CASE and other statements.

We should to do without any performance lost, if we do some changes in
this area.

Regards

Pavel


 I have to say, that argument doesn't excite me mu8ch. It's not like we
 don't want to do the constant expression evaluation at all anymore. Just
 not inside CASE WHEN blocks which already are barring some optimizations
 anyway...

 Would it work to run eval_const_expressions() lazily on THEN clauses?  The
 plan-time eval_const_expressions() would not descend to them.  The first
 ExecEvalCase() to use a given THEN clause would run eval_const_expressions()
 before proceeding.

 Ugh. Doesn't sound nice. I don't have any better ideas than to actually
 split eval_const_expressions into one function that does the necessary
 things like canonicalization of AND/OR and one that actually evaluates
 expressions inside though.
 So maybe that's the way to go :/

  I think that it is surprising behaviour.

 That's about how I characterize it, too.

 I question whether real applications care.  It's important to have CASE 
 usable
 for avoiding data-dependent errors, but what's the use of including in your
 query an expression that can do nothing but throw an error?  Does anyone have
 a real-world example?  Perhaps some generated-query scenario.

 It doesn't need to be an actual constant. Something that evaluates to
 the value at plan time is enough:
 PREPARE foo AS SELECT CASE WHEN (SELECT $1::int)=0 THEN 0 ELSE 1/$1::int END;
 EXECUTE foo(0);

 That example will most likely only crashes in 9.2+ because it will
 replan it with the acutal parameter values in place. But you could have
 the same in earlier versions e.g. using PQExecParams(), but that's
 harder to demonstrate.

 Now, that example only crashes because one place uses (SELECT $1) and
 the other doesn't, but...

 Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Noah Misch
On Fri, Jun 21, 2013 at 04:12:32PM +0200, Andres Freund wrote:
 On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
  On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote:

  Even if you could skip it, queries with expensive
  constant expressions would notice the performance loss.  The situations 
  helped
  by a change like this are too marginal to accept that cost.
 
 I have to say, that argument doesn't excite me mu8ch. It's not like we
 don't want to do the constant expression evaluation at all anymore. Just
 not inside CASE WHEN blocks which already are barring some optimizations
 anyway...

Sure, it's a narrow loss.  Before introducing a new narrow loss to fix an
existing one, we should consider which loss hurts more.  Offhand, I sympathize
with the folks who would lose performance more than with the folks who want to
write the sorts of expressions under consideration.

  Would it work to run eval_const_expressions() lazily on THEN clauses?  The
  plan-time eval_const_expressions() would not descend to them.  The first
  ExecEvalCase() to use a given THEN clause would run eval_const_expressions()
  before proceeding.
 
 Ugh. Doesn't sound nice.

Would you elaborate?

  I question whether real applications care.  It's important to have CASE 
  usable
  for avoiding data-dependent errors, but what's the use of including in your
  query an expression that can do nothing but throw an error?  Does anyone 
  have
  a real-world example?  Perhaps some generated-query scenario.
 
 It doesn't need to be an actual constant. Something that evaluates to
 the value at plan time is enough:
 PREPARE foo AS SELECT CASE WHEN (SELECT $1::int)=0 THEN 0 ELSE 1/$1::int END;
 EXECUTE foo(0);

 Now, that example only crashes because one place uses (SELECT $1) and
 the other doesn't, but...

Not the real-world I was hoping for, but fair enough.

Crash in this context means raise an error, right?

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers