On Thu, Jul 18, 2013 at 1:54 PM, Gurjeet Singh <gurj...@singh.im> wrote:

> On Thu, Jul 18, 2013 at 10:19 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>
>>
>> > Because simpler code is less likely to have bugs and is easier to
>> > maintain.
>>
>> I agree with that point, but one should also remember Polya's Inventor's
>> Paradox: the more general problem may be easier to solve.  That is, if
>> done right, code that fully flattens an AND tree might actually be
>> simpler than code that does just a subset of the transformation.  The
>> current patch fails to meet this expectation,
>
>
> The current patch does completely flatten any type of tree
> (left/right-deep or bushy) without recursing, and right-deep and bushy tree
> processing is what Robert is recommending to defer to recursive processing.
> Maybe I haven't considered a case where it doesn't flatten the tree; do you
> have an example in mind.
>
>
>> but maybe you just haven't
>> thought about it the right way.
>>
>
I tried to eliminate the 'pending' list, but I don't see a way around it.
We need temporary storage somewhere to store the branches encountered on
the right; in recursion case the call stack was serving that purpose.


>
>> My concerns about this patch have little to do with that, though, and
>> much more to do with the likelihood that it breaks some other piece of
>> code that is expecting AND/OR to be strictly binary operators, which
>> is what they've always been in parsetrees that haven't reached the
>> planner.  It doesn't appear to me that you've done any research on that
>> point whatsoever
>
>
> No, I haven't, and I might not be able to research it for a few more weeks.
>

There are about 30 files (including optimizer and executor) that match
case-insensitive search for BoolExpr, and I scanned those for the usage of
the member 'args'. All the instances where BoolExpr.args is being accessed,
it's being treated as a null-terminated list. There's one exception that I
could find, and it was in context of NOT expression: not_clause() in
clauses.c.


>
>
>> you have not even updated the comment for BoolExpr
>> (in primnodes.h) that this patch falsifies.
>>
>
> I will fix that.
>

I think this line in that comment already covers the fact that in some
"special" cases a BoolExpr may have more than 2 arguments.

"There are also a few special cases where more arguments can appear before
optimization."

I have updated the comment nevertheless, and removed another comment in
parse_expr.c that claimed to be the only place where a BoolExpr with more
than 2 args is generated.

I have isolated the code for right-deep and bushy tree processing via the
macro PROCESS_BUSHY_TREES. Also, I have shortened some variable names while
retaining their meaning.

Please find the updated patch attached (based on master).

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com <http://www.enterprisedb.com>
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 81c9338..eb35d70 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -41,8 +41,7 @@ bool		Transform_null_equals = false;
 static Node *transformExprRecurse(ParseState *pstate, Node *expr);
 static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
 static Node *transformAExprOp(ParseState *pstate, A_Expr *a);
-static Node *transformAExprAnd(ParseState *pstate, A_Expr *a);
-static Node *transformAExprOr(ParseState *pstate, A_Expr *a);
+static Node *transformAExprAndOr(ParseState *pstate, A_Expr *a);
 static Node *transformAExprNot(ParseState *pstate, A_Expr *a);
 static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a);
 static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a);
@@ -224,10 +223,10 @@ transformExprRecurse(ParseState *pstate, Node *expr)
 						result = transformAExprOp(pstate, a);
 						break;
 					case AEXPR_AND:
-						result = transformAExprAnd(pstate, a);
+						result = transformAExprAndOr(pstate, a);
 						break;
 					case AEXPR_OR:
-						result = transformAExprOr(pstate, a);
+						result = transformAExprAndOr(pstate, a);
 						break;
 					case AEXPR_NOT:
 						result = transformAExprNot(pstate, a);
@@ -918,32 +917,102 @@ transformAExprOp(ParseState *pstate, A_Expr *a)
 	return result;
 }
 
+/*
+ * Transform the AND/OR trees non-recursively.
+ *
+ * The parser turns a list of consecutive AND expressions into a left-deep tree.
+ *
+ * a AND b AND c
+ *
+ *      AND
+ *     /  \
+ *   AND   c
+ *  /  \
+ * a    b
+ *
+ * For very long lists, it gets deep enough that processing it recursively causes
+ * check_stack_depth() to raise error and abort the query. Hence, it is necessary
+ * that we process these trees iteratively.
+ */
 static Node *
-transformAExprAnd(ParseState *pstate, A_Expr *a)
+transformAExprAndOr(ParseState *pstate, A_Expr *a)
 {
-	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
-	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
+#define PROCESS_BUSHY_TREES 1
+	List		   *exprs = NIL;
+#if PROCESS_BUSHY_TREES
+	List		   *pending = NIL;
+#endif
+	Node		   *expr;
+	A_Expr		   *root = a;
+	A_Expr_Kind		root_kind = a->kind;
+	char		   *root_name = (root_kind == AEXPR_AND ? "AND" : "OR");
+	BoolExprType	root_expr_type = (root_kind == AEXPR_AND ? AND_EXPR : OR_EXPR);
+
+	do {
+		Node *tmp;
 
-	lexpr = coerce_to_boolean(pstate, lexpr, "AND");
-	rexpr = coerce_to_boolean(pstate, rexpr, "AND");
+		/*
+		 * Follow the left links to walk the left-deep tree, and process all the
+		 * right branches. We traverse down the left branch as long as we find
+		 * an unbroken chain of node types that match the root node. Then we
+		 * stop and process the rest of the tree in normal recursive manner.
+#if PROCESS_BUSHY_TREES
+		 *
+		 * If a right branch is also the same kind of tree as the root, then
+		 * append it to the 'pending' list. The pending list is also processed
+		 * in this function call iteratively rather than recursively.  This
+		 * allows us to process bushy and even right-deep trees, not just
+		 * left-deep trees.
+#endif
+		 */
+		tmp = (Node*)a;
+		do {
+			a = (A_Expr*) tmp;
 
-	return (Node *) makeBoolExpr(AND_EXPR,
-								 list_make2(lexpr, rexpr),
-								 a->location);
-}
+#if PROCESS_BUSHY_TREES
+			if (IsA(a->rexpr, A_Expr) && ((A_Expr*)a->rexpr)->kind == root_kind)
+			{
+				pending = lappend(pending, a->rexpr);
+			}
+			else
+#endif
+			{
+				expr = transformExprRecurse(pstate, a->rexpr);
+				expr = coerce_to_boolean(pstate, expr, root_name);
+				/*
+				 * Use lcons instead of lappend to retain the order of
+				 * processing of right branches close to the way it was in the
+				 * case of recursive processing.
+				 */
+				exprs = lcons(expr, exprs);
+			}
 
-static Node *
-transformAExprOr(ParseState *pstate, A_Expr *a)
-{
-	Node	   *lexpr = transformExprRecurse(pstate, a->lexpr);
-	Node	   *rexpr = transformExprRecurse(pstate, a->rexpr);
+			tmp = a->lexpr;
+		} while (IsA(tmp, A_Expr) && ((A_Expr*)tmp)->kind == root_kind);
 
-	lexpr = coerce_to_boolean(pstate, lexpr, "OR");
-	rexpr = coerce_to_boolean(pstate, rexpr, "OR");
+		/* Now process the node in left branch that broke our chain. */
+		expr = transformExprRecurse(pstate, a->lexpr);
+		expr = coerce_to_boolean(pstate, expr, root_name);
+		exprs = lcons(expr, exprs);
 
-	return (Node *) makeBoolExpr(OR_EXPR,
-								 list_make2(lexpr, rexpr),
-								 a->location);
+#if PROCESS_BUSHY_TREES
+		/*
+		 * Now that we're done processing the edge of the left-deep tree, pop
+		 * the first element from the front of the pending list and process any
+		 * interesting nodes we found earlier.
+		 */
+		if (list_length(pending) > 0)
+		{
+			a = (A_Expr*) linitial(pending);
+			pending = list_delete_first(pending);
+		}
+		else
+#endif
+			a = NULL;
+
+	} while(a != NULL);
+
+	return (Node *) makeBoolExpr(root_expr_type, exprs, root->location);
 }
 
 static Node *
@@ -2428,10 +2497,6 @@ make_row_comparison_op(ParseState *pstate, List *opname,
 	/*
 	 * For = and <> cases, we just combine the pairwise operators with AND or
 	 * OR respectively.
-	 *
-	 * Note: this is presently the only place where the parser generates
-	 * BoolExpr with more than two arguments.  Should be OK since the rest of
-	 * the system thinks BoolExpr is N-argument anyway.
 	 */
 	if (rctype == ROWCOMPARE_EQ)
 		return (Node *) makeBoolExpr(AND_EXPR, opexprs, location);
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 9cce60b..5b5c22b 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -458,12 +458,9 @@ typedef struct ScalarArrayOpExpr
  * BoolExpr - expression node for the basic Boolean operators AND, OR, NOT
  *
  * Notice the arguments are given as a List.  For NOT, of course the list
- * must always have exactly one element.  For AND and OR, the executor can
- * handle any number of arguments.	The parser generally treats AND and OR
- * as binary and so it typically only produces two-element lists, but the
- * optimizer will flatten trees of AND and OR nodes to produce longer lists
- * when possible.  There are also a few special cases where more arguments
- * can appear before optimization.
+ * must always have exactly one element.  For AND and OR, there are two or
+ * more arguments in the list. The optimizer and the executor are capable
+ * of processing these flat lists.
  */
 typedef enum BoolExprType
 {
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 6c51d0d..c3188a7 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2117,7 +2117,7 @@ shoe_ready| SELECT rsh.shoename,
     int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail
    FROM shoe rsh,
     shoelace rsl
-  WHERE (((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
+  WHERE ((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
 shoelace| SELECT s.sl_name,
     s.sl_avail,
     s.sl_color,
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to