On Thu, Jul 18, 2013 at 1:54 PM, Gurjeet Singh <[email protected]> wrote:
> On Thu, Jul 18, 2013 at 10:19 AM, Tom Lane <[email protected]> wrote:
>
>>
>> > Because simpler code is less likely to have bugs and is easier to
>> > maintain.
>>
>> I agree with that point, but one should also remember Polya's Inventor's
>> Paradox: the more general problem may be easier to solve. That is, if
>> done right, code that fully flattens an AND tree might actually be
>> simpler than code that does just a subset of the transformation. The
>> current patch fails to meet this expectation,
>
>
> The current patch does completely flatten any type of tree
> (left/right-deep or bushy) without recursing, and right-deep and bushy tree
> processing is what Robert is recommending to defer to recursive processing.
> Maybe I haven't considered a case where it doesn't flatten the tree; do you
> have an example in mind.
>
>
>> but maybe you just haven't
>> thought about it the right way.
>>
>
I tried to eliminate the 'pending' list, but I don't see a way around it.
We need temporary storage somewhere to store the branches encountered on
the right; in recursion case the call stack was serving that purpose.
>
>> My concerns about this patch have little to do with that, though, and
>> much more to do with the likelihood that it breaks some other piece of
>> code that is expecting AND/OR to be strictly binary operators, which
>> is what they've always been in parsetrees that haven't reached the
>> planner. It doesn't appear to me that you've done any research on that
>> point whatsoever
>
>
> No, I haven't, and I might not be able to research it for a few more weeks.
>
There are about 30 files (including optimizer and executor) that match
case-insensitive search for BoolExpr, and I scanned those for the usage of
the member 'args'. All the instances where BoolExpr.args is being accessed,
it's being treated as a null-terminated list. There's one exception that I
could find, and it was in context of NOT expression: not_clause() in
clauses.c.
>
>
>> you have not even updated the comment for BoolExpr
>> (in primnodes.h) that this patch falsifies.
>>
>
> I will fix that.
>
I think this line in that comment already covers the fact that in some
"special" cases a BoolExpr may have more than 2 arguments.
"There are also a few special cases where more arguments can appear before
optimization."
I have updated the comment nevertheless, and removed another comment in
parse_expr.c that claimed to be the only place where a BoolExpr with more
than 2 args is generated.
I have isolated the code for right-deep and bushy tree processing via the
macro PROCESS_BUSHY_TREES. Also, I have shortened some variable names while
retaining their meaning.
Please find the updated patch attached (based on master).
Best regards,
--
Gurjeet Singh http://gurjeet.singh.im/
EDB www.EnterpriseDB.com <http://www.enterprisedb.com>
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 81c9338..eb35d70 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -41,8 +41,7 @@ bool Transform_null_equals = false;
static Node *transformExprRecurse(ParseState *pstate, Node *expr);
static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
static Node *transformAExprOp(ParseState *pstate, A_Expr *a);
-static Node *transformAExprAnd(ParseState *pstate, A_Expr *a);
-static Node *transformAExprOr(ParseState *pstate, A_Expr *a);
+static Node *transformAExprAndOr(ParseState *pstate, A_Expr *a);
static Node *transformAExprNot(ParseState *pstate, A_Expr *a);
static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a);
static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a);
@@ -224,10 +223,10 @@ transformExprRecurse(ParseState *pstate, Node *expr)
result = transformAExprOp(pstate, a);
break;
case AEXPR_AND:
- result = transformAExprAnd(pstate, a);
+ result = transformAExprAndOr(pstate, a);
break;
case AEXPR_OR:
- result = transformAExprOr(pstate, a);
+ result = transformAExprAndOr(pstate, a);
break;
case AEXPR_NOT:
result = transformAExprNot(pstate, a);
@@ -918,32 +917,102 @@ transformAExprOp(ParseState *pstate, A_Expr *a)
return result;
}
+/*
+ * Transform the AND/OR trees non-recursively.
+ *
+ * The parser turns a list of consecutive AND expressions into a left-deep tree.
+ *
+ * a AND b AND c
+ *
+ * AND
+ * / \
+ * AND c
+ * / \
+ * a b
+ *
+ * For very long lists, it gets deep enough that processing it recursively causes
+ * check_stack_depth() to raise error and abort the query. Hence, it is necessary
+ * that we process these trees iteratively.
+ */
static Node *
-transformAExprAnd(ParseState *pstate, A_Expr *a)
+transformAExprAndOr(ParseState *pstate, A_Expr *a)
{
- Node *lexpr = transformExprRecurse(pstate, a->lexpr);
- Node *rexpr = transformExprRecurse(pstate, a->rexpr);
+#define PROCESS_BUSHY_TREES 1
+ List *exprs = NIL;
+#if PROCESS_BUSHY_TREES
+ List *pending = NIL;
+#endif
+ Node *expr;
+ A_Expr *root = a;
+ A_Expr_Kind root_kind = a->kind;
+ char *root_name = (root_kind == AEXPR_AND ? "AND" : "OR");
+ BoolExprType root_expr_type = (root_kind == AEXPR_AND ? AND_EXPR : OR_EXPR);
+
+ do {
+ Node *tmp;
- lexpr = coerce_to_boolean(pstate, lexpr, "AND");
- rexpr = coerce_to_boolean(pstate, rexpr, "AND");
+ /*
+ * Follow the left links to walk the left-deep tree, and process all the
+ * right branches. We traverse down the left branch as long as we find
+ * an unbroken chain of node types that match the root node. Then we
+ * stop and process the rest of the tree in normal recursive manner.
+#if PROCESS_BUSHY_TREES
+ *
+ * If a right branch is also the same kind of tree as the root, then
+ * append it to the 'pending' list. The pending list is also processed
+ * in this function call iteratively rather than recursively. This
+ * allows us to process bushy and even right-deep trees, not just
+ * left-deep trees.
+#endif
+ */
+ tmp = (Node*)a;
+ do {
+ a = (A_Expr*) tmp;
- return (Node *) makeBoolExpr(AND_EXPR,
- list_make2(lexpr, rexpr),
- a->location);
-}
+#if PROCESS_BUSHY_TREES
+ if (IsA(a->rexpr, A_Expr) && ((A_Expr*)a->rexpr)->kind == root_kind)
+ {
+ pending = lappend(pending, a->rexpr);
+ }
+ else
+#endif
+ {
+ expr = transformExprRecurse(pstate, a->rexpr);
+ expr = coerce_to_boolean(pstate, expr, root_name);
+ /*
+ * Use lcons instead of lappend to retain the order of
+ * processing of right branches close to the way it was in the
+ * case of recursive processing.
+ */
+ exprs = lcons(expr, exprs);
+ }
-static Node *
-transformAExprOr(ParseState *pstate, A_Expr *a)
-{
- Node *lexpr = transformExprRecurse(pstate, a->lexpr);
- Node *rexpr = transformExprRecurse(pstate, a->rexpr);
+ tmp = a->lexpr;
+ } while (IsA(tmp, A_Expr) && ((A_Expr*)tmp)->kind == root_kind);
- lexpr = coerce_to_boolean(pstate, lexpr, "OR");
- rexpr = coerce_to_boolean(pstate, rexpr, "OR");
+ /* Now process the node in left branch that broke our chain. */
+ expr = transformExprRecurse(pstate, a->lexpr);
+ expr = coerce_to_boolean(pstate, expr, root_name);
+ exprs = lcons(expr, exprs);
- return (Node *) makeBoolExpr(OR_EXPR,
- list_make2(lexpr, rexpr),
- a->location);
+#if PROCESS_BUSHY_TREES
+ /*
+ * Now that we're done processing the edge of the left-deep tree, pop
+ * the first element from the front of the pending list and process any
+ * interesting nodes we found earlier.
+ */
+ if (list_length(pending) > 0)
+ {
+ a = (A_Expr*) linitial(pending);
+ pending = list_delete_first(pending);
+ }
+ else
+#endif
+ a = NULL;
+
+ } while(a != NULL);
+
+ return (Node *) makeBoolExpr(root_expr_type, exprs, root->location);
}
static Node *
@@ -2428,10 +2497,6 @@ make_row_comparison_op(ParseState *pstate, List *opname,
/*
* For = and <> cases, we just combine the pairwise operators with AND or
* OR respectively.
- *
- * Note: this is presently the only place where the parser generates
- * BoolExpr with more than two arguments. Should be OK since the rest of
- * the system thinks BoolExpr is N-argument anyway.
*/
if (rctype == ROWCOMPARE_EQ)
return (Node *) makeBoolExpr(AND_EXPR, opexprs, location);
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 9cce60b..5b5c22b 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -458,12 +458,9 @@ typedef struct ScalarArrayOpExpr
* BoolExpr - expression node for the basic Boolean operators AND, OR, NOT
*
* Notice the arguments are given as a List. For NOT, of course the list
- * must always have exactly one element. For AND and OR, the executor can
- * handle any number of arguments. The parser generally treats AND and OR
- * as binary and so it typically only produces two-element lists, but the
- * optimizer will flatten trees of AND and OR nodes to produce longer lists
- * when possible. There are also a few special cases where more arguments
- * can appear before optimization.
+ * must always have exactly one element. For AND and OR, there are two or
+ * more arguments in the list. The optimizer and the executor are capable
+ * of processing these flat lists.
*/
typedef enum BoolExprType
{
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 6c51d0d..c3188a7 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2117,7 +2117,7 @@ shoe_ready| SELECT rsh.shoename,
int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail
FROM shoe rsh,
shoelace rsl
- WHERE (((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
+ WHERE ((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
shoelace| SELECT s.sl_name,
s.sl_avail,
s.sl_color,
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers