Hi,

>I finished writing the code patch for transformation "Or" expressions to
>"Any" expressions. I didn't see any problems in regression tests, even
>when I changed the constant at which the minimum or expression is
>replaced by any at 0. I ran my patch on sqlancer and so far the code has
>never fallen.
Thanks for working on this.

I took the liberty of making some modifications to the patch.
I didn't compile or test it.
Please feel free to use them.

regards,
Ranier Vilela
From 56fba3befe4f6b041d097d8884815fe943fb21f9 Mon Sep 17 00:00:00 2001
From: Alena Rybakina <a.rybak...@postgrespro.ru>
Date: Mon, 26 Jun 2023 04:18:15 +0300
Subject: [PATCH] Replace clause (X=N1) OR (X=N2) ... with X = ANY(N1, N2) on 
 the stage of the optimiser when we are still working with a tree expression.

Firstly, we do not try to make a transformation for "non-or"
expressions or inequalities and the creation of a relation
with "or" expressions occurs according to the same scenario;
secondly, we do not make transformations if there are less
than 15 or expressions (here you can put another number, but
during testing, already starting with 3 expressions, the execution
time and planning time with transformed or were faster);
thirdly, it is worth considering that we consider "or" expressions
only at the current level.
The transformation takes place according to the following scheme:
first we define the groups on the left side, collect the constants
in a list for each group, then considering each group we make the
collected constants to one type, find a common type for array and
using the make_scalar_array_op function we form ScalarArrayOpExpr,
if possible. If it is not possible, then these constants both remain
in the same expr as before the function call. All successful attempts
(received ScalarArrayOpExpr together with unformulated Expr are combined
via OR operation).
---
 src/backend/parser/parse_expr.c  | 289 ++++++++++++++++++++++++++++++-
 src/tools/pgindent/typedefs.list |   1 +
 2 files changed, 289 insertions(+), 1 deletion(-)

diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 346fd272b6d..c5f58aee9ec 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -95,6 +95,293 @@ static Expr *make_distinct_op(ParseState *pstate, List 
*opname,
 static Node *make_nulltest_from_distinct(ParseState *pstate,
                                                                                
 A_Expr *distincta, Node *arg);
 
+typedef struct OrClauseGroupEntry
+{
+       Node               *node;
+       List               *consts;
+       Oid                             scalar_type;
+       Oid                             opno;
+       Expr               *expr;
+} OrClauseGroupEntry;
+
+static int const_transform_or_limit = 15;
+
+static Node *
+transformBoolExprOr(ParseState *pstate, Expr *expr_orig)
+{
+       List               *groups_list = NIL;
+       ListCell           *lc_eargs;
+       BoolExpr           *expr;
+       const char         *opname;
+       bool                    or_statement;
+
+       Assert(IsA(expr, BoolExpr));
+
+       if (list_length(expr_orig->args) < const_transform_or_limit)
+               return transformBoolExpr(pstate, (BoolExpr *)expr_orig);
+
+       /* If this is not expression "Or", then will do it the old way. */
+       expr = (BoolExpr *)copyObject(expr_orig);
+       switch (expr->boolop)
+       {
+               case AND_EXPR:
+                       opname = "AND";
+                       break;
+               case OR_EXPR:
+                       return transformBoolExpr(pstate, (BoolExpr *)expr_orig);
+                       break;
+               case NOT_EXPR:
+                       opname = "NOT";
+                       break;
+               default:
+                       elog(ERROR, "unrecognized boolop: %d", (int) 
expr->boolop);
+                       opname = NULL;          /* keep compiler quiet */
+                       break;
+       }
+
+       /*
+               * NOTE:
+               * It is an OR-clause. So, rinfo->orclause is a BoolExpr node, 
contains
+               * a list of sub-restrictinfo args, and rinfo->clause - which is 
the
+               * same expression, made from bare clauses. To not break 
selectivity
+               * caches and other optimizations, use both:
+               * - use rinfos from orclause if no transformation needed
+               * - use  bare quals from rinfo->clause in the case of 
transformation,
+               * to create new RestrictInfo: in this case we have no options 
to avoid
+               * selectivity estimation procedure.
+               */
+       or_statement = false;
+       foreach(lc_eargs, expr->args)
+       {
+               A_Expr                     *arg = (A_Expr *) lfirst(lc_eargs);
+               Node                       *bare_orarg;
+               Node                       *const_expr;
+               Node                       *non_const_expr;
+               ListCell                   *lc_groups;
+               OrClauseGroupEntry *gentry;
+               bool                            allow_transformation;
+
+               /*
+                * The first step: checking that the expression consists only 
of equality.
+                * We can only do this here, while arg is still row data type, 
namely A_Expr.
+                * After applying transformExprRecurce, we already know that it 
will be OpExr type,
+                * but checking the expression for equality is already becoming 
impossible for us.
+                * Sometimes we have the chance to devide expression into the 
groups on
+                * equality and inequality. This is possible if all list items 
are not at the
+                * same level of a single BoolExpr expression, otherwise all of 
them cannot be converted.
+                */
+
+               if (!arg)
+                       break;
+
+               allow_transformation = (
+                                       arg->type == T_A_Expr && (arg)->kind == 
AEXPR_OP &&
+                                                           
list_length((arg)->name) >= 1 && strchr(strVal(linitial((arg)->name)), '=') == 
NULL
+                                                          );
+
+
+               bare_orarg = transformExprRecurse(pstate, (Node *)arg);
+               bare_orarg = coerce_to_boolean(pstate, bare_orarg, opname);
+
+               /*
+                * The next step: transform all the inputs, and detect whether 
any contain
+                * Vars.
+                */
+               if (!allow_transformation || !bare_orarg || !IsA(bare_orarg, 
OpExpr) || !contain_vars_of_level(bare_orarg, 0))
+               {
+                       /* Again, it's not the expr we can transform */
+                       or_list = lappend(or_list, bare_orarg);
+                       continue;
+               }
+
+               /*
+                * Get pointers to constant and expression sides of the clause
+                */
+               non_const_expr = get_leftop(bare_orarg);
+               const_expr = get_rightop(bare_orarg);
+
+               /*
+                       * At this point we definitely have a transformable 
clause.
+                       * Classify it and add into specific group of clauses, 
or create new
+                       * group.
+                       * TODO: to manage complexity in the case of many 
different clauses
+                       * (X1=C1) OR (X2=C2 OR) ... (XN = CN) we could invent 
something
+                       * like a hash table (htab key ???).
+                       */
+               foreach(lc_groups, groups_list)
+               {
+                       OrClauseGroupEntry *v = (OrClauseGroupEntry *) 
lfirst(lc_groups);
+
+                       Assert(v->node != NULL);
+
+                       if (equal(v->node, non_const_expr))
+                       {
+                               v->consts = lappend(v->consts, const_expr);
+                               non_const_expr = NULL;
+                               break;
+                       }
+               }
+
+               if (non_const_expr == NULL)
+                       /*
+                               * The clause classified successfully and added 
into existed
+                               * clause group.
+                               */
+                       continue;
+
+               /* New clause group needed */
+               gentry = palloc(sizeof(OrClauseGroupEntry));
+               gentry->node = non_const_expr;
+               gentry->consts = list_make1(const_expr);
+               gentry->opno = ((OpExpr *)bare_orarg)->opno;
+               gentry->expr = (Expr *)bare_orarg;
+               groups_list = lappend(groups_list,  (void *) gentry);
+       }
+
+       if (groups_list == NIL)
+       {
+               /*
+               * No any transformations possible with this rinfo!
+               */
+               return transformBoolExpr(pstate, (BoolExpr *)expr_orig);
+       }
+       else
+       {
+               List               *or_list = NIL;
+               ListCell           *lc_args;
+               Node               *result;
+
+               /* Let's convert each group of clauses to an IN operation. */
+
+               /*
+               * Go through the list of groups and convert each, where number 
of
+               * consts more than 1. trivial groups move to OR-list again
+               */
+
+               foreach(lc_args, groups_list)
+               {
+                       OrClauseGroupEntry *gentry = (OrClauseGroupEntry *) 
lfirst(lc_args);
+                       List                       *allexprs;
+                       Oid                                 scalar_type;
+
+                       Assert(list_length(gentry->consts) > 0);
+
+                       if (list_length(gentry->consts) == 1)
+                       {
+                               /*
+                                * Only one element in the class. Return rinfo 
into the BoolExpr
+                                * args list unchanged.
+                                */
+                               or_list = lappend(or_list, gentry->expr);
+                               continue;
+                       }
+
+                       /*
+                        * Do the transformation. It's been a long way ;)
+                        *
+                        * First of all, try to select a common type for the 
array elements.  Note that
+                        * since the LHS' type is first in the list, it will be 
preferred when
+                        * there is doubt (eg, when all the RHS items are 
unknown literals).
+                        *
+                        * Note: use list_concat here not lcons, to avoid 
damaging rnonvars.
+                        *
+                        * As a source of insides, use make_scalar_array_op()
+                        */
+                       allexprs = list_concat(list_make1(gentry->node), 
gentry->consts);
+                       scalar_type = select_common_type(NULL, allexprs, NULL, 
NULL);
+
+                       if (OidIsValid(scalar_type) &&
+                               scalar_type != RECORDOID &&
+                               verify_common_type(scalar_type, allexprs))
+                       {
+                               /*
+                                * OK: coerce all the right-hand non-Var inputs 
to the common type
+                                * and build an ArrayExpr for them.
+                                */
+                               List       *aexprs;
+                               ArrayExpr  *newa;
+                               ScalarArrayOpExpr *saopexpr;
+                               ListCell *l;
+
+                               aexprs = NIL;
+
+                               foreach(l, gentry->consts)
+                               {
+                                       Node       *rexpr = (Node *) lfirst(l);
+
+                                       rexpr = coerce_to_common_type(pstate, 
rexpr,
+                                                                               
                scalar_type,
+                                                                               
                "IN");
+                                       aexprs = lappend(aexprs, rexpr);
+                               }
+
+                               newa = makeNode(ArrayExpr);
+                               /* array_collid will be set by parse_collate.c 
*/
+                               newa->element_typeid = scalar_type;
+                               newa->array_typeid = 
get_array_type(scalar_type);
+                               newa->multidims = false;
+                               newa->elements = aexprs;
+                               newa->location = -1; /* Position of the new 
clause is undefined */
+
+                               saopexpr = (ScalarArrayOpExpr 
*)make_scalar_array_op(pstate,
+                                                                               
                   list_make1(makeString((char *) "=")),
+                                                                               
                   true,
+                                                                               
                   gentry->node,
+                                                                               
                   (Node *) newa,
+                                                                               
                   -1); /* Position of the new clause is undefined */
+
+                               /*
+                               * TODO: here we can try to coerce the array to 
a Const and find
+                               * hash func instead of linear search (see 
50e17ad281b).
+                               * convert_saop_to_hashed_saop((Node *) 
saopexpr);
+                               * We don't have to do this anymore, do we?
+                               */
+
+                               or_list = lappend(or_list, (void *) saopexpr);
+                               or_statement = true;
+                       }
+                       else
+                       {
+                               or_list = lappend(or_list, gentry->expr);
+                               continue;
+                       }
+               }
+
+               if (or_statement)
+               {
+                       /* One more trick: assemble correct clause */
+                       expr = list_length(or_list) > 1 ? makeBoolExpr(OR_EXPR, 
list_copy(or_list), -1) : linitial(or_list);
+                       result = (Node *)expr;
+               }
+               else
+               {
+                       /*
+                        * There was no reasons to create a new expresion, so
+                       * run the original BoolExpr conversion with using
+                       * transformBoolExpr function
+                       */
+                       result = transformBoolExpr(pstate, (BoolExpr 
*)expr_orig);
+               }
+               list_free_deep(groups_list);
+               list_free(or_list);
+
+               return result;
+       }
+}
 
 /*
  * transformExpr -
@@ -208,7 +495,7 @@ transformExprRecurse(ParseState *pstate, Node *expr)
                        }
 
                case T_BoolExpr:
-                       result = transformBoolExpr(pstate, (BoolExpr *) expr);
+                       result = (Node *)transformBoolExprOr(pstate, (Expr 
*)expr);
                        break;
 
                case T_FuncCall:
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 260854747b4..01918e6aeac 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1630,6 +1630,7 @@ NumericSumAccum
 NumericVar
 OM_uint32
 OP
+OrClauseGroupEntry
 OSAPerGroupState
 OSAPerQueryState
 OSInfo
-- 
2.34.1

Reply via email to