Hi,
Now that I've committed [1] which allows us to use multiple extended
statistics per table, I'd like to start a thread discussing a couple of
additional improvements for extended statistics. I've considered
starting a separate patch for each, but that would be messy as those
changes will touch roughly the same places. So I've organized it into a
single patch series, with the simpler parts at the beginning.
There are three main improvements:
1) improve estimates of OR clauses
Until now, OR clauses pretty much ignored extended statistics, based on
the experience that they're less vulnerable to misestimates. But it's a
bit weird that AND clauses are handled while OR clauses are not, so this
extends the logic to OR clauses.
Status: I think this is fairly OK.
2) support estimating clauses (Var op Var)
Currently, we only support clauses with a single Var, i.e. clauses like
- Var op Const
- Var IS [NOT] NULL
- [NOT] Var
- ...
and AND/OR clauses built from those simple ones. This patch adds support
for clauses of the form (Var op Var), of course assuming both Vars come
from the same relation.
Status: This works, but it feels a bit hackish. Needs more work.
3) support extended statistics on expressions
Currently we only allow simple references to columns in extended stats,
so we can do
CREATE STATISTICS s ON a, b, c FROM t;
but not
CREATE STATISTICS s ON (a+b), (c + 1) FROM t;
This patch aims to allow this. At the moment it's a WIP - it does most
of the catalog changes and stats building, but with some hacks/bugs. And
it does not even try to use those statistics during estimation.
The first question is how to extend the current pg_statistic_ext catalog
to support expressions. I've been planning to do it the way we support
expressions for indexes, i.e. have two catalog fields - one for keys,
one for expressions.
One difference is that for statistics we don't care about order of the
keys, so that we don't need to bother with storing 0 keys in place for
expressions - we can simply assume keys are first, then expressions.
And this is what the patch does now.
I'm however wondering whether to keep this split - why not to just treat
everything as expressions, and be done with it? A key just represents a
Var expression, after all. And it would massively simplify a lot of code
that now has to care about both keys and expressions.
Of course, expressions are a bit more expensive, but I wonder how
noticeable that would be.
Opinions?
ragards
[1] https://commitfest.postgresql.org/26/2320/
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From e8714d7edbfbafd3203623680e290d00ec3f1f8c Mon Sep 17 00:00:00 2001
From: Tomas Vondra <to...@2ndquadrant.com>
Date: Mon, 2 Dec 2019 23:02:17 +0100
Subject: [PATCH 1/3] Support using extended stats for parts of OR clauses
---
src/backend/optimizer/path/clausesel.c | 88 +++++++++++++++----
src/backend/statistics/extended_stats.c | 56 +++++++++---
src/backend/statistics/mcv.c | 5 +-
.../statistics/extended_stats_internal.h | 3 +-
src/include/statistics/statistics.h | 3 +-
5 files changed, 120 insertions(+), 35 deletions(-)
diff --git a/src/backend/optimizer/path/clausesel.c
b/src/backend/optimizer/path/clausesel.c
index a3ebe10592..8ff756bb31 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -92,7 +92,7 @@ clauselist_selectivity(PlannerInfo *root,
*/
s1 *= statext_clauselist_selectivity(root, clauses, varRelid,
jointype, sjinfo, rel,
-
&estimatedclauses);
+
&estimatedclauses, false);
}
/*
@@ -104,6 +104,68 @@ clauselist_selectivity(PlannerInfo *root,
estimatedclauses);
}
+static Selectivity
+clauselist_selectivity_or(PlannerInfo *root,
+ List *clauses,
+ int varRelid,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo)
+{
+ ListCell *lc;
+ Selectivity s1 = 0.0;
+ RelOptInfo *rel;
+ Bitmapset *estimatedclauses = NULL;
+ int idx;
+
+ /*
+ * Determine if these clauses reference a single relation. If so, and
if
+ * it has extended statistics, try to apply those.
+ */
+ rel = find_single_rel_for_clauses(root, clauses);
+ if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
+ {
+ /*
+ * Estimate as many clauses as possible using extended
statistics.
+ *
+ * 'estimatedclauses' tracks the 0-based list position index of
+ * clauses that we've estimated using extended statistics, and
that
+ * should be ignored.
+ *
+ * XXX We can't multiply with current value, because for OR
clauses
+ * we start with 0.0, so we simply assign to s1 directly.
+ */
+ s1 = statext_clauselist_selectivity(root, clauses, varRelid,
+
jointype, sjinfo, rel,
+
&estimatedclauses, true);
+ }
+
+ /*
+ * Selectivities of the remaining clauses for an OR clause are computed
+ * as s1+s2 - s1*s2 to account for the probable overlap of selected
tuple
+ * sets.
+ *
+ * XXX is this too conservative?
+ */
+ idx = 0;
+ foreach(lc, clauses)
+ {
+ Selectivity s2;
+
+ if (bms_is_member(idx, estimatedclauses))
+ continue;
+
+ s2 = clause_selectivity(root,
+ (Node *)
lfirst(lc),
+ varRelid,
+ jointype,
+ sjinfo);
+
+ s1 = s1 + s2 - s1 * s2;
+ }
+
+ return s1;
+}
+
/*
* clauselist_selectivity_simple -
* Compute the selectivity of an implicitly-ANDed list of boolean
@@ -735,24 +797,14 @@ clause_selectivity(PlannerInfo *root,
else if (is_orclause(clause))
{
/*
- * Selectivities for an OR clause are computed as s1+s2 - s1*s2
to
- * account for the probable overlap of selected tuple sets.
- *
- * XXX is this too conservative?
+ * Almost the same thing as clauselist_selectivity, but with
+ * the clauses connected by OR.
*/
- ListCell *arg;
-
- s1 = 0.0;
- foreach(arg, ((BoolExpr *) clause)->args)
- {
- Selectivity s2 = clause_selectivity(root,
-
(Node *) lfirst(arg),
-
varRelid,
-
jointype,
-
sjinfo);
-
- s1 = s1 + s2 - s1 * s2;
- }
+ s1 = clauselist_selectivity_or(root,
+
((BoolExpr *) clause)->args,
+
varRelid,
+
jointype,
+
sjinfo);
}
else if (is_opclause(clause) || IsA(clause, DistinctExpr))
{
diff --git a/src/backend/statistics/extended_stats.c
b/src/backend/statistics/extended_stats.c
index d17b8d9b1f..ccf9565c75 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1202,7 +1202,8 @@ statext_is_compatible_clause(PlannerInfo *root, Node
*clause, Index relid,
static Selectivity
statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int
varRelid,
JoinType
jointype, SpecialJoinInfo *sjinfo,
- RelOptInfo
*rel, Bitmapset **estimatedclauses)
+ RelOptInfo
*rel, Bitmapset **estimatedclauses,
+ bool is_or)
{
ListCell *l;
Bitmapset **list_attnums;
@@ -1289,13 +1290,36 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root,
List *clauses, int varReli
}
/*
- * First compute "simple" selectivity, i.e. without the extended
- * statistics, and essentially assuming independence of the
- * columns/clauses. We'll then use the various selectivities
computed from
- * MCV list to improve it.
+ * First compute "simple" selectivity, i.e. without the
extended stats,
+ * and essentially assuming independence of the
columns/clauses. We'll
+ * then use the selectivities computed from MCV list to improve
it.
*/
- simple_sel = clauselist_selectivity_simple(root, stat_clauses,
varRelid,
-
jointype, sjinfo, NULL);
+ if (is_or)
+ {
+ ListCell *lc;
+ Selectivity s1 = 0.0,
+ s2;
+
+ /*
+ * Selectivities of OR clauses are computed s1+s2 -
s1*s2 to account
+ * for the probable overlap of selected tuple sets.
+ */
+ foreach(lc, stat_clauses)
+ {
+ s2 = clause_selectivity(root,
+
(Node *) lfirst(lc),
+
varRelid,
+
jointype,
+
sjinfo);
+
+ s1 = s1 + s2 - s1 * s2;
+ }
+
+ simple_sel = s1;
+ }
+ else
+ simple_sel = clauselist_selectivity_simple(root,
stat_clauses, varRelid,
+
jointype, sjinfo, NULL);
/*
* Now compute the multi-column estimate from the MCV list,
along with the
@@ -1303,7 +1327,8 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root,
List *clauses, int varReli
*/
mcv_sel = mcv_clauselist_selectivity(root, stat, stat_clauses,
varRelid,
jointype, sjinfo, rel,
-
&mcv_basesel, &mcv_totalsel);
+
&mcv_basesel, &mcv_totalsel,
+
is_or);
/* Estimated selectivity of values not covered by MCV matches */
other_sel = simple_sel - mcv_basesel;
@@ -1331,13 +1356,14 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root,
List *clauses, int varReli
Selectivity
statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo,
- RelOptInfo *rel,
Bitmapset **estimatedclauses)
+ RelOptInfo *rel,
Bitmapset **estimatedclauses,
+ bool is_or)
{
Selectivity sel;
/* First, try estimating clauses using a multivariate MCV list. */
sel = statext_mcv_clauselist_selectivity(root, clauses, varRelid,
jointype,
-
sjinfo, rel, estimatedclauses);
+
sjinfo, rel, estimatedclauses, is_or);
/*
* Then, apply functional dependencies on the remaining clauses by
calling
@@ -1351,10 +1377,14 @@ statext_clauselist_selectivity(PlannerInfo *root, List
*clauses, int varRelid,
* For example, MCV list can give us an exact selectivity for values in
* two columns, while functional dependencies can only provide
information
* about the overall strength of the dependency.
+ *
+ * Functional dependencies only work for clauses connected by AND, so
skip
+ * this for OR clauses.
*/
- sel *= dependencies_clauselist_selectivity(root, clauses, varRelid,
-
jointype, sjinfo, rel,
-
estimatedclauses);
+ if (!is_or)
+ sel *= dependencies_clauselist_selectivity(root, clauses,
varRelid,
+
jointype, sjinfo, rel,
+
estimatedclauses);
return sel;
}
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 87e232fdd4..3f42713aa2 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -1795,7 +1795,8 @@ mcv_clauselist_selectivity(PlannerInfo *root,
StatisticExtInfo *stat,
List *clauses, int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo,
RelOptInfo *rel,
- Selectivity *basesel,
Selectivity *totalsel)
+ Selectivity *basesel,
Selectivity *totalsel,
+ bool is_or)
{
int i;
MCVList *mcv;
@@ -1808,7 +1809,7 @@ mcv_clauselist_selectivity(PlannerInfo *root,
StatisticExtInfo *stat,
mcv = statext_mcv_load(stat->statOid);
/* build a match bitmap for the clauses */
- matches = mcv_get_match_bitmap(root, clauses, stat->keys, mcv, false);
+ matches = mcv_get_match_bitmap(root, clauses, stat->keys, mcv, is_or);
/* sum frequencies for all the matching MCV items */
*basesel = 0.0;
diff --git a/src/include/statistics/extended_stats_internal.h
b/src/include/statistics/extended_stats_internal.h
index b512ee908a..5171895bba 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -107,6 +107,7 @@ extern Selectivity mcv_clauselist_selectivity(PlannerInfo
*root,
SpecialJoinInfo *sjinfo,
RelOptInfo *rel,
Selectivity *basesel,
-
Selectivity *totalsel);
+
Selectivity *totalsel,
+
bool is_or);
#endif /*
EXTENDED_STATS_INTERNAL_H */
diff --git a/src/include/statistics/statistics.h
b/src/include/statistics/statistics.h
index f5d9b6c73a..e18c9a6539 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -116,7 +116,8 @@ extern Selectivity
statext_clauselist_selectivity(PlannerInfo *root,
JoinType jointype,
SpecialJoinInfo *sjinfo,
RelOptInfo *rel,
-
Bitmapset **estimatedclauses);
+
Bitmapset **estimatedclauses,
+
bool is_or);
extern bool has_stats_of_kind(List *stats, char requiredkind);
extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind,
Bitmapset **clause_attnums,
--
2.21.0
>From 4f6d8f7e1cd16ec2c0c022479524497f271f821a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <t...@fuzzy.cz>
Date: Mon, 11 Nov 2019 01:34:11 +0100
Subject: [PATCH 2/3] Support clauses of the form Var op Var
---
src/backend/statistics/extended_stats.c | 62 ++++++++++++++++++-
src/backend/statistics/mcv.c | 61 ++++++++++++++++++
.../statistics/extended_stats_internal.h | 2 +
3 files changed, 122 insertions(+), 3 deletions(-)
diff --git a/src/backend/statistics/extended_stats.c
b/src/backend/statistics/extended_stats.c
index ccf9565c75..d9e854228c 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -964,13 +964,15 @@ statext_is_compatible_clause_internal(PlannerInfo *root,
Node *clause,
RangeTblEntry *rte = root->simple_rte_array[relid];
OpExpr *expr = (OpExpr *) clause;
Var *var;
+ Var *var2 = NULL;
/* Only expressions with two arguments are considered
compatible. */
if (list_length(expr->args) != 2)
return false;
/* Check if the expression the right shape (one Var, one Const)
*/
- if (!examine_opclause_expression(expr, &var, NULL, NULL))
+ if ((!examine_opclause_expression(expr, &var, NULL, NULL)) &&
+ (!examine_opclause_expression2(expr, &var, &var2)))
return false;
/*
@@ -1010,8 +1012,16 @@ statext_is_compatible_clause_internal(PlannerInfo *root,
Node *clause,
!get_func_leakproof(get_opcode(expr->opno)))
return false;
- return statext_is_compatible_clause_internal(root, (Node *) var,
-
relid, attnums);
+ if (var2)
+ {
+ return statext_is_compatible_clause_internal(root,
(Node *) var,
+
relid, attnums) &&
+ statext_is_compatible_clause_internal(root,
(Node *) var2,
+
relid, attnums);
+ }
+ else
+ return statext_is_compatible_clause_internal(root,
(Node *) var,
+
relid, attnums);
}
/* AND/OR/NOT clause */
@@ -1450,3 +1460,49 @@ examine_opclause_expression(OpExpr *expr, Var **varp,
Const **cstp, bool *varonl
return true;
}
+
+bool
+examine_opclause_expression2(OpExpr *expr, Var **varap, Var **varbp)
+{
+ Var *vara;
+ Var *varb;
+ Node *leftop,
+ *rightop;
+
+ /* enforced by statext_is_compatible_clause_internal */
+ Assert(list_length(expr->args) == 2);
+
+ leftop = linitial(expr->args);
+ rightop = lsecond(expr->args);
+
+ /* strip RelabelType from either side of the expression */
+ if (IsA(leftop, RelabelType))
+ leftop = (Node *) ((RelabelType *) leftop)->arg;
+
+ if (IsA(rightop, RelabelType))
+ rightop = (Node *) ((RelabelType *) rightop)->arg;
+
+ if (IsA(leftop, Var) && IsA(rightop, Var))
+ {
+ vara = (Var *) leftop;
+ varb = (Var *) rightop;
+ }
+ else
+ return false;
+
+ /*
+ * Both variables have to be for the same relation (otherwise it's a
+ * join clause, and we don't deal with those yet.
+ */
+ if (vara->varno != varb->varno)
+ return false;
+
+ /* return pointers to the extracted parts if requested */
+ if (varap)
+ *varap = vara;
+
+ if (varbp)
+ *varbp = varb;
+
+ return true;
+}
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 3f42713aa2..4b51af287e 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -1581,6 +1581,7 @@ mcv_get_match_bitmap(PlannerInfo *root, List *clauses,
/* valid only after examine_opclause_expression returns
true */
Var *var;
+ Var *var2;
Const *cst;
bool varonleft;
@@ -1651,6 +1652,66 @@ mcv_get_match_bitmap(PlannerInfo *root, List *clauses,
matches[i] = RESULT_MERGE(matches[i],
is_or, match);
}
}
+ else if (examine_opclause_expression2(expr, &var,
&var2))
+ {
+ int idx;
+ int idx2;
+
+ /* match the attribute to a dimension of the
statistic */
+ idx = bms_member_index(keys, var->varattno);
+ idx2 = bms_member_index(keys, var2->varattno);
+
+ /*
+ * Walk through the MCV items and evaluate the
current clause.
+ * We can skip items that were already ruled
out, and
+ * terminate if there are no remaining MCV
items that might
+ * possibly match.
+ */
+ for (i = 0; i < mcvlist->nitems; i++)
+ {
+ bool match = true;
+ MCVItem *item = &mcvlist->items[i];
+
+ /*
+ * When either of the MCV items is NULL
we can treat this
+ * as a mismatch. We must not call the
operator because
+ * of strictness.
+ */
+ if (item->isnull[idx] ||
item->isnull[idx2])
+ {
+ matches[i] =
RESULT_MERGE(matches[i], is_or, false);
+ continue;
+ }
+
+ /*
+ * Skip MCV items that can't change
result in the bitmap.
+ * Once the value gets false for
AND-lists, or true for
+ * OR-lists, we don't need to look at
more clauses.
+ */
+ if (RESULT_IS_FINAL(matches[i], is_or))
+ continue;
+
+ /*
+ * First check whether the constant is
below the lower
+ * boundary (in that case we can skip
the bucket, because
+ * there's no overlap).
+ *
+ * We don't store collations used to
build the statistics,
+ * but we can use the collation for the
attribute itself,
+ * as stored in varcollid. We do reset
the statistics after
+ * a type change (including collation
change), so this is
+ * OK. We may need to relax this after
allowing extended
+ * statistics on expressions.
+ */
+ match =
DatumGetBool(FunctionCall2Coll(&opproc,
+
var->varcollid,
+
item->values[idx],
+
item->values[idx2]));
+
+ /* update the match bitmap with the
result */
+ matches[i] = RESULT_MERGE(matches[i],
is_or, match);
+ }
+ }
}
else if (IsA(clause, NullTest))
{
diff --git a/src/include/statistics/extended_stats_internal.h
b/src/include/statistics/extended_stats_internal.h
index 5171895bba..23217497bb 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -98,6 +98,8 @@ extern SortItem *build_sorted_items(int numrows, int *nitems,
HeapTuple *rows,
extern bool examine_opclause_expression(OpExpr *expr, Var **varp,
Const **cstp, bool *varonleftp);
+extern bool examine_opclause_expression2(OpExpr *expr,
+
Var **varap, Var **varbp);
extern Selectivity mcv_clauselist_selectivity(PlannerInfo *root,
StatisticExtInfo *stat,
--
2.21.0
>From 8e5f74c7c55b4e602ef5460a9fbd4cdf26e52f77 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <t...@fuzzy.cz>
Date: Mon, 11 Nov 2019 14:01:21 +0100
Subject: [PATCH 3/3] Support for extended statistics on expressions
---
src/backend/commands/statscmds.c | 190 +++++--
src/backend/nodes/copyfuncs.c | 14 +
src/backend/nodes/equalfuncs.c | 13 +
src/backend/nodes/outfuncs.c | 12 +
src/backend/optimizer/util/plancat.c | 40 ++
src/backend/parser/gram.y | 31 +-
src/backend/parser/parse_agg.c | 10 +
src/backend/parser/parse_expr.c | 6 +
src/backend/parser/parse_func.c | 3 +
src/backend/parser/parse_utilcmd.c | 89 ++-
src/backend/statistics/dependencies.c | 159 +++++-
src/backend/statistics/extended_stats.c | 532 +++++++++++++++++-
src/backend/statistics/mcv.c | 17 +-
src/backend/statistics/mvdistinct.c | 51 +-
src/backend/tcop/utility.c | 16 +-
src/backend/utils/adt/ruleutils.c | 59 ++
src/backend/utils/adt/selfuncs.c | 11 +
src/bin/psql/describe.c | 1 +
src/include/catalog/pg_statistic_ext.h | 3 +
src/include/nodes/nodes.h | 1 +
src/include/nodes/parsenodes.h | 16 +
src/include/nodes/pathnodes.h | 1 +
src/include/parser/parse_node.h | 1 +
src/include/parser/parse_utilcmd.h | 2 +
.../statistics/extended_stats_internal.h | 13 +-
25 files changed, 1191 insertions(+), 100 deletions(-)
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index fb608cf5cd..a8415463af 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -29,6 +29,8 @@
#include "commands/comment.h"
#include "commands/defrem.h"
#include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/optimizer.h"
#include "statistics/statistics.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@@ -42,6 +44,7 @@
static char *ChooseExtendedStatisticName(const char *name1, const char *name2,
const char *label, Oid namespaceid);
static char *ChooseExtendedStatisticNameAddition(List *exprs);
+static bool CheckMutability(Expr *expr);
/* qsort comparator for the attnums in CreateStatistics */
@@ -62,6 +65,7 @@ ObjectAddress
CreateStatistics(CreateStatsStmt *stmt)
{
int16 attnums[STATS_MAX_DIMENSIONS];
+ int nattnums = 0;
int numcols = 0;
char *namestr;
NameData stxname;
@@ -74,6 +78,8 @@ CreateStatistics(CreateStatsStmt *stmt)
Datum datavalues[Natts_pg_statistic_ext_data];
bool datanulls[Natts_pg_statistic_ext_data];
int2vector *stxkeys;
+ List *stxexprs = NIL;
+ Datum exprsDatum;
Relation statrel;
Relation datarel;
Relation rel = NULL;
@@ -192,56 +198,95 @@ CreateStatistics(CreateStatsStmt *stmt)
foreach(cell, stmt->exprs)
{
Node *expr = (Node *) lfirst(cell);
- ColumnRef *cref;
- char *attname;
+ StatsElem *selem;
HeapTuple atttuple;
Form_pg_attribute attForm;
TypeCacheEntry *type;
- if (!IsA(expr, ColumnRef))
+ if (!IsA(expr, StatsElem))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("only simple column references
are allowed in CREATE STATISTICS")));
- cref = (ColumnRef *) expr;
+ selem = (StatsElem *) expr;
- if (list_length(cref->fields) != 1)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("only simple column references
are allowed in CREATE STATISTICS")));
- attname = strVal((Value *) linitial(cref->fields));
-
- atttuple = SearchSysCacheAttName(relid, attname);
- if (!HeapTupleIsValid(atttuple))
- ereport(ERROR,
- (errcode(ERRCODE_UNDEFINED_COLUMN),
- errmsg("column \"%s\" does not exist",
- attname)));
- attForm = (Form_pg_attribute) GETSTRUCT(atttuple);
-
- /* Disallow use of system attributes in extended stats */
- if (attForm->attnum <= 0)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("statistics creation on system
columns is not supported")));
-
- /* Disallow data types without a less-than operator */
- type = lookup_type_cache(attForm->atttypid, TYPECACHE_LT_OPR);
- if (type->lt_opr == InvalidOid)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("column \"%s\" cannot be used
in statistics because its type %s has no default btree operator class",
- attname,
format_type_be(attForm->atttypid))));
-
- /* Make sure no more than STATS_MAX_DIMENSIONS columns are used
*/
- if (numcols >= STATS_MAX_DIMENSIONS)
- ereport(ERROR,
- (errcode(ERRCODE_TOO_MANY_COLUMNS),
- errmsg("cannot have more than %d
columns in statistics",
- STATS_MAX_DIMENSIONS)));
-
- attnums[numcols] = attForm->attnum;
- numcols++;
- ReleaseSysCache(atttuple);
+ if (selem->name) /* column reference */
+ {
+ char *attname;
+ attname = selem->name;
+
+ atttuple = SearchSysCacheAttName(relid, attname);
+ if (!HeapTupleIsValid(atttuple))
+ ereport(ERROR,
+
(errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" does not
exist",
+ attname)));
+ attForm = (Form_pg_attribute) GETSTRUCT(atttuple);
+
+ /* Disallow use of system attributes in extended stats
*/
+ if (attForm->attnum <= 0)
+ ereport(ERROR,
+
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("statistics creation on
system columns is not supported")));
+
+ /* Disallow data types without a less-than operator */
+ type = lookup_type_cache(attForm->atttypid,
TYPECACHE_LT_OPR);
+ if (type->lt_opr == InvalidOid)
+ ereport(ERROR,
+
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("column \"%s\" cannot
be used in statistics because its type %s has no default btree operator class",
+ attname,
format_type_be(attForm->atttypid))));
+
+ /* Make sure no more than STATS_MAX_DIMENSIONS columns
are used */
+ if (numcols >= STATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+
(errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than
%d columns in statistics",
+
STATS_MAX_DIMENSIONS)));
+
+ attnums[nattnums] = attForm->attnum;
+ nattnums++;
+ numcols++;
+ ReleaseSysCache(atttuple);
+ }
+ else /* expression */
+ {
+ Node *expr = selem->expr;
+ TypeCacheEntry *type;
+ Oid atttype;
+
+ Assert(expr != NULL);
+
+ /*
+ * An expression using mutable functions is probably
wrong,
+ * since if you aren't going to get the same result for
the
+ * same data every time, it's not clear what the index
entries
+ * mean at all.
+ */
+ if (CheckMutability((Expr *) expr))
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("functions in
statistics expression must be marked IMMUTABLE")));
+
+ /* Disallow data types without a less-than operator */
+ atttype = exprType(expr);
+ type = lookup_type_cache(atttype, TYPECACHE_LT_OPR);
+ if (type->lt_opr == InvalidOid)
+ ereport(ERROR,
+
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("expression cannot be
used in statistics because its type %s has no default btree operator class",
+
format_type_be(atttype))));
+
+ /* Make sure no more than STATS_MAX_DIMENSIONS columns
are used */
+ if (numcols >= STATS_MAX_DIMENSIONS)
+ ereport(ERROR,
+
(errcode(ERRCODE_TOO_MANY_COLUMNS),
+ errmsg("cannot have more than
%d columns in statistics",
+
STATS_MAX_DIMENSIONS)));
+
+ numcols++;
+
+ stxexprs = lappend(stxexprs, expr);
+ }
}
/*
@@ -258,13 +303,13 @@ CreateStatistics(CreateStatsStmt *stmt)
* it does not hurt (it does not affect the efficiency, unlike for
* indexes, for example).
*/
- qsort(attnums, numcols, sizeof(int16), compare_int16);
+ qsort(attnums, nattnums, sizeof(int16), compare_int16);
/*
* Check for duplicates in the list of columns. The attnums are sorted
so
* just check consecutive elements.
*/
- for (i = 1; i < numcols; i++)
+ for (i = 1; i < nattnums; i++)
{
if (attnums[i] == attnums[i - 1])
ereport(ERROR,
@@ -273,7 +318,7 @@ CreateStatistics(CreateStatsStmt *stmt)
}
/* Form an int2vector representation of the sorted column list */
- stxkeys = buildint2vector(attnums, numcols);
+ stxkeys = buildint2vector(attnums, nattnums);
/*
* Parse the statistics kinds.
@@ -325,6 +370,18 @@ CreateStatistics(CreateStatsStmt *stmt)
Assert(ntypes > 0 && ntypes <= lengthof(types));
stxkind = construct_array(types, ntypes, CHAROID, 1, true, 'c');
+ /* convert the expressions (if any) to a text datum */
+ if (stxexprs != NIL)
+ {
+ char *exprsString;
+
+ exprsString = nodeToString(stxexprs);
+ exprsDatum = CStringGetTextDatum(exprsString);
+ pfree(exprsString);
+ }
+ else
+ exprsDatum = (Datum) 0;
+
statrel = table_open(StatisticExtRelationId, RowExclusiveLock);
/*
@@ -344,6 +401,15 @@ CreateStatistics(CreateStatsStmt *stmt)
values[Anum_pg_statistic_ext_stxkeys - 1] = PointerGetDatum(stxkeys);
values[Anum_pg_statistic_ext_stxkind - 1] = PointerGetDatum(stxkind);
+ values[Anum_pg_statistic_ext_stxexprs - 1] = exprsDatum;
+ if (exprsDatum == (Datum) 0)
+ nulls[Anum_pg_statistic_ext_stxexprs - 1] = true;
+
+ /*
+ * FIXME add dependencies on anything mentioned in the expressions,
+ * see recordDependencyOnSingleRelExpr in index_create
+ */
+
/* insert it into pg_statistic_ext */
htup = heap_form_tuple(statrel->rd_att, values, nulls);
CatalogTupleInsert(statrel, htup);
@@ -387,7 +453,7 @@ CreateStatistics(CreateStatsStmt *stmt)
*/
ObjectAddressSet(myself, StatisticExtRelationId, statoid);
- for (i = 0; i < numcols; i++)
+ for (i = 0; i < nattnums; i++)
{
ObjectAddressSubSet(parentobject, RelationRelationId, relid,
attnums[i]);
recordDependencyOn(&myself, &parentobject, DEPENDENCY_AUTO);
@@ -722,14 +788,14 @@ ChooseExtendedStatisticNameAddition(List *exprs)
buf[0] = '\0';
foreach(lc, exprs)
{
- ColumnRef *cref = (ColumnRef *) lfirst(lc);
+ StatsElem *selem = (StatsElem *) lfirst(lc);
const char *name;
/* It should be one of these, but just skip if it happens not
to be */
- if (!IsA(cref, ColumnRef))
+ if (!IsA(selem, StatsElem))
continue;
- name = strVal((Value *) linitial(cref->fields));
+ name = selem->name;
if (buflen > 0)
buf[buflen++] = '_'; /* insert _ between names */
@@ -745,3 +811,29 @@ ChooseExtendedStatisticNameAddition(List *exprs)
}
return pstrdup(buf);
}
+
+/*
+ * CheckMutability
+ * Test whether given expression is mutable
+ */
+static bool
+CheckMutability(Expr *expr)
+{
+ /*
+ * First run the expression through the planner. This has a couple of
+ * important consequences. First, function default arguments will get
+ * inserted, which may affect volatility (consider "default now()").
+ * Second, inline-able functions will get inlined, which may allow us to
+ * conclude that the function is really less volatile than it's marked.
As
+ * an example, polymorphic functions must be marked with the most
volatile
+ * behavior that they have for any input type, but once we inline the
+ * function we may be able to conclude that it's not so volatile for the
+ * particular input type we're dealing with.
+ *
+ * We assume here that expression_planner() won't scribble on its input.
+ */
+ expr = expression_planner(expr);
+
+ /* Now we can search for non-immutable functions */
+ return contain_mutable_functions((Node *) expr);
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 54ad62bb7f..477f670862 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2882,6 +2882,17 @@ _copyIndexElem(const IndexElem *from)
return newnode;
}
+static StatsElem *
+_copyStatsElem(const StatsElem *from)
+{
+ StatsElem *newnode = makeNode(StatsElem);
+
+ COPY_STRING_FIELD(name);
+ COPY_NODE_FIELD(expr);
+
+ return newnode;
+}
+
static ColumnDef *
_copyColumnDef(const ColumnDef *from)
{
@@ -5565,6 +5576,9 @@ copyObjectImpl(const void *from)
case T_IndexElem:
retval = _copyIndexElem(from);
break;
+ case T_StatsElem:
+ retval = _copyStatsElem(from);
+ break;
case T_ColumnDef:
retval = _copyColumnDef(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 5b1ba143b1..956420cce9 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2569,6 +2569,16 @@ _equalIndexElem(const IndexElem *a, const IndexElem *b)
return true;
}
+
+static bool
+_equalStatsElem(const StatsElem *a, const StatsElem *b)
+{
+ COMPARE_STRING_FIELD(name);
+ COMPARE_NODE_FIELD(expr);
+
+ return true;
+}
+
static bool
_equalColumnDef(const ColumnDef *a, const ColumnDef *b)
{
@@ -3662,6 +3672,9 @@ equal(const void *a, const void *b)
case T_IndexElem:
retval = _equalIndexElem(a, b);
break;
+ case T_StatsElem:
+ retval = _equalStatsElem(a, b);
+ break;
case T_ColumnDef:
retval = _equalColumnDef(a, b);
break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index d76fae44b8..a333e95692 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2870,6 +2870,15 @@ _outIndexElem(StringInfo str, const IndexElem *node)
WRITE_ENUM_FIELD(nulls_ordering, SortByNulls);
}
+static void
+_outStatsElem(StringInfo str, const StatsElem *node)
+{
+ WRITE_NODE_TYPE("STATSELEM");
+
+ WRITE_STRING_FIELD(name);
+ WRITE_NODE_FIELD(expr);
+}
+
static void
_outQuery(StringInfo str, const Query *node)
{
@@ -4176,6 +4185,9 @@ outNode(StringInfo str, const void *obj)
case T_IndexElem:
_outIndexElem(str, obj);
break;
+ case T_StatsElem:
+ _outStatsElem(str, obj);
+ break;
case T_Query:
_outQuery(str, obj);
break;
diff --git a/src/backend/optimizer/util/plancat.c
b/src/backend/optimizer/util/plancat.c
index d82fc5ab8b..01130c5779 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -34,6 +34,7 @@
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
#include "nodes/supportnodes.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
@@ -1304,6 +1305,7 @@ get_relation_statistics(RelOptInfo *rel, Relation
relation)
HeapTuple dtup;
Bitmapset *keys = NULL;
int i;
+ List *exprs = NIL;
htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
if (!HeapTupleIsValid(htup))
@@ -1322,6 +1324,41 @@ get_relation_statistics(RelOptInfo *rel, Relation
relation)
for (i = 0; i < staForm->stxkeys.dim1; i++)
keys = bms_add_member(keys, staForm->stxkeys.values[i]);
+ /*
+ * preprocess expression (if any)
+ *
+ * FIXME we probably need to cache the result somewhere
+ */
+ {
+ bool isnull;
+ Datum datum;
+
+ /* decode expression (if any) */
+ datum = SysCacheGetAttr(STATEXTOID, htup,
+
Anum_pg_statistic_ext_stxexprs, &isnull);
+
+ if (!isnull)
+ {
+ char *exprsString;
+
+ exprsString = TextDatumGetCString(datum);
+ exprs = (List *) stringToNode(exprsString);
+ pfree(exprsString);
+
+ /*
+ * Run the expressions through
eval_const_expressions. This is not just an
+ * optimization, but is necessary, because the
planner will be comparing
+ * them to similarly-processed qual clauses,
and may fail to detect valid
+ * matches without this. We must not use
canonicalize_qual, however,
+ * since these aren't qual expressions.
+ */
+ exprs = (List *) eval_const_expressions(NULL,
(Node *) exprs);
+
+ /* May as well fix opfuncids too */
+ fix_opfuncids((Node *) exprs);
+ }
+ }
+
/* add one StatisticExtInfo for each kind built */
if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT))
{
@@ -1331,6 +1368,7 @@ get_relation_statistics(RelOptInfo *rel, Relation
relation)
info->rel = rel;
info->kind = STATS_EXT_NDISTINCT;
info->keys = bms_copy(keys);
+ info->exprs = exprs;
stainfos = lappend(stainfos, info);
}
@@ -1343,6 +1381,7 @@ get_relation_statistics(RelOptInfo *rel, Relation
relation)
info->rel = rel;
info->kind = STATS_EXT_DEPENDENCIES;
info->keys = bms_copy(keys);
+ info->exprs = exprs;
stainfos = lappend(stainfos, info);
}
@@ -1355,6 +1394,7 @@ get_relation_statistics(RelOptInfo *rel, Relation
relation)
info->rel = rel;
info->kind = STATS_EXT_MCV;
info->keys = bms_copy(keys);
+ info->exprs = exprs;
stainfos = lappend(stainfos, info);
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 3806687ae3..da87c60dc3 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -225,6 +225,7 @@ static Node *makeRecursiveViewSelect(char *relname, List
*aliases, Node *query);
WindowDef *windef;
JoinExpr *jexpr;
IndexElem *ielem;
+ StatsElem *selem;
Alias *alias;
RangeVar *range;
IntoClause *into;
@@ -386,7 +387,7 @@ static Node *makeRecursiveViewSelect(char *relname, List
*aliases, Node *query);
old_aggr_definition old_aggr_list
oper_argtypes RuleActionList RuleActionMulti
opt_column_list columnList opt_name_list
- sort_clause opt_sort_clause sortby_list
index_params
+ sort_clause opt_sort_clause sortby_list
index_params stats_params
opt_include opt_c_include index_including_params
name_list role_list from_clause from_list
opt_array_bounds
qualified_name_list any_name any_name_list
type_name_list
@@ -494,6 +495,7 @@ static Node *makeRecursiveViewSelect(char *relname, List
*aliases, Node *query);
%type <list> func_alias_clause
%type <sortby> sortby
%type <ielem> index_elem
+%type <selem> stats_param
%type <node> table_ref
%type <jexpr> joined_table
%type <range> relation_expr
@@ -3965,7 +3967,7 @@ ExistingIndex: USING INDEX index_name
{ $$ = $3; }
CreateStatsStmt:
CREATE STATISTICS any_name
- opt_name_list ON expr_list FROM from_list
+ opt_name_list ON stats_params FROM from_list
{
CreateStatsStmt *n =
makeNode(CreateStatsStmt);
n->defnames = $3;
@@ -3977,7 +3979,7 @@ CreateStatsStmt:
$$ = (Node *)n;
}
| CREATE STATISTICS IF_P NOT EXISTS any_name
- opt_name_list ON expr_list FROM from_list
+ opt_name_list ON stats_params FROM from_list
{
CreateStatsStmt *n =
makeNode(CreateStatsStmt);
n->defnames = $6;
@@ -3990,6 +3992,29 @@ CreateStatsStmt:
}
;
+stats_params: stats_param
{ $$ = list_make1($1); }
+ | stats_params ',' stats_param { $$ =
lappend($1, $3); }
+ ;
+
+stats_param: ColId
+ {
+ $$ = makeNode(StatsElem);
+ $$->name = $1;
+ $$->expr = NULL;
+ }
+ | func_expr_windowless
+ {
+ $$ = makeNode(StatsElem);
+ $$->name = NULL;
+ $$->expr = $1;
+ }
+ | '(' a_expr ')'
+ {
+ $$ = makeNode(StatsElem);
+ $$->name = NULL;
+ $$->expr = $2;
+ }
+ ;
/*****************************************************************************
*
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index f1cc5479e4..169a31bf37 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -484,6 +484,13 @@ check_agglevels_and_constraints(ParseState *pstate, Node
*expr)
else
err = _("grouping operations are not allowed in
index predicates");
+ break;
+ case EXPR_KIND_STATS_EXPRESSION:
+ if (isAgg)
+ err = _("aggregate functions are not allowed in
statistics expressions");
+ else
+ err = _("grouping operations are not allowed in
statistics expressions");
+
break;
case EXPR_KIND_ALTER_COL_TRANSFORM:
if (isAgg)
@@ -906,6 +913,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc
*wfunc,
case EXPR_KIND_INDEX_EXPRESSION:
err = _("window functions are not allowed in index
expressions");
break;
+ case EXPR_KIND_STATS_EXPRESSION:
+ err = _("window functions are not allowed in stats
expressions");
+ break;
case EXPR_KIND_INDEX_PREDICATE:
err = _("window functions are not allowed in index
predicates");
break;
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 831db4af95..6ddd839654 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -564,6 +564,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref)
case EXPR_KIND_FUNCTION_DEFAULT:
case EXPR_KIND_INDEX_EXPRESSION:
case EXPR_KIND_INDEX_PREDICATE:
+ case EXPR_KIND_STATS_EXPRESSION:
case EXPR_KIND_ALTER_COL_TRANSFORM:
case EXPR_KIND_EXECUTE_PARAMETER:
case EXPR_KIND_TRIGGER_WHEN:
@@ -1913,6 +1914,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
case EXPR_KIND_INDEX_PREDICATE:
err = _("cannot use subquery in index predicate");
break;
+ case EXPR_KIND_STATS_EXPRESSION:
+ err = _("cannot use subquery in statistics expression");
+ break;
case EXPR_KIND_ALTER_COL_TRANSFORM:
err = _("cannot use subquery in transform expression");
break;
@@ -3543,6 +3547,8 @@ ParseExprKindName(ParseExprKind exprKind)
return "index expression";
case EXPR_KIND_INDEX_PREDICATE:
return "index predicate";
+ case EXPR_KIND_STATS_EXPRESSION:
+ return "statistics expression";
case EXPR_KIND_ALTER_COL_TRANSFORM:
return "USING";
case EXPR_KIND_EXECUTE_PARAMETER:
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 9c3b6ad916..cffc276de0 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -2495,6 +2495,9 @@ check_srf_call_placement(ParseState *pstate, Node
*last_srf, int location)
case EXPR_KIND_INDEX_PREDICATE:
err = _("set-returning functions are not allowed in
index predicates");
break;
+ case EXPR_KIND_STATS_EXPRESSION:
+ err = _("set-returning functions are not allowed in
stats expressions");
+ break;
case EXPR_KIND_ALTER_COL_TRANSFORM:
err = _("set-returning functions are not allowed in
transform expressions");
break;
diff --git a/src/backend/parser/parse_utilcmd.c
b/src/backend/parser/parse_utilcmd.c
index 42095ab830..aeada0b396 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1736,14 +1736,15 @@ generateClonedExtStatsStmt(RangeVar *heapRel, Oid
heapRelid,
/* Determine which columns the statistics are on */
for (i = 0; i < statsrec->stxkeys.dim1; i++)
{
- ColumnRef *cref = makeNode(ColumnRef);
+ StatsElem *selem = makeNode(StatsElem);
AttrNumber attnum = statsrec->stxkeys.values[i];
- cref->fields = list_make1(makeString(get_attname(heapRelid,
-
attnum, false)));
- cref->location = -1;
+ selem->name = get_attname(heapRelid, attnum, false);
+ selem->expr = NULL;
- def_names = lappend(def_names, cref);
+ /* FIXME handle expressions properly */
+
+ def_names = lappend(def_names, selem);
}
/* finally, build the output node */
@@ -2688,6 +2689,84 @@ transformIndexStmt(Oid relid, IndexStmt *stmt, const
char *queryString)
return stmt;
}
+/*
+ * transformStatsStmt - parse analysis for CREATE STATISTICS
+ *
+ * To avoid race conditions, it's important that this function rely only on
+ * the passed-in relid (and not on stmt->relation) to determine the target
+ * relation.
+ */
+CreateStatsStmt *
+transformStatsStmt(Oid relid, CreateStatsStmt *stmt, const char *queryString)
+{
+ ParseState *pstate;
+ RangeTblEntry *rte;
+ ListCell *l;
+ Relation rel;
+
+ /* Nothing to do if statement already transformed. */
+ if (stmt->transformed)
+ return stmt;
+
+ /*
+ * We must not scribble on the passed-in CreateStatsStmt, so copy it.
(This is
+ * overkill, but easy.)
+ */
+ stmt = copyObject(stmt);
+
+ /* Set up pstate */
+ pstate = make_parsestate(NULL);
+ pstate->p_sourcetext = queryString;
+
+ /*
+ * Put the parent table into the rtable so that the expressions can
refer
+ * to its fields without qualification. Caller is responsible for
locking
+ * relation, but we still need to open it.
+ */
+ rel = relation_open(relid, NoLock);
+ rte = addRangeTableEntryForRelation(pstate, rel,
+
AccessShareLock,
+
NULL, false, true);
+
+ /* no to join list, yes to namespaces */
+ addRTEtoQuery(pstate, rte, false, true, true);
+
+ /* take care of any expressions */
+ foreach(l, stmt->exprs)
+ {
+ StatsElem *selem = (StatsElem *) lfirst(l);
+
+ if (selem->expr)
+ {
+ /* Now do parse transformation of the expression */
+ selem->expr = transformExpr(pstate, selem->expr,
+
EXPR_KIND_STATS_EXPRESSION);
+
+ /* We have to fix its collations too */
+ assign_expr_collations(pstate, selem->expr);
+ }
+ }
+
+ /*
+ * Check that only the base rel is mentioned. (This should be dead code
+ * now that add_missing_from is history.)
+ */
+ if (list_length(pstate->p_rtable) != 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("index expressions and predicates can
refer only to the table being indexed")));
+
+ free_parsestate(pstate);
+
+ /* Close relation */
+ table_close(rel, NoLock);
+
+ /* Mark statement as successfully transformed */
+ stmt->transformed = true;
+
+ return stmt;
+}
+
/*
* transformRuleStmt -
diff --git a/src/backend/statistics/dependencies.c
b/src/backend/statistics/dependencies.c
index e2f6c5bb97..76afb0ea2a 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -69,8 +69,10 @@ static void generate_dependencies(DependencyGenerator state);
static DependencyGenerator DependencyGenerator_init(int n, int k);
static void DependencyGenerator_free(DependencyGenerator state);
static AttrNumber *DependencyGenerator_next(DependencyGenerator state);
-static double dependency_degree(int numrows, HeapTuple *rows, int k,
- AttrNumber
*dependency, VacAttrStats **stats, Bitmapset *attrs);
+static double dependency_degree(int numrows, HeapTuple *rows,
+ Datum
*exprvals, bool *exprnulls, int nexprs, int k,
+ AttrNumber
*dependency, VacAttrStats **stats,
+ Bitmapset
*attrs);
static bool dependency_is_fully_matched(MVDependency *dependency,
Bitmapset *attnums);
static bool dependency_implies_attribute(MVDependency *dependency,
@@ -213,8 +215,8 @@ DependencyGenerator_next(DependencyGenerator state)
* the last one.
*/
static double
-dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
- VacAttrStats **stats, Bitmapset *attrs)
+dependency_degree(int numrows, HeapTuple *rows, Datum *exprvals, bool
*exprnulls,
+ int nexprs, int k, AttrNumber *dependency,
VacAttrStats **stats, Bitmapset *attrs)
{
int i,
nitems;
@@ -283,8 +285,8 @@ dependency_degree(int numrows, HeapTuple *rows, int k,
AttrNumber *dependency,
* descriptor. For now that assumption holds, but it might change in
the
* future for example if we support statistics on multiple tables.
*/
- items = build_sorted_items(numrows, &nitems, rows, stats[0]->tupDesc,
- mss, k, attnums_dep);
+ items = build_sorted_items(numrows, &nitems, rows, exprvals, exprnulls,
+ nexprs,
stats[0]->tupDesc, mss, k, attnums_dep);
/*
* Walk through the sorted array, split it into rows according to the
@@ -354,7 +356,9 @@ dependency_degree(int numrows, HeapTuple *rows, int k,
AttrNumber *dependency,
* (c) -> b
*/
MVDependencies *
-statext_dependencies_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
+statext_dependencies_build(int numrows, HeapTuple *rows,
+ Datum *exprvals, bool
*exprnulls,
+ Bitmapset *attrs, List
*exprs,
VacAttrStats **stats)
{
int i,
@@ -365,6 +369,15 @@ statext_dependencies_build(int numrows, HeapTuple *rows,
Bitmapset *attrs,
/* result */
MVDependencies *dependencies = NULL;
+ /*
+ * Copy the bitmapset and add fake attnums representing expressions,
+ * starting above MaxHeapAttributeNumber.
+ */
+ attrs = bms_copy(attrs);
+
+ for (i = 1; i <= list_length(exprs); i++)
+ attrs = bms_add_member(attrs, MaxHeapAttributeNumber + i);
+
/*
* Transform the bms into an array, to make accessing i-th member
easier.
*/
@@ -392,7 +405,9 @@ statext_dependencies_build(int numrows, HeapTuple *rows,
Bitmapset *attrs,
MVDependency *d;
/* compute how valid the dependency seems */
- degree = dependency_degree(numrows, rows, k,
dependency, stats, attrs);
+ degree = dependency_degree(numrows, rows, exprvals,
exprnulls,
+
list_length(exprs), k, dependency,
+
stats, attrs);
/*
* if the dependency seems entirely invalid, don't
store it
@@ -435,6 +450,8 @@ statext_dependencies_build(int numrows, HeapTuple *rows,
Bitmapset *attrs,
DependencyGenerator_free(DependencyGenerator);
}
+ pfree(attrs);
+
return dependencies;
}
@@ -914,6 +931,128 @@ find_strongest_dependency(MVDependencies **dependencies,
int ndependencies,
return strongest;
}
+/*
+ * Similar to dependency_is_compatible_clause, but don't enforce that the
+ * expression is a simple Var.
+ */
+static bool
+dependency_clause_matches_expression(Node *clause, Index relid, List *statlist)
+{
+ List *vars;
+ ListCell *lc, *lc2;
+
+ RestrictInfo *rinfo = (RestrictInfo *) clause;
+ Node *clause_expr;
+
+ if (!IsA(rinfo, RestrictInfo))
+ return false;
+
+ /* Pseudoconstants are not interesting (they couldn't contain a Var) */
+ if (rinfo->pseudoconstant)
+ return false;
+
+ /* Clauses referencing multiple, or no, varnos are incompatible */
+ if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+ return false;
+
+ if (is_opclause(rinfo->clause))
+ {
+ /* If it's an opclause, check for Var = Const or Const = Var. */
+ OpExpr *expr = (OpExpr *) rinfo->clause;
+
+ /* Only expressions with two arguments are candidates. */
+ if (list_length(expr->args) != 2)
+ return false;
+
+ /* Make sure non-selected argument is a pseudoconstant. */
+ if (is_pseudo_constant_clause(lsecond(expr->args)))
+ clause_expr = linitial(expr->args);
+ else if (is_pseudo_constant_clause(linitial(expr->args)))
+ clause_expr = lsecond(expr->args);
+ else
+ return false;
+
+ /*
+ * If it's not an "=" operator, just ignore the clause, as it's
not
+ * compatible with functional dependencies.
+ *
+ * This uses the function for estimating selectivity, not the
operator
+ * directly (a bit awkward, but well ...).
+ *
+ * XXX this is pretty dubious; probably it'd be better to check
btree
+ * or hash opclass membership, so as not to be fooled by custom
+ * selectivity functions, and to be more consistent with
decisions
+ * elsewhere in the planner.
+ */
+ if (get_oprrest(expr->opno) != F_EQSEL)
+ return false;
+
+ /* OK to proceed with checking "var" */
+ }
+ else if (is_notclause(rinfo->clause))
+ {
+ /*
+ * "NOT x" can be interpreted as "x = false", so get the
argument and
+ * proceed with seeing if it's a suitable Var.
+ */
+ clause_expr = (Node *) get_notclausearg(rinfo->clause);
+ }
+ else
+ {
+ /*
+ * A boolean expression "x" can be interpreted as "x = true", so
+ * proceed with seeing if it's a suitable Var.
+ */
+ clause_expr = (Node *) rinfo->clause;
+ }
+
+ /*
+ * We may ignore any RelabelType node above the operand. (There won't
be
+ * more than one, since eval_const_expressions has been applied
already.)
+ */
+ if (IsA(clause_expr, RelabelType))
+ clause_expr = (Node *) ((RelabelType *) clause_expr)->arg;
+
+ vars = pull_var_clause(clause_expr, 0);
+
+ elog(WARNING, "nvars = %d", list_length(vars));
+
+ foreach (lc, vars)
+ {
+ Var *var = (Var *) lfirst(lc);
+
+ /* Ensure Var is from the correct relation */
+ if (var->varno != relid)
+ return false;
+
+ /* We also better ensure the Var is from the current level */
+ if (var->varlevelsup != 0)
+ return false;
+
+ /* Also ignore system attributes (we don't allow stats on
those) */
+ if (!AttrNumberIsForUserDefinedAttr(var->varattno))
+ return false;
+ }
+
+ foreach (lc, statlist)
+ {
+ StatisticExtInfo *info = (StatisticExtInfo *) lfirst(lc);
+
+ foreach (lc2, info->exprs)
+ {
+ Node *expr = (Node *) lfirst(lc2);
+
+ if (equal(clause_expr, expr))
+ {
+ elog(WARNING, "match");
+ return true;
+ }
+ }
+ }
+
+ return false;
+}
+
/*
* dependencies_clauselist_selectivity
* Return the estimated selectivity of (a subset of) the given
clauses
@@ -982,8 +1121,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
Node *clause = (Node *) lfirst(l);
AttrNumber attnum;
+ dependency_clause_matches_expression(clause, rel->relid,
rel->statlist);
+
if (!bms_is_member(listidx, *estimatedclauses) &&
- dependency_is_compatible_clause(clause, rel->relid,
&attnum))
+ dependency_is_compatible_clause(clause, rel->relid,
&attnum))
{
list_attnums[listidx] = bms_make_singleton(attnum);
clauses_attnums = bms_add_member(clauses_attnums,
attnum);
diff --git a/src/backend/statistics/extended_stats.c
b/src/backend/statistics/extended_stats.c
index d9e854228c..d9936ed684 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -24,6 +24,7 @@
#include "catalog/pg_collation.h"
#include "catalog/pg_statistic_ext.h"
#include "catalog/pg_statistic_ext_data.h"
+#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/clauses.h"
@@ -63,11 +64,12 @@ typedef struct StatExtEntry
Bitmapset *columns; /* attribute numbers covered by the
object */
List *types; /* 'char' list of enabled
statistic kinds */
int stattarget; /* statistics target
(-1 for default) */
+ List *exprs; /* expressions */
} StatExtEntry;
static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid);
-static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
+static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
List *exprs,
int nvacatts, VacAttrStats **vacatts);
static void statext_store(Oid relid,
MVNDistinct *ndistinct,
MVDependencies *dependencies,
@@ -111,11 +113,15 @@ BuildRelationExtStatistics(Relation onerel, double
totalrows,
ListCell *lc2;
int stattarget;
+ /* evaluated expressions */
+ Datum *exprvals = NULL;
+ bool *exprnulls = NULL;
+
/*
* Check if we can build these stats based on the column
analyzed. If
* not, report this fact (except in autovacuum) and move on.
*/
- stats = lookup_var_attr_stats(onerel, stat->columns,
+ stats = lookup_var_attr_stats(onerel, stat->columns,
stat->exprs,
natts, vacattrstats);
if (!stats)
{
@@ -131,8 +137,8 @@ BuildRelationExtStatistics(Relation onerel, double
totalrows,
}
/* check allowed number of dimensions */
- Assert(bms_num_members(stat->columns) >= 2 &&
- bms_num_members(stat->columns) <=
STATS_MAX_DIMENSIONS);
+ Assert(bms_num_members(stat->columns) +
list_length(stat->exprs) >= 2 &&
+ bms_num_members(stat->columns) +
list_length(stat->exprs) <= STATS_MAX_DIMENSIONS);
/* compute statistics target for this statistics */
stattarget = statext_compute_stattarget(stat->stattarget,
@@ -147,6 +153,78 @@ BuildRelationExtStatistics(Relation onerel, double
totalrows,
if (stattarget == 0)
continue;
+ if (stat->exprs)
+ {
+ int i;
+ int idx;
+ TupleTableSlot *slot;
+ EState *estate;
+ ExprContext *econtext;
+ List *exprstates = NIL;
+
+ /*
+ * Need an EState for evaluation of index expressions
and
+ * partial-index predicates. Create it in the
per-index context to be
+ * sure it gets cleaned up at the bottom of the loop.
+ */
+ estate = CreateExecutorState();
+ econtext = GetPerTupleExprContext(estate);
+ /* Need a slot to hold the current heap tuple, too */
+ slot =
MakeSingleTupleTableSlot(RelationGetDescr(onerel),
+
&TTSOpsHeapTuple);
+
+ /* Arrange for econtext's scan tuple to be the tuple
under test */
+ econtext->ecxt_scantuple = slot;
+
+ /* Compute and save index expression values */
+ exprvals = (Datum *) palloc(numrows *
list_length(stat->exprs) * sizeof(Datum));
+ exprnulls = (bool *) palloc(numrows *
list_length(stat->exprs) * sizeof(bool));
+
+ /* Set up expression evaluation state */
+ exprstates = ExecPrepareExprList(stat->exprs, estate);
+
+ idx = 0;
+ for (i = 0; i < numrows; i++)
+ {
+ /*
+ * Reset the per-tuple context each time, to
reclaim any cruft
+ * left behind by evaluating the predicate or
index expressions.
+ */
+ ResetExprContext(econtext);
+
+ /* Set up for predicate or expression
evaluation */
+ ExecStoreHeapTuple(rows[i], slot, false);
+
+ foreach (lc2, exprstates)
+ {
+ Datum datum;
+ bool isnull;
+ ExprState *exprstate = (ExprState *)
lfirst(lc2);
+
+ datum =
ExecEvalExprSwitchContext(exprstate,
+
GetPerTupleExprContext(estate),
+
&isnull);
+ if (isnull)
+ {
+ exprvals[idx] = (Datum) 0;
+ exprnulls[idx] = true;
+ }
+ else
+ {
+ exprvals[idx] = (Datum) datum;
+ exprnulls[idx] = false;
+ }
+
+ idx++;
+ }
+ }
+
+ ExecDropSingleTupleTableSlot(slot);
+ FreeExecutorState(estate);
+
+ elog(WARNING, "idx = %d", idx);
+ }
+
/* compute statistic of each requested type */
foreach(lc2, stat->types)
{
@@ -154,13 +232,19 @@ BuildRelationExtStatistics(Relation onerel, double
totalrows,
if (t == STATS_EXT_NDISTINCT)
ndistinct = statext_ndistinct_build(totalrows,
numrows, rows,
-
stat->columns, stats);
+
exprvals, exprnulls,
+
stat->columns, stat->exprs,
+
stats);
else if (t == STATS_EXT_DEPENDENCIES)
dependencies =
statext_dependencies_build(numrows, rows,
-
stat->columns, stats);
+
exprvals, exprnulls,
+
stat->columns,
+
stat->exprs, stats);
else if (t == STATS_EXT_MCV)
- mcv = statext_mcv_build(numrows, rows,
stat->columns, stats,
-
totalrows, stattarget);
+ mcv = statext_mcv_build(numrows, rows,
+
exprvals, exprnulls,
+
stat->columns, stat->exprs,
+
stats, totalrows, stattarget);
}
/* store the statistics in the catalog */
@@ -217,7 +301,7 @@ ComputeExtStatisticsRows(Relation onerel,
* analyzed. If not, ignore it (don't report anything, we'll do
that
* during the actual build BuildRelationExtStatistics).
*/
- stats = lookup_var_attr_stats(onerel, stat->columns,
+ stats = lookup_var_attr_stats(onerel, stat->columns,
stat->exprs,
natts, vacattrstats);
if (!stats)
@@ -364,6 +448,7 @@ fetch_statentries_for_relation(Relation pg_statext, Oid
relid)
ArrayType *arr;
char *enabled;
Form_pg_statistic_ext staForm;
+ List *exprs = NIL;
entry = palloc0(sizeof(StatExtEntry));
staForm = (Form_pg_statistic_ext) GETSTRUCT(htup);
@@ -395,6 +480,34 @@ fetch_statentries_for_relation(Relation pg_statext, Oid
relid)
entry->types = lappend_int(entry->types, (int)
enabled[i]);
}
+ /* decode expression (if any) */
+ datum = SysCacheGetAttr(STATEXTOID, htup,
+
Anum_pg_statistic_ext_stxexprs, &isnull);
+
+ if (!isnull)
+ {
+ char *exprsString;
+
+ exprsString = TextDatumGetCString(datum);
+ exprs = (List *) stringToNode(exprsString);
+
+ pfree(exprsString);
+
+ /*
+ * Run the expressions through eval_const_expressions.
This is not just an
+ * optimization, but is necessary, because the planner
will be comparing
+ * them to similarly-processed qual clauses, and may
fail to detect valid
+ * matches without this. We must not use
canonicalize_qual, however,
+ * since these aren't qual expressions.
+ */
+ exprs = (List *) eval_const_expressions(NULL, (Node *)
exprs);
+
+ /* May as well fix opfuncids too */
+ fix_opfuncids((Node *) exprs);
+ }
+
+ entry->exprs = exprs;
+
result = lappend(result, entry);
}
@@ -403,6 +516,89 @@ fetch_statentries_for_relation(Relation pg_statext, Oid
relid)
return result;
}
+
+/*
+ * examine_attribute -- pre-analysis of a single column
+ *
+ * Determine whether the column is analyzable; if so, create and initialize
+ * a VacAttrStats struct for it. If not, return NULL.
+ *
+ * If index_expr isn't NULL, then we're trying to analyze an expression index,
+ * and index_expr is the expression tree representing the column's data.
+ */
+static VacAttrStats *
+examine_attribute(Node *expr)
+{
+ HeapTuple typtuple;
+ VacAttrStats *stats;
+ int i;
+ bool ok;
+
+ /*
+ * Create the VacAttrStats struct. Note that we only have a copy of the
+ * fixed fields of the pg_attribute tuple.
+ */
+ stats = (VacAttrStats *) palloc0(sizeof(VacAttrStats));
+
+ /* fake the attribute */
+ stats->attr = (Form_pg_attribute) palloc0(ATTRIBUTE_FIXED_PART_SIZE);
+ stats->attr->attstattarget = -1;
+
+ /*
+ * When analyzing an expression index, believe the expression tree's
type
+ * not the column datatype --- the latter might be the opckeytype
storage
+ * type of the opclass, which is not interesting for our purposes.
(Note:
+ * if we did anything with non-expression index columns, we'd need to
+ * figure out where to get the correct type info from, but for now
that's
+ * not a problem.) It's not clear whether anyone will care about
the
+ * typmod, but we store that too just in case.
+ */
+ stats->attrtypid = exprType(expr);
+ stats->attrtypmod = exprTypmod(expr);
+ stats->attrcollid = exprCollation(expr);
+
+ typtuple = SearchSysCacheCopy1(TYPEOID,
+
ObjectIdGetDatum(stats->attrtypid));
+ if (!HeapTupleIsValid(typtuple))
+ elog(ERROR, "cache lookup failed for type %u",
stats->attrtypid);
+ stats->attrtype = (Form_pg_type) GETSTRUCT(typtuple);
+ // stats->anl_context = anl_context;
+ stats->tupattnum = InvalidAttrNumber;
+
+ /*
+ * The fields describing the stats->stavalues[n] element types default
to
+ * the type of the data being analyzed, but the type-specific typanalyze
+ * function can change them if it wants to store something else.
+ */
+ for (i = 0; i < STATISTIC_NUM_SLOTS; i++)
+ {
+ stats->statypid[i] = stats->attrtypid;
+ stats->statyplen[i] = stats->attrtype->typlen;
+ stats->statypbyval[i] = stats->attrtype->typbyval;
+ stats->statypalign[i] = stats->attrtype->typalign;
+ }
+
+ /*
+ * Call the type-specific typanalyze function. If none is specified,
use
+ * std_typanalyze().
+ */
+ if (OidIsValid(stats->attrtype->typanalyze))
+ ok = DatumGetBool(OidFunctionCall1(stats->attrtype->typanalyze,
+
PointerGetDatum(stats)));
+ else
+ ok = std_typanalyze(stats);
+
+ if (!ok || stats->compute_stats == NULL || stats->minrows <= 0)
+ {
+ heap_freetuple(typtuple);
+ pfree(stats->attr);
+ pfree(stats);
+ return NULL;
+ }
+
+ return stats;
+}
+
/*
* Using 'vacatts' of size 'nvacatts' as input data, return a newly built
* VacAttrStats array which includes only the items corresponding to
@@ -411,15 +607,18 @@ fetch_statentries_for_relation(Relation pg_statext, Oid
relid)
* to the caller that the stats should not be built.
*/
static VacAttrStats **
-lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
+lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
int nvacatts, VacAttrStats **vacatts)
{
int i = 0;
int x = -1;
+ int natts;
VacAttrStats **stats;
+ ListCell *lc;
- stats = (VacAttrStats **)
- palloc(bms_num_members(attrs) * sizeof(VacAttrStats *));
+ natts = bms_num_members(attrs) + list_length(exprs);
+
+ stats = (VacAttrStats **) palloc(natts * sizeof(VacAttrStats *));
/* lookup VacAttrStats info for the requested columns (same attnum) */
while ((x = bms_next_member(attrs, x)) >= 0)
@@ -453,6 +652,19 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
*/
Assert(!stats[i]->attr->attisdropped);
+ elog(WARNING, "A: %d => %p", i, stats[i]);
+
+ i++;
+ }
+
+ foreach (lc, exprs)
+ {
+ Node *expr = (Node *) lfirst(lc);
+
+ stats[i] = examine_attribute(expr);
+
+ elog(WARNING, "B: %d => %p (%s)", i, stats[i],
nodeToString(expr));
+
i++;
}
@@ -717,8 +929,10 @@ build_attnums_array(Bitmapset *attrs, int *numattrs)
* can simply pfree the return value to release all of it.
*/
SortItem *
-build_sorted_items(int numrows, int *nitems, HeapTuple *rows, TupleDesc tdesc,
- MultiSortSupport mss, int numattrs,
AttrNumber *attnums)
+build_sorted_items(int numrows, int *nitems, HeapTuple *rows,
+ Datum *exprvals, bool *exprnulls, int nexprs,
+ TupleDesc tdesc, MultiSortSupport mss,
+ int numattrs, AttrNumber *attnums)
{
int i,
j,
@@ -766,7 +980,16 @@ build_sorted_items(int numrows, int *nitems, HeapTuple
*rows, TupleDesc tdesc,
Datum value;
bool isnull;
- value = heap_getattr(rows[i], attnums[j], tdesc,
&isnull);
+ if (attnums[j] <= MaxHeapAttributeNumber)
+ value = heap_getattr(rows[i], attnums[j],
tdesc, &isnull);
+ else
+ {
+ int expridx = (attnums[j] -
MaxHeapAttributeNumber - 1);
+ int idx = i * nexprs + expridx;
+
+ value = exprvals[idx];
+ isnull = exprnulls[idx];
+ }
/*
* If this is a varlena value, check if it's too wide
and if yes
@@ -1080,6 +1303,168 @@ statext_is_compatible_clause_internal(PlannerInfo
*root, Node *clause,
return false;
}
+
+
+/*
+ * statext_extract_clause_internal
+ * Determines if the clause is compatible with MCV lists.
+ *
+ * Does the heavy lifting of actually inspecting the clauses for
+ * statext_is_compatible_clause. It needs to be split like this because
+ * of recursion. The attnums bitmap is an input/output parameter collecting
+ * attribute numbers from all compatible clauses (recursively).
+ */
+static List *
+statext_extract_clause_internal(PlannerInfo *root, Node *clause, Index relid)
+{
+ List *result = NIL;
+
+ /* Look inside any binary-compatible relabeling (as in
examine_variable) */
+ if (IsA(clause, RelabelType))
+ clause = (Node *) ((RelabelType *) clause)->arg;
+
+ /* plain Var references (boolean Vars or recursive checks) */
+ if (IsA(clause, Var))
+ {
+ Var *var = (Var *) clause;
+
+ /* Ensure var is from the correct relation */
+ if (var->varno != relid)
+ return NIL;
+
+ /* we also better ensure the Var is from the current level */
+ if (var->varlevelsup > 0)
+ return NIL;
+
+ /* Also skip system attributes (we don't allow stats on those).
*/
+ if (!AttrNumberIsForUserDefinedAttr(var->varattno))
+ return NIL;
+
+ // *attnums = bms_add_member(*attnums, var->varattno);
+
+ result = lappend(result, clause);
+
+ return result;
+ }
+
+ /* (Var op Const) or (Const op Var) */
+ if (is_opclause(clause))
+ {
+ RangeTblEntry *rte = root->simple_rte_array[relid];
+ OpExpr *expr = (OpExpr *) clause;
+ Var *var;
+ Var *var2 = NULL;
+
+ /* Only expressions with two arguments are considered
compatible. */
+ if (list_length(expr->args) != 2)
+ return NIL;
+
+ /* Check if the expression the right shape (one Var, one Const)
*/
+ if ((!examine_opclause_expression(expr, &var, NULL, NULL)) &&
+ (!examine_opclause_expression2(expr, &var, &var2)))
+ return NIL;
+
+ /*
+ * If it's not one of the supported operators ("=", "<", ">",
etc.),
+ * just ignore the clause, as it's not compatible with MCV
lists.
+ *
+ * This uses the function for estimating selectivity, not the
operator
+ * directly (a bit awkward, but well ...).
+ */
+ switch (get_oprrest(expr->opno))
+ {
+ case F_EQSEL:
+ case F_NEQSEL:
+ case F_SCALARLTSEL:
+ case F_SCALARLESEL:
+ case F_SCALARGTSEL:
+ case F_SCALARGESEL:
+ /* supported, will continue with inspection of
the Var */
+ break;
+
+ default:
+ /* other estimators are considered
unknown/unsupported */
+ return NIL;
+ }
+
+ /*
+ * If there are any securityQuals on the RTE from security
barrier
+ * views or RLS policies, then the user may not have access to
all the
+ * table's data, and we must check that the operator is
leak-proof.
+ *
+ * If the operator is leaky, then we must ignore this clause
for the
+ * purposes of estimating with MCV lists, otherwise the
operator might
+ * reveal values from the MCV list that the user doesn't have
+ * permission to see.
+ */
+ if (rte->securityQuals != NIL &&
+ !get_func_leakproof(get_opcode(expr->opno)))
+ return NIL;
+
+ result = lappend(result, var);
+
+ if (var2)
+ result = lappend(result, var2);
+
+ return result;
+ }
+
+ /* AND/OR/NOT clause */
+ if (is_andclause(clause) ||
+ is_orclause(clause) ||
+ is_notclause(clause))
+ {
+ /*
+ * AND/OR/NOT-clauses are supported if all sub-clauses are
supported
+ *
+ * Perhaps we could improve this by handling mixed cases, when
some of
+ * the clauses are supported and some are not. Selectivity for
the
+ * supported subclauses would be computed using extended
statistics,
+ * and the remaining clauses would be estimated using the
traditional
+ * algorithm (product of selectivities).
+ *
+ * It however seems overly complex, and in a way we already do
that
+ * because if we reject the whole clause as unsupported here,
it will
+ * be eventually passed to clauselist_selectivity() which does
exactly
+ * this (split into supported/unsupported clauses etc).
+ */
+ BoolExpr *expr = (BoolExpr *) clause;
+ ListCell *lc;
+
+ foreach(lc, expr->args)
+ {
+ /*
+ * Had we found incompatible clause in the arguments,
treat the
+ * whole clause as incompatible.
+ */
+ if (!statext_extract_clause_internal(root,
+
(Node *) lfirst(lc),
+
relid))
+ return NIL;
+ }
+
+ return result;
+ }
+
+ /* Var IS NULL */
+ if (IsA(clause, NullTest))
+ {
+ NullTest *nt = (NullTest *) clause;
+
+ /*
+ * Only simple (Var IS NULL) expressions supported for now.
Maybe we
+ * could use examine_variable to fix this?
+ */
+ if (!IsA(nt->arg, Var))
+ return false;
+
+ return statext_extract_clause_internal(root, (Node *) (nt->arg),
+
relid);
+ }
+
+ return false;
+}
+
/*
* statext_is_compatible_clause
* Determines if the clause is compatible with MCV lists.
@@ -1154,6 +1539,51 @@ statext_is_compatible_clause(PlannerInfo *root, Node
*clause, Index relid,
return true;
}
+/*
+ * statext_extract_clause
+ * Determines if the clause is compatible with MCV lists.
+ *
+ * Currently, we only support three types of clauses:
+ *
+ * (a) OpExprs of the form (Var op Const), or (Const op Var), where the op
+ * is one of ("=", "<", ">", ">=", "<=")
+ *
+ * (b) (Var IS [NOT] NULL)
+ *
+ * (c) combinations using AND/OR/NOT
+ *
+ * In the future, the range of supported clauses may be expanded to more
+ * complex cases, for example (Var op Var).
+ */
+static List *
+statext_extract_clause(PlannerInfo *root, Node *clause, Index relid)
+{
+ RestrictInfo *rinfo = (RestrictInfo *) clause;
+ List *exprs;
+
+ if (!IsA(rinfo, RestrictInfo))
+ return false;
+
+ /* Pseudoconstants are not really interesting here. */
+ if (rinfo->pseudoconstant)
+ return false;
+
+ /* clauses referencing multiple varnos are incompatible */
+ if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+ return false;
+
+ /* Check the clause and determine what attributes it references. */
+ exprs = statext_extract_clause_internal(root, (Node *) rinfo->clause,
relid);
+
+ if (!exprs)
+ return NULL;
+
+ /* FIXME do the same ACL check as in statext_is_compatible_clause */
+
+ /* If we reach here, the clause is OK */
+ return exprs;
+}
+
/*
* statext_mcv_clauselist_selectivity
* Estimate clauses using the best multi-column statistics.
@@ -1216,7 +1646,8 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root,
List *clauses, int varReli
bool is_or)
{
ListCell *l;
- Bitmapset **list_attnums;
+ Bitmapset **list_attnums; /* attnums extracted from the clause */
+ bool *exact_clauses; /* covered as-is by at least one
statistic */
int listidx;
Selectivity sel = 1.0;
@@ -1227,6 +1658,8 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root,
List *clauses, int varReli
list_attnums = (Bitmapset **) palloc(sizeof(Bitmapset *) *
list_length(clauses));
+ exact_clauses = (bool *) palloc(sizeof(bool) * list_length(clauses));
+
/*
* Pre-process the clauses list to extract the attnums seen in each
item.
* We need to determine if there's any clauses which will be useful for
@@ -1244,11 +1677,76 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root,
List *clauses, int varReli
Node *clause = (Node *) lfirst(l);
Bitmapset *attnums = NULL;
+ /* the clause is considered incompatible by default */
+ list_attnums[listidx] = NULL;
+
+ /* and it's also not covered exactly by the statistic */
+ exact_clauses[listidx] = false;
+
+ /*
+ * First see if the clause is simple enough to be covered
directly
+ * by the attributes. If not, see if there's at least one
statistic
+ * object using the expression as-is.
+ */
if (!bms_is_member(listidx, *estimatedclauses) &&
statext_is_compatible_clause(root, clause, rel->relid,
&attnums))
+ /* simple expression, covered through attnum(s) */
list_attnums[listidx] = attnums;
else
- list_attnums[listidx] = NULL;
+ {
+ ListCell *lc;
+
+ List *exprs = statext_extract_clause(root, clause,
rel->relid);
+
+ /* complex expression, search for statistic */
+ foreach(lc, rel->statlist)
+ {
+ ListCell *lc2;
+ StatisticExtInfo *info = (StatisticExtInfo *)
lfirst(lc);
+ bool all_found =
true;
+
+ /* have we already found all expressions in a
statistic? */
+ Assert(!exact_clauses[listidx]);
+
+ /* no expressions */
+ if (!info->exprs)
+ continue;
+
+ foreach (lc2, exprs)
+ {
+ Node *expr = (Node *) lfirst(lc2);
+
+ /*
+ * Walk the expressions, see if all
expressions extracted from
+ * the clause are covered by the
extended statistic object.
+ */
+ foreach (lc2, info->exprs)
+ {
+ Node *stat_expr = (Node *)
lfirst(lc2);
+ bool expr_found = false;
+
+ if (equal(expr, stat_expr))
+ {
+ expr_found = true;
+ break;
+ }
+
+ if (!expr_found)
+ {
+ all_found = false;
+ break;
+ }
+ }
+ }
+
+ /* stop looking for another statistic */
+ if (all_found)
+ {
+ exact_clauses[listidx] = true;
+ break;
+ }
+ }
+ }
listidx++;
}
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 4b51af287e..c3c3ede7c5 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -180,7 +180,9 @@ get_mincount_for_mcv_list(int samplerows, double totalrows)
*
*/
MCVList *
-statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
+statext_mcv_build(int numrows, HeapTuple *rows,
+ Datum *exprvals, bool *exprnulls,
+ Bitmapset *attrs, List *exprs,
VacAttrStats **stats, double totalrows, int
stattarget)
{
int i,
@@ -194,13 +196,23 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset
*attrs,
MCVList *mcvlist = NULL;
MultiSortSupport mss;
+ /*
+ * Copy the bitmapset and add fake attnums representing expressions,
+ * starting above MaxHeapAttributeNumber.
+ */
+ attrs = bms_copy(attrs);
+
+ for (i = 1; i <= list_length(exprs); i++)
+ attrs = bms_add_member(attrs, MaxHeapAttributeNumber + i);
+
attnums = build_attnums_array(attrs, &numattrs);
/* comparator for all the columns */
mss = build_mss(stats, numattrs);
/* sort the rows */
- items = build_sorted_items(numrows, &nitems, rows, stats[0]->tupDesc,
+ items = build_sorted_items(numrows, &nitems, rows, exprvals, exprnulls,
+ list_length(exprs),
stats[0]->tupDesc,
mss, numattrs,
attnums);
if (!items)
@@ -337,6 +349,7 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset
*attrs,
pfree(items);
pfree(groups);
+ pfree(attrs);
return mcvlist;
}
diff --git a/src/backend/statistics/mvdistinct.c
b/src/backend/statistics/mvdistinct.c
index 977d6f3e2e..dd874c7a04 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -37,8 +37,10 @@
#include "utils/typcache.h"
static double ndistinct_for_combination(double totalrows, int numrows,
-
HeapTuple *rows, VacAttrStats **stats,
-
int k, int *combination);
+
HeapTuple *rows, Datum *exprvals,
+
bool *exprnulls, int nexprs,
+
VacAttrStats **stats, int k,
+
int *combination);
static double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
static int n_choose_k(int n, int k);
static int num_combinations(int n);
@@ -84,14 +86,26 @@ static void generate_combinations(CombinationGenerator
*state);
*/
MVNDistinct *
statext_ndistinct_build(double totalrows, int numrows, HeapTuple *rows,
- Bitmapset *attrs, VacAttrStats
**stats)
+ Datum *exprvals, bool
*exprnulls,
+ Bitmapset *attrs, List *exprs,
+ VacAttrStats **stats)
{
MVNDistinct *result;
+ int i;
int k;
int itemcnt;
- int numattrs = bms_num_members(attrs);
+ int numattrs = bms_num_members(attrs) +
list_length(exprs);
int numcombs = num_combinations(numattrs);
+ /*
+ * Copy the bitmapset and add fake attnums representing expressions,
+ * starting above MaxHeapAttributeNumber.
+ */
+ attrs = bms_copy(attrs);
+
+ for (i = 1; i <= list_length(exprs); i++)
+ attrs = bms_add_member(attrs, MaxHeapAttributeNumber + i);
+
result = palloc(offsetof(MVNDistinct, items) +
numcombs * sizeof(MVNDistinctItem));
result->magic = STATS_NDISTINCT_MAGIC;
@@ -114,10 +128,18 @@ statext_ndistinct_build(double totalrows, int numrows,
HeapTuple *rows,
item->attrs = NULL;
for (j = 0; j < k; j++)
- item->attrs = bms_add_member(item->attrs,
-
stats[combination[j]]->attr->attnum);
+ {
+ if (combination[j] <= MaxHeapAttributeNumber)
+ item->attrs =
bms_add_member(item->attrs,
+
stats[combination[j]]->attr->attnum);
+ else
+ item->attrs =
bms_add_member(item->attrs, combination[j]);
+ }
+
item->ndistinct =
ndistinct_for_combination(totalrows, numrows,
rows,
+
exprvals, exprnulls,
+
list_length(exprs),
stats, k, combination);
itemcnt++;
@@ -428,6 +450,7 @@ pg_ndistinct_send(PG_FUNCTION_ARGS)
*/
static double
ndistinct_for_combination(double totalrows, int numrows, HeapTuple *rows,
+ Datum *exprvals, bool
*exprnulls, int nexprs,
VacAttrStats **stats, int k,
int *combination)
{
int i,
@@ -481,11 +504,17 @@ ndistinct_for_combination(double totalrows, int numrows,
HeapTuple *rows,
/* accumulate all the data for this dimension into the arrays */
for (j = 0; j < numrows; j++)
{
- items[j].values[i] =
- heap_getattr(rows[j],
- colstat->attr->attnum,
- colstat->tupDesc,
- &items[j].isnull[i]);
+ if (combination[i] <= MaxHeapAttributeNumber)
+ items[j].values[i] =
+ heap_getattr(rows[j],
+
colstat->attr->attnum,
+
colstat->tupDesc,
+
&items[j].isnull[i]);
+ else
+ {
+ items[j].values[i] = exprvals[j * nexprs +
combination[i] - MaxHeapAttributeNumber - 1];
+ items[j].isnull[i] = exprnulls[j * nexprs +
combination[i] - MaxHeapAttributeNumber - 1];
+ }
}
}
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index b2c58bf862..701cff1693 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1678,7 +1678,21 @@ ProcessUtilitySlow(ParseState *pstate,
break;
case T_CreateStatsStmt:
- address = CreateStatistics((CreateStatsStmt *)
parsetree);
+ {
+ Oid relid;
+ CreateStatsStmt *stmt =
(CreateStatsStmt *) parsetree;
+ RangeVar *rel = (RangeVar *)
linitial(stmt->relations);
+
+ relid = RangeVarGetRelidExtended(rel,
ShareLock,
+
0,
+
RangeVarCallbackOwnsRelation,
+
NULL);
+
+ /* Run parse analysis ... */
+ stmt = transformStatsStmt(relid, stmt,
queryString);
+
+ address = CreateStatistics(stmt);
+ }
break;
case T_AlterStatsStmt:
diff --git a/src/backend/utils/adt/ruleutils.c
b/src/backend/utils/adt/ruleutils.c
index 116e00bce4..7e44afed16 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -1524,6 +1524,9 @@ pg_get_statisticsobj_worker(Oid statextid, bool
missing_ok)
bool dependencies_enabled;
bool mcv_enabled;
int i;
+ List *context;
+ ListCell *lc;
+ List *exprs = NIL;
statexttup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statextid));
@@ -1616,6 +1619,62 @@ pg_get_statisticsobj_worker(Oid statextid, bool
missing_ok)
appendStringInfoString(&buf, quote_identifier(attname));
}
+ /* deparse expressions */
+
+ {
+ bool isnull;
+ Datum datum;
+
+ /* decode expression (if any) */
+ datum = SysCacheGetAttr(STATEXTOID, statexttup,
+
Anum_pg_statistic_ext_stxexprs, &isnull);
+
+ if (!isnull)
+ {
+ char *exprsString;
+
+ exprsString = TextDatumGetCString(datum);
+ exprs = (List *) stringToNode(exprsString);
+ pfree(exprsString);
+
+ /*
+ * Run the expressions through
eval_const_expressions. This is not just an
+ * optimization, but is necessary, because the
planner will be comparing
+ * them to similarly-processed qual clauses,
and may fail to detect valid
+ * matches without this. We must not use
canonicalize_qual, however,
+ * since these aren't qual expressions.
+ */
+ exprs = (List *) eval_const_expressions(NULL,
(Node *) exprs);
+
+ /* May as well fix opfuncids too */
+ fix_opfuncids((Node *) exprs);
+ }
+ }
+
+ context = deparse_context_for(get_relation_name(statextrec->stxrelid),
+
statextrec->stxrelid);
+
+ foreach (lc, exprs)
+ {
+ Node *expr = (Node *) lfirst(lc);
+ char *str;
+ int prettyFlags = PRETTYFLAG_INDENT;
+
+ str = deparse_expression_pretty(expr, context, false, false,
+
prettyFlags, 0);
+
+ if (colno > 0)
+ appendStringInfoString(&buf, ", ");
+
+ /* Need parens if it's not a bare function call */
+ if (looks_like_function(expr))
+ appendStringInfoString(&buf, str);
+ else
+ appendStringInfo(&buf, "(%s)", str);
+
+ colno++;
+ }
+
appendStringInfo(&buf, " FROM %s",
generate_relation_name(statextrec->stxrelid, NIL));
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 18d77ac0b7..bfbe92a543 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3082,6 +3082,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs,
double input_rows,
double this_srf_multiplier;
VariableStatData vardata;
List *varshere;
+ Relids varnos;
ListCell *l2;
/* is expression in this grouping set? */
@@ -3149,6 +3150,16 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs,
double input_rows,
continue;
}
+ /*
+ * Are all the variables from the same relation? If yes, search
for
+ * an extended statistic matching this expression exactly.
+ */
+ varnos = pull_varnos((Node *) varshere);
+ if (bms_membership(varnos) == BMS_SINGLETON)
+ {
+ // FIXME try to match it to expressions in mvdistinct
stats
+ }
+
/*
* Else add variables to varinfos list
*/
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index f3c7eb96fa..92c2deb1ba 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2671,6 +2671,7 @@ describeOneTableDetails(const char *schemaname,
/* print any extended statistics */
if (pset.sversion >= 100000)
{
+ /* FIXME improve this with printing expressions the
statistics is defined on */
printfPQExpBuffer(&buf,
"SELECT oid, "
"stxrelid::pg_catalog.regclass, "
diff --git a/src/include/catalog/pg_statistic_ext.h
b/src/include/catalog/pg_statistic_ext.h
index e9491a0a87..dd0f41cd14 100644
--- a/src/include/catalog/pg_statistic_ext.h
+++ b/src/include/catalog/pg_statistic_ext.h
@@ -52,6 +52,9 @@ CATALOG(pg_statistic_ext,3381,StatisticExtRelationId)
#ifdef CATALOG_VARLEN
char stxkind[1] BKI_FORCE_NOT_NULL; /* statistics kinds
requested
* to build */
+ pg_node_tree stxexprs; /* expression trees for stats
attributes that
+ * are not
simple column references; one for
+ * each zero
entry in stxkeys[] */
#endif
} FormData_pg_statistic_ext;
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index baced7eec0..72f6534ceb 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -448,6 +448,7 @@ typedef enum NodeTag
T_TypeName,
T_ColumnDef,
T_IndexElem,
+ T_StatsElem,
T_Constraint,
T_DefElem,
T_RangeTblEntry,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index cdfa0568f7..def7e4fe3f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2808,8 +2808,24 @@ typedef struct CreateStatsStmt
List *relations; /* rels to build stats on (list of
RangeVar) */
char *stxcomment; /* comment to apply to stats, or NULL */
bool if_not_exists; /* do nothing if stats name already
exists */
+ bool transformed; /* true when transformIndexStmt is
finished */
} CreateStatsStmt;
+/*
+ * StatsElem - statistics parameters (used in CREATE STATISTICS)
+ *
+ * For a plain attribute, 'name' is the name of the referenced table column
+ * and 'expr' is NULL. For an expression, 'name' is NULL and 'expr' is the
+ * expression tree.
+ */
+typedef struct StatsElem
+{
+ NodeTag type;
+ char *name; /* name of attribute to index,
or NULL */
+ Node *expr; /* expression to index, or NULL
*/
+} StatsElem;
+
+
/* ----------------------
* Alter Statistics Statement
* ----------------------
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 3d3be197e0..f3ca603570 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -885,6 +885,7 @@ typedef struct StatisticExtInfo
RelOptInfo *rel; /* back-link to statistic's
table */
char kind; /* statistic kind of this entry
*/
Bitmapset *keys; /* attnums of the columns
covered */
+ List *exprs; /* expressions */
} StatisticExtInfo;
/*
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index d25819aa28..82e5190964 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -69,6 +69,7 @@ typedef enum ParseExprKind
EXPR_KIND_FUNCTION_DEFAULT, /* default parameter value for function */
EXPR_KIND_INDEX_EXPRESSION, /* index expression */
EXPR_KIND_INDEX_PREDICATE, /* index predicate */
+ EXPR_KIND_STATS_EXPRESSION, /* extended statistics expression */
EXPR_KIND_ALTER_COL_TRANSFORM, /* transform expr in ALTER COLUMN TYPE
*/
EXPR_KIND_EXECUTE_PARAMETER, /* parameter value in EXECUTE */
EXPR_KIND_TRIGGER_WHEN, /* WHEN condition in CREATE TRIGGER */
diff --git a/src/include/parser/parse_utilcmd.h
b/src/include/parser/parse_utilcmd.h
index eb73acdbd3..ca94cbd542 100644
--- a/src/include/parser/parse_utilcmd.h
+++ b/src/include/parser/parse_utilcmd.h
@@ -24,6 +24,8 @@ extern List *transformAlterTableStmt(Oid relid,
AlterTableStmt *stmt,
const
char *queryString);
extern IndexStmt *transformIndexStmt(Oid relid, IndexStmt *stmt,
const
char *queryString);
+extern CreateStatsStmt *transformStatsStmt(Oid relid, CreateStatsStmt *stmt,
+ const
char *queryString);
extern void transformRuleStmt(RuleStmt *stmt, const char *queryString,
List **actions, Node
**whereClause);
extern List *transformCreateSchemaStmt(CreateSchemaStmt *stmt);
diff --git a/src/include/statistics/extended_stats_internal.h
b/src/include/statistics/extended_stats_internal.h
index 23217497bb..96a54f8487 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -59,17 +59,23 @@ typedef struct SortItem
extern MVNDistinct *statext_ndistinct_build(double totalrows,
int numrows, HeapTuple *rows,
-
Bitmapset *attrs, VacAttrStats **stats);
+
Datum *exprvals, bool *exprnulls,
+
Bitmapset *attrs, List *exprs,
+
VacAttrStats **stats);
extern bytea *statext_ndistinct_serialize(MVNDistinct *ndistinct);
extern MVNDistinct *statext_ndistinct_deserialize(bytea *data);
extern MVDependencies *statext_dependencies_build(int numrows, HeapTuple *rows,
-
Bitmapset *attrs, VacAttrStats **stats);
+
Datum *exprvals, bool *exprnulls,
+
Bitmapset *attrs, List *exprs,
+
VacAttrStats **stats);
extern bytea *statext_dependencies_serialize(MVDependencies *dependencies);
extern MVDependencies *statext_dependencies_deserialize(bytea *data);
extern MCVList *statext_mcv_build(int numrows, HeapTuple *rows,
- Bitmapset
*attrs, VacAttrStats **stats,
+ Datum
*exprvals, bool *exprnulls,
+ Bitmapset
*attrs, List *exprs,
+ VacAttrStats
**stats,
double
totalrows, int stattarget);
extern bytea *statext_mcv_serialize(MCVList *mcv, VacAttrStats **stats);
extern MCVList *statext_mcv_deserialize(bytea *data);
@@ -93,6 +99,7 @@ extern void *bsearch_arg(const void *key, const void *base,
extern AttrNumber *build_attnums_array(Bitmapset *attrs, int *numattrs);
extern SortItem *build_sorted_items(int numrows, int *nitems, HeapTuple *rows,
+ Datum
*exprvals, bool *exprnulls, int nexprs,
TupleDesc tdesc, MultiSortSupport mss,
int
numattrs, AttrNumber *attnums);
--
2.21.0