Re: [PATCH] Add support function for containment operators

2024-01-26 Thread vignesh C
On Sun, 21 Jan 2024 at 00:31, Tom Lane  wrote:
>
> jian he  writes:
> > Now I see your point. If the transformed plan is right, the whole
> > added code should be fine.
> > but keeping the textrange_supp related test should be a good idea.
> > since we don't have SUBTYPE_OPCLASS related sql tests.
>
> Yeah, it's a little harder to make a table-less test for that case.
> I thought about using current_user or the like as a stable comparison
> value, but that introduces some doubt about what the collation would
> be.  That test seems cheap enough as-is, since it's handling only a
> tiny amount of data.
>
> Committed.

I have updated the commitfest entry to Committed as the patch is committed.

Regards,
Vignesh




Re: [PATCH] Add support function for containment operators

2024-01-20 Thread Tom Lane
jian he  writes:
> Now I see your point. If the transformed plan is right, the whole
> added code should be fine.
> but keeping the textrange_supp related test should be a good idea.
> since we don't have SUBTYPE_OPCLASS related sql tests.

Yeah, it's a little harder to make a table-less test for that case.
I thought about using current_user or the like as a stable comparison
value, but that introduces some doubt about what the collation would
be.  That test seems cheap enough as-is, since it's handling only a
tiny amount of data.

Committed.

regards, tom lane




Re: [PATCH] Add support function for containment operators

2024-01-16 Thread jian he
On Wed, Jan 17, 2024 at 5:46 AM Tom Lane  wrote:
>
> But perhaps someone has an argument for a different rule?
>
> Anyway, pending discussion of that point, I think the code is good
> to go.  I don't like the test cases much though: they expend many more
> cycles than necessary.  You could prove the same points just by
> looking at the expansion of expressions, eg.
>

your patch is far better!

IMHO, worried about the support function, the transformed plan
generates the wrong result,
so we add the tests to make it bullet proof.
Now I see your point. If the transformed plan is right, the whole
added code should be fine.
but keeping the textrange_supp related test should be a good idea.
since we don't have SUBTYPE_OPCLASS related sql tests.




Re: [PATCH] Add support function for containment operators

2024-01-16 Thread Tom Lane
jian he  writes:
> [ v5-0001-Simplify-containment-in-range-constants-with-supp.patch ]

I spent some time reviewing and cleaning up this code.  The major
problem I noted was that it doesn't spend any effort thinking about
cases where it'd be unsafe or undesirable to apply the transformation.
In particular, it's entirely uncool to produce a double-sided
comparison if the elemExpr is volatile.  These two expressions
do not have the same result:

select random() <@ float8range(0.1, 0.2);
select random() >= 0.1 and random() < 0.2;

(Yes, I'm aware that BETWEEN is broken in this respect.  All the
more reason why we mustn't break <@.)

Another problem is that even if the elemExpr isn't volatile,
it might be expensive enough that evaluating it twice is bad.
I am not sure where we ought to put the cutoff.  There are some
existing places where we set a 10X cpu_operator_cost limit on
comparable transformations, so I stole that logic in the attached.
But perhaps someone has an argument for a different rule?

Anyway, pending discussion of that point, I think the code is good
to go.  I don't like the test cases much though: they expend many more
cycles than necessary.  You could prove the same points just by
looking at the expansion of expressions, eg.

regression=# explain (verbose, costs off) select current_date <@ 
daterange(null,null);
   QUERY PLAN   

 Result
   Output: true
(2 rows)

regression=# explain (verbose, costs off) select current_date <@ 
daterange('-Infinity', '1997-04-10'::date, '[)');
   QUERY PLAN   
 
-
 Result
   Output: ((CURRENT_DATE >= '-infinity'::date) AND (CURRENT_DATE < 
'1997-04-10'::date))
(2 rows)

I'd suggest losing the temp table and just coding tests like these.

regards, tom lane

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index d99b00b590..2d94a6b877 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -30,19 +30,20 @@
  */
 #include "postgres.h"
 
-#include "access/tupmacs.h"
 #include "common/hashfn.h"
-#include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
+#include "nodes/makefuncs.h"
 #include "nodes/miscnodes.h"
-#include "port/pg_bitutils.h"
+#include "nodes/supportnodes.h"
+#include "optimizer/clauses.h"
+#include "optimizer/cost.h"
+#include "optimizer/optimizer.h"
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
-#include "varatt.h"
 
 
 /* fn_extra cache entry for one of the range I/O functions */
@@ -69,6 +70,12 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval,
 			   char typalign, int16 typlen, char typstorage);
 static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval,
 		   char typalign, int16 typlen, char typstorage);
+static Node *find_simplified_clause(PlannerInfo *root,
+	Expr *rangeExpr, Expr *elemExpr);
+static Expr *build_bound_expr(Expr *elemExpr, Datum val,
+			  bool isLowerBound, bool isInclusive,
+			  TypeCacheEntry *typeCache,
+			  Oid opfamily, Oid rng_collation);
 
 
 /*
@@ -2173,6 +2180,58 @@ make_empty_range(TypeCacheEntry *typcache)
 	return make_range(typcache, &lower, &upper, true, NULL);
 }
 
+/*
+ * Planner support function for elem_contained_by_range (<@ operator).
+ */
+Datum
+elem_contained_by_range_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = NULL;
+
+	if (IsA(rawreq, SupportRequestSimplify))
+	{
+		SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq;
+		FuncExpr   *fexpr = req->fcall;
+		Expr	   *leftop,
+   *rightop;
+
+		Assert(list_length(fexpr->args) == 2);
+		leftop = linitial(fexpr->args);
+		rightop = lsecond(fexpr->args);
+
+		ret = find_simplified_clause(req->root, rightop, leftop);
+	}
+
+	PG_RETURN_POINTER(ret);
+}
+
+/*
+ * Planner support function for range_contains_elem (@> operator).
+ */
+Datum
+range_contains_elem_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = NULL;
+
+	if (IsA(rawreq, SupportRequestSimplify))
+	{
+		SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq;
+		FuncExpr   *fexpr = req->fcall;
+		Expr	   *leftop,
+   *rightop;
+
+		Assert(list_length(fexpr->args) == 2);
+		leftop = linitial(fexpr->args);
+		rightop = lsecond(fexpr->args);
+
+		ret = find_simplified_clause(req->root, leftop, rightop);
+	}
+
+	PG_RETURN_POINTER(ret);
+}
+
 
 /*
  *--
@@ -2715,3 +2774,180 @@ datum_write(Pointer ptr, Datum datum, bool typbyval, char typalign,
 
 	return ptr;
 }
+
+/*
+ * Common code for the elem_contained_by_range and range_contains_elem
+ * support functions.  The ca

Re: [PATCH] Add support function for containment operators

2023-12-16 Thread jian he
fix a typo and also did a minor change.

from
+ if (lowerExpr != NULL && upperExpr != NULL)
+ return (Node *) makeBoolExpr(AND_EXPR, list_make2(lowerExpr, upperExpr),
-1);
+ else if (lowerExpr != NULL)
+ return (Node *) lowerExpr;
+ else if (upperExpr != NULL)
+ return (Node *) upperExpr;

to

+ if (lowerExpr != NULL && upperExpr != NULL)
+ return (Node *) makeBoolExpr(AND_EXPR, list_make2(lowerExpr, upperExpr),
-1);
+ else if (lowerExpr != NULL)
+ return (Node *) lowerExpr;
+ else if (upperExpr != NULL)
+ return (Node *) upperExpr;
+ else
+ {
+ Assert(false);
+ return NULL;
+ }

because cfbot says:

15:04:38.116] make -s -j${BUILD_JOBS} clean
[15:04:38.510] time make -s -j${BUILD_JOBS} world-bin
[15:04:43.272] rangetypes.c:2908:1: error: non-void function does not
return a value in all control paths [-Werror,-Wreturn-type]
[15:04:43.272] }
[15:04:43.272] ^
[15:04:43.272] 1 error generated.

also add some commit messages, I hope it will be useful.
From 334566a62163469f53cfd941ed87e6d6e54d756b Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Mon, 11 Dec 2023 19:14:01 +0800
Subject: [PATCH v5 1/1] Simplify containment in range constants with support function

doc:
https://www.postgresql.org/docs/current/extend-type-system.html#EXTEND-TYPES-POLYMORPHIC
Similarly, if there are positions declared anyrange and others declared anyelement or anyarray,
the actual range type in the anyrange positions must be a range 
whose subtype is the same type appearing in the anyelement positions
and the same as the element type of the anyarray positions.

proname => 'range_contains_elem', prorettype => 'bool', proargtypes => 'anyrange anyelement',
proname => 'elem_contained_by_range', prorettype => 'bool', proargtypes => 'anyelement anyrange'

It will work, because the range constant base type will be the same as the element.
so we can safely rewrite "element within a range or not" 
to element comparison with range lower bound 
and (&&) element comparison with range upper bound.
and these simple same data type comparisons can be supported by btree index.

---
 src/backend/utils/adt/rangetypes.c  | 196 
 src/backend/utils/adt/rangetypes_selfuncs.c |   6 +-
 src/include/catalog/pg_proc.dat |  12 +-
 src/test/regress/expected/rangetypes.out| 193 +++
 src/test/regress/sql/rangetypes.sql |  92 +
 5 files changed, 494 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index 24bad529..0b3eab08 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -31,16 +31,24 @@
 #include "postgres.h"
 
 #include "access/tupmacs.h"
+#include "access/stratnum.h"
+#include "catalog/pg_range.h"
 #include "common/hashfn.h"
 #include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
 #include "nodes/miscnodes.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/pg_list.h"
+#include "nodes/supportnodes.h"
 #include "port/pg_bitutils.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
 #include "utils/date.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
+#include "utils/syscache.h"
 #include "utils/timestamp.h"
 #include "varatt.h"
 
@@ -69,6 +77,10 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval,
 			   char typalign, int16 typlen, char typstorage);
 static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval,
 		   char typalign, int16 typlen, char typstorage);
+static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache,
+			  bool isLowerBound, bool isInclusive,
+			  Datum val, Expr *otherExpr, Oid rng_collation);
+static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr);
 
 
 /*
@@ -2173,6 +2185,57 @@ make_empty_range(TypeCacheEntry *typcache)
 	return make_range(typcache, &lower, &upper, true, NULL);
 }
 
+/*
+ * Planner support function for the elem_contained_by_range (<@) operator.
+ */
+Datum
+elem_contained_by_range_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0),
+			   *leftop,
+			   *rightop;
+	FuncExpr   *fexpr;
+
+	if (!IsA(rawreq, SupportRequestSimplify))
+		PG_RETURN_POINTER(NULL);
+
+	fexpr = ((SupportRequestSimplify *) rawreq)->fcall;
+
+	Assert(list_length(fexpr->args) == 2);
+	leftop = linitial(fexpr->args);
+	rightop = lsecond(fexpr->args);
+
+	if (!IsA(rightop, Const) || ((Const *) rightop)->constisnull)
+		PG_RETURN_POINTER(NULL);
+
+	PG_RETURN_POINTER(find_simplified_clause((Const *) rightop, (Expr *) leftop));
+}
+
+/*
+ * Planner support function for the range_contains_elem (@>) operator.
+ */
+Datum
+range_contains_elem_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0),
+			   *leftop,
+			   *rightop;
+	FuncExpr   *fexpr;
+
+	if (!IsA(rawreq, SupportRequestSimplify))
+		PG_RETURN_POINTER(NULL);
+
+	fexpr = ((SupportRequestSimplify *) rawreq)->fc

Re: [PATCH] Add support function for containment operators

2023-11-12 Thread Kim Johan Andersson

On 12-11-2023 20:20, Laurenz Albe wrote:

On Sun, 2023-11-12 at 18:15 +0100, Laurenz Albe wrote:

I adjusted your patch according to my comments; what do you think?


I have added the patch to the January commitfest, with Jian and Kim as authors.
I hope that is OK with you.


Sounds great to me. Thanks to Jian for picking this up.

Regards,
Kim Johan Andersson






Re: [PATCH] Add support function for containment operators

2023-11-12 Thread Laurenz Albe
On Sun, 2023-11-12 at 18:15 +0100, Laurenz Albe wrote:
> I adjusted your patch according to my comments; what do you think?

I have added the patch to the January commitfest, with Jian and Kim as authors.
I hope that is OK with you.

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-11-12 Thread Laurenz Albe
On Fri, 2023-10-20 at 16:24 +0800, jian he wrote:
> [new patch]

Thanks, that patch works as expected and passes regression tests.

Some comments about the code:

> --- a/src/backend/utils/adt/rangetypes.c
> +++ b/src/backend/utils/adt/rangetypes.c
> @@ -558,7 +570,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS)
>   PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val));
>  }
>  
> -
>  /* range, range -> bool functions */
>  
>  /* equality (internal version) */

Please don't change unrelated whitespace.

> +static Node *
> +find_simplified_clause(Const *rangeConst, Expr *otherExpr)
> +{
> + Form_pg_range pg_range;
> + HeapTuple   tup;
> + Oid opclassOid;
> + RangeBound  lower;
> + RangeBound  upper;
> + boolempty;
> + Oid rng_collation;
> + TypeCacheEntry *elemTypcache;
> + Oid opfamily =  InvalidOid;
> +
> + RangeType  *range = DatumGetRangeTypeP(rangeConst->constvalue);
> + TypeCacheEntry *rangetypcache = 
> lookup_type_cache(RangeTypeGetOid(range), TYPECACHE_RANGE_INFO);
> + {

This brace is unnecessary.  Perhaps a leftover from a removed conditional 
statement.

> + /* this part is get the range's SUBTYPE_OPCLASS from pg_range 
> catalog.
> +  * Refer load_rangetype_info function last line.
> +  * TypeCacheEntry->rngelemtype typcaheenetry either don't have 
> opclass entry or with default opclass.
> +  * Range's subtype opclass only in catalog table.
> + */

The comments in the patch need some more love.
Apart from the language, you should have a look at the style guide:

- single-line comments should start with lower case and have no period:

  /* example of a single-line comment */

- Multi-line comments should start with /* on its own line and end with */ on 
its
  own line.  They should use whole sentences:

  /*
   * In case a comment spans several lines, it should look like
   * this.  Try not to exceed 80 characters.
   */

> + tup = SearchSysCache1(RANGETYPE, 
> ObjectIdGetDatum(RangeTypeGetOid(range)));
> +
> + /* should not fail, since we already checked typtype ... */
> + if (!HeapTupleIsValid(tup))
> + elog(ERROR, "cache lookup failed for range type %u", 
> RangeTypeGetOid(range));

If this is a "can't happen" case, it should be an Assert.

> +
> + pg_range = (Form_pg_range) GETSTRUCT(tup);
> +
> + opclassOid = pg_range->rngsubopc;
> +
> + ReleaseSysCache(tup);
> +
> + /* get opclass properties and look up the comparison function */
> + opfamily = get_opclass_family(opclassOid);
> + }
> +
> + range_deserialize(rangetypcache, range, &lower, &upper, &empty);
> + rng_collation = rangetypcache->rng_collation;
> +
> + if (empty)
> + {
> + /* If the range is empty, then there can be no matches. */
> + return makeBoolConst(false, false);
> + }
> + else if (lower.infinite && upper.infinite)
> + {
> + /* The range has no bounds, so matches everything. */
> + return makeBoolConst(true, false);
> + }
> + else
> + {

Many of the variables declared at the beginning of the function are only used in
this branch.  You should declare them here.

> +static Node *
> +match_support_request(Node *rawreq)
> +{
> + if (IsA(rawreq, SupportRequestSimplify))
> + {

To keep the indentation shallow, the preferred style is:

  if (/* case we don't handle */)
  return NULL;
  /* proceed without indentation */

> + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq;
> + FuncExpr   *fexpr = req->fcall;
> + Node   *leftop;
> + Node   *rightop;
> + Const  *rangeConst;
> + Expr   *otherExpr;
> +
> + Assert(list_length(fexpr->args) == 2);
> +
> + leftop = linitial(fexpr->args);
> + rightop = lsecond(fexpr->args);
> +
> + switch (fexpr->funcid)
> + {
> + case F_ELEM_CONTAINED_BY_RANGE:
> + if (!IsA(rightop, Const) || ((Const *) 
> rightop)->constisnull)
> + return NULL;
> +
> + rangeConst = (Const *) rightop;
> + otherExpr = (Expr *) leftop;
> + break;
> +
> + case F_RANGE_CONTAINS_ELEM:
> + if (!IsA(leftop, Const) || ((Const *) 
> leftop)->constisnull)
> + return NULL;
> +
> + rangeConst = (Const *) leftop;
> + otherExpr = (Expr *) rightop;
> + break;
> +
> + default:
> +   

Re: [PATCH] Add support function for containment operators

2023-10-20 Thread jian he
On Fri, Oct 20, 2023 at 12:01 AM Laurenz Albe  wrote:
>
> On Fri, 2023-10-13 at 14:26 +0800, jian he wrote:
> > Collation problem seems solved.
>
> I didn't review your patch in detail, there is still a problem
> with my example:
>
>   CREATE TYPE textrange AS RANGE (
>  SUBTYPE = text,
>  SUBTYPE_OPCLASS = text_pattern_ops
>   );
>
>   CREATE TABLE tx (t text COLLATE "cs-CZ-x-icu");
>
>   INSERT INTO tx VALUES ('a'), ('c'), ('d'), ('ch');
>
>   SELECT * FROM tx WHERE t <@ textrange('a', 'd');
>
>t
>   
>a
>c
>ch
>   (3 rows)
>
> That was correct.
>
>   EXPLAIN SELECT * FROM tx WHERE t <@ textrange('a', 'd');
>
>QUERY PLAN
>   
>Seq Scan on tx  (cost=0.00..30.40 rows=7 width=32)
>  Filter: ((t >= 'a'::text) AND (t < 'd'::text))
>   (2 rows)
>
> But that was weird.  The operators seem wrong.  Look at that

Thanks for pointing this out!

The problem is that TypeCacheEntry->rngelemtype typcaheentry don't
have the range's SUBTYPE_OPCLASS info.
So in find_simplified_clause, we need to get the range's
SUBTYPE_OPCLASS from the pg_catalog table.
Also in pg_range, column rngsubopc  is not null. so this should be fine.
From 810208a42e99109bcaf56cddae6968efbcf969c5 Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Fri, 20 Oct 2023 16:20:38 +0800
Subject: [PATCH v3 1/1]  Optimize quals (Expr <@ RangeConst) and (RangeConst
 @> Expr)

Previously these two quals will be processed as is.
This patch will rewritten the expression to expose more info to optimizer by
adding prosupport function to range_contains_elem, elem_contained_by_range.

Expr <@ rangeConst will be rewritten to "expr opr range_lower_bound and expr opr
range_upper_bound". (range bound inclusiveness will be handled properly).

Added some tests to validate the generated plan.
---
 src/backend/utils/adt/rangetypes.c  | 229 +++-
 src/backend/utils/adt/rangetypes_selfuncs.c |   6 +-
 src/include/catalog/pg_proc.dat |  12 +-
 src/test/regress/expected/rangetypes.out| 194 +
 src/test/regress/sql/rangetypes.sql | 101 +
 5 files changed, 535 insertions(+), 7 deletions(-)

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index d65e5625..d100e55e 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -31,16 +31,24 @@
 #include "postgres.h"
 
 #include "access/tupmacs.h"
+#include "access/stratnum.h"
+#include "catalog/pg_range.h"
 #include "common/hashfn.h"
 #include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
 #include "nodes/miscnodes.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/pg_list.h"
+#include "nodes/supportnodes.h"
 #include "port/pg_bitutils.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
 #include "utils/date.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
+#include "utils/syscache.h"
 #include "utils/timestamp.h"
 #include "varatt.h"
 
@@ -69,7 +77,11 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval,
 			   char typalign, int16 typlen, char typstorage);
 static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval,
 		   char typalign, int16 typlen, char typstorage);
-
+static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache,
+			  bool isLowerBound, bool isInclusive,
+			  Datum val, Expr *otherExpr, Oid rng_collation);
+static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr);
+static Node *match_support_request(Node *rawreq);
 
 /*
  *--
@@ -558,7 +570,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val));
 }
 
-
 /* range, range -> bool functions */
 
 /* equality (internal version) */
@@ -2173,6 +2184,29 @@ make_empty_range(TypeCacheEntry *typcache)
 	return make_range(typcache, &lower, &upper, true, NULL);
 }
 
+/*
+ * Planner support function for elem_contained_by_range operator
+ */
+Datum
+elem_contained_by_range_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = match_support_request(rawreq);
+
+	PG_RETURN_POINTER(ret);
+}
+
+/*
+ * Planner support function for range_contains_elem operator
+ */
+Datum
+range_contains_elem_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = match_support_request(rawreq);
+
+	PG_RETURN_POINTER(ret);
+}
 
 /*
  *--
@@ -2714,3 +2748,194 @@ datum_write(Pointer ptr, Datum datum, bool typbyval, char typalign,
 
 	return ptr;
 }
+/*
+ * build (Expr Operator Range_bound) expression. something like: expr >= lower(range)
+ * if operator both sides have collation, operator should use (right) range's collation
+ *
+*/
+static Expr *
+build_bound_expr(O

Re: [PATCH] Add support function for containment operators

2023-10-19 Thread Laurenz Albe
On Fri, 2023-10-13 at 14:26 +0800, jian he wrote:
> Collation problem seems solved.

I didn't review your patch in detail, there is still a problem
with my example:

  CREATE TYPE textrange AS RANGE (
 SUBTYPE = text,
 SUBTYPE_OPCLASS = text_pattern_ops
  );

  CREATE TABLE tx (t text COLLATE "cs-CZ-x-icu");

  INSERT INTO tx VALUES ('a'), ('c'), ('d'), ('ch');

  SELECT * FROM tx WHERE t <@ textrange('a', 'd');

   t  
  
   a
   c
   ch
  (3 rows)

That was correct.

  EXPLAIN SELECT * FROM tx WHERE t <@ textrange('a', 'd');

   QUERY PLAN 
  
   Seq Scan on tx  (cost=0.00..30.40 rows=7 width=32)
 Filter: ((t >= 'a'::text) AND (t < 'd'::text))
  (2 rows)

But that was weird.  The operators seem wrong.  Look at that
query:

  SELECT * FROM tx WHERE t >= 'a' AND t < 'd';

   t 
  ═══
   a
   c
  (2 rows)

But the execution plan is identical...

I am not sure what is the problem here, but in my opinion the
operators shown in the execution plan should be like this:

  SELECT * FROM tx WHERE t ~>=~ 'a' AND t ~<~ 'd';

   t  
  
   a
   c
   ch
  (3 rows)

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-10-12 Thread jian he
On Tue, Aug 1, 2023 at 10:07 AM Laurenz Albe  wrote:
> >
> >
> > > I had an idea about this:
> > > So far, you only consider constant ranges.  But if we have a STABLE range
> > > expression, you could use an index scan for "expr <@ range", for example
> > > Index Cond (expr >= lower(range) AND expr < upper(range)).
> > >

The above part, not sure how to implement it, not sure it is doable.

Refactor:
drop SupportRequestIndexCondition and related code, since mentioned in
upthread, it's dead code.
refactor the regression test. (since data types with infinity cover
more cases than int4range, so I deleted some tests).

now there are 3 helper functions (build_bound_expr,
find_simplified_clause, match_support_request), 2 entry functions
(elem_contained_by_range_support, range_contains_elem_support)

Collation problem seems solved. Putting the following test on the
src/test/regress/sql/rangetypes.sql will not work. Maybe because of
the order of the regression test, in SQL-ASCII encoding, I cannot
create collation="cs-CZ-x-icu".

drop type if EXISTS textrange1, textrange2;
drop table if EXISTS collate_test1, collate_test2;
CREATE TYPE textrange1 AS RANGE (SUBTYPE = text, collation="C");
create type textrange2 as range(subtype=text, collation="cs-CZ-x-icu");
CREATE TABLE collate_test1 (b text COLLATE "en-x-icu" NOT NULL);
INSERT INTO collate_test1(b) VALUES ('a'), ('c'), ('d'), ('ch');
CREATE TABLE collate_test2 (b text NOT NULL);
INSERT INTO collate_test2(b) VALUES ('a'), ('c'), ('d'), ('ch');

--should include 'ch'
SELECT * FROM collate_test1 WHERE b <@ textrange1('a', 'd');
--should not include 'ch'
SELECT * FROM collate_test1 WHERE b <@ textrange2('a', 'd');
--should include 'ch'
SELECT * FROM collate_test2 WHERE b <@ textrange1('a', 'd');
--should not include 'ch'
SELECT * FROM collate_test2 WHERE b <@ textrange2('a', 'd');
-
From bcae7f8a0640b48b04f243660539e8670de43d41 Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Fri, 13 Oct 2023 12:43:41 +0800
Subject: [PATCH v2 1/1] Optimize qual cases like Expr <@ RangeConst and
 RangeConst @> Expr

Previously these two quals will be processed as is.
With this patch, adding prosupport
function to range_contains_elem, elem_contained_by_range.

So cases like Expr <@ rangeCOnst, rangeConst @> expr
will be rewritten to "expr opr range_lower_bound and expr opr
range_upper_bound".

Expr <@ rangeConst will be logically equivalent to rangeConst @> Expr,
if rangeConst is the same. added some tests to validate the generated
plan.
---
 src/backend/utils/adt/rangetypes.c  | 199 +++-
 src/backend/utils/adt/rangetypes_selfuncs.c |   6 +-
 src/include/catalog/pg_proc.dat |  12 +-
 src/test/regress/expected/rangetypes.out| 194 +++
 src/test/regress/sql/rangetypes.sql | 101 ++
 5 files changed, 505 insertions(+), 7 deletions(-)

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index d65e5625..647bd5e4 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -31,13 +31,19 @@
 #include "postgres.h"
 
 #include "access/tupmacs.h"
+#include "access/stratnum.h"
 #include "common/hashfn.h"
 #include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
+#include "nodes/supportnodes.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/pg_list.h"
 #include "nodes/miscnodes.h"
 #include "port/pg_bitutils.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
 #include "utils/date.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
@@ -69,7 +75,11 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval,
 			   char typalign, int16 typlen, char typstorage);
 static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval,
 		   char typalign, int16 typlen, char typstorage);
-
+static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache,
+			  bool isLowerBound, bool isInclusive,
+			  Datum val, Expr *otherExpr, Oid rng_collation);
+static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr);
+static Node *match_support_request(Node *rawreq);
 
 /*
  *--
@@ -558,7 +568,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val));
 }
 
-
 /* range, range -> bool functions */
 
 /* equality (internal version) */
@@ -2173,6 +2182,29 @@ make_empty_range(TypeCacheEntry *typcache)
 	return make_range(typcache, &lower, &upper, true, NULL);
 }
 
+/*
+ * Planner support function for elem_contained_by_range operator
+ */
+Datum
+elem_contained_by_range_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = match_support_request(rawreq);
+
+	PG_RETURN_POINTER(ret);
+}
+
+/*
+ * Planner support function for range_contains_elem operator
+ */
+Datum
+range_contains_elem_support(PG_FUNCT

Re: [PATCH] Add support function for containment operators

2023-07-31 Thread Laurenz Albe
On Sat, 2023-07-08 at 08:11 +0200, Kim Johan Andersson wrote:
> On 07-07-2023 13:20, Laurenz Albe wrote:
> > I wrote:
> > > You implement both "SupportRequestIndexCondition" and 
> > > "SupportRequestSimplify",
> > > but when I experimented, the former was never called.  That does not
> > > surprise me, since any expression of the shape "expr <@ range constant"
> > > can be simplified.  Is the "SupportRequestIndexCondition" branch dead 
> > > code?
> > > If not, do you have an example that triggers it?
> 
> I would think it is dead code, I came to the same conclusion. So we can 
> drop SupportRequestIndexCondition, since the simplification happens to 
> take care of everything.
> 
> 
> > I had an idea about this:
> > So far, you only consider constant ranges.  But if we have a STABLE range
> > expression, you could use an index scan for "expr <@ range", for example
> > Index Cond (expr >= lower(range) AND expr < upper(range)).
> > 
> 
> I will try to look into this. Originally that was what I was hoping for, 
> but didn't see way of going about it.
> 
> Thanks for your comments, I will also look at the locale-related 
> breakage you spotted.

I have marked the patch as "returned with feedback".

I encourage you to submit an improved version in a future commitfest!

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-07-07 Thread Kim Johan Andersson

On 07-07-2023 13:20, Laurenz Albe wrote:

I wrote:

You implement both "SupportRequestIndexCondition" and "SupportRequestSimplify",
but when I experimented, the former was never called.  That does not
surprise me, since any expression of the shape "expr <@ range constant"
can be simplified.  Is the "SupportRequestIndexCondition" branch dead code?
If not, do you have an example that triggers it?


I would think it is dead code, I came to the same conclusion. So we can 
drop SupportRequestIndexCondition, since the simplification happens to 
take care of everything.




I had an idea about this:
So far, you only consider constant ranges.  But if we have a STABLE range
expression, you could use an index scan for "expr <@ range", for example
Index Cond (expr >= lower(range) AND expr < upper(range)).



I will try to look into this. Originally that was what I was hoping for, 
but didn't see way of going about it.


Thanks for your comments, I will also look at the locale-related 
breakage you spotted.


Regards,
Kimjand




Re: [PATCH] Add support function for containment operators

2023-07-07 Thread Laurenz Albe
I wrote:
> You implement both "SupportRequestIndexCondition" and 
> "SupportRequestSimplify",
> but when I experimented, the former was never called.  That does not
> surprise me, since any expression of the shape "expr <@ range constant"
> can be simplified.  Is the "SupportRequestIndexCondition" branch dead code?
> If not, do you have an example that triggers it?

I had an idea about this:
So far, you only consider constant ranges.  But if we have a STABLE range
expression, you could use an index scan for "expr <@ range", for example
Index Cond (expr >= lower(range) AND expr < upper(range)).

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-07-06 Thread Laurenz Albe
> On Sat, 2023-04-29 at 17:07 +0200, Kim Johan Andersson wrote:
> > I had noticed that performance wasn't great when using the @> or <@ 
> > operators when examining if an element is contained in a range.
> > Based on the discussion in [1] I would like to suggest the following 
> > changes:
> > 
> > This patch attempts to improve the row estimation, as well as opening 
> > the possibility of using a btree index scan when using the containment 
> > operators.

I managed to break the patch:

CREATE DATABASE czech ICU_LOCALE "cs-CZ" LOCALE "cs_CZ.utf8" TEMPLATE template0;

\c czech

CREATE TYPE textrange AS RANGE (SUBTYPE = text, SUBTYPE_OPCLASS = 
text_pattern_ops);

CREATE TABLE tx (t text);

INSERT INTO tx VALUES ('a'), ('c'), ('d'), ('ch');

EXPLAIN SELECT * FROM tx WHERE t <@ textrange('a', 'd');

 QUERY PLAN 

 Seq Scan on tx  (cost=0.00..30.40 rows=7 width=32)
   Filter: ((t >= 'a'::text) AND (t < 'd'::text))
(2 rows)

SELECT * FROM tx WHERE t <@ textrange('a', 'd');
ERROR:  could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.


The replacement operators are wrong; it should be ~>=~ and ~<~ .

Also, there should be no error message.
The result should be 'a', 'c' and 'ch'.

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-07-06 Thread Laurenz Albe
On Sat, 2023-04-29 at 17:07 +0200, Kim Johan Andersson wrote:
> I had noticed that performance wasn't great when using the @> or <@ 
> operators when examining if an element is contained in a range.
> Based on the discussion in [1] I would like to suggest the following 
> changes:
> 
> This patch attempts to improve the row estimation, as well as opening 
> the possibility of using a btree index scan when using the containment 
> operators.
> 
> This is done via a new support function handling the following 2 requests:
> 
> * SupportRequestIndexCondition
> find_index_quals will build an operator clause, given at least one 
> finite RangeBound.
> 
> * SupportRequestSimplify
> find_simplified_clause will rewrite the containment operator into a 
> clause using inequality operators from the btree family (if available 
> for the element type).
> 
> A boolean constant is returned if the range is either empty or has no 
> bounds.
> 
> Performing the rewrite here lets the clausesel machinery provide the 
> same estimates as for normal scalar inequalities.
> 
> In both cases build_bound_expr is used to build the operator clauses 
> from RangeBounds.

I think that this is a small, but useful improvement.

The patch applies and builds without warning and passes "make 
installcheck-world"
with the (ample) new regression tests.

Some comments:

- About the regression tests:
  You are using EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF, COSTS OFF).
  While that returns stable results, I don't see the added value.
  I think you should use EXPLAIN (COSTS OFF).  You don't need to test the
  actual number of result rows; we can trust that the executor  processes
  >= and < correctly.
  Plain EXPLAIN would speed up the regression tests, which is a good thing.

- About the implementation:
  You implement both "SupportRequestIndexCondition" and 
"SupportRequestSimplify",
  but when I experimented, the former was never called.  That does not
  surprise me, since any expression of the shape "expr <@ range constant"
  can be simplified.  Is the "SupportRequestIndexCondition" branch dead code?
  If not, do you have an example that triggers it?

- About the code:

  +static Node *
  +find_index_quals(Const *rangeConst, Expr *otherExpr, Oid opfamily)
  +{
  [...]
  +
  +   if (!(lower.infinite && upper.infinite))
  +   {
 [...]
  +   }
  +
  +   return NULL;

  To avoid deep indentation and to make the code more readable, I think
  it would be better to write

  if (!(lower.infinite && upper.infinite))
  return NULL;

  and unindent the rest of the code


  +static Node *
  +match_support_request(Node *rawreq)
  +{
  [...]
  +   switch (req->funcid)
  +   {
  +   case F_ELEM_CONTAINED_BY_RANGE:
  [...]
  +   case F_RANGE_CONTAINS_ELEM:
  [...]
  +   default:
  +   return NULL;
  +   }

  (This code appears twice.)

  The default clause should not be reachable, right?
  I think that there should be an Assert() to verify that.
  Perhaps something like

  Assert(req->funcid == F_ELEM_CONTAINED_BY_RANGE ||
 req->funcid == F_RANGE_CONTAINS_ELEM);

  if (req->funcid == F_ELEM_CONTAINED_BY_RANGE)
  {
  [...]
  }
  else if (req->funcid == F_RANGE_CONTAINS_ELEM)
  {
  [...]
  }

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-07-06 Thread Kim Johan Andersson
Any thoughts on this change?

[PATCH] Add support function for containment operators

2023-04-29 Thread Kim Johan Andersson
I had noticed that performance wasn't great when using the @> or <@ 
operators when examining if an element is contained in a range.
Based on the discussion in [1] I would like to suggest the following 
changes:


This patch attempts to improve the row estimation, as well as opening 
the possibility of using a btree index scan when using the containment 
operators.


This is done via a new support function handling the following 2 requests:

* SupportRequestIndexCondition
find_index_quals will build an operator clause, given at least one 
finite RangeBound.


* SupportRequestSimplify
find_simplified_clause will rewrite the containment operator into a 
clause using inequality operators from the btree family (if available 
for the element type).


A boolean constant is returned if the range is either empty or has no 
bounds.


Performing the rewrite here lets the clausesel machinery provide the 
same estimates as for normal scalar inequalities.


In both cases build_bound_expr is used to build the operator clauses 
from RangeBounds.


Thanks to Laurenz Albe for giving the patch a look before submission.

[1] 
https://www.postgresql.org/message-id/222c75fd-43b8-db3e-74a6-bb4fe22f7...@kimmet.dk


Regards,
Kim Johan Anderssondiff --git a/src/backend/utils/adt/rangetypes.c 
b/src/backend/utils/adt/rangetypes.c
index d65e5625c7..f20ca76d87 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -31,13 +31,19 @@
 #include "postgres.h"
 
 #include "access/tupmacs.h"
+#include "access/stratnum.h"
 #include "common/hashfn.h"
 #include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
+#include "nodes/supportnodes.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/pg_list.h"
 #include "nodes/miscnodes.h"
 #include "port/pg_bitutils.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
 #include "utils/date.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
@@ -69,7 +75,13 @@ static Size datum_compute_size(Size data_length, Datum val, 
bool typbyval,
   char typalign, int16 
typlen, char typstorage);
 static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval,
   char typalign, int16 typlen, 
char typstorage);
-
+static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache,
+ bool isLowerBound, 
bool isInclusive,
+ Datum val, Expr 
*otherExpr);
+static Node *find_index_quals(Const *rangeConst, Expr *otherExpr,
+ Oid opfamily);
+static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr);
+static Node *match_support_request(Node *rawreq);
 
 /*
  *--
@@ -558,7 +570,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val));
 }
 
-
 /* range, range -> bool functions */
 
 /* equality (internal version) */
@@ -2173,6 +2184,29 @@ make_empty_range(TypeCacheEntry *typcache)
return make_range(typcache, &lower, &upper, true, NULL);
 }
 
+/*
+ * Planner support function for elem_contained_by_range operator
+ */
+Datum
+elem_contained_by_range_support(PG_FUNCTION_ARGS)
+{
+   Node   *rawreq = (Node *) PG_GETARG_POINTER(0);
+   Node   *ret = match_support_request(rawreq);
+
+   PG_RETURN_POINTER(ret);
+}
+
+/*
+ * Planner support function for range_contains_elem operator
+ */
+Datum
+range_contains_elem_support(PG_FUNCTION_ARGS)
+{
+   Node   *rawreq = (Node *) PG_GETARG_POINTER(0);
+   Node   *ret = match_support_request(rawreq);
+
+   PG_RETURN_POINTER(ret);
+}
 
 /*
  *--
@@ -2714,3 +2748,277 @@ datum_write(Pointer ptr, Datum datum, bool typbyval, 
char typalign,
 
return ptr;
 }
+
+static Expr *
+build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache, bool isLowerBound, 
bool isInclusive, Datum val, Expr *otherExpr)
+{
+   Oid elemType = typeCache->type_id;
+   int16   elemTypeLen = typeCache->typlen;
+   boolelemByValue = typeCache->typbyval;
+   Oid elemCollation = typeCache->typcollation;
+   int16   strategy;
+   Oid oproid;
+   Expr   *constExpr;
+
+   if (isLowerBound)
+   strategy = isInclusive ? BTGreaterEqualStrategyNumber : 
BTGreaterStrategyNumber;
+   else
+   strategy = isInclusive ? BTLessEqualStrategyNumber : 
BTLessStrategyNumber;
+
+   oproid = get_opfamily_member(opfamily, elemType, elemType, strategy);
+
+   if (!OidIsValid(oproid))
+   return NULL;
+
+   constExpr = (Expr *) makeConst(