Re: Prefix operator for text and spgist support

2018-04-16 Thread Ildus Kurbangaliev
On Mon, 16 Apr 2018 12:45:23 +0200
Emre Hasegeli  wrote:

> > Thank you, pushed with some editorization and renaming
> > text_startswith to starts_with  
> 
> I am sorry for not noticing this before, but what is the point of this
> operator?  It seems to me we are only making the prefix searching
> business, which is already complicated, more complicated.

Hi.

> 
> Also, the new operator is not documented on SQL String Functions and
> Operators table.  It is not supported by btree text_pattern_ops or
> btree indexes with COLLATE "C".  It is not defined for "citext", so
> people would get wrong results.  It doesn't use pg_trgm indexes
> whereas LIKE can.

It is mentioned in documentation, look for "starts_with" function.
Currently it's working with spgist indexes which fact is pointed out in
the documentation too. I was going to add btree support but it would
require a new strategy so it will be matter of another patch. I think
this operator could be used in LIKE instead of current weird comparison
operators.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: Prefix operator for text and spgist support

2018-04-16 Thread Emre Hasegeli
> Thank you, pushed with some editorization and renaming text_startswith to
> starts_with

I am sorry for not noticing this before, but what is the point of this
operator?  It seems to me we are only making the prefix searching
business, which is already complicated, more complicated.

Also, the new operator is not documented on SQL String Functions and
Operators table.  It is not supported by btree text_pattern_ops or
btree indexes with COLLATE "C".  It is not defined for "citext", so
people would get wrong results.  It doesn't use pg_trgm indexes
whereas LIKE can.



Re: Prefix operator for text and spgist support

2018-04-03 Thread Teodor Sigaev
Thank you, pushed with some editorization and renaming text_startswith to 
starts_with


Ildus Kurbangaliev wrote:

On Fri, 23 Mar 2018 11:45:33 +0300
Alexander Korotkov  wrote:


On Thu, Mar 22, 2018 at 7:09 PM, Teodor Sigaev 
wrote:


Patch looks resonable, but I see some place to improvement:
spg_text_leaf_consistent() only needs to check with
text_startswith() if reconstucted value came to leaf consistent is
shorter than given prefix. For example, if level >= length of
prefix then we guarantee that fully reconstracted is matched too.
But do not miss that you may need to return value for index only
scan, consult returnData field

In attachment rebased and minorly edited version of your patch.



I took a look at this patch.  In addition to Teodor's comments I can
note following.

* This line looks strange for me.

-   if (strategy > 10)
+   if (strategy > 10 && strategy !=
RTPrefixStrategyNumber)

It's not because we added strategy != RTPrefixStrategyNumber condition
there.
It's because we already used magic number here and now have a mix of
magic number and macro constant in one line.  Once we anyway touch
this place, could we get rid of magic numbers here?

* I'm a little concern about operator name.  We're going to choose @^
operator for
prefix search without any preliminary discussion.  However,
personally I don't
have better ideas :)


Teodor, Alexander, thanks for review. In new version I have added the
optimization in spgist using level variable and also got rid of magic
numbers.

About the operator it's actually ^@ (not @^ :)), I thought about it and
don't really have any idea what operator can be used instead.

Attached version 5 of the patch.



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/



Re: Prefix operator for text and spgist support

2018-03-29 Thread Ildus Kurbangaliev
On Fri, 23 Mar 2018 11:45:33 +0300
Alexander Korotkov  wrote:

> On Thu, Mar 22, 2018 at 7:09 PM, Teodor Sigaev 
> wrote:
> 
> > Patch looks resonable, but I see some place to improvement:
> > spg_text_leaf_consistent() only needs to check with
> > text_startswith() if reconstucted value came to leaf consistent is
> > shorter than given prefix. For example, if level >= length of
> > prefix then we guarantee that fully reconstracted is matched too.
> > But do not miss that you may need to return value for index only
> > scan, consult returnData field
> >
> > In attachment rebased and minorly edited version of your patch.  
> 
> 
> I took a look at this patch.  In addition to Teodor's comments I can
> note following.
> 
> * This line looks strange for me.
> 
> - if (strategy > 10)
> + if (strategy > 10 && strategy !=
> RTPrefixStrategyNumber)
> 
> It's not because we added strategy != RTPrefixStrategyNumber condition
> there.
> It's because we already used magic number here and now have a mix of
> magic number and macro constant in one line.  Once we anyway touch
> this place, could we get rid of magic numbers here?
> 
> * I'm a little concern about operator name.  We're going to choose @^
> operator for
> prefix search without any preliminary discussion.  However,
> personally I don't
> have better ideas :)

Teodor, Alexander, thanks for review. In new version I have added the
optimization in spgist using level variable and also got rid of magic
numbers.

About the operator it's actually ^@ (not @^ :)), I thought about it and
don't really have any idea what operator can be used instead.

Attached version 5 of the patch.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9d1772f349..a1b4724a88 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -2274,6 +2274,21 @@
ph
   
 
+  
+   
+
+ text_startswith
+
+text_startswith(string, prefix)
+   
+   bool
+   
+Returns true if string starts with prefix.
+   
+   text_startswith('alphabet', 'alph')
+   t
+  
+
   

 
@@ -4033,6 +4048,12 @@ cast(-44 as bit(12))   11010100
 ILIKE, respectively.  All of these operators are
 PostgreSQL-specific.

+
+   
+There is also the prefix operator ^@ and corresponding
+text_startswith function which covers cases when only
+searching by beginning of the string is needed.
+   
   
 
 
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index e47f70be89..06b7519052 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -161,6 +161,7 @@
~<~
~>=~
~>~
+   ^@
   
  
  
diff --git a/src/backend/access/spgist/spgtextproc.c b/src/backend/access/spgist/spgtextproc.c
index f156b2166e..c5e27dd7aa 100644
--- a/src/backend/access/spgist/spgtextproc.c
+++ b/src/backend/access/spgist/spgtextproc.c
@@ -67,6 +67,20 @@
  */
 #define SPGIST_MAX_PREFIX_LENGTH	Max((int) (BLCKSZ - 258 * 16 - 100), 32)
 
+/*
+ * Strategy for collation aware operator on text is equal to btree strategy
+ * plus value of 10.
+ *
+ * Current collation aware strategies and their corresponding btree strategies:
+ * 11 BTLessStrategyNumber
+ * 12 BTLessEqualStrategyNumber
+ * 14 BTGreaterEqualStrategyNumber
+ * 15 BTGreaterStrategyNumber
+ */
+#define SPG_STRATEGY_ADDITION	(10)
+#define SPG_IS_COLLATION_AWARE_STRATEGY(s) ((s) > SPG_STRATEGY_ADDITION \
+		 && (s) != RTPrefixStrategyNumber)
+
 /* Struct for sorting values in picksplit */
 typedef struct spgNodePtr
 {
@@ -496,10 +510,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 			 * well end with a partial multibyte character, so that applying
 			 * any encoding-sensitive test to it would be risky anyhow.)
 			 */
-			if (strategy > 10)
+			if (SPG_IS_COLLATION_AWARE_STRATEGY(strategy))
 			{
 if (collate_is_c)
-	strategy -= 10;
+	strategy -= SPG_STRATEGY_ADDITION;
 else
 	continue;
 			}
@@ -526,6 +540,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 	if (r < 0)
 		res = false;
 	break;
+case RTPrefixStrategyNumber:
+	if (r != 0)
+		res = false;
+	break;
 default:
 	elog(ERROR, "unrecognized strategy number: %d",
 		 in->scankeys[j].sk_strategy);
@@ -605,10 +623,31 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS)
 		int			queryLen = VARSIZE_ANY_EXHDR(query);
 		int			r;
 
-		if (strategy > 10)
+		if (strategy == RTPrefixStrategyNumber)
+		{
+			if (!in->returnData && level >= queryLen)
+			{
+/*
+ * If we got to the leaf when level is equal or longer than
+ * query then it is a prefix. We skip cases when returnData is
+ * true, it would mean then leaf could be visited anyway.
+ */
+continue;
+			}
+
+			res = DatumGetBool(DirectFunctionCall2(text_sta

Re: Prefix operator for text and spgist support

2018-03-23 Thread Alexander Korotkov
On Thu, Mar 22, 2018 at 7:09 PM, Teodor Sigaev  wrote:

> Patch looks resonable, but I see some place to improvement:
> spg_text_leaf_consistent() only needs to check with text_startswith() if
> reconstucted value came to leaf consistent is shorter than given prefix.
> For example, if level >= length of prefix then we guarantee that fully
> reconstracted is matched too. But do not miss that you may need to return
> value for index only scan, consult returnData field
>
> In attachment rebased and minorly edited version of your patch.


I took a look at this patch.  In addition to Teodor's comments I can note
following.

* This line looks strange for me.

-   if (strategy > 10)
+   if (strategy > 10 && strategy != RTPrefixStrategyNumber)

It's not because we added strategy != RTPrefixStrategyNumber condition
there.
It's because we already used magic number here and now have a mix of magic
number and macro constant in one line.  Once we anyway touch this place,
could we get rid of magic numbers here?

* I'm a little concern about operator name.  We're going to choose @^
operator for
prefix search without any preliminary discussion.  However, personally I
don't
have better ideas :)

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: Prefix operator for text and spgist support

2018-03-22 Thread Teodor Sigaev

Hi!

Patch looks resonable, but I see some place to improvement:
spg_text_leaf_consistent() only needs to check with text_startswith() if 
reconstucted value came to leaf consistent is shorter than given prefix. For 
example, if level >= length of prefix then we guarantee that fully reconstracted 
is matched too. But do not miss that you may need to return value for index only 
scan, consult returnData field


In attachment rebased and minorly edited version of your patch.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2f59af25a6..4dc11d8df2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -2274,6 +2274,21 @@
ph
   
 
+  
+   
+
+ text_startswith
+
+text_startswith(string, prefix)
+   
+   bool
+   
+Returns true if string starts with prefix.
+   
+   text_startswith('alphabet', 'alph')
+   t
+  
+
   

 
@@ -4033,6 +4048,12 @@ cast(-44 as bit(12))   11010100
 ILIKE, respectively.  All of these operators are
 PostgreSQL-specific.

+
+   
+There is also the prefix operator ^@ and corresponding
+text_startswith function which covers cases when only
+searching by beginning of the string is needed.
+   
   
 
 
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index e47f70be89..06b7519052 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -161,6 +161,7 @@
~<~
~>=~
~>~
+   ^@
   
  
  
diff --git a/src/backend/access/spgist/spgtextproc.c b/src/backend/access/spgist/spgtextproc.c
index f156b2166e..f92ff68a5f 100644
--- a/src/backend/access/spgist/spgtextproc.c
+++ b/src/backend/access/spgist/spgtextproc.c
@@ -496,7 +496,7 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 			 * well end with a partial multibyte character, so that applying
 			 * any encoding-sensitive test to it would be risky anyhow.)
 			 */
-			if (strategy > 10)
+			if (strategy > 10 && strategy != RTPrefixStrategyNumber)
 			{
 if (collate_is_c)
 	strategy -= 10;
@@ -526,6 +526,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 	if (r < 0)
 		res = false;
 	break;
+case RTPrefixStrategyNumber:
+	if (r != 0)
+		res = false;
+	break;
 default:
 	elog(ERROR, "unrecognized strategy number: %d",
 		 in->scankeys[j].sk_strategy);
@@ -601,10 +605,25 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS)
 	for (j = 0; j < in->nkeys; j++)
 	{
 		StrategyNumber strategy = in->scankeys[j].sk_strategy;
-		text	   *query = DatumGetTextPP(in->scankeys[j].sk_argument);
-		int			queryLen = VARSIZE_ANY_EXHDR(query);
+		text	   *query;
+		int			queryLen;
 		int			r;
 
+		/* for this strategy collation is not important */
+		if (strategy == RTPrefixStrategyNumber)
+		{
+			res = DatumGetBool(DirectFunctionCall2(text_startswith,
+out->leafValue, in->scankeys[j].sk_argument));
+
+			if (res) /* no need to consider remaining conditions */
+break;
+
+			continue;
+		}
+
+		query = DatumGetTextPP(in->scankeys[j].sk_argument);
+		queryLen = VARSIZE_ANY_EXHDR(query);
+
 		if (strategy > 10)
 		{
 			/* Collation-aware comparison */
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index bf240aa9c5..de3ace442a 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1204,7 +1204,6 @@ patternsel(PG_FUNCTION_ARGS, Pattern_Type ptype, bool negate)
 	Oid			vartype;
 	Oid			opfamily;
 	Pattern_Prefix_Status pstatus;
-	Const	   *patt;
 	Const	   *prefix = NULL;
 	Selectivity rest_selec = 0;
 	double		nullfrac = 0.0;
@@ -1316,9 +1315,21 @@ patternsel(PG_FUNCTION_ARGS, Pattern_Type ptype, bool negate)
 	 * but because we want to be sure we cache compiled regexps under the
 	 * right cache key, so that they can be re-used at runtime.
 	 */
-	patt = (Const *) other;
-	pstatus = pattern_fixed_prefix(patt, ptype, collation,
-   &prefix, &rest_selec);
+	if (ptype == Pattern_Type_Prefix)
+	{
+		char	*s = TextDatumGetCString(constval);
+		prefix = string_to_const(s, vartype);
+		pstatus = Pattern_Prefix_Partial;
+		rest_selec = 1.0;	/* all */
+		pfree(s);
+	}
+	else
+	{
+		Const *patt = (Const *) other;
+
+		pstatus = pattern_fixed_prefix(patt, ptype, collation,
+	   &prefix, &rest_selec);
+	}
 
 	/*
 	 * If necessary, coerce the prefix constant to the right type.
@@ -1488,6 +1499,16 @@ likesel(PG_FUNCTION_ARGS)
 }
 
 /*
+ *		prefixsel			- selectivity of prefix operator
+ */
+Datum
+prefixsel(PG_FUNCTION_ARGS)
+{
+	PG_RETURN_FLOAT8(patternsel(fcinfo, Pattern_Type_Prefix, false));
+}
+
+/*
+ *
  *		iclikesel			- Selectivity of ILIKE pattern match.
  */
 Datum
@@ -2906,6 +2927,15 @@ likejoinsel(PG_FUNCTION_ARGS)
 	PG_RETURN_FLOAT8(patternjoinsel(fcinfo, P

Re: Prefix operator for text and spgist support

2018-03-12 Thread Ildus Kurbangaliev
On Tue, 6 Mar 2018 19:27:21 +0300
Arthur Zakirov  wrote:

> On Mon, Feb 19, 2018 at 05:19:15PM +0300, Ildus Kurbangaliev wrote:
> > At brief look at this place seems better to move this block into
> > pattern_fixed_prefix function. But there is also `vartype` variable
> > which used to in prefix construction, and it would require pass this
> > variable too. And since pattern_fixed_prefix called in 10 other
> > places and vartype is required only for this ptype it seems better
> > just keep this block outside of this function.  
> 
> Understood.
> 
> > I've added documentation in current version of the patch.  
> 
> Thank you.
> 
> Can you rebase the patch due to changes within pg_proc.h?
> 
> Also here
> 
> +   
> +There is also the prefix operator ^@ and
> corresponding
> +text_startswith function which covers cases
> when only
> +searching by beginning of the string is needed.
> +   
> 
> I think text_startswith should be enclosed with the  tag.
> I'm not sure, but I think  used for operators, keywords,
> etc. I haven't found a manual which describes how to use tags, but
> after looking at the documentation where  is used, I think
> that for function  should be used.
> 

Hi, thanks for the review. I've fixed documentation as you said and
also rebased to current master.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2f59af25a6..4dc11d8df2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -2274,6 +2274,21 @@
ph
   
 
+  
+   
+
+ text_startswith
+
+text_startswith(string, prefix)
+   
+   bool
+   
+Returns true if string starts with prefix.
+   
+   text_startswith('alphabet', 'alph')
+   t
+  
+
   

 
@@ -4033,6 +4048,12 @@ cast(-44 as bit(12))   11010100
 ILIKE, respectively.  All of these operators are
 PostgreSQL-specific.

+
+   
+There is also the prefix operator ^@ and corresponding
+text_startswith function which covers cases when only
+searching by beginning of the string is needed.
+   
   
 
 
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index e47f70be89..06b7519052 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -161,6 +161,7 @@
~<~
~>=~
~>~
+   ^@
   
  
  
diff --git a/src/backend/access/spgist/spgtextproc.c b/src/backend/access/spgist/spgtextproc.c
index f156b2166e..5beb9e33a1 100644
--- a/src/backend/access/spgist/spgtextproc.c
+++ b/src/backend/access/spgist/spgtextproc.c
@@ -496,7 +496,7 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 			 * well end with a partial multibyte character, so that applying
 			 * any encoding-sensitive test to it would be risky anyhow.)
 			 */
-			if (strategy > 10)
+			if (strategy > 10 && strategy != RTPrefixStrategyNumber)
 			{
 if (collate_is_c)
 	strategy -= 10;
@@ -526,6 +526,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 	if (r < 0)
 		res = false;
 	break;
+case RTPrefixStrategyNumber:
+	if (r != 0)
+		res = false;
+	break;
 default:
 	elog(ERROR, "unrecognized strategy number: %d",
 		 in->scankeys[j].sk_strategy);
@@ -601,10 +605,21 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS)
 	for (j = 0; j < in->nkeys; j++)
 	{
 		StrategyNumber strategy = in->scankeys[j].sk_strategy;
-		text	   *query = DatumGetTextPP(in->scankeys[j].sk_argument);
-		int			queryLen = VARSIZE_ANY_EXHDR(query);
+		text	   *query;
+		int			queryLen;
 		int			r;
 
+		/* for this strategy collation is not important */
+		if (strategy == RTPrefixStrategyNumber)
+		{
+			res = DatumGetBool(DirectFunctionCall2(text_startswith,
+out->leafValue, in->scankeys[j].sk_argument));
+			goto next;
+		}
+
+		query = DatumGetTextPP(in->scankeys[j].sk_argument);
+		queryLen = VARSIZE_ANY_EXHDR(query);
+
 		if (strategy > 10)
 		{
 			/* Collation-aware comparison */
@@ -655,6 +670,7 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS)
 break;
 		}
 
+next:
 		if (!res)
 			break;/* no need to consider remaining conditions */
 	}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index bf240aa9c5..50b2583ca0 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1317,8 +1317,18 @@ patternsel(PG_FUNCTION_ARGS, Pattern_Type ptype, bool negate)
 	 * right cache key, so that they can be re-used at runtime.
 	 */
 	patt = (Const *) other;
-	pstatus = pattern_fixed_prefix(patt, ptype, collation,
-   &prefix, &rest_selec);
+
+	if (ptype == Pattern_Type_Prefix)
+	{
+		char	*s = TextDatumGetCString(constval);
+		prefix = string_to_const(s, vartype);
+		pstatus = Pattern_Prefix_Partial;
+		rest_selec = 1.0;	/* all */
+		pfree(s);
+	}
+	else
+		pstatus = pattern_fixed_prefix(patt, ptype, collation,
+		

Re: Prefix operator for text and spgist support

2018-03-06 Thread Arthur Zakirov
On Mon, Feb 19, 2018 at 05:19:15PM +0300, Ildus Kurbangaliev wrote:
> At brief look at this place seems better to move this block into
> pattern_fixed_prefix function. But there is also `vartype` variable
> which used to in prefix construction, and it would require pass this
> variable too. And since pattern_fixed_prefix called in 10 other places
> and vartype is required only for this ptype it seems better just keep
> this block outside of this function.

Understood.

> I've added documentation in current version of the patch.

Thank you.

Can you rebase the patch due to changes within pg_proc.h?

Also here

+   
+There is also the prefix operator ^@ and corresponding
+text_startswith function which covers cases when only
+searching by beginning of the string is needed.
+   

I think text_startswith should be enclosed with the  tag. I'm
not sure, but I think  used for operators, keywords, etc. I
haven't found a manual which describes how to use tags, but after looking
at the documentation where  is used, I think that for function
 should be used.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: Prefix operator for text and spgist support

2018-02-19 Thread Ildus Kurbangaliev
On Mon, 19 Feb 2018 15:06:51 +0300
Arthur Zakirov  wrote:

> Hello,
> 
> On Fri, Feb 02, 2018 at 06:03:27PM +0300, Ildus Kurbangaliev wrote:
> > Hi,
> > 
> > Attached patch introduces prefix operator ^@ for text type. For 'a
> > ^@ b' it returns true if 'a' starts with 'b'. Also there is spgist
> > index support for this operator.
> > 
> > It could be useful as an alternative for LIKE for 'something%'
> > templates. Or even used in LIKE queries instead of ~>=~ and ~<~ in
> > the future. But it would require new strategy for btree.  
> 
> I've looked at the patch. It is applied and tests pass.

Hi, thanks for the review.

> 
> I have a couple comments:
> 
> > +   if (ptype == Pattern_Type_Prefix)
> > +   {
> > +   char*s = TextDatumGetCString(constval);
> > +   prefix = string_to_const(s, vartype);
> > +   pstatus = Pattern_Prefix_Partial;
> > +   rest_selec = 1.0;   /* all */
> > +   pfree(s);
> > +   }
> > +   else
> > +   pstatus = pattern_fixed_prefix(patt, ptype,
> > collation,
> > +
> > &prefix, &rest_selec);  
> 
> I think it is better to put Pattern_Type_Prefix processing into
> pattern_fixed_prefix() as another case entry.

At brief look at this place seems better to move this block into
pattern_fixed_prefix function. But there is also `vartype` variable
which used to in prefix construction, and it would require pass this
variable too. And since pattern_fixed_prefix called in 10 other places
and vartype is required only for this ptype it seems better just keep
this block outside of this function.

> 
> Secondly, it is worth to fix the documentation. At least here [1].
> Maybe there are another places where documentation should be fixed.
> 
> 
> 1 -
> https://www.postgresql.org/docs/current/static/spgist-builtin-opclasses.html
> 

I've added documentation in current version of the patch.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1e535cf215..403d0193ce 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -2274,6 +2274,21 @@
ph
   
 
+  
+   
+
+ text_startswith
+
+text_startswith(string, prefix)
+   
+   bool
+   
+Returns true if string starts with prefix.
+   
+   text_startswith('alphabet', 'alph')
+   t
+  
+
   

 
@@ -3964,6 +3979,12 @@ cast(-44 as bit(12))   11010100
 ILIKE, respectively.  All of these operators are
 PostgreSQL-specific.

+
+   
+There is also the prefix operator ^@ and corresponding
+text_startswith function which covers cases when only
+searching by beginning of the string is needed.
+   
   
 
 
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index 51bb60c92a..00ef9a92b0 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -179,6 +179,7 @@
~<~
~>=~
~>~
+   ^@
   
  
  
diff --git a/src/backend/access/spgist/spgtextproc.c b/src/backend/access/spgist/spgtextproc.c
index f156b2166e..5beb9e33a1 100644
--- a/src/backend/access/spgist/spgtextproc.c
+++ b/src/backend/access/spgist/spgtextproc.c
@@ -496,7 +496,7 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 			 * well end with a partial multibyte character, so that applying
 			 * any encoding-sensitive test to it would be risky anyhow.)
 			 */
-			if (strategy > 10)
+			if (strategy > 10 && strategy != RTPrefixStrategyNumber)
 			{
 if (collate_is_c)
 	strategy -= 10;
@@ -526,6 +526,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 	if (r < 0)
 		res = false;
 	break;
+case RTPrefixStrategyNumber:
+	if (r != 0)
+		res = false;
+	break;
 default:
 	elog(ERROR, "unrecognized strategy number: %d",
 		 in->scankeys[j].sk_strategy);
@@ -601,10 +605,21 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS)
 	for (j = 0; j < in->nkeys; j++)
 	{
 		StrategyNumber strategy = in->scankeys[j].sk_strategy;
-		text	   *query = DatumGetTextPP(in->scankeys[j].sk_argument);
-		int			queryLen = VARSIZE_ANY_EXHDR(query);
+		text	   *query;
+		int			queryLen;
 		int			r;
 
+		/* for this strategy collation is not important */
+		if (strategy == RTPrefixStrategyNumber)
+		{
+			res = DatumGetBool(DirectFunctionCall2(text_startswith,
+out->leafValue, in->scankeys[j].sk_argument));
+			goto next;
+		}
+
+		query = DatumGetTextPP(in->scankeys[j].sk_argument);
+		queryLen = VARSIZE_ANY_EXHDR(query);
+
 		if (strategy > 10)
 		{
 			/* Collation-aware comparison */
@@ -655,6 +670,7 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS)
 break;
 		}
 
+next:
 		if (!res)
 			break;/* no need to consider remaining conditions */
 	}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fcc8323f62..be91082b56 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backen

Re: Prefix operator for text and spgist support

2018-02-19 Thread Arthur Zakirov
Hello,

On Fri, Feb 02, 2018 at 06:03:27PM +0300, Ildus Kurbangaliev wrote:
> Hi,
> 
> Attached patch introduces prefix operator ^@ for text type. For 'a ^@ b'
> it returns true if 'a' starts with 'b'. Also there is spgist index
> support for this operator.
> 
> It could be useful as an alternative for LIKE for 'something%'
> templates. Or even used in LIKE queries instead of ~>=~ and ~<~ in the
> future. But it would require new strategy for btree.

I've looked at the patch. It is applied and tests pass.

I have a couple comments:

> + if (ptype == Pattern_Type_Prefix)
> + {
> + char*s = TextDatumGetCString(constval);
> + prefix = string_to_const(s, vartype);
> + pstatus = Pattern_Prefix_Partial;
> + rest_selec = 1.0;   /* all */
> + pfree(s);
> + }
> + else
> + pstatus = pattern_fixed_prefix(patt, ptype, collation,
> +
> &prefix, &rest_selec);

I think it is better to put Pattern_Type_Prefix processing into
pattern_fixed_prefix() as another case entry.

Secondly, it is worth to fix the documentation. At least here [1]. Maybe
there are another places where documentation should be fixed.


1 - https://www.postgresql.org/docs/current/static/spgist-builtin-opclasses.html

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Prefix operator for text and spgist support

2018-02-02 Thread Ildus Kurbangaliev
Hi,

Attached patch introduces prefix operator ^@ for text type. For 'a ^@ b'
it returns true if 'a' starts with 'b'. Also there is spgist index
support for this operator.

It could be useful as an alternative for LIKE for 'something%'
templates. Or even used in LIKE queries instead of ~>=~ and ~<~ in the
future. But it would require new strategy for btree.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/access/spgist/spgtextproc.c b/src/backend/access/spgist/spgtextproc.c
index f156b2166e..5beb9e33a1 100644
--- a/src/backend/access/spgist/spgtextproc.c
+++ b/src/backend/access/spgist/spgtextproc.c
@@ -496,7 +496,7 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 			 * well end with a partial multibyte character, so that applying
 			 * any encoding-sensitive test to it would be risky anyhow.)
 			 */
-			if (strategy > 10)
+			if (strategy > 10 && strategy != RTPrefixStrategyNumber)
 			{
 if (collate_is_c)
 	strategy -= 10;
@@ -526,6 +526,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 	if (r < 0)
 		res = false;
 	break;
+case RTPrefixStrategyNumber:
+	if (r != 0)
+		res = false;
+	break;
 default:
 	elog(ERROR, "unrecognized strategy number: %d",
 		 in->scankeys[j].sk_strategy);
@@ -601,10 +605,21 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS)
 	for (j = 0; j < in->nkeys; j++)
 	{
 		StrategyNumber strategy = in->scankeys[j].sk_strategy;
-		text	   *query = DatumGetTextPP(in->scankeys[j].sk_argument);
-		int			queryLen = VARSIZE_ANY_EXHDR(query);
+		text	   *query;
+		int			queryLen;
 		int			r;
 
+		/* for this strategy collation is not important */
+		if (strategy == RTPrefixStrategyNumber)
+		{
+			res = DatumGetBool(DirectFunctionCall2(text_startswith,
+out->leafValue, in->scankeys[j].sk_argument));
+			goto next;
+		}
+
+		query = DatumGetTextPP(in->scankeys[j].sk_argument);
+		queryLen = VARSIZE_ANY_EXHDR(query);
+
 		if (strategy > 10)
 		{
 			/* Collation-aware comparison */
@@ -655,6 +670,7 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS)
 break;
 		}
 
+next:
 		if (!res)
 			break;/* no need to consider remaining conditions */
 	}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fcc8323f62..be91082b56 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1315,8 +1315,18 @@ patternsel(PG_FUNCTION_ARGS, Pattern_Type ptype, bool negate)
 	 * right cache key, so that they can be re-used at runtime.
 	 */
 	patt = (Const *) other;
-	pstatus = pattern_fixed_prefix(patt, ptype, collation,
-   &prefix, &rest_selec);
+
+	if (ptype == Pattern_Type_Prefix)
+	{
+		char	*s = TextDatumGetCString(constval);
+		prefix = string_to_const(s, vartype);
+		pstatus = Pattern_Prefix_Partial;
+		rest_selec = 1.0;	/* all */
+		pfree(s);
+	}
+	else
+		pstatus = pattern_fixed_prefix(patt, ptype, collation,
+	   &prefix, &rest_selec);
 
 	/*
 	 * If necessary, coerce the prefix constant to the right type.
@@ -1486,6 +1496,16 @@ likesel(PG_FUNCTION_ARGS)
 }
 
 /*
+ *		prefixsel			- selectivity of prefix operator
+ */
+Datum
+prefixsel(PG_FUNCTION_ARGS)
+{
+	PG_RETURN_FLOAT8(patternsel(fcinfo, Pattern_Type_Prefix, false));
+}
+
+/*
+ *
  *		iclikesel			- Selectivity of ILIKE pattern match.
  */
 Datum
@@ -2904,6 +2924,15 @@ likejoinsel(PG_FUNCTION_ARGS)
 	PG_RETURN_FLOAT8(patternjoinsel(fcinfo, Pattern_Type_Like, false));
 }
 
+/*
+ *		prefixjoinsel			- Join selectivity of prefix operator
+ */
+Datum
+prefixjoinsel(PG_FUNCTION_ARGS)
+{
+	PG_RETURN_FLOAT8(patternjoinsel(fcinfo, Pattern_Type_Prefix, false));
+}
+
 /*
  *		iclikejoinsel			- Join selectivity of ILIKE pattern match.
  */
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 304cb26691..050875e0bb 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1762,6 +1762,34 @@ text_ge(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(result);
 }
 
+Datum
+text_startswith(PG_FUNCTION_ARGS)
+{
+	Datum		arg1 = PG_GETARG_DATUM(0);
+	Datum		arg2 = PG_GETARG_DATUM(1);
+	bool		result;
+	Size		len1,
+len2;
+
+	len1 = toast_raw_datum_size(arg1);
+	len2 = toast_raw_datum_size(arg2);
+	if (len2 > len1)
+		result = false;
+	else
+	{
+		text	   *targ1 = DatumGetTextPP(arg1);
+		text	   *targ2 = DatumGetTextPP(arg2);
+
+		result = (memcmp(VARDATA_ANY(targ1), VARDATA_ANY(targ2),
+		 len2 - VARHDRSZ) == 0);
+
+		PG_FREE_IF_COPY(targ1, 0);
+		PG_FREE_IF_COPY(targ2, 1);
+	}
+
+	PG_RETURN_BOOL(result);
+}
+
 Datum
 bttextcmp(PG_FUNCTION_ARGS)
 {
diff --git a/src/include/access/stratnum.h b/src/include/access/stratnum.h
index bddfac4c10..0db11a1117 100644
--- a/src/include/access/stratnum.h
+++ b/src/include/access/stratnum.h
@@ -68,8 +68,9 @@ typedef uint16 StrategyNumber;
 #define RTSubEqualStrategyNumber		25	/* for inet <<= */
 #define RTSuperStrategyNumber			26	/* for inet << */
 #define RTSuperEqualStr