Re: Prefix operator for text and spgist support
On Mon, 16 Apr 2018 12:45:23 +0200 Emre Hasegeli wrote: > > Thank you, pushed with some editorization and renaming > > text_startswith to starts_with > > I am sorry for not noticing this before, but what is the point of this > operator? It seems to me we are only making the prefix searching > business, which is already complicated, more complicated. Hi. > > Also, the new operator is not documented on SQL String Functions and > Operators table. It is not supported by btree text_pattern_ops or > btree indexes with COLLATE "C". It is not defined for "citext", so > people would get wrong results. It doesn't use pg_trgm indexes > whereas LIKE can. It is mentioned in documentation, look for "starts_with" function. Currently it's working with spgist indexes which fact is pointed out in the documentation too. I was going to add btree support but it would require a new strategy so it will be matter of another patch. I think this operator could be used in LIKE instead of current weird comparison operators. -- --- Ildus Kurbangaliev Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Re: Prefix operator for text and spgist support
> Thank you, pushed with some editorization and renaming text_startswith to > starts_with I am sorry for not noticing this before, but what is the point of this operator? It seems to me we are only making the prefix searching business, which is already complicated, more complicated. Also, the new operator is not documented on SQL String Functions and Operators table. It is not supported by btree text_pattern_ops or btree indexes with COLLATE "C". It is not defined for "citext", so people would get wrong results. It doesn't use pg_trgm indexes whereas LIKE can.
Re: Prefix operator for text and spgist support
Thank you, pushed with some editorization and renaming text_startswith to starts_with Ildus Kurbangaliev wrote: On Fri, 23 Mar 2018 11:45:33 +0300 Alexander Korotkov wrote: On Thu, Mar 22, 2018 at 7:09 PM, Teodor Sigaev wrote: Patch looks resonable, but I see some place to improvement: spg_text_leaf_consistent() only needs to check with text_startswith() if reconstucted value came to leaf consistent is shorter than given prefix. For example, if level >= length of prefix then we guarantee that fully reconstracted is matched too. But do not miss that you may need to return value for index only scan, consult returnData field In attachment rebased and minorly edited version of your patch. I took a look at this patch. In addition to Teodor's comments I can note following. * This line looks strange for me. - if (strategy > 10) + if (strategy > 10 && strategy != RTPrefixStrategyNumber) It's not because we added strategy != RTPrefixStrategyNumber condition there. It's because we already used magic number here and now have a mix of magic number and macro constant in one line. Once we anyway touch this place, could we get rid of magic numbers here? * I'm a little concern about operator name. We're going to choose @^ operator for prefix search without any preliminary discussion. However, personally I don't have better ideas :) Teodor, Alexander, thanks for review. In new version I have added the optimization in spgist using level variable and also got rid of magic numbers. About the operator it's actually ^@ (not @^ :)), I thought about it and don't really have any idea what operator can be used instead. Attached version 5 of the patch. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/
Re: Prefix operator for text and spgist support
On Fri, 23 Mar 2018 11:45:33 +0300 Alexander Korotkov wrote: > On Thu, Mar 22, 2018 at 7:09 PM, Teodor Sigaev > wrote: > > > Patch looks resonable, but I see some place to improvement: > > spg_text_leaf_consistent() only needs to check with > > text_startswith() if reconstucted value came to leaf consistent is > > shorter than given prefix. For example, if level >= length of > > prefix then we guarantee that fully reconstracted is matched too. > > But do not miss that you may need to return value for index only > > scan, consult returnData field > > > > In attachment rebased and minorly edited version of your patch. > > > I took a look at this patch. In addition to Teodor's comments I can > note following. > > * This line looks strange for me. > > - if (strategy > 10) > + if (strategy > 10 && strategy != > RTPrefixStrategyNumber) > > It's not because we added strategy != RTPrefixStrategyNumber condition > there. > It's because we already used magic number here and now have a mix of > magic number and macro constant in one line. Once we anyway touch > this place, could we get rid of magic numbers here? > > * I'm a little concern about operator name. We're going to choose @^ > operator for > prefix search without any preliminary discussion. However, > personally I don't > have better ideas :) Teodor, Alexander, thanks for review. In new version I have added the optimization in spgist using level variable and also got rid of magic numbers. About the operator it's actually ^@ (not @^ :)), I thought about it and don't really have any idea what operator can be used instead. Attached version 5 of the patch. -- --- Ildus Kurbangaliev Postgres Professional: http://www.postgrespro.com Russian Postgres Company diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 9d1772f349..a1b4724a88 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -2274,6 +2274,21 @@ ph + + + + text_startswith + +text_startswith(string, prefix) + + bool + +Returns true if string starts with prefix. + + text_startswith('alphabet', 'alph') + t + + @@ -4033,6 +4048,12 @@ cast(-44 as bit(12)) 11010100 ILIKE, respectively. All of these operators are PostgreSQL-specific. + + +There is also the prefix operator ^@ and corresponding +text_startswith function which covers cases when only +searching by beginning of the string is needed. + diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml index e47f70be89..06b7519052 100644 --- a/doc/src/sgml/spgist.sgml +++ b/doc/src/sgml/spgist.sgml @@ -161,6 +161,7 @@ ~<~ ~>=~ ~>~ + ^@ diff --git a/src/backend/access/spgist/spgtextproc.c b/src/backend/access/spgist/spgtextproc.c index f156b2166e..c5e27dd7aa 100644 --- a/src/backend/access/spgist/spgtextproc.c +++ b/src/backend/access/spgist/spgtextproc.c @@ -67,6 +67,20 @@ */ #define SPGIST_MAX_PREFIX_LENGTH Max((int) (BLCKSZ - 258 * 16 - 100), 32) +/* + * Strategy for collation aware operator on text is equal to btree strategy + * plus value of 10. + * + * Current collation aware strategies and their corresponding btree strategies: + * 11 BTLessStrategyNumber + * 12 BTLessEqualStrategyNumber + * 14 BTGreaterEqualStrategyNumber + * 15 BTGreaterStrategyNumber + */ +#define SPG_STRATEGY_ADDITION (10) +#define SPG_IS_COLLATION_AWARE_STRATEGY(s) ((s) > SPG_STRATEGY_ADDITION \ + && (s) != RTPrefixStrategyNumber) + /* Struct for sorting values in picksplit */ typedef struct spgNodePtr { @@ -496,10 +510,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS) * well end with a partial multibyte character, so that applying * any encoding-sensitive test to it would be risky anyhow.) */ - if (strategy > 10) + if (SPG_IS_COLLATION_AWARE_STRATEGY(strategy)) { if (collate_is_c) - strategy -= 10; + strategy -= SPG_STRATEGY_ADDITION; else continue; } @@ -526,6 +540,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS) if (r < 0) res = false; break; +case RTPrefixStrategyNumber: + if (r != 0) + res = false; + break; default: elog(ERROR, "unrecognized strategy number: %d", in->scankeys[j].sk_strategy); @@ -605,10 +623,31 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS) int queryLen = VARSIZE_ANY_EXHDR(query); int r; - if (strategy > 10) + if (strategy == RTPrefixStrategyNumber) + { + if (!in->returnData && level >= queryLen) + { +/* + * If we got to the leaf when level is equal or longer than + * query then it is a prefix. We skip cases when returnData is + * true, it would mean then leaf could be visited anyway. + */ +continue; + } + + res = DatumGetBool(DirectFunctionCall2(text_sta
Re: Prefix operator for text and spgist support
On Thu, Mar 22, 2018 at 7:09 PM, Teodor Sigaev wrote: > Patch looks resonable, but I see some place to improvement: > spg_text_leaf_consistent() only needs to check with text_startswith() if > reconstucted value came to leaf consistent is shorter than given prefix. > For example, if level >= length of prefix then we guarantee that fully > reconstracted is matched too. But do not miss that you may need to return > value for index only scan, consult returnData field > > In attachment rebased and minorly edited version of your patch. I took a look at this patch. In addition to Teodor's comments I can note following. * This line looks strange for me. - if (strategy > 10) + if (strategy > 10 && strategy != RTPrefixStrategyNumber) It's not because we added strategy != RTPrefixStrategyNumber condition there. It's because we already used magic number here and now have a mix of magic number and macro constant in one line. Once we anyway touch this place, could we get rid of magic numbers here? * I'm a little concern about operator name. We're going to choose @^ operator for prefix search without any preliminary discussion. However, personally I don't have better ideas :) -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Prefix operator for text and spgist support
Hi! Patch looks resonable, but I see some place to improvement: spg_text_leaf_consistent() only needs to check with text_startswith() if reconstucted value came to leaf consistent is shorter than given prefix. For example, if level >= length of prefix then we guarantee that fully reconstracted is matched too. But do not miss that you may need to return value for index only scan, consult returnData field In attachment rebased and minorly edited version of your patch. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 2f59af25a6..4dc11d8df2 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -2274,6 +2274,21 @@ ph + + + + text_startswith + +text_startswith(string, prefix) + + bool + +Returns true if string starts with prefix. + + text_startswith('alphabet', 'alph') + t + + @@ -4033,6 +4048,12 @@ cast(-44 as bit(12)) 11010100 ILIKE, respectively. All of these operators are PostgreSQL-specific. + + +There is also the prefix operator ^@ and corresponding +text_startswith function which covers cases when only +searching by beginning of the string is needed. + diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml index e47f70be89..06b7519052 100644 --- a/doc/src/sgml/spgist.sgml +++ b/doc/src/sgml/spgist.sgml @@ -161,6 +161,7 @@ ~<~ ~>=~ ~>~ + ^@ diff --git a/src/backend/access/spgist/spgtextproc.c b/src/backend/access/spgist/spgtextproc.c index f156b2166e..f92ff68a5f 100644 --- a/src/backend/access/spgist/spgtextproc.c +++ b/src/backend/access/spgist/spgtextproc.c @@ -496,7 +496,7 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS) * well end with a partial multibyte character, so that applying * any encoding-sensitive test to it would be risky anyhow.) */ - if (strategy > 10) + if (strategy > 10 && strategy != RTPrefixStrategyNumber) { if (collate_is_c) strategy -= 10; @@ -526,6 +526,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS) if (r < 0) res = false; break; +case RTPrefixStrategyNumber: + if (r != 0) + res = false; + break; default: elog(ERROR, "unrecognized strategy number: %d", in->scankeys[j].sk_strategy); @@ -601,10 +605,25 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS) for (j = 0; j < in->nkeys; j++) { StrategyNumber strategy = in->scankeys[j].sk_strategy; - text *query = DatumGetTextPP(in->scankeys[j].sk_argument); - int queryLen = VARSIZE_ANY_EXHDR(query); + text *query; + int queryLen; int r; + /* for this strategy collation is not important */ + if (strategy == RTPrefixStrategyNumber) + { + res = DatumGetBool(DirectFunctionCall2(text_startswith, +out->leafValue, in->scankeys[j].sk_argument)); + + if (res) /* no need to consider remaining conditions */ +break; + + continue; + } + + query = DatumGetTextPP(in->scankeys[j].sk_argument); + queryLen = VARSIZE_ANY_EXHDR(query); + if (strategy > 10) { /* Collation-aware comparison */ diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index bf240aa9c5..de3ace442a 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -1204,7 +1204,6 @@ patternsel(PG_FUNCTION_ARGS, Pattern_Type ptype, bool negate) Oid vartype; Oid opfamily; Pattern_Prefix_Status pstatus; - Const *patt; Const *prefix = NULL; Selectivity rest_selec = 0; double nullfrac = 0.0; @@ -1316,9 +1315,21 @@ patternsel(PG_FUNCTION_ARGS, Pattern_Type ptype, bool negate) * but because we want to be sure we cache compiled regexps under the * right cache key, so that they can be re-used at runtime. */ - patt = (Const *) other; - pstatus = pattern_fixed_prefix(patt, ptype, collation, - &prefix, &rest_selec); + if (ptype == Pattern_Type_Prefix) + { + char *s = TextDatumGetCString(constval); + prefix = string_to_const(s, vartype); + pstatus = Pattern_Prefix_Partial; + rest_selec = 1.0; /* all */ + pfree(s); + } + else + { + Const *patt = (Const *) other; + + pstatus = pattern_fixed_prefix(patt, ptype, collation, + &prefix, &rest_selec); + } /* * If necessary, coerce the prefix constant to the right type. @@ -1488,6 +1499,16 @@ likesel(PG_FUNCTION_ARGS) } /* + * prefixsel - selectivity of prefix operator + */ +Datum +prefixsel(PG_FUNCTION_ARGS) +{ + PG_RETURN_FLOAT8(patternsel(fcinfo, Pattern_Type_Prefix, false)); +} + +/* + * * iclikesel - Selectivity of ILIKE pattern match. */ Datum @@ -2906,6 +2927,15 @@ likejoinsel(PG_FUNCTION_ARGS) PG_RETURN_FLOAT8(patternjoinsel(fcinfo, P
Re: Prefix operator for text and spgist support
On Tue, 6 Mar 2018 19:27:21 +0300 Arthur Zakirov wrote: > On Mon, Feb 19, 2018 at 05:19:15PM +0300, Ildus Kurbangaliev wrote: > > At brief look at this place seems better to move this block into > > pattern_fixed_prefix function. But there is also `vartype` variable > > which used to in prefix construction, and it would require pass this > > variable too. And since pattern_fixed_prefix called in 10 other > > places and vartype is required only for this ptype it seems better > > just keep this block outside of this function. > > Understood. > > > I've added documentation in current version of the patch. > > Thank you. > > Can you rebase the patch due to changes within pg_proc.h? > > Also here > > + > +There is also the prefix operator ^@ and > corresponding > +text_startswith function which covers cases > when only > +searching by beginning of the string is needed. > + > > I think text_startswith should be enclosed with the tag. > I'm not sure, but I think used for operators, keywords, > etc. I haven't found a manual which describes how to use tags, but > after looking at the documentation where is used, I think > that for function should be used. > Hi, thanks for the review. I've fixed documentation as you said and also rebased to current master. -- --- Ildus Kurbangaliev Postgres Professional: http://www.postgrespro.com Russian Postgres Company diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 2f59af25a6..4dc11d8df2 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -2274,6 +2274,21 @@ ph + + + + text_startswith + +text_startswith(string, prefix) + + bool + +Returns true if string starts with prefix. + + text_startswith('alphabet', 'alph') + t + + @@ -4033,6 +4048,12 @@ cast(-44 as bit(12)) 11010100 ILIKE, respectively. All of these operators are PostgreSQL-specific. + + +There is also the prefix operator ^@ and corresponding +text_startswith function which covers cases when only +searching by beginning of the string is needed. + diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml index e47f70be89..06b7519052 100644 --- a/doc/src/sgml/spgist.sgml +++ b/doc/src/sgml/spgist.sgml @@ -161,6 +161,7 @@ ~<~ ~>=~ ~>~ + ^@ diff --git a/src/backend/access/spgist/spgtextproc.c b/src/backend/access/spgist/spgtextproc.c index f156b2166e..5beb9e33a1 100644 --- a/src/backend/access/spgist/spgtextproc.c +++ b/src/backend/access/spgist/spgtextproc.c @@ -496,7 +496,7 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS) * well end with a partial multibyte character, so that applying * any encoding-sensitive test to it would be risky anyhow.) */ - if (strategy > 10) + if (strategy > 10 && strategy != RTPrefixStrategyNumber) { if (collate_is_c) strategy -= 10; @@ -526,6 +526,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS) if (r < 0) res = false; break; +case RTPrefixStrategyNumber: + if (r != 0) + res = false; + break; default: elog(ERROR, "unrecognized strategy number: %d", in->scankeys[j].sk_strategy); @@ -601,10 +605,21 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS) for (j = 0; j < in->nkeys; j++) { StrategyNumber strategy = in->scankeys[j].sk_strategy; - text *query = DatumGetTextPP(in->scankeys[j].sk_argument); - int queryLen = VARSIZE_ANY_EXHDR(query); + text *query; + int queryLen; int r; + /* for this strategy collation is not important */ + if (strategy == RTPrefixStrategyNumber) + { + res = DatumGetBool(DirectFunctionCall2(text_startswith, +out->leafValue, in->scankeys[j].sk_argument)); + goto next; + } + + query = DatumGetTextPP(in->scankeys[j].sk_argument); + queryLen = VARSIZE_ANY_EXHDR(query); + if (strategy > 10) { /* Collation-aware comparison */ @@ -655,6 +670,7 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS) break; } +next: if (!res) break;/* no need to consider remaining conditions */ } diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index bf240aa9c5..50b2583ca0 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -1317,8 +1317,18 @@ patternsel(PG_FUNCTION_ARGS, Pattern_Type ptype, bool negate) * right cache key, so that they can be re-used at runtime. */ patt = (Const *) other; - pstatus = pattern_fixed_prefix(patt, ptype, collation, - &prefix, &rest_selec); + + if (ptype == Pattern_Type_Prefix) + { + char *s = TextDatumGetCString(constval); + prefix = string_to_const(s, vartype); + pstatus = Pattern_Prefix_Partial; + rest_selec = 1.0; /* all */ + pfree(s); + } + else + pstatus = pattern_fixed_prefix(patt, ptype, collation, +
Re: Prefix operator for text and spgist support
On Mon, Feb 19, 2018 at 05:19:15PM +0300, Ildus Kurbangaliev wrote: > At brief look at this place seems better to move this block into > pattern_fixed_prefix function. But there is also `vartype` variable > which used to in prefix construction, and it would require pass this > variable too. And since pattern_fixed_prefix called in 10 other places > and vartype is required only for this ptype it seems better just keep > this block outside of this function. Understood. > I've added documentation in current version of the patch. Thank you. Can you rebase the patch due to changes within pg_proc.h? Also here + +There is also the prefix operator ^@ and corresponding +text_startswith function which covers cases when only +searching by beginning of the string is needed. + I think text_startswith should be enclosed with the tag. I'm not sure, but I think used for operators, keywords, etc. I haven't found a manual which describes how to use tags, but after looking at the documentation where is used, I think that for function should be used. -- Arthur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Re: Prefix operator for text and spgist support
On Mon, 19 Feb 2018 15:06:51 +0300 Arthur Zakirov wrote: > Hello, > > On Fri, Feb 02, 2018 at 06:03:27PM +0300, Ildus Kurbangaliev wrote: > > Hi, > > > > Attached patch introduces prefix operator ^@ for text type. For 'a > > ^@ b' it returns true if 'a' starts with 'b'. Also there is spgist > > index support for this operator. > > > > It could be useful as an alternative for LIKE for 'something%' > > templates. Or even used in LIKE queries instead of ~>=~ and ~<~ in > > the future. But it would require new strategy for btree. > > I've looked at the patch. It is applied and tests pass. Hi, thanks for the review. > > I have a couple comments: > > > + if (ptype == Pattern_Type_Prefix) > > + { > > + char*s = TextDatumGetCString(constval); > > + prefix = string_to_const(s, vartype); > > + pstatus = Pattern_Prefix_Partial; > > + rest_selec = 1.0; /* all */ > > + pfree(s); > > + } > > + else > > + pstatus = pattern_fixed_prefix(patt, ptype, > > collation, > > + > > &prefix, &rest_selec); > > I think it is better to put Pattern_Type_Prefix processing into > pattern_fixed_prefix() as another case entry. At brief look at this place seems better to move this block into pattern_fixed_prefix function. But there is also `vartype` variable which used to in prefix construction, and it would require pass this variable too. And since pattern_fixed_prefix called in 10 other places and vartype is required only for this ptype it seems better just keep this block outside of this function. > > Secondly, it is worth to fix the documentation. At least here [1]. > Maybe there are another places where documentation should be fixed. > > > 1 - > https://www.postgresql.org/docs/current/static/spgist-builtin-opclasses.html > I've added documentation in current version of the patch. -- --- Ildus Kurbangaliev Postgres Professional: http://www.postgrespro.com Russian Postgres Company diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 1e535cf215..403d0193ce 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -2274,6 +2274,21 @@ ph + + + + text_startswith + +text_startswith(string, prefix) + + bool + +Returns true if string starts with prefix. + + text_startswith('alphabet', 'alph') + t + + @@ -3964,6 +3979,12 @@ cast(-44 as bit(12)) 11010100 ILIKE, respectively. All of these operators are PostgreSQL-specific. + + +There is also the prefix operator ^@ and corresponding +text_startswith function which covers cases when only +searching by beginning of the string is needed. + diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml index 51bb60c92a..00ef9a92b0 100644 --- a/doc/src/sgml/spgist.sgml +++ b/doc/src/sgml/spgist.sgml @@ -179,6 +179,7 @@ ~<~ ~>=~ ~>~ + ^@ diff --git a/src/backend/access/spgist/spgtextproc.c b/src/backend/access/spgist/spgtextproc.c index f156b2166e..5beb9e33a1 100644 --- a/src/backend/access/spgist/spgtextproc.c +++ b/src/backend/access/spgist/spgtextproc.c @@ -496,7 +496,7 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS) * well end with a partial multibyte character, so that applying * any encoding-sensitive test to it would be risky anyhow.) */ - if (strategy > 10) + if (strategy > 10 && strategy != RTPrefixStrategyNumber) { if (collate_is_c) strategy -= 10; @@ -526,6 +526,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS) if (r < 0) res = false; break; +case RTPrefixStrategyNumber: + if (r != 0) + res = false; + break; default: elog(ERROR, "unrecognized strategy number: %d", in->scankeys[j].sk_strategy); @@ -601,10 +605,21 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS) for (j = 0; j < in->nkeys; j++) { StrategyNumber strategy = in->scankeys[j].sk_strategy; - text *query = DatumGetTextPP(in->scankeys[j].sk_argument); - int queryLen = VARSIZE_ANY_EXHDR(query); + text *query; + int queryLen; int r; + /* for this strategy collation is not important */ + if (strategy == RTPrefixStrategyNumber) + { + res = DatumGetBool(DirectFunctionCall2(text_startswith, +out->leafValue, in->scankeys[j].sk_argument)); + goto next; + } + + query = DatumGetTextPP(in->scankeys[j].sk_argument); + queryLen = VARSIZE_ANY_EXHDR(query); + if (strategy > 10) { /* Collation-aware comparison */ @@ -655,6 +670,7 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS) break; } +next: if (!res) break;/* no need to consider remaining conditions */ } diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index fcc8323f62..be91082b56 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backen
Re: Prefix operator for text and spgist support
Hello, On Fri, Feb 02, 2018 at 06:03:27PM +0300, Ildus Kurbangaliev wrote: > Hi, > > Attached patch introduces prefix operator ^@ for text type. For 'a ^@ b' > it returns true if 'a' starts with 'b'. Also there is spgist index > support for this operator. > > It could be useful as an alternative for LIKE for 'something%' > templates. Or even used in LIKE queries instead of ~>=~ and ~<~ in the > future. But it would require new strategy for btree. I've looked at the patch. It is applied and tests pass. I have a couple comments: > + if (ptype == Pattern_Type_Prefix) > + { > + char*s = TextDatumGetCString(constval); > + prefix = string_to_const(s, vartype); > + pstatus = Pattern_Prefix_Partial; > + rest_selec = 1.0; /* all */ > + pfree(s); > + } > + else > + pstatus = pattern_fixed_prefix(patt, ptype, collation, > + > &prefix, &rest_selec); I think it is better to put Pattern_Type_Prefix processing into pattern_fixed_prefix() as another case entry. Secondly, it is worth to fix the documentation. At least here [1]. Maybe there are another places where documentation should be fixed. 1 - https://www.postgresql.org/docs/current/static/spgist-builtin-opclasses.html -- Arthur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Prefix operator for text and spgist support
Hi, Attached patch introduces prefix operator ^@ for text type. For 'a ^@ b' it returns true if 'a' starts with 'b'. Also there is spgist index support for this operator. It could be useful as an alternative for LIKE for 'something%' templates. Or even used in LIKE queries instead of ~>=~ and ~<~ in the future. But it would require new strategy for btree. -- --- Ildus Kurbangaliev Postgres Professional: http://www.postgrespro.com Russian Postgres Company diff --git a/src/backend/access/spgist/spgtextproc.c b/src/backend/access/spgist/spgtextproc.c index f156b2166e..5beb9e33a1 100644 --- a/src/backend/access/spgist/spgtextproc.c +++ b/src/backend/access/spgist/spgtextproc.c @@ -496,7 +496,7 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS) * well end with a partial multibyte character, so that applying * any encoding-sensitive test to it would be risky anyhow.) */ - if (strategy > 10) + if (strategy > 10 && strategy != RTPrefixStrategyNumber) { if (collate_is_c) strategy -= 10; @@ -526,6 +526,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS) if (r < 0) res = false; break; +case RTPrefixStrategyNumber: + if (r != 0) + res = false; + break; default: elog(ERROR, "unrecognized strategy number: %d", in->scankeys[j].sk_strategy); @@ -601,10 +605,21 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS) for (j = 0; j < in->nkeys; j++) { StrategyNumber strategy = in->scankeys[j].sk_strategy; - text *query = DatumGetTextPP(in->scankeys[j].sk_argument); - int queryLen = VARSIZE_ANY_EXHDR(query); + text *query; + int queryLen; int r; + /* for this strategy collation is not important */ + if (strategy == RTPrefixStrategyNumber) + { + res = DatumGetBool(DirectFunctionCall2(text_startswith, +out->leafValue, in->scankeys[j].sk_argument)); + goto next; + } + + query = DatumGetTextPP(in->scankeys[j].sk_argument); + queryLen = VARSIZE_ANY_EXHDR(query); + if (strategy > 10) { /* Collation-aware comparison */ @@ -655,6 +670,7 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS) break; } +next: if (!res) break;/* no need to consider remaining conditions */ } diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index fcc8323f62..be91082b56 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -1315,8 +1315,18 @@ patternsel(PG_FUNCTION_ARGS, Pattern_Type ptype, bool negate) * right cache key, so that they can be re-used at runtime. */ patt = (Const *) other; - pstatus = pattern_fixed_prefix(patt, ptype, collation, - &prefix, &rest_selec); + + if (ptype == Pattern_Type_Prefix) + { + char *s = TextDatumGetCString(constval); + prefix = string_to_const(s, vartype); + pstatus = Pattern_Prefix_Partial; + rest_selec = 1.0; /* all */ + pfree(s); + } + else + pstatus = pattern_fixed_prefix(patt, ptype, collation, + &prefix, &rest_selec); /* * If necessary, coerce the prefix constant to the right type. @@ -1486,6 +1496,16 @@ likesel(PG_FUNCTION_ARGS) } /* + * prefixsel - selectivity of prefix operator + */ +Datum +prefixsel(PG_FUNCTION_ARGS) +{ + PG_RETURN_FLOAT8(patternsel(fcinfo, Pattern_Type_Prefix, false)); +} + +/* + * * iclikesel - Selectivity of ILIKE pattern match. */ Datum @@ -2904,6 +2924,15 @@ likejoinsel(PG_FUNCTION_ARGS) PG_RETURN_FLOAT8(patternjoinsel(fcinfo, Pattern_Type_Like, false)); } +/* + * prefixjoinsel - Join selectivity of prefix operator + */ +Datum +prefixjoinsel(PG_FUNCTION_ARGS) +{ + PG_RETURN_FLOAT8(patternjoinsel(fcinfo, Pattern_Type_Prefix, false)); +} + /* * iclikejoinsel - Join selectivity of ILIKE pattern match. */ diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index 304cb26691..050875e0bb 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -1762,6 +1762,34 @@ text_ge(PG_FUNCTION_ARGS) PG_RETURN_BOOL(result); } +Datum +text_startswith(PG_FUNCTION_ARGS) +{ + Datum arg1 = PG_GETARG_DATUM(0); + Datum arg2 = PG_GETARG_DATUM(1); + bool result; + Size len1, +len2; + + len1 = toast_raw_datum_size(arg1); + len2 = toast_raw_datum_size(arg2); + if (len2 > len1) + result = false; + else + { + text *targ1 = DatumGetTextPP(arg1); + text *targ2 = DatumGetTextPP(arg2); + + result = (memcmp(VARDATA_ANY(targ1), VARDATA_ANY(targ2), + len2 - VARHDRSZ) == 0); + + PG_FREE_IF_COPY(targ1, 0); + PG_FREE_IF_COPY(targ2, 1); + } + + PG_RETURN_BOOL(result); +} + Datum bttextcmp(PG_FUNCTION_ARGS) { diff --git a/src/include/access/stratnum.h b/src/include/access/stratnum.h index bddfac4c10..0db11a1117 100644 --- a/src/include/access/stratnum.h +++ b/src/include/access/stratnum.h @@ -68,8 +68,9 @@ typedef uint16 StrategyNumber; #define RTSubEqualStrategyNumber 25 /* for inet <<= */ #define RTSuperStrategyNumber 26 /* for inet << */ #define RTSuperEqualStr