On Wed, May 12, 2021 at 12:19:37AM +0300, Alexander Korotkov wrote:
> This relates not just to quotes. Original problem relates to quotes
> in websearch_to_tsquery() and phrase operator in to_tsquery(). But
> the solution changes output for all query operands containing
> discarded tokens.
>
> Could we try this?
>
> Make to_tsquery() and websearch_to_tsquery() produce more strict
> output for query parts containing discarded tokens. In particular,
> this makes to_tsquery() and websearch_to_tsquery() properly parse the
> discarded tokens in phrase search operands and quotes correspondingly.
> > <para>
> > Certain discarded tokens, like underscore, caused the output
> > of these functions to produce incorrect tsquery output, e.g.,
> > websearch_to_tsquery('"pg_class pg"') used to output '( pg &
> > class ) <-> pg', but now outputs 'pg <-> class <-> pg'.
> > </para>
> > </listitem>
>
> This part looks good to me. I'd just suggest to extend the example to
> to_tsquery() as well.
>
> Certain discarded tokens, like underscore, caused the output of these
> functions to produce incorrect tsquery output, e.g., both
> websearch_to_tsquery('"pg_class pg"') and to_tsquery('pg_class <->
> pg') used to output '( pg & class ) <-> pg', but now both output 'pg
> <-> class <-> pg'.
OK, I went with this:
<listitem>
<!--
Author: Alexander Korotkov <[email protected]>
2021-01-31 [0c4f355c6] Fix parsing of complex morphs to tsquery
-->
<para>
Fix to_tsquery() and websearch_to_tsquery() to properly parse
query text containing discarded tokens (Alexander Korotkov)
</para>
<para>
Certain discarded tokens, like underscore, caused the output of
these functions to produce incorrect tsquery output, e.g., both
websearch_to_tsquery('"pg_class pg"') and to_tsquery('pg_class
<-> pg') used to output '( pg & class ) <-> pg',
but now both output 'pg <-> class <-> pg'.
</para>
</listitem>
> > <listitem>
> > <!--
> > Author: Alexander Korotkov <[email protected]>
> > 2021-05-03 [eb086056f] Make websearch_to_tsquery() parse text in
> > quotes as a si
> > -->
> >
> > <para>
> > Fix websearch_to_tsquery() to properly parse multiple adjacent
> > discarded tokens in quotes (Alexander Korotkov)
> > </para>
> >
> > <para>
> > Previously, quoted text that contained multiple adjacent discarded
> > tokens were treated as multiple tokens, causing incorrect tsquery
> > output, e.g., websearch_to_tsquery('"aaa: bbb"') used to output
> > 'aaa <2> bbb', but now outputs 'aaa <-> bbb'.
> > </para>
> > </listitem>
>
> This item looks good to me.
Good, thanks.
--
Bruce Momjian <[email protected]> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.