On Wed, May 12, 2021 at 12:19:37AM +0300, Alexander Korotkov wrote: > This relates not just to quotes. Original problem relates to quotes > in websearch_to_tsquery() and phrase operator in to_tsquery(). But > the solution changes output for all query operands containing > discarded tokens. > > Could we try this? > > Make to_tsquery() and websearch_to_tsquery() produce more strict > output for query parts containing discarded tokens. In particular, > this makes to_tsquery() and websearch_to_tsquery() properly parse the > discarded tokens in phrase search operands and quotes correspondingly.
> > <para> > > Certain discarded tokens, like underscore, caused the output > > of these functions to produce incorrect tsquery output, e.g., > > websearch_to_tsquery('"pg_class pg"') used to output '( pg & > > class ) <-> pg', but now outputs 'pg <-> class <-> pg'. > > </para> > > </listitem> > > This part looks good to me. I'd just suggest to extend the example to > to_tsquery() as well. > > Certain discarded tokens, like underscore, caused the output of these > functions to produce incorrect tsquery output, e.g., both > websearch_to_tsquery('"pg_class pg"') and to_tsquery('pg_class <-> > pg') used to output '( pg & class ) <-> pg', but now both output 'pg > <-> class <-> pg'. OK, I went with this: <listitem> <!-- Author: Alexander Korotkov <akorot...@postgresql.org> 2021-01-31 [0c4f355c6] Fix parsing of complex morphs to tsquery --> <para> Fix to_tsquery() and websearch_to_tsquery() to properly parse query text containing discarded tokens (Alexander Korotkov) </para> <para> Certain discarded tokens, like underscore, caused the output of these functions to produce incorrect tsquery output, e.g., both websearch_to_tsquery('"pg_class pg"') and to_tsquery('pg_class <-> pg') used to output '( pg & class ) <-> pg', but now both output 'pg <-> class <-> pg'. </para> </listitem> > > <listitem> > > <!-- > > Author: Alexander Korotkov <akorot...@postgresql.org> > > 2021-05-03 [eb086056f] Make websearch_to_tsquery() parse text in > > quotes as a si > > --> > > > > <para> > > Fix websearch_to_tsquery() to properly parse multiple adjacent > > discarded tokens in quotes (Alexander Korotkov) > > </para> > > > > <para> > > Previously, quoted text that contained multiple adjacent discarded > > tokens were treated as multiple tokens, causing incorrect tsquery > > output, e.g., websearch_to_tsquery('"aaa: bbb"') used to output > > 'aaa <2> bbb', but now outputs 'aaa <-> bbb'. > > </para> > > </listitem> > > This item looks good to me. Good, thanks. -- Bruce Momjian <br...@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.