On Sat, Nov 23, 2019 at 10:42 AM Christoph Gößmann <m...@goessmann.io> wrote:
> Hi Jeff, > > You're right about that point. Let me redefine. I would like to drop all > tokens which neither are the stemmed or unstemmed version of a known word. > Would there be the possibility of putting a wordlist as a filter ahead of > the stemming? Or do you know about a good English lexeme list that could be > used to filter after stemming? > I think what you describe is the opposite of what snowball was designed to do. You want an ispell-based dictionary instead. PostgreSQL doesn't ship with real ispell dictionaries, so you have to retrieve the files yourself and install them into $SHAREDIR/tsearch_data as described in the docs for https://www.postgresql.org/docs/12/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY Cheers, Jeff