On Tuesday, October 20, 2015 6:05 AM, Geoff Winkless <[email protected]> wrote: > On 20 October 2015 at 11:35, Tim van der Linden <[email protected]>wrote:
>> Of course, I can simply go ahead and create my own synonym >> dictionary with a jargon specific synonym file to feed it. However, >> most of the synonyms are comprised out of more then a single word. > > Does the Thesaurus dictionary not do what you want? > > http://www.postgresql.org/docs/current/static/textsearch-dictionaries.html#TEXTSEARCH-THESAURUS +1 I had a very similar need for legal terms (e.g., "power of attorney") and the thesaurus fit that need exactly. I don't know whether you'll run into the other need I had that required some special handling for full text search with legal documents: things like dates, case numbers, and statute cites were not handled well by default. What I did there was to pick those out with regular expression searches, put them into a space-separated string, cast that to tsvector, assign a higher weight to such key elements, and concatenate that tsvector with the one generated from the standard text parser and dictionaries. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
