Re: [GENERAL] ts_headline

Richard Huxton Fri, 22 Feb 2008 02:18:22 -0800

Stephen Davies wrote:

Unfortunately, my link to the box with the test database is down due to lackof maintenance by our local telco (Telstra) but I think that I also missedthe optional config arg to ts_headline.
The lack of link also means that I cannot confirm your findings but your logiclooks good.

Looks like ALTER DATABASE SET default_text_config='english' is what youneed.

It begs the question, however, as to why ts-headline needs to reparse the rawtext.

It needs to line up tsvector lexemes with actual characters in the text.The tsvector is missing punctuation, any stopwords (the, it, a) as wellas being stemmed (if your dictionary does that).

Also, it's looking for a short span of words that provide the bestmatch. That might not be a complete match of course, and is different tohow you'd normally look to use a tsvector.

At least in my case, I am using a trigger to parse the combination of Titleand Abstract to a ts_vector field in the table row (as suggested in 12.2.2and 12.4.3 in the doco) so that the ts_vector is already available tots_headline.
If ts_headline had the ability to use that pre-parsed ts_vector, my problemwould never have arisen - and the performance of ts_headline would beimproved.

Maybe. It would still have to parse the text to some degree though, justto get the original words & punctuation into the headline.


--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org/

Re: [GENERAL] ts_headline

Reply via email to