Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline

Teodor Sigaev Mon, 30 Jun 2008 12:06:09 -0700

1. Respects ShortWord and MinWords
2. Uses hlCover instead of Cover
3. Does not store norm (or lexeme) for headline marking
4. Removes ts_rank.h
5. Earlier it was counting even NONWORDTOKEN in the headline. Now it
only counts the actual words and excludes spaces etc.


I have also changed NumFragments option to MaxFragments as there may not
be enough covers to display NumFragments.


Nice. But it will be good to resolve following issues:

1) Patch contains mistakes, I didn't investigate or carefully read it. Gethttp://www.sai.msu.su/~megera/postgres/fts/apod.dump.gz and load in db.


Queries

# select ts_headline(body, plainto_tsquery('black hole'), 'MaxFragments=1') fromapod where to_tsvector(body) @@ plainto_tsquery('black hole');

and

# select ts_headline(body, plainto_tsquery('black hole'), 'MaxFragments=1') fromapod;


crash postgresql :(

2) pls, include in your patch documentation and regression tests.


Another change that I was thinking:

Right now if cover size > max_words then I just cut the trailing words.
Instead I was thinking that we should split the cover into more
fragments such that each fragment contains a few query words. Then each
fragment will not contain all query words but will show more occurrences
of query words in the headline. I would  like to know what your opinion
on this is.


Agreed.

--
Teodor Sigaev                                   E-mail: [EMAIL PROTECTED]
                                                   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline

Reply via email to