Why we need norms?

We don't need norms at all - all matched HeadlineWordEntry already marked by HeadlineWordEntry->item! If it equals to NULL then this word isn't contained in tsquery.

hlCover does the exact thing that Cover in tsrank does which is to find
the  cover that contains the query. However hlcover has to go through
words that do not match the query. Cover on the other hand operates on
position indexes for just the query words and so it should be faster.
Cover, by definition, is a minimal continuous text's piece matched by query. May be a several covers in text and hlCover will find all of them. Next, prsd_headline() (for now) tries to define the best one. "Best" means: cover contains a lot of words from query, not less that MinWords, not greater than MaxWords, hasn't words shorter that ShortWord on the begin and end of cover etc.

The main reason why I would I like it to be fast is that I want to
generate all covers for a given query. Then choose covers with smallest
hlCover generates all covers.

Let me know what you think on this patch and I will update the patch to
respect other options like MinWords and ShortWord.

As I understand, you very wish to call Cover() function instead of hlCover() - by design, they should be identical, but accepts different document's representation. So, the best way is generalize them: develop a new one which can be called with some kind of callback or/and opaque structure to use it in both rank and headline.


NumFragments < 2:
I wanted people to use the new headline marker if they specify
NumFragments >= 1. If they do not specify the NumFragments or put it to
Ok, but if you unify cover generation and NumFragments == 1 then result for old and new algorithms should be the same...


On an another note I found that make_tsvector crashes if it receives a
ParsedText with curwords = 0. Specifically uniqueWORD returns curwords
as 1 even when it gets 0 words. I am not sure if this is the desired
behavior.
In all places there is a check before call of make_tsvector.

--
Teodor Sigaev                                   E-mail: [EMAIL PROTECTED]
                                                   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to