[EMAIL PROTECTED] skrev:
If you want something from an index it has to be IN the index. So, store a
summary field in each document and make sure that field is part of the
query.

And how could one create automatically such a summary?
Taking the first 2 lines of a document makes not always much sense.
How does google this?

Google don't summarize, they highlight parts that match the query. See previous reponses.

If you really want to summarize there are a number of more and less scientific ways to figure out what's important and what's not.

Very simple algorithmic solutions usually involve ranking top senstances by looking at distribution of terms in sentances, paragraphs and the whole document. I implemented something like this a couple of years back that worked fairly well.

Citeseer is a great source for papers on pretty much any IR related subject: <http://citeseer.ist.psu.edu/cs?cs=1&q=text+summarization>


   karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to