Re: MoreLikeThis over a subset of documents

Karl Wettin Wed, 23 Apr 2008 06:57:29 -0700

Jonathan Ariel skrev:

Yes, it will be too much to do in real time, but it is a good idea tough.


I don't know if a vector of term frequencies is stored with the document.
Because I could search on the index to get the subset of documents and then
take the term frequencies from there.
In that case I could change MoreLikeThis to receive a set of term
frequencies, instead of an IndexReader, and use that to do all the process.


That would probably not be too speedy.


Anyone knows if a document contains for his fields the term frequencies?

When adding a field to a document you can specify if and how detailedterm vector you want to store for easy retrieval.


http://lucene.apache.org/java/2_3_1/api/org/apache/lucene/document/Field.html


I really think you should consider one index per more-like-this filter.


       karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: MoreLikeThis over a subset of documents

Reply via email to