Jonathan Ariel skrev:
Yes, it will be too much to do in real time, but it is a good idea tough.
I don't know if a vector of term frequencies is stored with the document.
Because I could search on the index to get the subset of documents and then
take the term frequencies from there.
In that case I could change MoreLikeThis to receive a set of term
frequencies, instead of an IndexReader, and use that to do all the process.
That would probably not be too speedy.
Anyone knows if a document contains for his fields the term frequencies?
When adding a field to a document you can specify if and how detailed
term vector you want to store for easy retrieval.
http://lucene.apache.org/java/2_3_1/api/org/apache/lucene/document/Field.html
I really think you should consider one index per more-like-this filter.
karl
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]