On Feb 1, 2007, at 7:13 PM, Brian Whitman wrote:
I'm looking for a way to search by a field's internal TF vector
representation.
MoreLikeThis does not seem to be what I want-- it constructs a text
query based on the top scoring TF-IDF terms. I want to query by TF
vector directly, bypassing the tokens.
After looking around the archives & Lucene's code & etc, my
assumption is:
Lucene does not use the entire TF vector space in any search. There
is no tree search or other log-n search mechanism built into Lucene.
TF cos dist is using for scoring, once the search space is reduced
from the occurrence of terms in query from the inverted index, and
then it's a foreach document operation.
If this is incorrect, please let me know.
-Brian
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]