On Feb 1, 2007, at 7:13 PM, Brian Whitman wrote:

I'm looking for a way to search by a field's internal TF vector representation.

MoreLikeThis does not seem to be what I want-- it constructs a text query based on the top scoring TF-IDF terms. I want to query by TF vector directly, bypassing the tokens.


After looking around the archives & Lucene's code & etc, my assumption is:

Lucene does not use the entire TF vector space in any search. There is no tree search or other log-n search mechanism built into Lucene. TF cos dist is using for scoring, once the search space is reduced from the occurrence of terms in query from the inverted index, and then it's a foreach document operation.

If this is incorrect, please let me know.

-Brian



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to