Re: Get the total term frequency vector of a specific field from the hit results

karl wettin Tue, 10 Apr 2007 08:21:29 -0700


10 apr 2007 kl. 16.58 skrev Sengly Heng:

I wanted to do this way as well but I am a bit worrying aboutcomputational
time as I have many documents and each document is a bit large.

I am looking for more solutions.

We don't really know what your problem is. Explaining that rathernthan the solution you have thought of might render a couple ofalternate solutions. Perhaps something could be precalculated andstored in the documents. Perhaps feature selection (reduction) of theterms might do the trick for you. And so on.

Let me pull some questions out of nowhere that might help: How slowis it, and how fast did you expect it to be? How many documents doesyour queries normally yeild in? Can you limit the evaulation to thetop n documents?

Please do contribute if you have any. Your help is hightlyappreciated.

As Lucene primarily is an inverted index the document vector spacemodel is not available in any other fashion than the term frequencyvectors, or building them from scratch by enumerating the wholeindex. The latter of course beeing horrible slow in most cases.


--
karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Get the total term frequency vector of a specific field from the hit results

Reply via email to