10 apr 2007 kl. 16.58 skrev Sengly Heng:

I wanted to do this way as well but I am a bit worrying about computational
time as I have many documents and each document is a bit large.

I am looking for more solutions.

We don't really know what your problem is. Explaining that rathern than the solution you have thought of might render a couple of alternate solutions. Perhaps something could be precalculated and stored in the documents. Perhaps feature selection (reduction) of the terms might do the trick for you. And so on.

Let me pull some questions out of nowhere that might help: How slow is it, and how fast did you expect it to be? How many documents does your queries normally yeild in? Can you limit the evaulation to the top n documents?

Please do contribute if you have any. Your help is hightly appreciated.

As Lucene primarily is an inverted index the document vector space model is not available in any other fashion than the term frequency vectors, or building them from scratch by enumerating the whole index. The latter of course beeing horrible slow in most cases.

--
karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to