Well, I'm planning to have the term weights (assume in a matrix) and then using an adaptive learning system transform them into a new weights in such a way that index formed of these be optimized. Its just a test to see if this hypothesis is working or not.
--- On Thu, 4/9/09, Grant Ingersoll <gsing...@apache.org> wrote: From: Grant Ingersoll <gsing...@apache.org> Subject: Re: Vector space implemantion To: java-user@lucene.apache.org Date: Thursday, April 9, 2009, 6:29 PM Assuming you want to handle the vectors yourself, as opposed to relying on the fact that Lucene itself implements the VSM, you should index your documents with TermVector.YES. That will give you the term freq on a per doc basis, but you will have to use the TermEnum to get the Doc Freq. All and all, this is not going to be very efficient for you, but you should be able to build up a matrix from it. What is the problem you are trying to solve? On Apr 9, 2009, at 2:33 AM, Andy wrote: > Hello all, > > I'm trying to implement a vector space model using lucene. I need to have a > file (or on memory) with TF/IDF weight of each term in each document. (in > fact that is a matrix with documents presented as vectors, in which the > elements of each vector is the TF weight ...) > > Please Please help me on this > contact me if you need any further info via andykan1...@yahoo.com > Many Many thanks > > > > -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org