Well, I'm planning to have the term weights (assume in a matrix) and then using 
an adaptive learning system transform them into a new weights in such a way 
that index formed of these be optimized. Its just a test to see if this 
hypothesis is working or not.


--- On Thu, 4/9/09, Grant Ingersoll <gsing...@apache.org> wrote:

From: Grant Ingersoll <gsing...@apache.org>
Subject: Re: Vector space implemantion
To: java-user@lucene.apache.org
Date: Thursday, April 9, 2009, 6:29 PM

Assuming you want to handle the vectors yourself, as opposed to relying on the 
fact that Lucene itself implements the VSM, you should index your documents 
with TermVector.YES.  That will give you the term freq on a per doc basis, but 
you will have to use the TermEnum to get the Doc Freq.  All and all, this is 
not going to be very efficient for you, but you should be able to build up a 
matrix from it.

What is the problem you are trying to solve?



On Apr 9, 2009, at 2:33 AM, Andy wrote:

> Hello all,
> 
> I'm trying to implement a vector space model using lucene. I need to have a 
> file (or on memory) with TF/IDF weight of each term in each document. (in 
> fact that is a matrix with documents presented as vectors, in which the 
> elements of each vector is the TF weight ...)
> 
> Please Please help me on this
> contact me if you need any further info via andykan1...@yahoo.com
> Many Many thanks
> 
> 
> 
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using 
Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




      

Reply via email to