A simple Vector Space Model and TFIDF usage

Amir Hossein Jadidinejad Mon, 29 Jun 2009 12:10:41 -0700

Hi,
It's my first experiment with Lucene. Please help me.
I'm going to index a set of documents and create a feature vector for each of 
them. This vector contains all terms belong to the document that weight using 
TFIDF.
After that I want to compute the cosine similarity between all documents and 
produce a doc-doc similarity matrix. My document set is large and it's 
important to have a scalable implementation.
Would you please provide me a guideline or to-do list?
Thank you and kind regards.

A simple Vector Space Model and TFIDF usage

Reply via email to