Hu Junaid, Have a look at the SparseVectorsFromSequenceFiles class, as this does this already, in combination with SequenceFilesFromDirectory which can convert text files to SequenceFiles.
-Grant On Jan 4, 2012, at 8:30 AM, Junaid Surve wrote: > Hi > > I want to develop a Prototype to calculate the TF IDF from the documents > present in a directory. > > Can you please help me with the Steps to go about it using Apache Mahout? > Thank you. > > -- > Regards > Junaid -------------------------------------------- Grant Ingersoll http://www.lucidimagination.com