On Oct 12, 2011, at 9:26 AM, beneo_7 wrote: > > > // sequenceFile -> vector > mahout seq2sparse -i ../temp/input -o ../temp/vector/ -chunk 100 -wt TFIDF -ow
I think you need the --namedVector option to get/keep named vectors. You might try using the SequenceFile dumper (seqdumper) to examine the output of this. (Also, in the future, this question is best asked on u...@mahout.apache.org) > > > // vector -> canopy > mahoutcanopy -i /home/hduser/temp/vector/vector -o /home/hduser/temp/canopy/ > -dm org.apache.mahout.common.distance.CosineDistanceMeasure -t1 0.032 -t2 > 0.008 -ow > > > > > // canopy -> kmeans > KMeansDriver.run( conf, // configuration vectorPath, // the directory > pathname for input points canopyClusterPath, // the directory pathname for > initial & computed clusters kmeansPath, // the directory pathname for output > points new CosineDistanceMeasure(), // cos 0.1d, // the convergence delta > value 10, // the maximum number of iterations true, // run clustering false > // execute map reduce ); > > > > > no exception thrown and thx in advance > > > > > At 2011-10-12 20:27:19,"Grant Ingersoll" <gsing...@apache.org> wrote: >> Can you share your actual commands? >> >> On Oct 12, 2011, at 6:21 AM, beneo_7 wrote: >> >>> hi all >>> i create vector using lucene index, and the mahout will use NamedVector, >>> but how about create vector from sequenceFile??? >>> >>> now, i create vector from text with the follow steps: >>> >>> step #1 >>> text -> sequeneceFile >>> key = text, value = text >>> i do not use seqdirectory, cuz i want to put the String key into >>> the sequenceFile, not the doc Id >>> >>> step #2 >>> seq2sparse using TFIDF >>> the output i use tfidf-vectors/ >>> >>> step #3 #4 >>> canopy -> kmeans >>> >>> step #4 >>> clusterDump >>> >>> i found the vector is >>> org.apache.mahout.math.RandomAccessSparseVector, and where i can found the >>> sequenceFile key?? >>> >>> thx in advance >> >> -------------------------------------------- >> Grant Ingersoll >> http://www.lucidimagination.com >> Lucene Eurocon 2011: http://www.lucene-eurocon.com >> -------------------------------------------- Grant Ingersoll http://www.lucidimagination.com Lucene Eurocon 2011: http://www.lucene-eurocon.com