On Oct 12, 2011, at 9:26 AM, beneo_7 wrote:

> 
> 
> // sequenceFile -> vector
> mahout seq2sparse -i ../temp/input -o ../temp/vector/ -chunk 100 -wt TFIDF -ow

I think you need the --namedVector option to get/keep named vectors.  You might 
try using the SequenceFile dumper (seqdumper) to examine the output of this.

(Also, in the future, this question is best asked on u...@mahout.apache.org)

> 
> 
> // vector -> canopy
> mahoutcanopy -i /home/hduser/temp/vector/vector -o /home/hduser/temp/canopy/ 
> -dm org.apache.mahout.common.distance.CosineDistanceMeasure -t1 0.032 -t2 
> 0.008 -ow 
> 
> 
> 
> 
> // canopy -> kmeans
> KMeansDriver.run( conf, // configuration vectorPath, // the directory 
> pathname for input points canopyClusterPath, // the directory pathname for 
> initial & computed clusters kmeansPath, // the directory pathname for output 
> points new CosineDistanceMeasure(), // cos 0.1d, // the convergence delta 
> value 10, // the maximum number of iterations true, // run clustering false 
> // execute map reduce );
> 
> 
> 
> 
> no exception  thrown and thx in advance
> 
> 
> 
> 
> At 2011-10-12 20:27:19,"Grant Ingersoll" <gsing...@apache.org> wrote:
>> Can you share your actual commands?
>> 
>> On Oct 12, 2011, at 6:21 AM, beneo_7 wrote:
>> 
>>> hi all
>>>   i create vector using lucene index, and the mahout will use NamedVector, 
>>> but how about create vector from sequenceFile???
>>> 
>>>   now, i create vector from text with the follow steps:
>>> 
>>>   step #1
>>>       text -> sequeneceFile
>>>           key = text, value = text
>>>           i do not use seqdirectory, cuz i want to put the String key into 
>>> the sequenceFile, not the doc Id
>>> 
>>>   step #2
>>>       seq2sparse using TFIDF
>>>           the output i use tfidf-vectors/
>>> 
>>>   step #3 #4
>>>       canopy -> kmeans
>>> 
>>>   step #4
>>>       clusterDump
>>> 
>>>       i found the vector is 
>>> org.apache.mahout.math.RandomAccessSparseVector, and where i can found the 
>>> sequenceFile key??
>>> 
>>>   thx in advance
>> 
>> --------------------------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com
>> Lucene Eurocon 2011: http://www.lucene-eurocon.com
>> 

--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com
Lucene Eurocon 2011: http://www.lucene-eurocon.com

Reply via email to