hi all
    i create vector using lucene index, and the mahout will use NamedVector, 
but how about create vector from sequenceFile???

    now, i create vector from text with the follow steps:

    step #1
        text -> sequeneceFile
            key = text, value = text
            i do not use seqdirectory, cuz i want to put the String key into 
the sequenceFile, not the doc Id

    step #2
        seq2sparse using TFIDF
            the output i use tfidf-vectors/

    step #3 #4
        canopy -> kmeans

    step #4
        clusterDump
       
        i found the vector is org.apache.mahout.math.RandomAccessSparseVector, 
and where i can found the sequenceFile key??

    thx in advance

Reply via email to