[ 
https://issues.apache.org/jira/browse/MAHOUT-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13679076#comment-13679076
 ] 

Grant Ingersoll commented on MAHOUT-1247:
-----------------------------------------

After you run cluster-reuters.sh, you can run:
{code}bin/mahout org.apache.mahout.vectorizer.DictionaryVectorizer -i 
/tmp/mahout-work-grantingersoll/reuters-out-seqdir-sparse-kmeans/tokenized-documents
 -o ./dicVec{code}

Make sure you have HADOOP_HOME set and also substitute in the appropriate work 
directory.
                
> cluster-reuters doesn't work on Hadoop
> --------------------------------------
>
>                 Key: MAHOUT-1247
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1247
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>             Fix For: 0.8
>
>
> At least two issues:
> 1. MAHOUT-992 messed up the Distributed Cache stuff somehow
> 2. The ExtractReuters data is not being moved to HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to