Thank you Suneel, I appreciate the pointer. I am using Mahout 0.8 but I was 
following the wiki and not the examples/*.
I've gotten CVB to run successfully but now vectordump is giving me trouble. 
The call:
bin/mahout vectordump -i /opt/mahout/cvb-output-topic -o /opt/mahout/output -p 
true -c /opt/mahout/output/vectors.csv -dt sequencefile
The error returned either:Exception in thread "main" 
java.io.FileNotFoundException: /opt/mahout/output/ (No such file or directory)[ 
variant is triggered if I specify -c]
OR 
Exception in thread "main" java.io.FileNotFoundException: /opt/mahout/output 
(Permission denied)[ no -c param specified]
Which is odd for several reasons. First, that's a HDFS directory and the 
utilities have been writing and creating directories in that location just fine 
through the prior steps. Second, the output directory does existing in HDFS. 
I've tried various combinations (referencing a directory that does/doesn't 
exist, appending an actual file to the path and others) with no success. 
Any insight?
Cheers!
Chris

> Date: Wed, 7 Aug 2013 01:58:52 -0700
> From: suneel_mar...@yahoo.com
> Subject: Re: Using CVB; LdaTopics confusion
> To: user@mahout.apache.org
> 
> If u r using Mahout 0.8, suggest that you look at the CVB invocation in 
> examples/bin/cluster-reuters.sh as reference for the sequence of steps (and 
> other command line options for each step).
> 
> ldatopics has been deprecated (in 0.8) and removed completely (in 0.9).
> 
> Anyways, the input vectors directory in ur case would be - 
> '/opt/mahout/cvb-output/topic_dist.out', but I would desist from using it as 
> its been deprecated.
> 
> 
> 
> 
> 
> ________________________________
>  From: Christopher Schindler <ideab...@hotmail.com>
> To: "user@mahout.apache.org" <user@mahout.apache.org> 
> Sent: Wednesday, August 7, 2013 2:34 AM
> Subject: Using CVB; LdaTopics confusion
>  
> 
> Hi all,
> A noob question I'm sure but I'm stuck. I'm using CVB to cluster a text index 
> of articles. 
> Here's the CVB call:
> bin/mahout cvb \ -i /opt/mahout/lucene-sparse-vectors-cvb/matrix \ -dict 
> /opt/mahout/cvb-output/dict.file-* \ -o 
> /opt/mahout/cvb-output/topic_terms.out \ -dt 
> /opt/mahout/cvb-output/topic_dist.out \ -k 200 \-mt 
> /opt/mahout/output/iterations/ \-x 20 -a .25 -ow
> I'm trying to access the topics using ldatopics per 
> https://cwiki.apache.org/confluence/display/MAHOUT/Latent+Dirichlet+Allocation.
>  
> My latest combination was: bin/mahout ldatopics -i opt/mahout/cvb-output/ -d 
> /opt/mahout/cvb-output/dict.file-*
> However, it returns an error stating: ERROR driver.MahoutDriver: : Try the 
> new Collapsed Variation Bayes LDA, try bin/mahout cvb or bin/mahout cvb0_local
> The spec is:bin/mahout ldatopics \    -i <input vectors directory> \    -d 
> <input dictionary file> \
> What is the vectors directory supposed to be? Many thanks in advance.
> Cheers!
> Chris 
                                          

Reply via email to