Christopher -

I had the same confusion with vectordump output on a hadoop cluster. The
solution is that it's not trying to write a file to your hdfs: -o will go
locally. So when I just named a file (it did not want to create a local
directory), it wound up in the /bin I was working out of.

Best,
Liz


On Thu, Aug 8, 2013 at 9:47 PM, Suneel Marthi <suneel_mar...@yahoo.com>wrote:

> Seems like you are specifying a directory as input to vectordump.
> It should be a 'file' something like
> /opt/mahout/cvb-output-topic/part-xxxx in your case.
>
> Give that a try.
>
>
>
>
> ________________________________
>  From: Christopher Schindler <ideab...@hotmail.com>
> To: "user@mahout.apache.org" <user@mahout.apache.org>
> Sent: Thursday, August 8, 2013 8:35 PM
> Subject: RE: Using CVB; LdaTopics confusion
>
>
> Thank you Suneel, I appreciate the pointer. I am using Mahout 0.8 but I
> was following the wiki and not the examples/*.
> I've gotten CVB to run successfully but now vectordump is giving me
> trouble. The call:
> bin/mahout vectordump -i /opt/mahout/cvb-output-topic -o
> /opt/mahout/output -p true -c /opt/mahout/output/vectors.csv -dt
> sequencefile
> The error returned either:Exception in thread "main"
> java.io.FileNotFoundException: /opt/mahout/output/ (No such file or
> directory)[ variant is triggered if I specify -c]
> OR
> Exception in thread "main" java.io.FileNotFoundException:
> /opt/mahout/output (Permission denied)[ no -c param specified]
> Which is odd for several reasons. First, that's a HDFS directory and the
> utilities have been writing and creating directories in that location just
> fine through the prior steps. Second, the output directory does existing in
> HDFS. I've tried various combinations (referencing a directory that
> does/doesn't exist, appending an actual file to the path and others) with
> no success.
> Any insight?
> Cheers!
> Chris
>
> > Date: Wed, 7 Aug 2013 01:58:52 -0700
> > From: suneel_mar...@yahoo.com
> > Subject: Re: Using CVB; LdaTopics confusion
> > To: user@mahout.apache.org
> >
> > If u r using Mahout 0.8, suggest that you look at the CVB invocation in
> examples/bin/cluster-reuters.sh as reference for the sequence of steps (and
> other command line options for each step).
> >
> > ldatopics has been deprecated (in 0.8) and removed completely (in 0.9).
> >
> > Anyways, the input vectors directory in ur case would be -
> '/opt/mahout/cvb-output/topic_dist.out', but I would desist from using it
> as its been deprecated.
> >
> >
> >
> >
> >
> > ________________________________
> >  From: Christopher Schindler <ideab...@hotmail.com>
> > To: "user@mahout.apache.org" <user@mahout.apache.org>
> > Sent: Wednesday, August 7, 2013 2:34 AM
> > Subject: Using CVB; LdaTopics confusion
> >
> >
> > Hi all,
> > A noob question I'm sure but I'm stuck. I'm using CVB to cluster a text
> index of articles.
> > Here's the CVB call:
> > bin/mahout cvb \ -i /opt/mahout/lucene-sparse-vectors-cvb/matrix \ -dict
> /opt/mahout/cvb-output/dict.file-* \ -o
> /opt/mahout/cvb-output/topic_terms.out \ -dt
> /opt/mahout/cvb-output/topic_dist.out \ -k 200 \-mt
> /opt/mahout/output/iterations/ \-x 20 -a .25 -ow
> > I'm trying to access the topics using ldatopics per
> https://cwiki.apache.org/confluence/display/MAHOUT/Latent+Dirichlet+Allocation
> .
> > My latest combination was: bin/mahout ldatopics -i
> opt/mahout/cvb-output/ -d /opt/mahout/cvb-output/dict.file-*
> > However, it returns an error stating: ERROR driver.MahoutDriver: : Try
> the new Collapsed Variation Bayes LDA, try bin/mahout cvb or bin/mahout
> cvb0_local
> > The spec is:bin/mahout ldatopics \    -i <input vectors directory> \
> -d <input dictionary file> \
> > What is the vectors directory supposed to be? Many thanks in advance.
> > Cheers!
> > Chris
>

Reply via email to