Does the dictionary have a Key 'zero'?
On Monday, January 13, 2014 7:37 PM, Yang <teddyyyy...@gmail.com> wrote: Suneel: thanks for the reply (sorry my gmail somehow put the reply into archive so it didn't show up in my inbox) the dictionary seems ok, at least not empty. -sh-3.2$ ls -l sparse/ total 464 drwxr-xr-x 2 yyang15 gid-yyang15 32768 Jan 8 15:17 df-count -rw-r--r-- 1 yyang15 gid-yyang15 203369 Jan 8 15:17 dictionary.file-0 -rw-r--r-- 1 yyang15 gid-yyang15 186893 Jan 8 15:17 frequency.file-0 drwxr-xr-x 2 yyang15 gid-yyang15 4096 Jan 8 15:17 tf-vectors drwxr-xr-x 2 yyang15 gid-yyang15 4096 Jan 8 15:17 tokenized-documents drwxr-xr-x 2 yyang15 gid-yyang15 32768 Jan 8 15:18 wordcount -sh-3.2$ bin/mahout seqdumper -i MAHOUT/sparse/dictionary.file-0 Key: containing: Value: 9229 Key: craft: Value: 9230 Key: e33494add68d3d0138c45300f0aa361a: Value: 9231 Key: elizabeth: Value: 9232 Key: extra: Value: 9233 Key: joe: Value: 9234 Key: juice: Value: 9235 Key: mario's: Value: 9236 Key: musical: Value: 9237 Key: nicest: Value: 9238 Key: petit_ermitage.html: Value: 9239 Key: rebeccabarker: Value: 9240 Key: spa's: Value: 9241 Key: steam: Value: 9242 Key: stylesheet: Value: 9243 Key: tim46679: Value: 9244 Key: topnav.search_where: Value: 9245 Key: www.expedia.com: Value: 9246 Key: xv: Value: 9247 Count: 9248 14/01/13 17:35:39 INFO driver.MahoutDriver: Program took 54565 ms (Minutes: 0.9094166666666667) On Thu, Jan 9, 2014 at 4:12 PM, Suneel Marthi <suneel_mar...@yahoo.com> wrote: The issue seems to be with ur dictionary. What is the length of dictionary? > > > > > > >On Thursday, January 9, 2014 6:49 PM, Yang <teddyyyy...@gmail.com> wrote: > >I am trying to run the lda (now called cvb) function, I followed the steps >listed in many online sources. the final step after getting the lda result, >to show the result in a human-readable form is doing this vectordump, but >it gave me the following exception: > >I also listed the first few bytes of my cvb output file, looks to be at >least not empty. > >Thanks! >yang > >sh-3.2$ bin/mahout vectordump -i MAHOUT/cvb/part-m-00000 --dictionary >sparse/dictionary.file-0 --dictionaryType sequencefile --vectorSize 10 -o >cvbout >Running on hadoop, using /apache/hadoop/bin/hadoop and HADOOP_CONF_DIR= >MAHOUT-JOB: >/home/yyang15/mahout/mahout-distribution-0.8/mahout-examples-0.8-job.jar >14/01/08 16:37:03 INFO common.AbstractJob: Command line arguments: >{--dictionary=[sparse/dictionary.file-0], --dictionaryType=[sequencefile], >--endPhase=[2147483647], --input=[MAHOUT/cvb/part-m-00000], >--output=[cvbout], --startPhase=[0], --tempDir=[temp], --vectorSize=[10]} >14/01/08 16:37:04 INFO vectors.VectorDumper: Sort? false >Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0 > at >org.apache.mahout.utils.vectors.VectorHelper$2.apply(VectorHelper.java:132) > at >org.apache.mahout.utils.vectors.VectorHelper$2.apply(VectorHelper.java:129) > at com.google.common.collect.Iterators$8.next(Iterators.java:812) > at java.util.AbstractCollection.toArray(AbstractCollection.java:124) > at java.util.ArrayList.<init>(ArrayList.java:131) > at com.google.common.collect.Lists.newArrayList(Lists.java:119) > at >org.apache.mahout.utils.vectors.VectorHelper.toWeightedTerms(VectorHelper.java:128) > at >org.apache.mahout.utils.vectors.VectorHelper.vectorToJson(VectorHelper.java:147) > at >org.apache.mahout.utils.vectors.VectorDumper.run(VectorDumper.java:240) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at >org.apache.mahout.utils.vectors.VectorDumper.main(VectorDumper.java:260) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at >org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) > at >org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:194) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:160) > > > > >SEQ^F >org.apache.hadoop.io.IntWritable%org.apache.mahout.math.VectorWritable^@^@^@^@^@^@%<D3>NX<97><A9><FD><BB>a;H<98>KȪ<82>^@^A!^G^@^@^@^D^@^@^@^@^C<A0>H=<B6>g<9C>O ><EF>^?<D8>=ˍ<8A><F1>-<AC>8=ɪA+<E0><F1>^R=<AC>-^Ck<BE>^Cm=<F4>p-<E0>ul<D3>=<BA><FE>H7T<F6>^B=<D7>E<EC><95>RH<A7>=<BB>U<DE>^B^Y"<F1>=<D9>WV^F"^P^Q=հ`^?8^N<F1>=<D6>b^YJ ><91><A0>$=<BB><94><F1><C6>^S?c=<B1><BA><88>^G<EB>i^P=<9B>^N>R<92><D2>q=<BA>^H,<9E>^_<B3><91>=<CE><ED> >i<C1>^FA=<F4>6<9F><A6><BF>^V[=<9F><E8>IN<A4>L<D5>=<B4><E5><F4>j ><83><A0>I=<F4>p<AB>֣%<80>=<A3>'^A<AB><8B>=<A9>=<A4>^V<DB>3<80>^M<B7>=<B5>A^SV^Eͺ?4 > ^K0^\<9D><BA>=<AA><86>l<8B><F4><E8>m^@^@^@^@^@^@^@^@=<C4>w^NjK<BF> > =<AB>"^O;!<E0><F7>=<AF><BC>R<DC>-