Re: Problem converting SequenceFile to vectors, then running LDA

David Hall Fri, 12 Feb 2010 14:13:11 -0800

On Fri, Feb 12, 2010 at 2:00 PM, Ovidiu Dan <[email protected]> wrote:
> Hi, thanks for the fix. I guess it's one step closer to a worlable solution.
> I am now getting this error:
>
> 10/02/12 16:59:25 INFO mapred.JobClient: Task Id :
> attempt_201001192218_0234_m_000004_1, Status : FAILED
> java.lang.ArrayIndexOutOfBoundsException: 104017


You probably have more words than you allotted at the beginning.

I know it's less than ideal, but at the moment you need to specify an
upper bound on the number of words. That's the numWords parameter.

Try upping it by a factor of two or so.

-- David

> at org.apache.mahout.math.DenseMatrix.getQuick(DenseMatrix.java:75)
> at
> org.apache.mahout.clustering.lda.LDAState.logProbWordGivenTopic(LDAState.java:40)
> at
> org.apache.mahout.clustering.lda.LDAInference.eStepForWord(LDAInference.java:204)
> at
> org.apache.mahout.clustering.lda.LDAInference.infer(LDAInference.java:117)
> at org.apache.mahout.clustering.lda.LDAMapper.map(LDAMapper.java:47)
> at org.apache.mahout.clustering.lda.LDAMapper.map(LDAMapper.java:37)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
> Ovi
>
> ---
> Ovidiu Dan - http://www.ovidiudan.com/
>
> Please do not print this e-mail unless it is mandatory
>
> My public key can be downloaded from subkeys.pgp.net, or
> http://www.ovidiudan.com/public.pgp
>
>
> On Fri, Feb 12, 2010 at 4:29 PM, Ovidiu Dan <[email protected]> wrote:
>
>>
>> Will do in a bit, thanks!
>>
>> ---
>> Ovidiu Dan - http://www.ovidiudan.com/
>>
>> Please do not print this e-mail unless it is mandatory
>>
>> My public key can be downloaded from subkeys.pgp.net, or
>> http://www.ovidiudan.com/public.pgp
>>
>>
>> On Fri, Feb 12, 2010 at 4:27 PM, Robin Anil <[email protected]> wrote:
>>
>>> Ovidiu we just commited the fix. Just recreate vectors using the -seq
>>> option
>>> added to it.
>>>
>>> Remember to svnup and recompile
>>>
>>> Robin
>>> On Sat, Feb 13, 2010 at 2:39 AM, Jake Mannix <[email protected]>
>>> wrote:
>>>
>>> > Robin and I are trying out a fix.  I already ran into this in hooking
>>> > the Vectorizer up to my SVD code.
>>> >
>>> >  -jake
>>> >
>>> > On Fri, Feb 12, 2010 at 1:08 PM, Ovidiu Dan <[email protected]> wrote:
>>> >
>>> > > Well, I tried it. The previous problem is fixed, but now I have a new
>>> and
>>> > > shiny one :(
>>> > >
>>> > > Meet line 95 of LDAInference,java:
>>> > > DenseMatrix phi = new DenseMatrix(state.numTopics, docLength);
>>> > >
>>> > > docLength is calculated above on line 88:
>>> > > int docLength = wordCounts.size();
>>> > >
>>> > > My problem is that docLength is always 2147483647. Since DenseMatrix
>>> > > allocated an array based (also) on this value, I get multiple
>>> "Requested
>>> > > array size exceeds VM limit" messages (an array with 2147483647
>>> columns
>>> > > would be quite large).
>>> > >
>>> > > I added a trivial toString function that displays vector.size()
>>> > > in org/apache/mahout/math/VectorWritable.java, recompiled the project,
>>> > then
>>> > > ran:
>>> > >
>>> > > ./hadoop fs -text
>>> > > /user/MY_USERNAME/projects/lda/mahout_vectors/vectors/part-00000
>>> > >
>>> > > All lines were DOCUMENT_ID (tab) 2147483647. So all vectors report
>>> > > size 2147483647
>>> > >
>>> > > I checked the output for vector.zSum() as well, that one looks fine.
>>> > >
>>> > > I can confirm that my input SequenceFile is correct, it has the
>>> following
>>> > > format:
>>> > > - key: Text with unique id of document
>>> > > - value: Text with the contents of the document
>>> > >
>>> > > Ovi
>>> > >
>>> > > ---
>>> > > Ovidiu Dan - http://www.ovidiudan.com/
>>> > >
>>> > > Please do not print this e-mail unless it is mandatory
>>> > >
>>> > > My public key can be downloaded from subkeys.pgp.net, or
>>> > > http://www.ovidiudan.com/public.pgp
>>> > >
>>> > >
>>> > > On Fri, Feb 12, 2010 at 3:30 PM, Ovidiu Dan <[email protected]> wrote:
>>> > >
>>> > > >
>>> > > > Thanks, I also patched it myself but now I have some other problems.
>>> > I'll
>>> > > > run it and let you know how it goes.
>>> > > >
>>> > > > Ovi
>>> > > >
>>> > > > ---
>>> > > > Ovidiu Dan - http://www.ovidiudan.com/
>>> > > >
>>> > > > Please do not print this e-mail unless it is mandatory
>>> > > >
>>> > > > My public key can be downloaded from subkeys.pgp.net, or
>>> > > > http://www.ovidiudan.com/public.pgp
>>> > > >
>>> > > >
>>> > > > On Fri, Feb 12, 2010 at 3:25 PM, Robin Anil <[email protected]>
>>> > > wrote:
>>> > > >
>>> > > >> I fixed the bug here
>>> https://issues.apache.org/jira/browse/MAHOUT-289
>>> > > >>
>>> > > >>
>>> > > >> Try running now.
>>> > > >>
>>> > > >> Robin
>>> > > >>
>>> > > >> On Sat, Feb 13, 2010 at 1:09 AM, Ovidiu Dan <[email protected]>
>>> wrote:
>>> > > >>
>>> > > >> > I checked the code.
>>> > > >> >
>>> > > >> > Line 36 of LDAMapper.java references Vector:
>>> > > >> >
>>> > > >> > public class LDAMapper extends
>>> > > >> >    Mapper<WritableComparable<?>, *Vector*, IntPairWritable,
>>> > > >> DoubleWritable>
>>> > > >> > {
>>> > > >> >
>>> > > >> > Aren't all elements in Mapper<...> supposed to be Writable? Do
>>> you
>>> > > need
>>> > > >> to
>>> > > >> > do a conversion to VectorWritable?
>>> > > >> >
>>> > > >> > Ovi
>>> > > >> >
>>> > > >> > ---
>>> > > >> > Ovidiu Dan - http://www.ovidiudan.com/
>>> > > >> >
>>> > > >> > Please do not print this e-mail unless it is mandatory
>>> > > >> >
>>> > > >> > My public key can be downloaded from subkeys.pgp.net, or
>>> > > >> > http://www.ovidiudan.com/public.pgp
>>> > > >> >
>>> > > >> >
>>> > > >> > On Fri, Feb 12, 2010 at 2:32 PM, Ovidiu Dan <[email protected]>
>>> > wrote:
>>> > > >> >
>>> > > >> > >
>>> > > >> > > Ok I did a clean checkout & install and ran everything again,
>>> then
>>> > > >> > pointed
>>> > > >> > > LDA to mahout_vectors/vectors.
>>> > > >> > >
>>> > > >> > > Now I get this error:
>>> > > >> > >
>>> > > >> > > 10/02/12 14:28:35 INFO mapred.JobClient: Task Id :
>>> > > >> > > attempt_201001192218_0216_m_000005_0, Status : FAILED
>>> > > >> > > java.lang.ClassCastException:
>>> > *org.apache.mahout.math.VectorWritable
>>> > > >> > > cannot be cast to org.apache.mahout.math.Vector*
>>> > > >> > >  at
>>> > > org.apache.mahout.clustering.lda.LDAMapper.map(LDAMapper.java:36)
>>> > > >> > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>> > > >> > >  at
>>> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
>>> > > >> > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>> > > >> > >  at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>> > > >> > >
>>> > > >> > > Ovi
>>> > > >> > >
>>> > > >> > > ---
>>> > > >> > > Ovidiu Dan - http://www.ovidiudan.com/
>>> > > >> > >
>>> > > >> > > Please do not print this e-mail unless it is mandatory
>>> > > >> > >
>>> > > >> > > My public key can be downloaded from subkeys.pgp.net, or
>>> > > >> > > http://www.ovidiudan.com/public.pgp
>>> > > >> > >
>>> > > >> > >
>>> > > >> > > On Fri, Feb 12, 2010 at 1:50 PM, Robin Anil <
>>> [email protected]
>>> > >
>>> > > >> > wrote:
>>> > > >> > >
>>> > > >> > >> Hi Ovidu,
>>> > > >> > >>            If you choose tf, the vectors are generated in
>>> > > >> > >> outputfolder/vectors and if you choose tfidf the vectors are
>>> > > >> generated
>>> > > >> > in
>>> > > >> > >> outputfolder/tfidf/vectors. I am in the process of changing
>>> the
>>> > > code
>>> > > >> to
>>> > > >> > >> move
>>> > > >> > >> the output of the map/reduce to a fixed destination and that
>>> > > >> exception
>>> > > >> > >> would
>>> > > >> > >> have caused the folders not to move.
>>> > > >> > >> Thats the reason for the last error.
>>> > > >> > >>
>>> > > >> > >> For the first error, I am not sure what is happening. Could
>>> you
>>> > do
>>> > > a
>>> > > >> > clean
>>> > > >> > >> compile of mahout
>>> > > >> > >>
>>> > > >> > >> mvn clean install -DskipTests=true and make sure you svn up
>>> the
>>> > > trunk
>>> > > >> > >> before
>>> > > >> > >> doing that
>>> > > >> > >>
>>> > > >> > >> Then point your LDA to mahout_vectors/vectors
>>> > > >> > >>
>>> > > >> > >> Robin
>>> > > >> > >>
>>> > > >> > >>
>>> > > >> > >> On Sat, Feb 13, 2010 at 12:14 AM, Ovidiu Dan <
>>> [email protected]>
>>> > > >> wrote:
>>> > > >> > >>
>>> > > >> > >> > Hi again,
>>> > > >> > >> >
>>> > > >> > >> > Is there any workaround for my problem(s)? Or is there any
>>> > other
>>> > > >> way
>>> > > >> > >> that
>>> > > >> > >> > would allow me to transform many many small messages
>>> (they're
>>> > > >> Tweets)
>>> > > >> > >> into
>>> > > >> > >> > Mahout vectors, and the run LDA on them, without getting
>>> these
>>> > > >> errors?
>>> > > >> > >> > Converting them to txt files would be a bit of a pain
>>> because I
>>> > > >> would
>>> > > >> > >> get
>>> > > >> > >> > millions of very small files. And a Lucene index would be a
>>> bit
>>> > > >> > overkill
>>> > > >> > >> I
>>> > > >> > >> > think.
>>> > > >> > >> >
>>> > > >> > >> > Thanks,
>>> > > >> > >> > Ovi
>>> > > >> > >> >
>>> > > >> > >> > ---
>>> > > >> > >> > Ovidiu Dan - http://www.ovidiudan.com/
>>> > > >> > >> >
>>> > > >> > >> > Please do not print this e-mail unless it is mandatory
>>> > > >> > >> >
>>> > > >> > >> > My public key can be downloaded from subkeys.pgp.net, or
>>> > > >> > >> > http://www.ovidiudan.com/public.pgp
>>> > > >> > >> >
>>> > > >> > >> >
>>> > > >> > >> > On Fri, Feb 12, 2010 at 3:51 AM, Robin Anil <
>>> > > [email protected]>
>>> > > >> > >> wrote:
>>> > > >> > >> >
>>> > > >> > >> > > Was meant for the dev list. I am looking into the first
>>> error
>>> > > >> > >> > >
>>> > > >> > >> > > -bcc mahout-user
>>> > > >> > >> > >
>>> > > >> > >> > >
>>> > > >> > >> > > ---------- Forwarded message ----------
>>> > > >> > >> > > From: Robin Anil <[email protected]>
>>> > > >> > >> > > Date: Fri, Feb 12, 2010 at 2:20 PM
>>> > > >> > >> > > Subject: Re: Problem converting SequenceFile to vectors,
>>> then
>>> > > >> > running
>>> > > >> > >> LDA
>>> > > >> > >> > > To: [email protected]
>>> > > >> > >> > >
>>> > > >> > >> > >
>>> > > >> > >> > > Hi,
>>> > > >> > >> > >
>>> > > >> > >> > >      This confusion arises from the fact that we use
>>> > > intermediate
>>> > > >> > >> folders
>>> > > >> > >> > > as subfolders under output folder. How about we
>>> standardize
>>> > on
>>> > > >> all
>>> > > >> > the
>>> > > >> > >> > jobs
>>> > > >> > >> > > taking input, intermediate and output folder?. If not this
>>> > then
>>> > > >> for
>>> > > >> > >> the
>>> > > >> > >> > > next
>>> > > >> > >> > > release?
>>> > > >> > >> > >
>>> > > >> > >> > > Robin
>>> > > >> > >> > >
>>> > > >> > >> > >
>>> > > >> > >> > >
>>> > > >> > >> > >
>>> > > >> > >> > > On Fri, Feb 12, 2010 at 10:46 AM, Ovidiu Dan <
>>> > [email protected]
>>> > > >
>>> > > >> > >> wrote:
>>> > > >> > >> > >
>>> > > >> > >> > > > Hello Mahout developers / users,
>>> > > >> > >> > > >
>>> > > >> > >> > > > I am trying to convert a properly formatted SequenceFile
>>> to
>>> > > >> Mahout
>>> > > >> > >> > > vectors
>>> > > >> > >> > > > to run LDA on them. As reference I am using these two
>>> > > >> documents:
>>> > > >> > >> > > >
>>> > > http://cwiki.apache.org/MAHOUT/creating-vectors-from-text.html
>>> > > >> > >> > > >
>>> > > >> http://cwiki.apache.org/MAHOUT/latent-dirichlet-allocation.html
>>> > > >> > >> > > >
>>> > > >> > >> > > > I got the Mahout code from SVN on February 11th 2010.
>>> Below
>>> > I
>>> > > >> am
>>> > > >> > >> > listing
>>> > > >> > >> > > > the
>>> > > >> > >> > > > steps I have took and the problems I have encountered:
>>> > > >> > >> > > >
>>> > > >> > >> > > > export HADOOP_HOME=/home/hadoop/hadoop/hadoop_install/
>>> > > >> > >> > > > export
>>> > > >> > >> >
>>> > MAHOUT_HOME=/home/hadoop/hadoop/hadoop_install/bin/ovi/lda/trunk/
>>> > > >> > >> > > >
>>> > > >> > >> > > > $HADOOP_HOME/bin/hadoop jar
>>> > > >> > >> > > >
>>> > $MAHOUT_HOME/examples/target/mahout-examples-0.3-SNAPSHOT.job
>>> > > >> > >> > > > org.apache.mahout.text.SparseVectorsFromSequenceFiles -i
>>> > > >> > >> > > > /user/MY_USERNAME/projects/lda/twitter_sequence_files/
>>> -o
>>> > > >> > >> > > > /user/MY_USERNAME/projects/lda/mahout_vectors/ -wt tf
>>> > -chunk
>>> > > >> 300
>>> > > >> > -a
>>> > > >> > >> > > > org.apache.lucene.analysis.standard.StandardAnalyzer
>>> > > >> --minSupport
>>> > > >> > 2
>>> > > >> > >> > > --minDF
>>> > > >> > >> > > > 1 --maxDFPercent 50 --norm 2
>>> > > >> > >> > > >
>>> > > >> > >> > > > *Problem #1: *Got this error at the end, but I think
>>> > > everything
>>> > > >> > >> > finished
>>> > > >> > >> > > > more or less correctly:
>>> > > >> > >> > > > Exception in thread "main" java.lang.NoSuchMethodError:
>>> > > >> > >> > > >
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> org.apache.mahout.common.HadoopUtil.deletePath(Ljava/lang/String;Lorg/apache/hadoop/fs/FileSystem;)V
>>> > > >> > >> > > > at
>>> > > >> > >> > > >
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> org.apache.mahout.utils.vectors.text.DictionaryVectorizer.createTermFrequencyVectors(DictionaryVectorizer.java:173)
>>> > > >> > >> > > > at
>>> > > >> > >> > > >
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> org.apache.mahout.text.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:254)
>>> > > >> > >> > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> > > Method)
>>> > > >> > >> > > > at
>>> > > >> > >> > > >
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> > > >> > >> > > > at
>>> > > >> > >> > > >
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> > > >> > >> > > > at java.lang.reflect.Method.invoke(Method.java:597)
>>> > > >> > >> > > > at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>> > > >> > >> > > >
>>> > > >> > >> > > > $HADOOP_HOME/bin/hadoop jar
>>> > > >> > >> > > >
>>> > $MAHOUT_HOME/examples/target/mahout-examples-0.3-SNAPSHOT.job
>>> > > >> > >> > > > org.apache.mahout.clustering.lda.LDADriver -i
>>> > > >> > >> > > > /user/MY_USERNAME/projects/lda/mahout_vectors/ -o
>>> > > >> > >> > > > /user/MY_USERNAME/projects/lda/lda_out/ -k 20 --numWords
>>> > > 100000
>>> > > >> > >> > > > --numReducers 33
>>> > > >> > >> > > >
>>> > > >> > >> > > > *Problem #2: *Exception in thread "main"
>>> > > >> > >> java.io.FileNotFoundException:
>>> > > >> > >> > > > File
>>> > > >> > >> > > > does not exist:
>>> > > >> > >> > > >
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> hdfs://SOME_SERVER:8003/user/MY_USERNAME/projects/lda/mahout_vectors/partial-vectors-0/data
>>> > > >> > >> > > >
>>> > > >> > >> > > > *Tried to fix:*
>>> > > >> > >> > > >
>>> > > >> > >> > > > ../../hadoop fs -mv
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> /user/MY_USERNAME/projects/lda/mahout_vectors/partial-vectors-0/part-00000
>>> > > >> > >> > > >
>>> > > >> >
>>> /user/MY_USERNAME/projects/lda/mahout_vectors/partial-vectors-0/data
>>> > > >> > >> > > >
>>> > > >> > >> > > > *Ran again:*
>>> > > >> > >> > > >
>>> > > >> > >> > > > $HADOOP_HOME/bin/hadoop jar
>>> > > >> > >> > > >
>>> > $MAHOUT_HOME/examples/target/mahout-examples-0.3-SNAPSHOT.job
>>> > > >> > >> > > > org.apache.mahout.clustering.lda.LDADriver -i
>>> > > >> > >> > > > /user/MY_USERNAME/projects/lda/mahout_vectors/ -o
>>> > > >> > >> > > > /user/MY_USERNAME/projects/lda/lda_out/ -k 20 --numWords
>>> > > 100000
>>> > > >> > >> > > > --numReducers 33
>>> > > >> > >> > > >
>>> > > >> > >> > > > *Problem #3:*
>>> > > >> > >> > > >
>>> > > >> > >> > > > Exception in thread "main"
>>> java.io.FileNotFoundException:
>>> > > File
>>> > > >> > does
>>> > > >> > >> not
>>> > > >> > >> > > > exist:
>>> > > >> > >> > > >
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> hdfs://SOME_SERVER:8003/user/MY_USERNAME/projects/lda/mahout_vectors/tokenized-documents/data
>>> > > >> > >> > > >
>>> > > >> > >> > > > [had...@some_server retweets]$ ../../hadoop fs -ls
>>> > > >> > >> > > >
>>> > > >> /user/MY_USERNAME/projects/lda/mahout_vectors/tokenized-documents/
>>> > > >> > >> > > > Found 3 items
>>> > > >> > >> > > > -rw-r--r--   3 hadoop supergroup  129721338 2010-02-11
>>> > 23:54
>>> > > >> > >> > > >
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> /user/MY_USERNAME/projects/lda/mahout_vectors/tokenized-documents/part-00000
>>> > > >> > >> > > > -rw-r--r--   3 hadoop supergroup  128256085 2010-02-11
>>> > 23:54
>>> > > >> > >> > > >
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> /user/MY_USERNAME/projects/lda/mahout_vectors/tokenized-documents/part-00001
>>> > > >> > >> > > > -rw-r--r--   3 hadoop supergroup   24160265 2010-02-11
>>> > 23:54
>>> > > >> > >> > > >
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> >
>>> > > >>
>>> > >
>>> >
>>> /user/MY_USERNAME/projects/lda/mahout_vectors/tokenized-documents/part-00002
>>> > > >> > >> > > >
>>> > > >> > >> > > > Also, as a *bonus problem*, If the input
>>> > > >> > >> > > > folder
>>> > /user/MY_USERNAME/projects/lda/twitter_sequence_files
>>> > > >> > >> contains
>>> > > >> > >> > > more
>>> > > >> > >> > > > than one file (for example if I run only the maps
>>> without a
>>> > > >> final
>>> > > >> > >> > > reducer),
>>> > > >> > >> > > > this whole chain doesn't work.
>>> > > >> > >> > > >
>>> > > >> > >> > > > Thanks,
>>> > > >> > >> > > > Ovi
>>> > > >> > >> > > >
>>> > > >> > >> > > > ---
>>> > > >> > >> > > > Ovidiu Dan - http://www.ovidiudan.com/
>>> > > >> > >> > > >
>>> > > >> > >> > > > Please do not print this e-mail unless it is mandatory
>>> > > >> > >> > > >
>>> > > >> > >> > > > My public key can be downloaded from subkeys.pgp.net,
>>> or
>>> > > >> > >> > > > http://www.ovidiudan.com/public.pgp
>>> > > >> > >> > > >
>>> > > >> > >> > >
>>> > > >> > >> >
>>> > > >> > >>
>>> > > >> > >
>>> > > >> > >
>>> > > >> >
>>> > > >>
>>> > > >
>>> > > >
>>> > >
>>> >
>>>
>>
>>
>

Re: Problem converting SequenceFile to vectors, then running LDA

Reply via email to