term and should be the same size as the
> vocabulary (so if the vocabulary, or dictionary has 10 words, each vector
> should have a size of 10). This probably means that there will be some
> elements with zero counts, and a sparse vector might be a good way to handle
> that.
>
> On Wed
he same size as the
> vocabulary (so if the vocabulary, or dictionary has 10 words, each vector
> should have a size of 10). This probably means that there will be some
> elements with zero counts, and a sparse vector might be a good way to handle
> that.
>
> On Wed, Jan 13, 2016 at 6:40 PM, Li
I will try spark 1.6.0 to see it is the bug of 1.5.2.
On Wed, Jan 13, 2016 at 3:58 PM, Li Li <fancye...@gmail.com> wrote:
> I have set up a stand alone spark cluster and use the same codes. it
> still failed with the same exception
> I also preprocessed the data to lines of i
e
> improper vector size, it will not throw an exception but the term indices
> will start to be incorrect. For a small number of iterations, it is ok, but
> increasing iterations causes the indices to get larger also. Maybe that is
> what is going on in the JIRA you linked to?
>
> On W
d it ran correctly with output that looks good. Sorry, I don't
> have a YARN cluster setup right now, so maybe the error you are seeing is
> specific to that. Btw, I am running the latest Spark code from the master
> branch. Hope that helps some!
>
> Bryan
>
> On Mon, Jan 4, 2016
anyone could help? the problem is very easy to reproduce. What's wrong?
On Wed, Dec 30, 2015 at 8:59 PM, Li Li <fancye...@gmail.com> wrote:
> I use a small data and reproduce the problem.
> But I don't know my codes are correct or not because I am not familiar
> with spark.
>
return new Tuple2(id2doc._1, id2doc._2.vec);
}
}));
corpus.cache();
// Cluster the documents into three topics using LDA
DistributedLDAModel ldaModel = (DistributedLDAModel) new
LDA().setMaxIterations(iterNumber)
.setK(topicNumber).run(corpus);
es.apache.org/jira/browse/SPARK-12488
>
> I haven't figured out yet what is causing it. Do you have a small corpus
> which reproduces this error, and which you can share on the JIRA? If so,
> that would help a lot in debugging this failure.
>
> Thanks!
> Joseph
>
> O
I ran my lda example in a yarn 2.6.2 cluster with spark 1.5.2.
it throws exception in line: Matrix topics = ldaModel.topicsMatrix();
But in yarn job history ui, it's successful. What's wrong with it?
I submit job with
.bin/spark-submit --class Myclass \
--master yarn-client \