running lda in spark throws exception

2015-12-27 Thread Li Li
I ran my lda example in a yarn 2.6.2 cluster with spark 1.5.2. it throws exception in line: Matrix topics = ldaModel.topicsMatrix(); But in yarn job history ui, it's successful. What's wrong with it? I submit job with .bin/spark-submit --class Myclass \ --master yarn-client \ --num-execut

Re: [VOTE] Release Apache Spark 1.6.0 (RC4)

2015-12-27 Thread Michael Armbrust
Thanks for testing and voting everyone. The vote passes unanimously with 21 +1 votes and no -1 votes. I start finalizing the release now. +1 Michael Armbrust* Reynold Xin* Andrew Or* Benjamin Fradet Mark Hamstra* Jeff Zhang Josh Rosen* Aaron Davidson* Denny Lee Yin Huai Jean-Baptiste Onofré Kous

Re: Spark Streaming Kafka - DirectKafkaInputDStream: Using the new Kafka Consumer API

2015-12-27 Thread Cody Koeninger
Should probably get everyone on the same page at https://issues.apache.org/jira/browse/SPARK-12177 On Mon, Dec 21, 2015 at 5:33 AM, Mario Ds Briggs wrote: > Hi Cody, > > I took a shot and here's what it looks like > > https://github.com/mariobriggs/spark/tree/kafka0.9-streaming/external/kafka-

Re: Kafka consumer: Upgrading to use the the new Java Consumer

2015-12-27 Thread Cody Koeninger
Have you seen SPARK-12177 On Wed, Dec 23, 2015 at 3:27 PM, eugene miretsky wrote: > Hi, > > The Kafka connector currently uses the older Kafka Scala consumer. Kafka > 0.9 came out with a new Java Kafka consumer. > > One of the main differences is that the Scala consumer uses > a Decoder( kafka.s

Re: what is the best way to debug spark / mllib?

2015-12-27 Thread Fathi Salmi, Meisam
If you are modifying only mlib, you can use the "-am" and "-pl" options with mvn to cut the build time even more. Thanks, Meisam On 12/27/2015 11:45 AM, salexln wrote: Thanks for the response, I have several more questions: *1) you should run zinc incremental compiler* I run "./build/zinc-0.3

Re: what is the best way to debug spark / mllib?

2015-12-27 Thread Ted Yu
For #1, 9 minutes seem to be normal. Here was duration for recent build on master branch: [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 10:44 mi

Re: what is the best way to debug spark / mllib?

2015-12-27 Thread salexln
Thanks for the response, I have several more questions: *1) you should run zinc incremental compiler* I run "./build/zinc-0.3.9/bin/zinc -scala-home $SCALA_HOME -nailed -start" but the compilation time of "build/mvn -DskipTests package' is still about 9 mins. Is this normal? *2) if you want brea

Re: Akka with Spark

2015-12-27 Thread Ted Yu
Disha: Please consider these resources: https://groups.google.com/forum/#!forum/akka-user https://groups.google.com/forum/#!forum/akka-dev On Sun, Dec 27, 2015 at 5:15 AM, Dean Wampler wrote: > As Reynold said, you can still use Akka with Spark, but now it's more like > using any third-party li

Re: Akka with Spark

2015-12-27 Thread Dean Wampler
As Reynold said, you can still use Akka with Spark, but now it's more like using any third-party library that isn't already a Spark dependency (at least once the current Akka dependency is fully removed). Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition

Re: SparkML algos limitations question.

2015-12-27 Thread Yanbo Liang
Hi Eugene, AFAIK, the current implementation of MultilayerPerceptronClassifier have some scalability problems if the model is very huge (such as >10M), although I think the limitation can cover many use cases already. Yanbo 2015-12-16 6:00 GMT+08:00 Joseph Bradley : > Hi Eugene, > > The maxDept

Re: Akka with Spark

2015-12-27 Thread Disha Shrivastava
Hi All, I need an Akka like framework to implement model parallelism in neural networks, an architecture similar to that given in the link http://alexminnaar.com/implementing-the-distbelief-deep-neural-network-training-framework-with-akka.html. I need to divide a big neural network ( which can't f

Re: what is the best way to debug spark / mllib?

2015-12-27 Thread Stephen Boesch
1) you should run zinc incremental compiler 2) if you want breakpoints that should likely be done in local mode 3) adjust the log4j.properties settings and you can start to see the logInfo 2015-12-27 0:20 GMT-08:00 salexln : > Hi guys, > > I'm debugging my code in mllib/clustering but i'm not sur

what is the best way to debug spark / mllib?

2015-12-27 Thread salexln
Hi guys, I'm debugging my code in mllib/clustering but i'm not sure i'm doing it the best way: I build my changes in mllib using "build/mvn -DskipTests package" and then running invoking my code using "./bin/spark-shell" My two main issues: 1) After each change the build (build/mvn -DskipTests p