Could you share the error log? What do you mean by "500 instead of
200"? If this is the number of files, try to use `repartition` before
calling naive Bayes, which works the best when the number of
partitions matches the number of cores, or even less. -Xiangrui

On Tue, Feb 10, 2015 at 10:34 PM, rkgurram <rkgur...@gmail.com> wrote:
> Further I have tried HttpBroadcast but that too does not work.
>
> It is almost like there is a MemoryLeak because if I increase the input
> files to "500" instead of "200" the system crashes early.
>
>
> The code is as follows
> ========================
>
>   logger.info("Training the model Fold:["+ fold +"]")
>     logger.info("Step 1: Split the input into Training and Testing sets")
>     val splits = labeledPointRDD.randomSplit(Array(0.6, 0.4), seed = 11L)
>     logger.info("Step 1: splits successful...")
>
>     val training = splits(0)
>     val test = splits(1)
>     status = ModelStatus.IN_TRAINING
>     //logger.info("Fold:[" + fold + "] Training count: " + training.count()
> + " Testing/Verification count:" + test.count())
>
>     logger.info("Step 2: Train the NB classifier")
>     model = NaiveBayes.train(training, lambda = 1.0)
>     logger.info("Step 2: NB model training complete Fold:[" + fold + "]")
>
>     logger.info("Step 3: Testing/Verification of the model")
>     status = ModelStatus.IN_VERIFICATION
>     val predictionAndLabel = test.map(p => (model.predict(p.features),
> p.label))
>     val arry = predictionAndLabel.filter(x => x._1 == x._2)
>     val accuracy = 1.0 * predictionAndLabel.filter(x => x._1 ==
> x._2).count() / test.count()
>     logger.info("Step 3: Testing complete")
>     status = ModelStatus.INITIALIZED
>     logger.info("Fold["+ fold +"] Accuracy:[" + accuracy + "] Model
> Status:[" + status + "]")
>
>
>
>
> -Ravi
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Naive-Bayes-model-fails-after-a-few-predictions-tp21592p21593.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to