Further I have tried HttpBroadcast but that too does not work. It is almost like there is a MemoryLeak because if I increase the input files to "500" instead of "200" the system crashes early.
The code is as follows ======================== logger.info("Training the model Fold:["+ fold +"]") logger.info("Step 1: Split the input into Training and Testing sets") val splits = labeledPointRDD.randomSplit(Array(0.6, 0.4), seed = 11L) logger.info("Step 1: splits successful...") val training = splits(0) val test = splits(1) status = ModelStatus.IN_TRAINING //logger.info("Fold:[" + fold + "] Training count: " + training.count() + " Testing/Verification count:" + test.count()) logger.info("Step 2: Train the NB classifier") model = NaiveBayes.train(training, lambda = 1.0) logger.info("Step 2: NB model training complete Fold:[" + fold + "]") logger.info("Step 3: Testing/Verification of the model") status = ModelStatus.IN_VERIFICATION val predictionAndLabel = test.map(p => (model.predict(p.features), p.label)) val arry = predictionAndLabel.filter(x => x._1 == x._2) val accuracy = 1.0 * predictionAndLabel.filter(x => x._1 == x._2).count() / test.count() logger.info("Step 3: Testing complete") status = ModelStatus.INITIALIZED logger.info("Fold["+ fold +"] Accuracy:[" + accuracy + "] Model Status:[" + status + "]") -Ravi -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Naive-Bayes-model-fails-after-a-few-predictions-tp21592p21593.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org