Further I have tried HttpBroadcast but that too does not work. 

It is almost like there is a MemoryLeak because if I increase the input
files to "500" instead of "200" the system crashes early. 


The code is as follows
========================

  logger.info("Training the model Fold:["+ fold +"]")
    logger.info("Step 1: Split the input into Training and Testing sets")
    val splits = labeledPointRDD.randomSplit(Array(0.6, 0.4), seed = 11L)
    logger.info("Step 1: splits successful...")

    val training = splits(0)
    val test = splits(1)
    status = ModelStatus.IN_TRAINING
    //logger.info("Fold:[" + fold + "] Training count: " + training.count()
+ " Testing/Verification count:" + test.count())

    logger.info("Step 2: Train the NB classifier")
    model = NaiveBayes.train(training, lambda = 1.0)
    logger.info("Step 2: NB model training complete Fold:[" + fold + "]")

    logger.info("Step 3: Testing/Verification of the model")
    status = ModelStatus.IN_VERIFICATION
    val predictionAndLabel = test.map(p => (model.predict(p.features),
p.label))
    val arry = predictionAndLabel.filter(x => x._1 == x._2)
    val accuracy = 1.0 * predictionAndLabel.filter(x => x._1 ==
x._2).count() / test.count()
    logger.info("Step 3: Testing complete")
    status = ModelStatus.INITIALIZED
    logger.info("Fold["+ fold +"] Accuracy:[" + accuracy + "] Model
Status:[" + status + "]")




-Ravi



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Naive-Bayes-model-fails-after-a-few-predictions-tp21592p21593.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to