Sean, My last sentence didn't come out right. Let me try to explain my question again.
For instance, I have two categories, C1 and C2. I have trained 100 samples for C1 and 10 samples for C2. Now, I predict two samples one each of C1 and C2, namely S1 and S2 respectively. I get the following prediction results, S1=> Category: C1, Probability: 0.7 S2=> Category: C2, Probability: 0.04 Now, both the predictions are correct but their probabilities are far apart. Can I improve the prediction probability by taking the 10 samples I have of C2 and replicating each of them 10 times making the total count equal to 100 which is same as C1. Can I expect this to increase the probability of sample S2 after training the new set? Is this a viable approach? Thanks, Jatin ----- Novice Big Data Programmer -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Naive-Baye-s-classification-confidence-tp19341p19366.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org