Sean,

My last sentence didn't come out right. Let me try to explain my question
again.

For instance, I have two categories, C1 and C2. I have trained 100 samples
for C1 and 10 samples for C2.

Now, I predict two samples one each of C1 and C2, namely S1 and S2
respectively. I get the following prediction results,

S1=> Category: C1, Probability: 0.7
S2=> Category: C2, Probability: 0.04

Now, both the predictions are correct but their probabilities are far apart.
Can I improve the prediction probability by taking the 10 samples I have of
C2 and replicating each of them 10 times making the total count equal to 100
which is same as C1.

Can I expect this to increase the probability of sample S2 after training
the new set? Is this a viable approach? 

Thanks,
Jatin



-----
Novice Big Data Programmer
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Naive-Baye-s-classification-confidence-tp19341p19366.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to