Re: MLlib Naive Bayes classifier confidence

2014-12-04 Thread MariusFS
That was it, Thanks. (Posting here so people know it's the right answer in case they have the same need :) ). sowen wrote Probabilities won't sum to 1 since this expression doesn't incorporate the probability of the evidence, I imagine? it's constant across classes so is usually excluded. It

Re: MLlib Naive Bayes classifier confidence

2014-12-03 Thread Sean Owen
Probabilities won't sum to 1 since this expression doesn't incorporate the probability of the evidence, I imagine? it's constant across classes so is usually excluded. It would appear as a - log(P(evidence)) term. On Tue, Dec 2, 2014 at 10:44 AM, MariusFS marius.fete...@sien.com wrote: Are we

Re: MLlib Naive Bayes classifier confidence

2014-12-02 Thread MariusFS
Are we sure that exponentiating will give us the probabilities? I did some tests by cloning the MLLIb class and adding the required code but the calculated probabilities do not add up to 1. I tried something like : def predictProbs(testData: Vector): (BDV[Double], BDV[Double]) = { val

Re: MLlib Naive Bayes classifier confidence

2014-11-10 Thread Sean Owen
Not directly. If you could access brzPi and brzTheta in the NaiveBayesModel, you could repeat its same computation in predict() and exponentiate it to get back class probabilities, since input and internal values are in log space. Hm I wonder how people feel about exposing those fields or a

Re: MLlib Naive Bayes classifier confidence

2014-11-10 Thread jatinpreet
Thanks for the answer. The variables brzPi and brzTheta are declared private. I am writing my code with Java otherwise I could have replicated the scala class and performed desired computation, which is as I observed a multiplication of brzTheta with test vector and adding this value to brzPi.

Re: MLlib Naive Bayes classifier confidence

2014-11-10 Thread Sean Owen
It's hacky, but you could access these fields via reflection. It'd be better to propose opening them up in a PR. On Mon, Nov 10, 2014 at 9:25 AM, jatinpreet jatinpr...@gmail.com wrote: Thanks for the answer. The variables brzPi and brzTheta are declared private. I am writing my code with Java

Re: MLlib Naive Bayes classifier confidence

2014-11-10 Thread jatinpreet
Thanks, I will try it out and raise a request for making the variables accessible. An unrelated question, do you think the probability value thus calculated will be a good measure of confidence in prediction? I have been reading mixed opinions about the same. Jatin - Novice Big Data