Re: MLlib Naive Bayes classifier confidence

2014-12-04 Thread MariusFS
That was it, Thanks. (Posting here so people know it's the right answer in
case they have the same need :) ).



sowen wrote
 Probabilities won't sum to 1 since this expression doesn't incorporate
 the probability of the evidence, I imagine? it's constant across
 classes so is usually excluded. It would appear as a -
 log(P(evidence)) term.





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Naive-Bayes-classifier-confidence-tp18456p20361.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: MLlib Naive Bayes classifier confidence

2014-12-03 Thread Sean Owen
Probabilities won't sum to 1 since this expression doesn't incorporate
the probability of the evidence, I imagine? it's constant across
classes so is usually excluded. It would appear as a -
log(P(evidence)) term.

On Tue, Dec 2, 2014 at 10:44 AM, MariusFS marius.fete...@sien.com wrote:
 Are we sure that exponentiating will give us the probabilities? I did some
 tests by cloning the MLLIb class and adding the required code but the
 calculated probabilities do not add up to 1.

 I tried something like :

   def predictProbs(testData: Vector): (BDV[Double], BDV[Double]) = {
 val logProbs = brzPi + brzTheta * new BDV[Double](testData.toArray)
 val probs = logProbs.map(x = math.exp(x))
 (logProbs, probs)
   }

 This was because I need the actual probs to process downstream from this...




 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Naive-Bayes-classifier-confidence-tp18456p20175.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: MLlib Naive Bayes classifier confidence

2014-12-02 Thread MariusFS
Are we sure that exponentiating will give us the probabilities? I did some
tests by cloning the MLLIb class and adding the required code but the
calculated probabilities do not add up to 1.

I tried something like :

  def predictProbs(testData: Vector): (BDV[Double], BDV[Double]) = {
val logProbs = brzPi + brzTheta * new BDV[Double](testData.toArray)
val probs = logProbs.map(x = math.exp(x))
(logProbs, probs)
  }

This was because I need the actual probs to process downstream from this...




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Naive-Bayes-classifier-confidence-tp18456p20175.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: MLlib Naive Bayes classifier confidence

2014-11-10 Thread Sean Owen
Not directly. If you could access brzPi and brzTheta in the
NaiveBayesModel, you could repeat its same computation in predict() and
exponentiate it to get back class probabilities, since input and internal
values are in log space.

Hm I wonder how people feel about exposing those fields or a different
method to expose class probabilities? Seems useful since it is conceptually
directly available.
On Nov 10, 2014 5:46 AM, jatinpreet jatinpr...@gmail.com wrote:

 Hi,

 Is there a way to get the confidence value of a prediction with  MLlib's
 implementation of Naive Baye's classification. I wish to eliminate the
 samples that were classified with low confidence.

 Thanks,
 Jatin



 -
 Novice Big Data Programmer
 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Naive-Bayes-classifier-confidence-tp18456.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: MLlib Naive Bayes classifier confidence

2014-11-10 Thread jatinpreet
Thanks for the answer. The variables brzPi and brzTheta are declared private.
I am writing my code with Java otherwise I could have replicated the scala
class and performed desired computation, which is as I observed  a
multiplication of brzTheta  with test vector and adding this value to brzPi.

Any suggestions of a way out other than replicating the whole functionality
of Naive Baye's model in Java? That would be a time consuming process.



-
Novice Big Data Programmer
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Naive-Bayes-classifier-confidence-tp18456p18472.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: MLlib Naive Bayes classifier confidence

2014-11-10 Thread Sean Owen
It's hacky, but you could access these fields via reflection. It'd be
better to propose opening them up in a PR.

On Mon, Nov 10, 2014 at 9:25 AM, jatinpreet jatinpr...@gmail.com wrote:
 Thanks for the answer. The variables brzPi and brzTheta are declared private.
 I am writing my code with Java otherwise I could have replicated the scala
 class and performed desired computation, which is as I observed  a
 multiplication of brzTheta  with test vector and adding this value to brzPi.

 Any suggestions of a way out other than replicating the whole functionality
 of Naive Baye's model in Java? That would be a time consuming process.


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: MLlib Naive Bayes classifier confidence

2014-11-10 Thread jatinpreet
Thanks, I will try it out and raise a request for making the variables
accessible.

An unrelated question, do you think the probability value thus calculated
will be a good measure of confidence in prediction? I have been reading
mixed opinions about the same.

Jatin



-
Novice Big Data Programmer
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Naive-Bayes-classifier-confidence-tp18456p18497.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org