Thanks Sean. As far as I can see probabilities are NOT normalized; denominator isn't implemented in either v1.1.0 or v1.5.0 (by denominator, I refer to the probability of feature X). So, for given lambda, how to compute the denominator? FYI: https://github.com/apache/spark/blob/v1.5.0/mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala
*// Adamantios* On Thu, Sep 10, 2015 at 7:03 PM, Sean Owen <so...@cloudera.com> wrote: > The log probabilities are unlikely to be very large, though the > probabilities may be very small. The direct answer is to exponentiate > brzPi + brzTheta * testData.toBreeze -- apply exp(x). > > I have forgotten whether the probabilities are normalized already > though. If not you'll have to normalize to get them to sum to 1 and be > real class probabilities. This is better done in log space though. > > On Thu, Sep 10, 2015 at 5:12 PM, Adamantios Corais > <adamantios.cor...@gmail.com> wrote: > > great. so, provided that model.theta represents the log-probabilities and > > (hence the result of brzPi + brzTheta * testData.toBreeze is a big number > > too), how can I get back the non-log-probabilities which - apparently - > are > > bounded between 0.0 and 1.0? > > > > > > // Adamantios > > > > > > > > On Tue, Sep 1, 2015 at 12:57 PM, Sean Owen <so...@cloudera.com> wrote: > >> > >> (pedantic: it's the log-probabilities) > >> > >> On Tue, Sep 1, 2015 at 10:48 AM, Yanbo Liang <yblia...@gmail.com> > wrote: > >> > Actually > >> > brzPi + brzTheta * testData.toBreeze > >> > is the probabilities of the input Vector on each class, however it's a > >> > Breeze Vector. > >> > Pay attention the index of this Vector need to map to the > corresponding > >> > label index. > >> > > >> > 2015-08-28 20:38 GMT+08:00 Adamantios Corais > >> > <adamantios.cor...@gmail.com>: > >> >> > >> >> Hi, > >> >> > >> >> I am trying to change the following code so as to get the > probabilities > >> >> of > >> >> the input Vector on each class (instead of the class itself with the > >> >> highest > >> >> probability). I know that this is already available as part of the > most > >> >> recent release of Spark but I have to use Spark 1.1.0. > >> >> > >> >> Any help is appreciated. > >> >> > >> >>> override def predict(testData: Vector): Double = { > >> >>> labels(brzArgmax(brzPi + brzTheta * testData.toBreeze)) > >> >>> } > >> >> > >> >> > >> >>> > >> >>> > >> >>> > https://github.com/apache/spark/blob/v1.1.0/mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala > >> >> > >> >> > >> >> // Adamantios > >> >> > >> >> > >> > > > > > >