Thanks Sean. As far as I can see probabilities are NOT normalized;
denominator isn't implemented in either v1.1.0 or v1.5.0 (by denominator,
I refer to the probability of feature X). So, for given lambda, how to
compute the denominator? FYI:
https://github.com/apache/spark/blob/v1.5.0/mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala


*// Adamantios*



On Thu, Sep 10, 2015 at 7:03 PM, Sean Owen <so...@cloudera.com> wrote:

> The log probabilities are unlikely to be very large, though the
> probabilities may be very small. The direct answer is to exponentiate
> brzPi + brzTheta * testData.toBreeze -- apply exp(x).
>
> I have forgotten whether the probabilities are normalized already
> though. If not you'll have to normalize to get them to sum to 1 and be
> real class probabilities. This is better done in log space though.
>
> On Thu, Sep 10, 2015 at 5:12 PM, Adamantios Corais
> <adamantios.cor...@gmail.com> wrote:
> > great. so, provided that model.theta represents the log-probabilities and
> > (hence the result of brzPi + brzTheta * testData.toBreeze is a big number
> > too), how can I get back the non-log-probabilities which - apparently -
> are
> > bounded between 0.0 and 1.0?
> >
> >
> > // Adamantios
> >
> >
> >
> > On Tue, Sep 1, 2015 at 12:57 PM, Sean Owen <so...@cloudera.com> wrote:
> >>
> >> (pedantic: it's the log-probabilities)
> >>
> >> On Tue, Sep 1, 2015 at 10:48 AM, Yanbo Liang <yblia...@gmail.com>
> wrote:
> >> > Actually
> >> > brzPi + brzTheta * testData.toBreeze
> >> > is the probabilities of the input Vector on each class, however it's a
> >> > Breeze Vector.
> >> > Pay attention the index of this Vector need to map to the
> corresponding
> >> > label index.
> >> >
> >> > 2015-08-28 20:38 GMT+08:00 Adamantios Corais
> >> > <adamantios.cor...@gmail.com>:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I am trying to change the following code so as to get the
> probabilities
> >> >> of
> >> >> the input Vector on each class (instead of the class itself with the
> >> >> highest
> >> >> probability). I know that this is already available as part of the
> most
> >> >> recent release of Spark but I have to use Spark 1.1.0.
> >> >>
> >> >> Any help is appreciated.
> >> >>
> >> >>> override def predict(testData: Vector): Double = {
> >> >>>     labels(brzArgmax(brzPi + brzTheta * testData.toBreeze))
> >> >>>   }
> >> >>
> >> >>
> >> >>>
> >> >>>
> >> >>>
> https://github.com/apache/spark/blob/v1.1.0/mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala
> >> >>
> >> >>
> >> >> // Adamantios
> >> >>
> >> >>
> >> >
> >
> >
>

Reply via email to