Re: [Scikit-learn-general] is predict_proba output valid for a stopping criterion in EM?

josef . pktd Sun, 25 Sep 2011 05:39:07 -0700

On Sun, Sep 25, 2011 at 7:57 AM, Lars Buitinck <[email protected]> wrote:


> 2011/9/25 Mathieu Blondel <[email protected]>:
> > On Sun, Sep 25, 2011 at 7:05 PM, Lars Buitinck <[email protected]>
> wrote:
> >
> > That seems very similar to Kamal Nigam's semi-supervised Naive-Bayes.
>
> That's right. The first difference is the initialization, where Nigam
> starts from a labeled set containing all classes, while Liu initially
> assumes the unlabeled set contains the negative examples. The second
> difference is convergence, see below.
>
> > In theory, I think that EM guarantees convergence in likelihood but
> > not in parameters or probabilities. In practice, I don't know
> > (monitoring likelihood is slow though...). Here's a related post on
> > the LingPipe blog (especially the comments at the end):
> >
> >
> http://lingpipe-blog.com/2011/01/04/monitoring-convergence-of-em-for-map-estimates-with-priors/
>
> This is news for me. However, Liu (and I believe Nigam, in an earlier
> paper) checks convergence based on the parameters, so apparently this
> is good enough. And [thinking out loud] the prediction probabilities
> of NB would converge iff the parameters converge, right?
>
> I could of course restrict the algo to linear classifiers, if need be.
>

In statsmodels, we only have Logit and Probit with maximum likelihood, so I
don't know this in general.

The predict_proba are just nonlinear monotonic transformations of the
parameters. So the difference is only in specifying the convergence
tolerance.

However, the problem that we just had is the complete (quasi-) separation
case. In this case the predict_proba converge to 0 and 1, while the
parameters will go off to infinity.
So the boundary behavior might be messy.

Josef



>
> --
> Lars Buitinck
> Scientific programmer, ILPS
> University of Amsterdam
>
>
> ------------------------------------------------------------------------------
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2dcopy2
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] is predict_proba output valid for a stopping criterion in EM?

Reply via email to