On Sun, Sep 25, 2011 at 11:12 AM, Lars Buitinck <[email protected]> wrote:

> 2011/9/25  <[email protected]>:
> > The predict_proba are just nonlinear monotonic transformations of the
> > parameters. So the difference is only in specifying the convergence
> > tolerance.
>
> That's what I thought, and I'd be so lazy to let the client determine
> the tolerance parameter ;)
>
> > However, the problem that we just had is the complete (quasi-) separation
> > case. In this case the predict_proba converge to 0 and 1, while the
> > parameters will go off to infinity.
> > So the boundary behavior might be messy.
>
> Right, so unless I map the parameters back from log-space to [0,1]
> (which is exactly what NB's predict_proba does), predict_proba would
> actually be a safer bet than coef_ + intercept_?
>

I guess it depends on what you want, in the separation case, the maximum
likelihood estimator doesn't exist and the parameters might not be
identified, see weird results that Alex had with the Iris example halfway
down this ticket
https://github.com/statsmodels/statsmodels/issues/66

For prediction, the results might be ok. And I have no idea whether you will
run into these problems in machine learning.

In the readings on complete (quasi-) separation they argued to check
convergence on the parameters, so that the no-convergence case can be
detected.
If you just use predict_proba, or in our case the default of generalized
linear models, we get convergence in the used criterion, but it didn't
indicate that the parameters didn't converge and make only very limited
sense. Our maximum likelihood models for Logit and Probit hit the maximum
number of iterations and stop without converging.

Josef





>
> --
> Lars Buitinck
> Scientific programmer, ILPS
> University of Amsterdam
>
>
> ------------------------------------------------------------------------------
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2dcopy2
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to