On Tue, Oct 04, 2011 at 12:23:59AM +0200, Gael Varoquaux wrote:
> On Mon, Oct 03, 2011 at 06:16:37PM -0400, Satrajit Ghosh wrote:
> >    when i used mean_square_error as score_func, it gave me p=.98, when i was
> >    pretty positive i had a significant result. but that's because the lower
> >    the value is in the distribution the better it is. this obviously 
> > reversed
> >    when i used explained_variance, where things closer to 1 are better.
> >    do you think stating that score_func should return a float between 0 and 
> > 1
> >    would be better or to state that if you have a score_func that ranges 
> > from
> >    0 to inf and whose lower bound is a better score, then interpret
> >    significance as 1-p_value?
> 
> In the scikit, there is a convention that everything that is a 'score' is
> 'bigger is better'. The reason is that it enables black box optimizers to
> tune parameters or select models based on this score. I wouldn't like to
> enforce that it is bound between 0 and 1 because many scores used in real
> life are not bound. Also, in general, you cannot interpret a score (like
> explained_variance) as related to a p-value. I wouldn't try to have a too
> simple message by fitting all the metrics in the framework. I don't think
> that it can work: they test for different things.

A more common one I've seen is -log(pval). This has the benefit that it looks
nothing like a p-value, and may have some numerical stability advantages in
certain situations (in particular, 1 - A_VERY_SMALL_VALUE is very, very
error-prone).

David

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to