Perhaps we should future proof here a little bit and simply have a classify 
method that returns a typed object that contains the necessary info depending 
on the implementation?  Something like:
ClassifierResult classify()

and then ClassifierResult has an enum or something that indicates whether one 
should grab the Vector, Matrix or double.   Just brainstorming...


On May 20, 2011, at 2:21 PM, Hector Yee wrote:

> Hi,
> 
>  I noticed that classifier has three functions to call to get the score.
> classify - returns probabilities
> classifyNoLink - returns the raw score (optional)
> classifyScalar - returns the binary probability
> 
> I'm working on a few classifiers for which it doesn't make sense to return
> probability. In fact, the probability is just the raw score exponentiated.
> This would distort the scores a bit, rather than if the user just used the
> raw score directly. Also, if they assume that the scores are really
> probabilities they may be tempted to use it to compare between two
> classifiers without previously calibrating on a test set.
> 
> I wonder if we can add classifiyScalarNoLink and make the NoLinks
> non-optional. They just return probabilities if you're using a classifier
> that returns in the 0-1 range.
> This way people  can choose to use either interface primarily, rather than
> calling classify and assume all classifiers support probabilities.
> 
> Finally, there's some algorithms that can return regression / ranking or
> classification scores depending on the training data. I was just planning to
> return the same value via classifiyScalarNoLink but it seems to be a poorly
> named proposed function. I could just name the function 'score' but it would
> break the naming convention already set down.
> 
> Thoughts?
> 
> -- 
> Yee Yang Li Hector
> http://hectorgon.blogspot.com/ (tech + travel)
> http://hectorgon.com (book reviews)


Reply via email to