My point is just that a method named 'score' may convey the message of what
'classifyFull' does better than the name classify and it can easily be
added to AbstractVectorClassifier without modifying any additional code
beyond:

public Vector score(Vector instance){
     return classifyFull(instance);
}

I'm not necessarily pushing for this, I'm just generating discussion.

-Timothy Mann

On Tue, Oct 23, 2012 at 12:33 PM, Ted Dunning <[email protected]> wrote:

> Classification *is* regression.  You can always ask the result for the
> index of the largest score.
>
> On Tue, Oct 23, 2012 at 7:02 AM, Timothy Mann <[email protected]
> >wrote:
>
> > It also seems strange that the classify method is being used for
> > regression. To me classification is the act of selecting a category
> > according to some rule. Here what classification does is calculate scores
> > for an instance in each category. It may make sense to add a method, for
> > example,
> >
> > public Vector scores(Vector); or maybe public Vector evaluate(Vector);,
> > etc.
> >
> > Adding a method wouldn't break older code, but it also wouldn't resolve
> > strange use of classifier.
> >
> > -Timothy Mann
> >
> > On Tue, Oct 23, 2012 at 5:32 AM, Grant Ingersoll <[email protected]
> > >wrote:
> >
> > >
> > > On Oct 22, 2012, at 12:20 AM, Ted Dunning wrote:
> > >
> > > > Yes.
> > > >
> > > > It seems stupid in retrospect.  Changing these things is very
> painful,
> > > > however, because we have no idea how many people will be affected.
> > >
> > > That being said, we are still pre 1.0.  Better to change now than to
> bake
> > > it in 1.0?
> > >
> > > >
> > > > On Sun, Oct 21, 2012 at 9:16 PM, Timothy Mann <
> [email protected]
> > > >wrote:
> > > >
> > > >> It seems strange to me that the classify method declared in
> > > >> AbstractVectorClassifier returns a vector with n-1 scores, where n
> is
> > > the
> > > >> number of categories. I understand that this decision was made for
> > > >> efficiency reasons, but it seems like classify is the first place
> > where
> > > >> people will look in the API. Instead classifyFull provides the
> > > >> implementation that a user may find more intuitive. Furthermore,
> > > >> classifyFull does not require the assumption that the scores over
> all
> > > >> categories represent probabilities that sum to one, and is therefore
> > > more
> > > >> general. In fact, classify is not even implemented for the Naive
> Bayes
> > > >> implementations but classifyFull is, which was initially confusing
> > > until I
> > > >> understood what classify actually does. Any thoughts on this?
> > > >>
> > > >> -Timothy Mann
> > > >>
> > >
> > >
> >
>

Reply via email to