Re: Help with Classifier

2013-02-15 Thread Ted Dunning
You win the prize. Order is very important in stochastic gradient descent. Randomizing once should be fine. It should also be fine to do a random merge of the two classes. Or an alternating join. On Thu, Feb 14, 2013 at 1:33 PM, Brian McCallister bri...@skife.org wrote: So to answer my own

Re: Help with Classifier

2013-02-14 Thread Brian McCallister
So to answer my own question, the order of training matters. I had been doing all category 1 then all category 0. Apparently this breaks things badly On Wed, Feb 13, 2013 at 4:29 PM, Brian McCallister bri...@skife.org wrote: I'm trying to do a basic two category classifier on textual data, I

Help with Classifier

2013-02-13 Thread Brian McCallister
I'm trying to do a basic two category classifier on textual data, I am working with a training set of only about 100,000 documents, and am using an AdaptiveLogisticRegression with default settings. When I build the trainer it reports: % correct: 0.9996315789473774 AUC: 0.75