On Jan 4, 2012, at 7:59 PM, Lance Norskog wrote: > > Is it worthwhile to clamp the training data so that there are similar > numbers of documents for each label? Or does Naive Bayes work well > with a bell curve?
It's possible to do this already with the --maxItemsPerLabel. I think I use that somewhere in there for NB setup.
