Also, you should think about what your performance measure should be,
and if it should be accuracy (usually it is not).
AUC is often good, but you need to choose an operating point in the end.
On 06/23/2015 10:58 AM, Trevor Stephens wrote:
Many of the scikit-learn classifiers are equipped with a
Many of the scikit-learn classifiers are equipped with a parameter
`class_weight` that can be helpful in situations such as this. Depending on
if you are on the development branch, or a public release, the preset
"auto" or "balanced" will re-weight samples by their inverse class
frequencies.
You m
I had a similar situation, so I created a larger training set with roughly
equal class membership by randomly sampling with replacement from the
training set. Results were much better during CV (against the inflated
training set) and also against the held out test set (from the original
training se
Any suggestions?
Neal Becker wrote:
> I am interested in supervised learning for classification where I have
> multiple classes, but training data is highly unequal. There may be 1000s
> of training examples for class A, but maybe 100s for class B. What are
> suggested algorithms/approaches?
>
kouadibarna...@gmail.com
--
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors
network devices and physical & virtual servers, alerts via email