Re: [Scikit-learn-general] 2nd try: Re: suggestions for unequal group training

2015-06-23 Thread Andreas Mueller
Also, you should think about what your performance measure should be, and if it should be accuracy (usually it is not). AUC is often good, but you need to choose an operating point in the end. On 06/23/2015 10:58 AM, Trevor Stephens wrote: Many of the scikit-learn classifiers are equipped with a

Re: [Scikit-learn-general] 2nd try: Re: suggestions for unequal group training

2015-06-23 Thread Trevor Stephens
Many of the scikit-learn classifiers are equipped with a parameter `class_weight` that can be helpful in situations such as this. Depending on if you are on the development branch, or a public release, the preset "auto" or "balanced" will re-weight samples by their inverse class frequencies. You m

Re: [Scikit-learn-general] 2nd try: Re: suggestions for unequal group training

2015-06-23 Thread Sujit Pal
I had a similar situation, so I created a larger training set with roughly equal class membership by randomly sampling with replacement from the training set. Results were much better during CV (against the inflated training set) and also against the held out test set (from the original training se

[Scikit-learn-general] 2nd try: Re: suggestions for unequal group training

2015-06-23 Thread Neal Becker
Any suggestions? Neal Becker wrote: > I am interested in supervised learning for classification where I have > multiple classes, but training data is highly unequal. There may be 1000s > of training examples for class A, but maybe 100s for class B. What are > suggested algorithms/approaches? >

[Scikit-learn-general] barnabas

2015-06-23 Thread kouami barnabas
kouadibarna...@gmail.com -- Monitor 25 network devices or servers for free with OpManager! OpManager is web-based network management software that monitors network devices and physical & virtual servers, alerts via email