[Scikit-learn-general] Feature selection using Information Gain and Informatin Gain Ratio

2016-03-30 Thread Viktor Pekar
Dear all, I've created a pull request with new functions to perform feature selection using IG and GR: https://github.com/scikit-learn/scikit-learn/pull/6534 These are two popular feature selection methods, it would be great if scikit-learn implemented those. The implementation closely follows th

[Scikit-learn-general] Scikit-learn for distributed processing?

2012-07-19 Thread Viktor Pekar
Dear all, I am trying to find information on the use of scikit-learn on very large datasets, i.e. if and how it can be used in a distributed processing setup. I saw that PiCloud has scikit-learn installed in their environment, and this post suggests it can be deployed on PiCloud: http://stackoverf