>> Do you believe that it is a major tool that is very useful in general?
I'm not sure it's the best option, but the main motive I had behind sending
this is my desire to add new features to the ensemble package of
scikit-learn
>> Have you had a lot of success using it?
I've tried a it with the twenty newsgroup dataset - the gist :
https://gist.github.com/Moh-Yakoub/7861747 - with a library containing a
(SGD Classifier + SVC + 3 Bernoulli naive bayes + 3 multinomial naive
bayes) using the f1-score and using forming an ensemble of the top 3
models, I produced the following results
-----------------------------------------------------------------------------
training time: 8.302s
prediction time: 0.050s
precision recall f1-score support
0 0.95 0.95 0.95 37
1 0.83 0.83 0.83 65
2 0.80 0.87 0.83 54
3 0.84 0.87 0.85 76
4 0.98 0.83 0.90 66
5 0.91 0.88 0.90 59
6 0.75 0.88 0.81 50
7 0.96 0.85 0.90 53
8 0.95 0.97 0.96 63
9 0.93 0.96 0.95 57
10 0.98 0.97 0.98 65
11 0.98 0.96 0.97 53
12 0.83 0.86 0.84 57
13 0.96 0.96 0.96 53
14 0.97 0.97 0.97 65
15 0.96 0.93 0.94 73
16 0.94 0.93 0.93 54
17 0.91 1.00 0.95 63
18 0.87 0.92 0.89 37
19 0.88 0.69 0.77 32
avg / total 0.91 0.91 0.91 1132
------------------------------------------------------------------------------
Which is a `minor` improvement above the benchmarks using each of those
classifiers alone here (
http://scikit-learn.org/stable/auto_examples/document_classification_20newsgroups.html)
I agree that it's not a `major tool` and I would appreciate if you could
guide me to any new `valuable` paper about forming an ensemble from library
of models, or in general any paper that's `valuable` related to ensemble
method that I can contribute to scikit-learn, Thanks a lot for your
consideration
Respectfully
Yakoub
On Sun, Dec 8, 2013 at 7:57 PM, Gael Varoquaux <
[email protected]> wrote:
> Hi Magellane,
>
> > I would like to provide an implementation for the Ensemble selection
> > technique as described by the following paper : Ensemble selection from
> > libraries of models by Rich Caruana ,Alexandru Niculescu-Mizil,Geoff
> > Crew,Alex Ksikes (
> > www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml04.icdm06long.pdf)
>
> This paper has 200 citations on Google scholar, which is somewhat on the
> low end of what we include in scikit-learn.
>
> Do you believe that it is a major tool that is very useful in general?
> Have you had a lot of success using it?
>
> Thanks a lot for the proposal,
>
> Gaƫl
>
>
> ------------------------------------------------------------------------------
> Sponsored by Intel(R) XDK
> Develop, test and display web and hybrid apps with a single code base.
> Download it for free now!
>
> http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Sponsored by Intel(R) XDK
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general