Re: [Scikit-learn-general] Questions for plot_forest_iris.py and AdaBoost

Ian Ozsvald Sun, 07 Jul 2013 13:37:07 -0700

Hi Peter. Thanks for the issue. I mentioned in my email that this
brings up a second problem. The demo uses some of AdaBoost's defaults
- notably the default "algorithm='SAMME.R'".


1) If the algorithm choice were SAMME (the one other choice) then we
have model.estimator_weights_, we could scale the alpha parameter
using these weights
2) By default we use SAMME.R - I haven't (yet) figured out how the
weights are used. The estimator_weights_ are not used (they're all set
to 1.0), _samme_proba() is used but that doesn't seem to have any
weights. Maybe there aren't any weights (and I'm missing something
else) with SAMME.R?

Do you know if the SAMME.R algorithm removes the need for weights?
Maybe I'm missing something obvious?

i.

On 7 July 2013 19:55, Peter Prettenhofer <[email protected]> wrote:
> Issue is here https://github.com/scikit-learn/scikit-learn/issues/2133
>
>
> 2013/7/7 Peter Prettenhofer <[email protected]>
>>
>>
>>
>>
>> 2013/7/7 Ian Ozsvald <[email protected]>
>>>
>>> Hi all. I have a couple of questions about the demo image for the
>>> AdaBoost classifier in the dev branch:
>>> http://scikit-learn.org/dev/auto_examples/ensemble/plot_forest_iris.html
>>>
>>> I've worked through the underlying code, I understand what's being
>>> plotted, I think the AdaBoost example (final column) is in error. I
>>> figured checking my reasoning made sense before filing a bug report (I
>>> have some possible patches too).
>>>
>>> The first column is for a DecisionTree (with no limits on tree depth),
>>> the plot makes sense.
>>>
>>> The second and third columns are for a RandomForest and ExtraTrees
>>> classifier (with DecisionTrees with no depth limit). The plots for
>>> columns 2 and 3 are made by iterating over the 30 classifiers and
>>> plotting each decision surface with an alpha of 0.1.
>>>
>>> The fourth column is for an AdaBoost classifier using a DecisionTree
>>> with no limit on max depth. The plots in this column don't look right
>>> - the red regions clearly encompass where the yellow dots are drawn
>>> (this is particularly obvious in the bottom-right plot).
>>>
>>> The problem is that the weights for the ensemble of classifiers in
>>> AdaBoost aren't taken into account, I believe the alpha value for the
>>> plot should use these weights. This raises another problem but let me
>>> check first - does my logic (weights being required for the plot to
>>> make sense) sound ok?
>>
>>
>> I think you are correct - we should definitely fix that - lets create an
>> issue for that.
>>
>>>
>>>
>>> Checking clf.score (and calling clf.predict in the yellow regions)
>>> show that the underlying classifications are correct (in the yellow
>>> regions with AdaBoost the yellow class is chosen). I'm pretty
>>> confident it is just the display that's in error.
>>>
>>> I guess possibly the display is meant to force the user to question
>>> why the classifications look wrong and to reason about the weights in
>>> AdaBoost, but I'm probably overthinking this!
>>>
>>> Regards,
>>> Ian.
>>>
>>>
>>> --
>>> Ian Ozsvald (A.I. researcher)
>>> [email protected]
>>>
>>> http://IanOzsvald.com
>>> http://MorConsulting.com/
>>> http://Annotate.IO
>>> http://SocialTiesApp.com/
>>> http://TheScreencastingHandbook.com
>>> http://FivePoundApp.com/
>>> http://twitter.com/IanOzsvald
>>> http://ShowMeDo.com
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> This SF.net email is sponsored by Windows:
>>>
>>> Build for Windows Store.
>>>
>>> http://p.sf.net/sfu/windows-dev2dev
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>
>>
>> --
>> Peter Prettenhofer
>
>
>
>
> --
> Peter Prettenhofer
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
Ian Ozsvald (A.I. researcher)
[email protected]

http://IanOzsvald.com
http://MorConsulting.com/
http://Annotate.IO
http://SocialTiesApp.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald
http://ShowMeDo.com

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Questions for plot_forest_iris.py and AdaBoost

Reply via email to