Re: [Scikit-learn-general] weight issues in scikit-learn's adaboost

Andreas Mueller Mon, 22 Jul 2013 06:13:34 -0700

That seems to look good to me (I think labs might need to be (1000,) butI'm not entirely sure).


Can you reproduce your error on random / generated data?
A gist to reproduce the problem would be great.


Cheers,
Andy


On 07/22/2013 03:02 PM, Arslan, Ali wrote:

Hi Andy,

ipdb> feats.dtype
dtype('float64')

ipdb> type(feats)
<type 'numpy.ndarray'>

ipdb> feats.shape
(1000, 20)

ipdb> labs.dtype
dtype('int8')

ipdb> type(labs)
<type 'numpy.ndarray'>

ipdb> labs.shape
(1000, 1)

I think it could also be related to the values inside the feats matrixbut I don't know what would cause these errors. I made sure that it'snot full of zero but that's the only thing I could think of.

Any ideas?
Thanks,
A

On Mon, Jul 22, 2013 at 4:43 AM, Andreas Mueller<[email protected] <mailto:[email protected]>> wrote:


    Hi Ali.
    What is the type and size of your input and output vectors?
    (type, dtype, shape)

    Cheers,
    Andy


    On 07/22/2013 01:24 AM, Arslan, Ali wrote:

    Hi,
    I'm trying to use AdaBoostClassifier with a decision tree stump
    as the base classifier. I noticed that the weight adjustment done
    by AdaBoostClassifier has been giving me errors both for SAMME.R
    and SAMME options.

    Here's a brief overview of what I'm doing

    def train_adaboost(features, labels):
        uniqLabels = np.unique(labels)
        allLearners = []
        for targetLab in uniqLabels:
            runs=[]
            for rrr in xrange(10):
                feats,labs = get_binary_sets(features, labels, targetLab)
                baseClf = DecisionTreeClassifier(max_depth=1,
    min_samples_leaf=1)
                baseClf.fit(feats, labs)

                ada_real = AdaBoostClassifier( base_estimator=baseClf,
     learning_rate=1,
     n_estimators=20,
     algorithm="SAMME")
                runs.append(ada_real.fit(feats, labs))
            allLearners.append(runs)

        return allLearners

    I looked at the fit for every single decision tree classifier and
    they are able to predict some labels. When I look at the
    AdaBoostClassifier using this base classifier, however, I get
    errors about the weight boosting algorithm.

    def compute_confidence(allLearners, dada, labbo):
        for ii,thisLab in enumerate(allLearners):
            for jj, thisLearner in enumerate(thisLab):
                #accessing thisLearner's methods here

    The methods give errors like these:

    ipdb> thisLearner.predict_proba(myData)

    
PATHTOPACKAGE/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:727:
    RuntimeWarning: invalid value encountered in double_scalars proba
    /= self.estimator_weights_.sum() *** ValueError: 'axis' entry is
    out of bounds

    ipdb> thisLearner.predict(myData)

    
PATHTOPACKAGE/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:639:
    RuntimeWarning: invalid value encountered in double_scalars pred
    /= self.estimator_weights_.sum() *** IndexError: 0-d arrays can
    only use a single () or a list of newaxes (and a single ...) as
    an index

    I tried SAMME.R algorithm for adaboost but I can't even fit
    adaboost in that case because of this error[...]

    File "PATH/sklearn/ensemble/weight_boosting.py", line 388, in fit
    return super(AdaBoostClassifier, self).fit(X, y, sample_weight)

    File "PATH/sklearn/ensemble/weight_boosting.py", line 124, in fit
    X_argsorted=X_argsorted)

    File "PATH/sklearn/ensemble/weight_boosting.py", line 435, in
    _boost X_argsorted=X_argsorted)

    File "PATH/sklearn/ensemble/weight_boosting.py", line 498, in
    _boost_real (estimator_weight < 0)))

    ValueError: non-broadcastable output operand with shape (1000)
    doesn't match the broadcast shape (1000,1000)

    the data's dimensions are actually compatible with the format
    that classifier is expecting, both before using adaboost and when
    I try to test the trained classifiers. What can these errors
    indicate?

    Thanks,
    Ali


    
------------------------------------------------------------------------------
    See everything from the browser to the database with AppDynamics
    Get end-to-end visibility with application monitoring from AppDynamics
    Isolate bottlenecks and diagnose root cause in seconds.
    Start your free trial of AppDynamics Pro today!
    http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk


    _______________________________________________
    Scikit-learn-general mailing list
    [email protected]  
<mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



    
------------------------------------------------------------------------------
    See everything from the browser to the database with AppDynamics
    Get end-to-end visibility with application monitoring from AppDynamics
    Isolate bottlenecks and diagnose root cause in seconds.
    Start your free trial of AppDynamics Pro today!
    http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
    _______________________________________________
    Scikit-learn-general mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




--
Ali B Arslan, M.Sc.
Cognitive, Linguistic and Psychological Sciences
Brown University


------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk


_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] weight issues in scikit-learn's adaboost

Reply via email to