Re: [Scikit-learn-general] weight issues in scikit-learn's adaboost

Arslan, Ali Mon, 22 Jul 2013 06:04:28 -0700

Hi Andy,

ipdb> feats.dtype
dtype('float64')


ipdb> type(feats)
<type 'numpy.ndarray'>

ipdb> feats.shape
(1000, 20)

ipdb> labs.dtype
dtype('int8')

ipdb> type(labs)
<type 'numpy.ndarray'>

ipdb> labs.shape
(1000, 1)


I think it could also be related to the values inside the feats matrix but
I don't know what would cause these errors. I made sure that it's not full
of zero but that's the only thing I could think of.
Any ideas?
Thanks,
A


On Mon, Jul 22, 2013 at 4:43 AM, Andreas Mueller
<[email protected]>wrote:

>  Hi Ali.
> What is the type and size of your input and output vectors?
> (type, dtype, shape)
>
> Cheers,
> Andy
>
>
> On 07/22/2013 01:24 AM, Arslan, Ali wrote:
>
>  Hi,
> I'm trying to use AdaBoostClassifier with a decision tree stump as the
> base classifier. I noticed that the weight adjustment done by
> AdaBoostClassifier has been giving me errors both for SAMME.R and SAMME
> options.
>
> Here's a brief overview of what I'm doing
>
> def train_adaboost(features, labels):
>     uniqLabels = np.unique(labels)
>     allLearners = []
>     for targetLab in uniqLabels:
>         runs=[]
>         for rrr in xrange(10):
>             feats,labs = get_binary_sets(features, labels, targetLab)
>             baseClf = DecisionTreeClassifier(max_depth=1,
> min_samples_leaf=1)
>             baseClf.fit(feats, labs)
>
>             ada_real = AdaBoostClassifier( base_estimator=baseClf,
>                                            learning_rate=1,
>                                            n_estimators=20,
>                                            algorithm="SAMME")
>             runs.append(ada_real.fit(feats, labs))
>         allLearners.append(runs)
>
>     return allLearners
>
> I looked at the fit for every single decision tree classifier and they are
> able to predict some labels. When I look at the AdaBoostClassifier using
> this base classifier, however, I get errors about the weight boosting
> algorithm.
>
> def compute_confidence(allLearners, dada, labbo):
>     for ii,thisLab in enumerate(allLearners):
>         for jj, thisLearner in enumerate(thisLab):
>             #accessing thisLearner's methods here
>
> The methods give errors like these:
>
> ipdb> thisLearner.predict_proba(myData)
>
> PATHTOPACKAGE/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:727:
> RuntimeWarning: invalid value encountered in double_scalars proba /=
> self.estimator_weights_.sum() *** ValueError: 'axis' entry is out of bounds
>
> ipdb> thisLearner.predict(myData)
>
> PATHTOPACKAGE/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:639:
> RuntimeWarning: invalid value encountered in double_scalars pred /=
> self.estimator_weights_.sum() *** IndexError: 0-d arrays can only use a
> single () or a list of newaxes (and a single ...) as an index
>
> I tried SAMME.R algorithm for adaboost but I can't even fit adaboost in
> that case because of this error[...]
>
> File "PATH/sklearn/ensemble/weight_boosting.py", line 388, in fit return
> super(AdaBoostClassifier, self).fit(X, y, sample_weight)
>
> File "PATH/sklearn/ensemble/weight_boosting.py", line 124, in fit
> X_argsorted=X_argsorted)
>
> File "PATH/sklearn/ensemble/weight_boosting.py", line 435, in _boost
> X_argsorted=X_argsorted)
>
> File "PATH/sklearn/ensemble/weight_boosting.py", line 498, in _boost_real
> (estimator_weight < 0)))
>
> ValueError: non-broadcastable output operand with shape (1000) doesn't
> match the broadcast shape (1000,1000)
>
> the data's dimensions are actually compatible with the format that
> classifier is expecting, both before using adaboost and when I try to test
> the trained classifiers. What can these errors indicate?
>
>  Thanks,
> Ali
>
>
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics
> Get end-to-end visibility with application monitoring from AppDynamics
> Isolate bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro 
> today!http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>
>
>
> _______________________________________________
> Scikit-learn-general mailing 
> [email protected]https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics
> Get end-to-end visibility with application monitoring from AppDynamics
> Isolate bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>


-- 
Ali B Arslan, M.Sc.
Cognitive, Linguistic and Psychological Sciences
Brown University

------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] weight issues in scikit-learn's adaboost

Reply via email to