Hi,
I'm trying to use AdaBoostClassifier with a decision tree stump as the base
classifier. I noticed that the weight adjustment done by AdaBoostClassifier
has been giving me errors both for SAMME.R and SAMME options.
Here's a brief overview of what I'm doing
def train_adaboost(features, labels):
uniqLabels = np.unique(labels)
allLearners = []
for targetLab in uniqLabels:
runs=[]
for rrr in xrange(10):
feats,labs = get_binary_sets(features, labels, targetLab)
baseClf = DecisionTreeClassifier(max_depth=1,
min_samples_leaf=1)
baseClf.fit(feats, labs)
ada_real = AdaBoostClassifier( base_estimator=baseClf,
learning_rate=1,
n_estimators=20,
algorithm="SAMME")
runs.append(ada_real.fit(feats, labs))
allLearners.append(runs)
return allLearners
I looked at the fit for every single decision tree classifier and they are
able to predict some labels. When I look at the AdaBoostClassifier using
this base classifier, however, I get errors about the weight boosting
algorithm.
def compute_confidence(allLearners, dada, labbo):
for ii,thisLab in enumerate(allLearners):
for jj, thisLearner in enumerate(thisLab):
#accessing thisLearner's methods here
The methods give errors like these:
ipdb> thisLearner.predict_proba(myData)
PATHTOPACKAGE/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:727:
RuntimeWarning: invalid value encountered in double_scalars proba /=
self.estimator_weights_.sum() *** ValueError: 'axis' entry is out of bounds
ipdb> thisLearner.predict(myData)
PATHTOPACKAGE/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:639:
RuntimeWarning: invalid value encountered in double_scalars pred /=
self.estimator_weights_.sum() *** IndexError: 0-d arrays can only use a
single () or a list of newaxes (and a single ...) as an index
I tried SAMME.R algorithm for adaboost but I can't even fit adaboost in
that case because of this error[...]
File "PATH/sklearn/ensemble/weight_boosting.py", line 388, in fit return
super(AdaBoostClassifier, self).fit(X, y, sample_weight)
File "PATH/sklearn/ensemble/weight_boosting.py", line 124, in fit
X_argsorted=X_argsorted)
File "PATH/sklearn/ensemble/weight_boosting.py", line 435, in _boost
X_argsorted=X_argsorted)
File "PATH/sklearn/ensemble/weight_boosting.py", line 498, in _boost_real
(estimator_weight < 0)))
ValueError: non-broadcastable output operand with shape (1000) doesn't
match the broadcast shape (1000,1000)
the data's dimensions are actually compatible with the format that
classifier is expecting, both before using adaboost and when I try to test
the trained classifiers. What can these errors indicate?
Thanks,
Ali
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general