That seems to look good to me (I think labs might need to be (1000,) but
I'm not entirely sure).
Can you reproduce your error on random / generated data?
A gist to reproduce the problem would be great.
Cheers,
Andy
On 07/22/2013 03:02 PM, Arslan, Ali wrote:
Hi Andy,
ipdb> feats.dtype
dtype('float64')
ipdb> type(feats)
<type 'numpy.ndarray'>
ipdb> feats.shape
(1000, 20)
ipdb> labs.dtype
dtype('int8')
ipdb> type(labs)
<type 'numpy.ndarray'>
ipdb> labs.shape
(1000, 1)
I think it could also be related to the values inside the feats matrix
but I don't know what would cause these errors. I made sure that it's
not full of zero but that's the only thing I could think of.
Any ideas?
Thanks,
A
On Mon, Jul 22, 2013 at 4:43 AM, Andreas Mueller
<[email protected] <mailto:[email protected]>> wrote:
Hi Ali.
What is the type and size of your input and output vectors?
(type, dtype, shape)
Cheers,
Andy
On 07/22/2013 01:24 AM, Arslan, Ali wrote:
Hi,
I'm trying to use AdaBoostClassifier with a decision tree stump
as the base classifier. I noticed that the weight adjustment done
by AdaBoostClassifier has been giving me errors both for SAMME.R
and SAMME options.
Here's a brief overview of what I'm doing
def train_adaboost(features, labels):
uniqLabels = np.unique(labels)
allLearners = []
for targetLab in uniqLabels:
runs=[]
for rrr in xrange(10):
feats,labs = get_binary_sets(features, labels, targetLab)
baseClf = DecisionTreeClassifier(max_depth=1,
min_samples_leaf=1)
baseClf.fit(feats, labs)
ada_real = AdaBoostClassifier( base_estimator=baseClf,
learning_rate=1,
n_estimators=20,
algorithm="SAMME")
runs.append(ada_real.fit(feats, labs))
allLearners.append(runs)
return allLearners
I looked at the fit for every single decision tree classifier and
they are able to predict some labels. When I look at the
AdaBoostClassifier using this base classifier, however, I get
errors about the weight boosting algorithm.
def compute_confidence(allLearners, dada, labbo):
for ii,thisLab in enumerate(allLearners):
for jj, thisLearner in enumerate(thisLab):
#accessing thisLearner's methods here
The methods give errors like these:
ipdb> thisLearner.predict_proba(myData)
PATHTOPACKAGE/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:727:
RuntimeWarning: invalid value encountered in double_scalars proba
/= self.estimator_weights_.sum() *** ValueError: 'axis' entry is
out of bounds
ipdb> thisLearner.predict(myData)
PATHTOPACKAGE/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:639:
RuntimeWarning: invalid value encountered in double_scalars pred
/= self.estimator_weights_.sum() *** IndexError: 0-d arrays can
only use a single () or a list of newaxes (and a single ...) as
an index
I tried SAMME.R algorithm for adaboost but I can't even fit
adaboost in that case because of this error[...]
File "PATH/sklearn/ensemble/weight_boosting.py", line 388, in fit
return super(AdaBoostClassifier, self).fit(X, y, sample_weight)
File "PATH/sklearn/ensemble/weight_boosting.py", line 124, in fit
X_argsorted=X_argsorted)
File "PATH/sklearn/ensemble/weight_boosting.py", line 435, in
_boost X_argsorted=X_argsorted)
File "PATH/sklearn/ensemble/weight_boosting.py", line 498, in
_boost_real (estimator_weight < 0)))
ValueError: non-broadcastable output operand with shape (1000)
doesn't match the broadcast shape (1000,1000)
the data's dimensions are actually compatible with the format
that classifier is expecting, both before using adaboost and when
I try to test the trained classifiers. What can these errors
indicate?
Thanks,
Ali
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Ali B Arslan, M.Sc.
Cognitive, Linguistic and Psychological Sciences
Brown University
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general