Hi Joel

XData in the matlab code is n_features X n_samples.

On Wed, Jul 24, 2013 at 2:04 AM, Joel Nothman
<[email protected]>wrote:

> Hi Ali,
>
> Can you describe the shapes/contents of those structures?
>

tree_node.parent returns another tree object, if available.
tree_node.dim is a scalar (in this case it's 13)
tree_node.right_constrain and left_constrain are float values that may or
may not exist for each node.
I don't know much about trees yet so unfortunately I won't be able to
explain further.


>
> Am I right in thinking that this evaluates the entire tree for every
> sample, rather than just the path from root to a single leaf? I can see
> that as bringing speed gains if the process is vectorised over samples...?
>

XData (which is the transposed version of "dada" in my code) is n_features
X n_samples, specifically it's 20 X 133895.
I think you're right in that the tree is evaluate for each sample
(tree_node.dim seems to determine for which feature). I guess the recursion
over the "tree_node.parent" is an indicator of this.

Any ideas how to implement a similar procedure with the trees in scikit?
Thanks,
A



>
> - Joel
>
>
> On Wed, Jul 24, 2013 at 12:40 PM, Arslan, Ali <[email protected]>wrote:
>
>> Hi,
>> I've been running adaboost with DecisionTreeClassifier in a for a
>> multiclass detection problem (comprises of multiple one-vs-all problems).
>> The prediction method I'm using is like this:
>>
>>     for ii,thisLab in enumerate(allLearners):
>>
>>         res = np.zeros([dada.shape[0]], dtype='float16')
>>
>>         for jj, thisLearner in enumerate(thisLab):
>>
>>             my_weights = thisLearner.estimator_weights_
>>
>>             #tic = time.time()
>>
>>             for hh, thisEstimator in enumerate(thisLearner):
>>
>>                 res = res+thisEstimator.predict(DATA)*my_weights[hh]
>> I don't know how straightforward this looks but basically I'm iterating
>> over labels (or classes), then different estimators in the adaboost to
>> collect their prediction into one result array (after scaling the results
>> with each individual tree's weight).
>>
>> The innermost part of the loop is taking a bit too long (~1 sec)
>> considering it's run about 2600 time for my data.
>>
>> I was looking for faster/alternative ways of making a prediction and I've
>> encountered this toolbox for matlab:
>>
>> http://graphics.cs.msu.ru/en/science/research/machinelearning/adaboosttoolbox
>>
>> This toolbox's prediction method seems pretty succinct and it runs very
>> fast (0.0015 sec). The function is something like this:
>>
>>
>> function y = calc_output(tree_node, XData)
>> y = XData(tree_node.dim, :) * 0 + 1;
>>
>> for i = 1 : length(tree_node.parent)
>>   y = y .* calc_output(tree_node.parent, XData); % recursively split
>> based on its parents' constrain
>> end
>>
>> if( length(tree_node.right_constrain) > 0)
>>   y = y .* ((XData(tree_node.dim, :) < tree_node.right_constrain));
>> end
>> if( length(tree_node.left_constrain) > 0)
>>   y = y .* ((XData(tree_node.dim, :) > tree_node.left_constrain));
>> end
>>
>>
>>
>> I tried to find the analogues of these structures (ie. tree_node.dim ,
>> tree_node.parent, tree_node. right_constrain) in the "tree object" in
>> python but I failed to see them.
>>
>> I was wondering if it's possible to speed up the prediction like this
>> matlab example?
>> Thanks!
>>
>> --
>> Ali B Arslan, M.Sc.
>> Cognitive, Linguistic and Psychological Sciences
>> Brown University
>>
>>
>> ------------------------------------------------------------------------------
>> See everything from the browser to the database with AppDynamics
>> Get end-to-end visibility with application monitoring from AppDynamics
>> Isolate bottlenecks and diagnose root cause in seconds.
>> Start your free trial of AppDynamics Pro today!
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics
> Get end-to-end visibility with application monitoring from AppDynamics
> Isolate bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>


-- 
Ali B Arslan, M.Sc.
Cognitive, Linguistic and Psychological Sciences
Brown University
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to