Hi Gilles,

thanks!

I'll invest some more time into the array repr (tests, visitor,
benchmarks) and ask Brian for feedback - if that's ok I'd suggest we
merge the array repr into master and rebase ensemble (and gradient
boosting). I can do the rebasing -> it should't be a huge problem for
random forests since the modifications are pretty localized
(apply_tree + < 10 lines of code in `build_tree` ).

best,
 Peter

2011/11/4 Gilles Louppe <[email protected]>:
> Peter,
>
> This looks very good! I will definitely have a look at it later.
>
> However, as I warned in pull request #385, I have been making changes
> [1] to the tree code and to the ensemble branch. I guess our future
> patches are in conflict :( How should we proceed?
>
> [1] https://github.com/glouppe/scikit-learn/tree/ensemble
>
> Best,
>
> Gilles
>
>
> On 3 November 2011 23:40, Peter Prettenhofer
> <[email protected]> wrote:
>> Hi everybody,
>>
>> I created an experimental branch [1] which uses numpy arrays (as Gael
>> suggested) instead of the composite structure to represent the tree.
>>
>> The reason for this was two-fold: first, storage is more compact (no
>> structure padding) and writing/reading to disc is more efficient and
>> second, traversing the composite structure in cython is inefficient
>> compared to pure C. I assume that the reason for the latter is the
>> reference counting overhead when we traverse the structure (look at
>> the generated c code of the `apply_tree` function in `_tree.c`). I ran
>> into this performance problem when I benched my gradient boosting code
>> [2] against its R counterpart gbm.
>>
>> According to our covertype benchmark the new representation is a bit
>> slower at training time due to the array re-sizing operations; its
>> about a factor of 4-5 faster at prediction time - competitive with
>> liblinear on our benchmark! The graphviz exporter has not been updated
>> yet - so one test fails.
>>
>> [1] https://github.com/pprett/scikit-learn/tree/tree-array-repr
>> [2] https://github.com/pprett/scikit-learn/tree/gradient_boosting
>>
>> best,
>>  Peter
>>
>> 2011/10/28 Olivier Grisel <[email protected]>:
>>> Victor replied to me in a private message: it might be caused by this
>>> bug http://bugs.python.org/issue12775 .
>>>
>>> Brian, can you disable the gc dans re-run your scripts to check
>>> whether this is the case?
>>>
>>>  import gc
>>>  gc.disable()
>>>
>>> Also Victor would like to know whether the situation is better in python 
>>> 3.2+.
>>>
>>> --
>>> Olivier
>>>
>>> ------------------------------------------------------------------------------
>>> The demand for IT networking professionals continues to grow, and the
>>> demand for specialized networking skills is growing even more rapidly.
>>> Take a complimentary Learning@Cisco Self-Assessment and learn
>>> about Cisco certifications, training, and career opportunities.
>>> http://p.sf.net/sfu/cisco-dev2dev
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>
>>
>>
>> --
>> Peter Prettenhofer
>>
>> ------------------------------------------------------------------------------
>> RSA(R) Conference 2012
>> Save $700 by Nov 18
>> Register now
>> http://p.sf.net/sfu/rsa-sfdev2dev1
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
> ------------------------------------------------------------------------------
> RSA(R) Conference 2012
> Save $700 by Nov 18
> Register now
> http://p.sf.net/sfu/rsa-sfdev2dev1
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
Peter Prettenhofer

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to