On 23 Jan 2014, at 07:18, Maheshakya Wijewardena <[email protected]> wrote:

> Arnaud, 
> I've gone through those messages and I've already started working on patches. 
> Last year I've done a project of a module in our university. It was to 
> implement Bagging in Scikit-learn. As Gilles had already begun that, I was 
> not able to get my code merged. Moreover I have not implemented feature 
> bootstrapping as it was beyond the scope of my original proposal to the 
> project. 
> https://github.com/maheshakya/scikit-learn/blob/bagging2/sklearn/ensemble/bagging.py
> 
> I would appreciate if you can review and give some feedback on my 
> implementation and what can I do further. 
> 

Sorry, but I am currently short of time. If you sync your master branch with 
the upstream repository,
you can use github to show the differences between the current implementation 
and
your implementation.
http://stackoverflow.com/questions/3792989/how-to-view-diff-of-a-forked-github-project
 
http://stackoverflow.com/questions/14500240/how-can-i-generate-a-diff-for-a-single-file-between-two-branches-in-github


(here by you, I refer to  Maheshakya Wijewardena, but also to any prospective 
google summer 
applicants)

Everybody will appreciate if you take care of some issues in the tracker. 
You’ll be able to know more about scikit-learn:  its goal, its vision, its 
development practices, its contributors, ...
Start with small pull requests such as documentation fixes, small bugs or easy 
tagged issues.
Then progressively move on bigger projects. 

This is an opportunity for you to know more  about what you sign with a google 
summer of code.
 It will allow you to think about the subject of your
proposal. Are you sure that you want to work full time on this subject for 
several months?
If your proposal involve several specific modules, it’s a good idea to become
more familiar with this part of the code (algorithms, software designs, 
problems, bugs, drawbacks,
possible improvements, related scientific literature, …) and start contributing.

Last but not least. Committing early in the project is important, because we 
are able to better 
know you in the process.

Best,
Arnaud



> Thank you.
> 
> 
> On Wed, Jan 22, 2014 at 2:51 PM, Caleb <[email protected]> wrote:
> Hi all,
> 
> I am using random forest to do deep learning/feature learning using the 
> RandomForestEmbedding in scikit-learn. It would be cool to apply 
> the random forest on the learned features and induced a higher level 
> representation.
> 
> I have actually tried the naive approach of densified the output from 
> RandomForestEmbedding and feed it back to another one to get the second level 
> of representation of the same data, and then apply SVM on it. Not only it is 
> extremely slow, the result become worst. 
> 
> However, I think sparse matrix support for decision tree is a worthwhile 
> effort as it enables me to investigate why the result is worst easily.
> 
> Just my 2 cents.
> 
> Caleb 
> 
> 
> On Wednesday, January 22, 2014 1:15 PM, Maheshakya Wijewardena 
> <[email protected]> wrote:
> Hi, 
> 
> I have been using Scikit-learn One hot encoder for data encoding and the 
> resulting array supports only for a few models such as logistic regression, 
> SVC, etc. When I convert those sparse matrices with list comprehension or 
> toarray() function to dense matrices, resulting arrays become too large for 
> those classifiers such as Decision trees or any other tree based classifier. 
> I saw a GSOC project idea of implementing this as mentioned here.
> https://github.com/scikit-learn/scikit-learn/wiki/Google-summer-of-code-(GSOC)-2014
> I'm looking forward to apply for GSOC this year as well, so I would like 
> start working on this. From where can I get support for this. (There're no 
> possible mentors assigned for this) 
> 
> Regards,
> Maheshakya
> 
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today. 
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> 
> 
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> 
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today. 
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk_______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to