2014/1/23 Maheshakya Wijewardena <[email protected]>: > Hi > > As I think, using sparse data we can enhance the descriptiveness of the data > while keeping its' smaller compared to the dense data without loosing > information.
I don't understand what you mean by "sparse data we can enhance the descriptiveness of the data". > I will try using sparse data on 20newsgroups data and let you know the > results. What do you mean? 20newsgroups data is inherently sparse in the sense as extracted BoW features are mostly zero valued. The problem is that the current implementation of Decision Trees requires a dense *representation* of that sparse data to work. To make Decision Trees work on a spase representation (e.g. a CSC sparse matrix) would require to re-implement a lot of the code. > Arnaud, > I've gone through those messages and I've already started working on > patches. Last year I've done a project of a module in our university. It was > to implement Bagging in Scikit-learn. As Gilles had already begun that, I > was not able to get my code merged. Moreover I have not implemented feature > bootstrapping as it was beyond the scope of my original proposal to the > project. > https://github.com/maheshakya/scikit-learn/blob/bagging2/sklearn/ensemble/bagging.py > > I would appreciate if you can review and give some feedback on my > implementation and what can I do further. I don't really see the point in spending time reviewing past alternative implementations of existing features. There are already 129 pull requests that need reviewer's time: https://github.com/scikit-learn/scikit-learn/pulls In my opinion it would be much more productive to fix bugs in the current code base. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
