On 03/02/2012 02:33 PM, Olivier Grisel wrote: > 2012/3/2 Andreas<[email protected]>: > >> Hi everybody! >> As I am in charge of the next release, here a quick pep talk and >> wish list: >> >> We are two month into .10 now and I feel the spirit of Granada >> is still with us with lots of activity going on all the time. >> Loads of great work has happened since the last >> release, even though much was "behind the scenes" and fixes. >> >> There are some things that I would really love to have in >> 0.11 and that I think there is still enough time to do. >> >> I think one of the great things about sklearn is the documentation, >> but as was pointed out recently, there is still room for improvement. >> >> In particular, it really bugs me that there is no documentation for >> LDA and the text features. >> I will try to do LDA but I really can not do documentation for text >> features. So it would be *really* great if someone with some >> NLP knowledge could sit down and write some documentation >> for this part of the scikit. >> > I am currently tinkering with a full flattening of the > sklearn.feature_extraxtion.text module (no more preprocessor and > analyzers, just vectorizers + TfIdf transformer). Will finish and > submit an PR this WE along with the narrative doc. > > This will be much simpler that the current state and will make it > easier to implement the murmurhash3 hashing vectorizer afterwards. > > Awesome :) I don't have much insight into the module but it felt like it needed some love. Thumbs up!
------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
