2012/3/2 Andreas <[email protected]>:
> Hi everybody!
> As I am in charge of the next release, here a quick pep talk and
> wish list:
>
> We are two month into .10 now and I feel the spirit of Granada
> is still with us with lots of activity going on all the time.
> Loads of great work has happened since the last
> release, even though much was "behind the scenes" and fixes.
>
> There are some things that I would really love to have in
> 0.11 and that I think there is still enough time to do.
>
> I think one of the great things about sklearn is the documentation,
> but as was pointed out recently, there is still room for improvement.
>
> In particular, it really bugs me that there is no documentation for
> LDA and the text features.
> I will try to do LDA but I really can not do documentation for text
> features. So it would be *really* great if someone with some
> NLP knowledge could sit down and write some documentation
> for this part of the scikit.

I am currently tinkering with a full flattening of the
sklearn.feature_extraxtion.text module (no more preprocessor and
analyzers, just vectorizers + TfIdf transformer). Will finish and
submit an PR this WE along with the narrative doc.

This will be much simpler that the current state and will make it
easier to implement the murmurhash3 hashing vectorizer afterwards.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to