It sounds like a bug. How many tokens do you have in your corpus?
If you have the vectorized corpus in a variable X (e.g. `X =
CountVectorizer().fit_transform(list_of_documents)`) you can do:
>>> print(repr(X))
to get the dimension and number of non-zeros in the sparse matrix.
------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122712
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general