On 01/16/2013 09:30 AM, JAGANADH G wrote:
> Hi All,
> I trained a LinierSVC() classifier for Text Classfication. Total 
> training set is about 1 Lakh documents (mostly sinle line). The saved 
> model is about 3.5 MB in size.
> When I used the model in my python script it takes too much time to 
> preform the prediction (near to one min to predict took 20 minutes to 
> classify 3000 documents).
> My pipleline is
> classifier = Pipeline([('vect',vectorizer),('tfidf',transformer), 
> ('clf',LinearSVC())])
> Is there any way to make it faster.
Probably.
I'm a bit surprised that it took so long. I would imagine it is the 
vectorizer?
Thought that should be O(tokens) afaik.
Can you find out which of the steps in the pipline takes so long?
Cheers,
Andy


------------------------------------------------------------------------------
Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery
and much more. Keep your Java skills current with LearnJavaNow -
200+ hours of step-by-step video tutorials by Java experts.
SALE $49.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122612 
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to