Re: [Scikit-learn-general] Countvectorizer vocabulary

2013-11-14 Thread Lars Buitinck
Please don't send duplicate messages. Sometimes it takes a while for people to respond. 2013/11/13 Srevastan Muralidharan : > Usefulness: It will help in cases when one has to add additional documents > to the training data, and one should not have to start from the beginning. I'm not sure I unde

[Scikit-learn-general] CountVectorizer vocabulary

2013-11-14 Thread Srevastan Muralidharan
Hi Scikit-learn CountVectorizer for bag-of-words approach currently gives two sub-options: (a) use a custom vocabulary (b) if custom vocabulary is unavailable, then it makes a vocabulary based on all the words present in the corpus. My question: Can we specify a custom vocabulary to begin wit

[Scikit-learn-general] Countvectorizer vocabulary

2013-11-14 Thread Srevastan Muralidharan
Hi Scikit-learn CountVectorizer for bag-of-words approach currently gives two sub-options: (a) use a custom vocabulary (b) if custom vocabulary is unavailable, then it makes a vocabulary based on all the words present in the corpus. My question: Can we specify a custom vocabulary to begin wit