> We don't want generators or list of functions as parameters though as
> it would break the ability to do cross validation and picklability.
Agreed, but this does seem to fit in the general usecase of on-line
learning, some hopefully we should be able to addresse this usecase in
the long run.
G
I plan to work on this during the sprint to simplify the vectorizer
and make it easier to override the default implementation. We don't
want generators or list of functions as parameters though as it would
break the ability to do cross validation and picklability.
--
Olivier
On 25 November 2011 08:58, Nelle Varoquaux wrote:
> On 24 November 2011 22:51, Lars Buitinck wrote:
> > 2011/11/22 SK Sn :
> >> I looked into WordNGramAnalyzer in feature_extraction/text.py.
> >>
> >> It occured to me that in case of nGram n>1, 'handle token n-grams'
> happends
> >> before 'handl
On 24 November 2011 22:51, Lars Buitinck wrote:
> 2011/11/22 SK Sn :
>> I looked into WordNGramAnalyzer in feature_extraction/text.py.
>>
>> It occured to me that in case of nGram n>1, 'handle token n-grams' happends
>> before 'handle stop words', as shown in following snippet:
>
>
>
>> At least
2011/11/22 SK Sn :
> I looked into WordNGramAnalyzer in feature_extraction/text.py.
>
> It occured to me that in case of nGram n>1, 'handle token n-grams' happends
> before 'handle stop words', as shown in following snippet:
> At least it is strange to me that, especially when I define my own
>
Hi there,
I looked into WordNGramAnalyzer in feature_extraction/text.py.
It occured to me that in case of nGram n>1, 'handle token n-grams' happends
before 'handle stop words', as shown in following snippet:
# handle token n-grams
if self.min_n != 1 or self.max_n != 1: