[Scikit-learn-general] Order of processes in WordNGramAnalyzer

2011-11-22 Thread SK Sn
Hi there, I looked into WordNGramAnalyzer in feature_extraction/text.py. It occured to me that in case of nGram n>1, 'handle token n-grams' happends before 'handle stop words', as shown in following snippet: # handle token n-grams if self.min_n != 1 or self.max_n != 1:

Re: [Scikit-learn-general] SGD learning rate heuristic

2011-11-22 Thread Andreas Müller
- Ursprüngliche Mail - Von: "Peter Prettenhofer" An: scikit-learn-general@lists.sourceforge.net Gesendet: Dienstag, 22. November 2011 13:44:25 Betreff: Re: [Scikit-learn-general] SGD learning rate heuristic 2011/11/22 Andreas Müller : > Hi Peter. > Thanks for the quick answer. > > > On

Re: [Scikit-learn-general] SGD learning rate heuristic

2011-11-22 Thread Peter Prettenhofer
2011/11/22 Andreas Müller : > Hi Peter. > Thanks for the quick answer. > > > On 11/22/2011 12:33 PM, Peter Prettenhofer wrote: >> Hi Andy, >> >> I adopted the heuristic from Leon Bottou's sgd implementation (version >> 1.3). He explains the heuristic in [1] - search for "Choosing the Gain >> Schedu

Re: [Scikit-learn-general] SGD learning rate heuristic

2011-11-22 Thread Andreas Müller
Hi Peter. Thanks for the quick answer. On 11/22/2011 12:33 PM, Peter Prettenhofer wrote: > Hi Andy, > > I adopted the heuristic from Leon Bottou's sgd implementation (version > 1.3). He explains the heuristic in [1] - search for "Choosing the Gain > Schedule". I'm not aware of any paper which des

Re: [Scikit-learn-general] SGD learning rate heuristic

2011-11-22 Thread Peter Prettenhofer
Hi Andy, I adopted the heuristic from Leon Bottou's sgd implementation (version 1.3). He explains the heuristic in [1] - search for "Choosing the Gain Schedule". I'm not aware of any paper which describes the rational in more depth. Here's the quote from the slide: "Choose t_0 to make sure that t

[Scikit-learn-general] SGD learning rate heuristic

2011-11-22 Thread Andreas Müller
Hi everybody. Could someone please explain to me the learning rate heuristic in SGD? Why is \eta_0 initialized the way it is? Peter Prettenhofer mentioned it is taken from Leon Bottou's sgd code. I found it there but no further explanation. Is it explained in any of the papers? I could not find it.