> Reasonable approach. With the sgd code, I avoid an IDF computation by > using > an annealed per term feature learning rate. > > Ok. I am going ahead with this. I would ask you to add the logNormalization per document as an option in SGD. Jrennie's paper mentions how it improves accuracy for text. I don't know how it affects sgd type of learning.
Robin
