Hi list, The following NIPS 2009 paper by John Langford et al. my interest those working with online learners (e.g. using SGD):
http://books.nips.cc/papers/files/nips22/NIPS2009_0942.pdf Here is video recording of the workshop talk on the same subject (even though not using exactly the same approach as in the paper): http://videolectures.net/nipsworkshops09_langford_pol/ The main takeway point is that averaging for linear models is possible but not as interesting as horizontal feature sharding that experimentally works for both linear and non linear models. The second takeway point is that vowpal wabbit looks more and more unbeatable :) -- Olivier http://twitter.com/ogrisel - http://code.oliviergrisel.name