On Thu, Mar 29, 2012 at 06:19:48PM +0200, David Marek wrote: > Hi, > > I have created a first draft of my application for gsoc. I summarized > all ideas from last thread so I hope it makes sense. You can read it > at > https://docs.google.com/document/d/11zxSbsGwevd49JIqAiNz4Qb6cFYzHdJgH9ROegY9-qo/edit > I would like to ask Andreas and David to have a look. Every feedback is > welcome.
I'd emphasize that "SGD" is a class of algorithms, and the implementations that exist are purely for the linear classifier setting. I'm not sure how much use they will be in an SGD-for-MLP (they can maybe be reused for certain kinds of output layers), but there is definitely more work in efficiently computing the gradient. I'm unsure, but if you're as familiar as you say with backpropagation, this doesn't seem like that much actual code for a 2.5 month stretch you've projected. If possible, I wouldn't limit yourself to vanilla SGD as the only avenue for optimization. For small problems/model sizes, other avenues are worth exploring, e.g. - batch gradient descent with delta-bar-delta adaptation ( http://www.bcs.rochester.edu/people/robbie/jacobs.nn88.pdf ) once you have the gradient formula taken care of, this is a few relatively simple lines of NumPy. - miscellaneous numerical optimizers in scipy.optimize, in particular the "minimize" function which provides a unified interface to all the different optimization strategies. In addition, beyond basic SGD, make sure you *at least* implement support for a momentum term; this can help enormously with rapidly traversing and escaping plateaus in the error surface, and is trivial to implement once you are already computing the gradient. Polyak averaging may be another useful avenue given any spare time. I presume when you mention "Levenberg-Marquardt" you mean the stochastic-diagonal version referenced in the "Efficient Backprop" paper? This is very different than regular Levenberg-Marquardt and you should include this distinction. Other comments: - Ideally, some amount of testing should be done in parallel with development. You will inevitably be ad-hoc testing your implementation as you go, don't throw that code away but put it in a unit test. - I'd like to see more than simply "add tests". Specifically, you should give some thought into how exactly you are going to go about unit-testing your implementation. This will require some careful thought about the nature of MLPs themselves, and how to go about verifying correctness. Regression tests (e.g. making sure you get the same output as before given the same random seed) are good for catching bugs introduced by refactoring, but they are not the whole story. Otherwise, it seems like a good proposal. As I said, it seems like a rather small amount of actual implementation, even if you are only budgeting the first half of the work period. I would look for some additional features to flesh out the implementation side of the proposal. David ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
