[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794661#action_12794661 ]
Ted Dunning commented on MAHOUT-228: ------------------------------------ The orginal code was very nearly correct as it turns out. The problem is that lambda in the batch learning is used to weight the prior against all of the training examples. In the on-line algorithm the prior gradient is applied for each update. In the example I used, this caused an effective increase in the value of lambda by 60 (the number of training examples). After adjusting the value of lambda, I get values from the on-line algorithm very similar to those obtained by the batch algorithm (after lots of iterations). I will post a new patch shortly for review. > Need sequential logistic regression implementation using SGD techniques > ----------------------------------------------------------------------- > > Key: MAHOUT-228 > URL: https://issues.apache.org/jira/browse/MAHOUT-228 > Project: Mahout > Issue Type: New Feature > Components: Classification > Reporter: Ted Dunning > Fix For: 0.3 > > Attachments: logP.csv, MAHOUT-228-3.patch, r.csv, sgd-derivation.pdf, > sgd-derivation.tex, sgd.csv > > > Stochastic gradient descent (SGD) is often fast enough for highly scalable > learning (see Vowpal Wabbit, http://hunch.net/~vw/). > I often need to have a logistic regression in Java as well, so that is a > reasonable place to start. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.