[ 
https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794661#action_12794661
 ] 

Ted Dunning commented on MAHOUT-228:
------------------------------------


The orginal code was very nearly correct as it turns out.  The problem is that 
lambda in the batch learning is used to weight the prior against all of the 
training examples.  In the on-line algorithm the prior gradient is applied for 
each update.  

In the example I used, this caused an effective increase in the value of lambda 
by 60 (the number of training examples).

After adjusting the value of lambda, I get values from the on-line algorithm 
very similar to those obtained by the batch algorithm (after lots of 
iterations).

I will post a new patch shortly for review.



> Need sequential logistic regression implementation using SGD techniques
> -----------------------------------------------------------------------
>
>                 Key: MAHOUT-228
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-228
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>            Reporter: Ted Dunning
>             Fix For: 0.3
>
>         Attachments: logP.csv, MAHOUT-228-3.patch, r.csv, sgd-derivation.pdf, 
> sgd-derivation.tex, sgd.csv
>
>
> Stochastic gradient descent (SGD) is often fast enough for highly scalable 
> learning (see Vowpal Wabbit, http://hunch.net/~vw/).
> I often need to have a logistic regression in Java as well, so that is a 
> reasonable place to start.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to