[jira] Commented: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

Olivier Grisel (JIRA) Mon, 18 Jan 2010 18:01:28 -0800

    [ 
https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802032#action_12802032
 ]


Olivier Grisel commented on MAHOUT-228:
---------------------------------------

For the records: I am working adding more tests and debugging in the following 
branch (keps in sync with the trunk) hosted on github:

  http://github.com/ogrisel/mahout/commits/MAHOUT-228

Fixed so far:
 - convergence issues (inconstency on the index of the 'missing' beta row)
 - make sure that L1 is sparsity inducing my apply eager post update 
regularization

Still TODO (independently of Ted's TODOs) - migh be splitted into specific jira 
issues:
 - test that highly redundant dataset can lean to very sparse models with L1 
prior
 - an hadoop driver to do // extraction vector features of documents using the 
Randomizer classes
 - an hadoop driver to do // cross validation and confusion matrix evaluation 
(along with confidence interval)
 - an hadoop driver to perform hyperparameters grid search (lambda, priorfunc, 
learning rate, ...)
 - a sample hadoop driver to categorize wikipedia articles by country
 - profile it a bit


> Need sequential logistic regression implementation using SGD techniques
> -----------------------------------------------------------------------
>
>                 Key: MAHOUT-228
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-228
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>            Reporter: Ted Dunning
>             Fix For: 0.3
>
>         Attachments: logP.csv, MAHOUT-228-3.patch, r.csv, sgd-derivation.pdf, 
> sgd-derivation.tex, sgd.csv
>
>
> Stochastic gradient descent (SGD) is often fast enough for highly scalable 
> learning (see Vowpal Wabbit, http://hunch.net/~vw/).
> I often need to have a logistic regression in Java as well, so that is a 
> reasonable place to start.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

Reply via email to