[ 
https://issues.apache.org/jira/browse/MAHOUT-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13714803#comment-13714803
 ] 

Ted Dunning commented on MAHOUT-1273:
-------------------------------------



Should the document be updated to describe what you intend to do?

                
> Single Pass Algorithm for Penalized Linear Regression on MapReduce
> ------------------------------------------------------------------
>
>                 Key: MAHOUT-1273
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1273
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Kun Yang
>         Attachments: PenalizedLinear.pdf
>
>   Original Estimate: 720h
>  Remaining Estimate: 720h
>
> Penalized linear regression such as Lasso, Elastic-net are widely used in 
> machine learning, but there are no very efficient scalable implementations on 
> MapReduce.
> The published distributed algorithms for solving this problem is either 
> iterative (which is not good for MapReduce, see Steven Boyd's paper) or 
> approximate (what if we need exact solutions, see Paralleled stochastic 
> gradient descent); another disadvantage of these algorithms is that they can 
> not do cross validation in the training phase, which requires a 
> user-specified penalty parameter in advance. 
> My ideas can train the model with cross validation in a single pass. They are 
> based on some simple observations.
> I have implemented the primitive version of this algorithm in Alpine Data 
> Labs. Advanced features such as inner-mapper combiner are employed to reduce 
> the network traffic in the shuffle phase.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to