[ 
https://issues.apache.org/jira/browse/SPARK-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504596#comment-14504596
 ] 

zhengruifeng commented on SPARK-7008:
-------------------------------------

I had not considered of the size of model, because the problems which I usualy 
encounter have dimensionality less than 10 millions. In the situation of higher 
dimensionality, I think feature hashing may help to limit the number of 
features (not sure).
The libFM had implemented four training algorithms: SGD, AdaptiveSGD, ALS and 
MCC. I have only implemented the SGD for regression, and I'm to carry out SGD 
for binary classification.
In my opinion, SGD is sensitive to the learning rate: big values cause 
divergency while small cause long-time training.
When coding, I strictly refers to LibFM. There are only two points different: 
LibFM use strict SGD, I use mini-batch SGD provided by MLlib; LibFM use 
Learning Rate as a constant, I make it decreasing with the square root of the 
iteration counter. So I think it's convergence may like LibFM's SGD.
I'm testing the library, and the result will be post in several days.
Thanks.

> An implementation of Factorization Machine (LibFM)
> --------------------------------------------------
>
>                 Key: SPARK-7008
>                 URL: https://issues.apache.org/jira/browse/SPARK-7008
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>    Affects Versions: 1.3.0, 1.3.1, 1.3.2
>            Reporter: zhengruifeng
>              Labels: features, patch
>         Attachments: FM_convergence_rate.xlsx, QQ20150421-1.png, 
> QQ20150421-2.png
>
>
> An implement of Factorization Machines based on Scala and Spark MLlib.
> Factorization Machine is a kind of machine learning algorithm for 
> multi-linear regression, and is widely used for recommendation.
> Factorization Machines works well in recent years' recommendation 
> competitions.
> Ref:
> http://libfm.org/
> http://doi.acm.org/10.1145/2168752.2168771
> http://www.inf.uni-konstanz.de/~rendle/pdf/Rendle2010FM.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to