[ https://issues.apache.org/jira/browse/SPARK-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241535#comment-14241535 ]
Debasish Das commented on SPARK-4675: ------------------------------------- There are few issues: 1. Batch API for topK similar users and topK similar products 2. Comparison of product x product similarities generated with columnSimilarities and compared with topK similar products I added batch APIs for topK product recommendation for each user and topK user recommendation for each product in SPARK-4231...similar batch API will be very helpful for topK similar users and topK similar products... I agree with Cosine Similarity...you should be able to re-use column similarity calculations...I think a better idea is to add rowMatrix.similarRows and re-use that code to generate product similarities and user similarities... But my question is more on validation. We can compute product similarities on raw features and we can compute product similarities on matrix product factor...which one is better ? > Find similar products and similar users in MatrixFactorizationModel > ------------------------------------------------------------------- > > Key: SPARK-4675 > URL: https://issues.apache.org/jira/browse/SPARK-4675 > Project: Spark > Issue Type: Improvement > Components: MLlib > Reporter: Steven Bourke > Priority: Trivial > Labels: mllib, recommender > > Using the latent feature space that is learnt in MatrixFactorizationModel, I > have added 2 new functions to find similar products and similar users. A user > of the API can for example pass a product ID, and get the closest products. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org