[ https://issues.apache.org/jira/browse/SPARK-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907669#comment-14907669 ]
Tomasz Bartczak commented on SPARK-10802: ----------------------------------------- hmm you are probably referring to method predict(usersProducts: RDD[(Int, Int)]): RDD[Rating] but what I am referring to are topK recommendations for subset of users. using recommendProducts(user: Int, num: Int): Array[Rating] is quite slow when done in loop for many users recommendProductsForUsers(num: Int): RDD[(Int, Array[Rating])] is an overhead when I need that for a subset of users I imagine a method: recommendProductsForUsers(users:RDD[Int], num: Int): RDD[(Int, Array[Rating])] that would first retain user features for given users and after that do a cartesian join with product features. > Let ALS recommend for subset of data > ------------------------------------ > > Key: SPARK-10802 > URL: https://issues.apache.org/jira/browse/SPARK-10802 > Project: Spark > Issue Type: Improvement > Components: MLlib > Affects Versions: 1.5.0 > Reporter: Tomasz Bartczak > > Currently MatrixFactorizationModel allows to get recommendations for > - single user > - single product > - all users > - all products > recommendation for all users/products do a cartesian join inside. > It would be useful in some cases to get recommendations for subset of > users/products by providing an RDD with which MatrixFactorizationModel could > do an intersection before doing a cartesian join. This would make it much > faster in situation where recommendations are needed only for subset of > users/products, and when the subset is still too large to make it feasible to > recommend one-by-one. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org