Re: MatrixFactorizationModel predict(Int, Int) API

2014-11-06 Thread Debasish Das
I reproduced the problem in mllib tests ALSSuite.scala using the following functions: val arrayPredict = userProductsRDD.map{case(user,product) = val recommendedProducts = model.recommendProducts(user, products) val productScore = recommendedProducts.find{x=x.product

Re: MatrixFactorizationModel predict(Int, Int) API

2014-11-06 Thread Debasish Das
model.recommendProducts can only be called from the master then ? I have a set of 20% users on whom I am performing the test...the 20% users are in a RDD...if I have to collect them all to master node and then call model.recommendProducts, that's a issue... Any idea how to optimize this so that

Re: MatrixFactorizationModel predict(Int, Int) API

2014-11-06 Thread Xiangrui Meng
There is a JIRA for it: https://issues.apache.org/jira/browse/SPARK-3066 The easiest case is when one side is small. If both sides are large, this is a super-expensive operation. We can do block-wise cross product and then find top-k for each user. Best, Xiangrui On Thu, Nov 6, 2014 at 4:51 PM,