ALS model contains RDDs. So you cannot put `model.recommendProducts`
inside a RDD closure `userProductsRDD.map`. -Xiangrui

On Thu, Nov 6, 2014 at 4:39 PM, Debasish Das <debasish.da...@gmail.com> wrote:
> I reproduced the problem in mllib tests ALSSuite.scala using the following
> functions:
>
>         val arrayPredict = userProductsRDD.map{case(user,product) =>
>
>          val recommendedProducts = model.recommendProducts(user, products)
>
>          val productScore = recommendedProducts.find{x=>x.product ==
> product}
>
>           require(productScore != None)
>
>           productScore.get
>
>         }.collect
>
>         arrayPredict.foreach { elem =>
>
>           if (allRatings.get(elem.user, elem.product) != elem.rating)
>
>           fail("Prediction APIs don't match")
>
>         }
>
> If the usage of model.recommendProducts is correct, the test fails with the
> same error I sent before...
>
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
> stage 316.0 failed 1 times, most recent failure: Lost task 0.0 in stage
> 316.0 (TID 79, localhost): scala.MatchError: null
>
> org.apache.spark.rdd.PairRDDFunctions.lookup(PairRDDFunctions.scala:825)
> org.apache.spark.mllib.recommendation.MatrixFactorizationModel.recommendProducts(MatrixFactorizationModel.scala:81)
>
> It is a blocker for me and I am debugging it. I will open up a JIRA if this
> is indeed a bug...
>
> Do I have to cache the models to make userFeatures.lookup(user).head to work
> ?
>
>
> On Mon, Nov 3, 2014 at 9:24 PM, Xiangrui Meng <men...@gmail.com> wrote:
>>
>> Was "user" presented in training? We can put a check there and return
>> NaN if the user is not included in the model. -Xiangrui
>>
>> On Mon, Nov 3, 2014 at 5:25 PM, Debasish Das <debasish.da...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > I am testing MatrixFactorizationModel.predict(user: Int, product: Int)
>> > but
>> > the code fails on userFeatures.lookup(user).head
>> >
>> > In computeRmse MatrixFactorizationModel.predict(RDD[(Int, Int)]) has
>> > been
>> > called and in all the test-cases that API has been used...
>> >
>> > I can perhaps refactor my code to do the same but I was wondering
>> > whether
>> > people test the lookup(user) version of the code..
>> >
>> > Do I need to cache the model to make it work ? I think right now default
>> > is
>> > STORAGE_AND_DISK...
>> >
>> > Thanks.
>> > Deb
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to