Re: no output for RecommenderJob, v0.4

Sebastian Schelter Sat, 12 Mar 2011 01:13:14 -0800

Hello Jake,

my first advice would be to use the RecommenderJob from the currenttrunk, the 0.4 version has a serious bug unfortunately.


Your toy data is too small to give output, let me explain why.

The first thing that RecommenderJob will do is to compute all pairs ofsimilar items (all pairs of items that cooccured within the preferencesof a single user):

The next thing that happens is that RecommenderJob tries to predict howmuch the users like items that might possibly be recommended to them. Inorder to do this for a single user,item pair we need to look at allitems similar to the "candidate" item that have also been liked by theuser. The formula used is a weighted sum defined like this:


u = a user
i = an item not yet rated by u
N = all items similar to i

Prediction(u,i) = sum(all n from N: similarity(i,n) * rating(u,n)) /sum(all n from N: abs(similarity(i,n)))

This formula has one drawback. If we only know a single similar item,the prediction will just be the "rating" value for that single similaritem. In order to avoid this, we throw out all predicitions that we'rebased on a single item only.

Unfortunately your toy data is so small that there is no prediction,that can be based on more than one item, so everything is thrown awayand the output is empty.

As you only have boolean data in your example (no ratings), you coulduse --booleanData to make RecommenderJob treat the input as boolean (itdoes not do this automatically). In that case you should see output asthe previously described problem doesn't exist for boolean data.


--sebastian

On 12.03.2011 06:03, Jake Vang wrote:

hi,

i am testing the RecommenderJob. according to the v0.4 javadocs, it
requires the format: userID,itemID[,preferencevalue]. i have a very
simple input i want to test before running it on the real dataset. my
toy data set is as simple as the following lines.

1,10
1,20
2,10
2,30
2,40
3,10
3,20

(user 1 likes item 10, user 1 likes item 20, and so on).

i then run the job.

hadoop jar mahout-core-0.4-job.jar
org.apache.mahout.cf.taste.hadoop.item.RecommenderJob
-Dmapred.input.dir=/input/toy/toydata.txt
-Dmapred.output.dir=/output/toy01

however, when i look at the results in (part-r-0000), i see nothing.
the file is blank. why is this happening?

i am running this on cygwin. i can run the hadoop examples correctly.
is there something that i am doing wrong?

Re: no output for RecommenderJob, v0.4

Reply via email to