Hello everyone,
I'm new to mahout, to recommender systems and to the mailing list.

I''m trying to find a (fast) way to write back preferences to a file. I tried a 
few methods but I'm sure there must be a better approach.
Here's the deal (you can find the same post in stackoverflow[1]).
I have a training dataset of 800.000 records from 6000 users rating 3900 
movies. These are stored in a comma separated file like: 
userId,movieId,preference. I have another dataset (200.000 records) in the 
format: userId,movieId. My goal is to use the first dataset as a training-set, 
in order to determine the missing preferences of the second set.

So far, I managed to load the training dataset and I generated user-based 
recommendations. This is pretty smooth and doesn't take too much time. But I'm 
struggling when it comes to writing back the recommendations.

The first method I tried is:

 * read a line from the file and get the userId,movieId tuple.
 * retrieve the calculated preference with estimatePreference(userId, movieId)
 * append the preference to the line and save it in a new file
This works, but it's incredibly slow (I added a counter to print every 10.000th 
iteration: after a couple of minutes it had only printed once. I have 8GB-RAM 
with an i7-core... how long can it take to process 200.000 lines?!)

My second choise was:

 * create a new FileDataModel with the second dataset
 * do something like this: newDataModel.setPreference(userId, movieId, 
recommender.estimatePreference(userId, movieId));

Here I get several problems:
 * at runtime: java.lang.UnsupportedOperationException (as I found out in [2], 
FileDataModel actually can't be updated. I don't understand why the function 
setPreference exists in the first place...)
 * The API of FileDataModel#setPreference states "This method should also be 
considered relatively slow."

I read around that a solution would be to use delta files, but I couldn't find 
out what that actually means. Any suggestion on how I could speed up my 
writing-the-preferences process?
Thank you!

Pier Lorenzo


[1] 
http://stackoverflow.com/questions/29423824/mahout-fast-performance-how-to-write-preferences-to-file
[2] http://comments.gmane.org/gmane.comp.apache.mahout.user/11330

Reply via email to