Hi,

We are developing a system that issues recommendations in real-time based on 
data from a main data file (say, /tmp/data.lst) together with daily update 
files (/tmp/data.1.lst, /tmp/data.2.lst, etc.)  We call refresh() on the 
SlopeOne recommender when the daily files are updated.  We are concerned about 
the performance while the daily update files are being loaded, and are 
interested in any feedback on what to expect.

I've been looking through the Mahout code to determine whether Mahout can make 
recommendations while the (SlopeOne) recommender is being refreshed.

From what I can tell, the call to refresh() ends up in 
MemoryDiffStorage.buildAverageDiffs(), where the system acquires a write lock.
This would stall any calls to MemoryDiffStorage.getDiffs(), where the system 
acquires a read lock.
So, it looks to me like the MemoryDiffStorage is taking a locking-based 
approach, rather than a fill-and-swap approach.

On the other hand, FileDataModel has a reload() method with:
                delegate = buildModel()
Which looks like a fill-and-swap based approach that would allow the system to 
seamlessly continue to serve recommendations even while the model is being 
refreshed.

Is this correct?  If so, should we be concerned about the locking of the 
MemoryDiffStorage?  Are there any workarounds?

Thanks in advance!

Regards,

Eric



The information contained in this communication may be CONFIDENTIAL and is 
intended only for the use of the recipient(s) named above. If you are not the 
intended recipient, you are hereby notified that any dissemination, 
distribution, or copying of this communication, or any of its contents, is 
strictly prohibited. If you have received this communication in error, please 
notify the sender and delete/destroy the original message and any copy of it 
from your computer or paper files.

Reply via email to