Hello - 

If this isn't the best forum to ask, please let me know.

TL;DR;
Is there a way to stream preference/user data to an EMR recommender workflow 
without having to go through the pain of re-uploading all preference data, and 
starting brand new jobs over and over, etc?

I am trying to process large volumes of preference data using Amazon EMR.  It 
seems extremely unscalable to upload our entire preference set every time we 
run a job, as the vast majority of the preferences will never change.  It seems 
like the append files that Mahout can process would be perfect for this, but it 
doesn't appear that EMR supports it.

The brute force method appears to be:
1) Upload preference set
2) Run Recommender job
3) Download and process results
4) Go to step 1

Does anyone have some general advice for processing recommendations in as 
real-time a manner as possible using EMR?

Thank you for any help or references you could provide.

Bryan Marble

Reply via email to