Hello - If this isn't the best forum to ask, please let me know.
TL;DR; Is there a way to stream preference/user data to an EMR recommender workflow without having to go through the pain of re-uploading all preference data, and starting brand new jobs over and over, etc? I am trying to process large volumes of preference data using Amazon EMR. It seems extremely unscalable to upload our entire preference set every time we run a job, as the vast majority of the preferences will never change. It seems like the append files that Mahout can process would be perfect for this, but it doesn't appear that EMR supports it. The brute force method appears to be: 1) Upload preference set 2) Run Recommender job 3) Download and process results 4) Go to step 1 Does anyone have some general advice for processing recommendations in as real-time a manner as possible using EMR? Thank you for any help or references you could provide. Bryan Marble