Hi,
I need a quick clarification that pertains to the RecommenderJob (RJ) and the 
ItemSimilarityJob (ISJ).
We are using  the RJ and the ISJ to compute Itembased recommendations and Item 
similarities. We have an input dataset of about ~100 million rows and would 
like to share the computed output (of the intermediate steps) between the two 
jobs in favor of making the entire process quicker. In short, the first two 
phases of the RJ and the ISJ are doing the same thing - running the 
PreparePreferenceMatrixJob and the RowSimilarityJob. So we want to run the RJ 
with endPhase = 1, then fork off to run the RJ (with startPhase = 2) and the 
ISJ (with startPhase = 2) with the same temp directories as reference.
While doing this, we noticed that there is a difference in the name of the 
"prepPath" temp directory used in RJ and ISJ. RJ calls it 
"preparePreferenceMatrix" and ISJ calls it "prepareRatingMatrix". Is there a 
reason why this is different? This makes it impossible to share the computed 
information between the two jobs. Are we overlooking something?
Any help would be appreciated.
Thanks,Bala Rajagopal                                     

Reply via email to