RE: JIRA issues 1248/1249

Saikat Kanjilal Thu, 09 Jan 2014 21:48:09 -0800

Some more clarifications:
http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf
It seems like we can pretty much follow the strategy below:
Step 1 Initialize matrix M by assigning the average rating asthe ﬁrst row, and 
small random numbers for the remaining entries.
Step 2 Fix M, Solve U by minimizing the objective function (the sum ofsquared 
errors);
Step 3 Fix U, solve M by minimizing the objective function similarly;
Step 4 Repeat Steps 2 and 3 until a stopping criterion is satisﬁed.


The stopping criterion in this case is where the objective function 
minimization has happened within the RMSE limits specified 
Again the RMSE limit would be specified as a configuration parameter instead of 
the number of iterations.
Sebastien et al, would love to get some feedback on my approach.
> From: [email protected]
> To: [email protected]
> Subject: RE: JIRA issues 1248/1249
> Date: Wed, 8 Jan 2014 20:17:04 -0800
> 
> I read through 1249 and had some initial questions before coming up with a 
> plan, I was looking through the ParallelALSFactorizationJob.java and am 
> assuming this is the right place to make all the changes, to this end:
> 1) I was thinking of introducing convergence training error as another 
> parameter to be specified as a configuration parameter to replace the number 
> of iterations2) For the chunk of code below:
> for (int currentIteration = 0; currentIteration < numIterations; 
> currentIteration++) {      /* broadcast M, read A row-wise, recompute U 
> row-wise */      log.info("Recomputing U (iteration {}/{})", 
> currentIteration, numIterations);      runSolver(pathToUserRatings(), 
> pathToU(currentIteration), pathToM(currentIteration - 1), currentIteration, 
> "U",                numItems);      /* broadcast U, read A' row-wise, 
> recompute M row-wise */      log.info("Recomputing M (iteration {}/{})", 
> currentIteration, numIterations);      runSolver(pathToItemRatings(), 
> pathToM(currentIteration), pathToU(currentIteration), currentIteration, "M",  
>               numUsers);    }
> 
> I am proposing we have a while loop similar to the following:
> while (currentTrainingError<=specifiedTrainingErrorForConvergence) { /* 
> broadcast M, read A row-wise, recompute U row-wise */      
> log.info("Recomputing U (iteration {}/{})", currentIteration, numIterations); 
>      runSolver(pathToUserRatings(), pathToU(currentIteration), 
> pathToM(currentIteration - 1), currentIteration, "U",                
> numItems);      /* broadcast U, read A' row-wise, recompute M row-wise */     
>  log.info("Recomputing M (iteration {}/{})", currentIteration, 
> numIterations);      runSolver(pathToItemRatings(), 
> pathToM(currentIteration), pathToU(currentIteration), currentIteration, "M",  
>               numUsers);}
> However I am wondering where or how I would compute the training error each 
> time, would that happen inside runSolver or be an artifact of performing the 
> solverComputation, pardon my ignorance on this, also I wanted to get deeper 
> insight into ALS, is the following the best paper to read:
> http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf.
> Specifically I am trying to understand where the training error comes into 
> play within the SVD computation.
> Really would appreciate some more insight as I explore and dig through the 
> code.
> Regards
> 
> > Date: Tue, 7 Jan 2014 09:11:17 +0100
> > From: [email protected]
> > To: [email protected]
> > Subject: Re: JIRA issues 1248/1249
> > 
> > Hi Saikat,
> > 
> > I suggest to start with 1249, which is the easier task. The best way to
> > proceed is by discussing on the mailinglist. Have a look at the issue,
> > propose a solution here and wait for our feedback.
> > 
> > Best,
> > Sebastian
> > 
> > On 07.01.2014 04:27, Saikat Kanjilal wrote:
> > > Sebastien et al,After months of not having bandwidth to help out with 
> > > coding tasks I am finally ready to help with the implementation of the 
> > > above JIRA issues, before I begin I wanted to make sure these 
> > > improvements are still needed for ALS, I am targeting to finish these by 
> > > the 1.0 release.   Also if these are relevant should I just present a 
> > > design/plan of implementation?  I'd love some initial guidance and 
> > > thoughts around these tasks, feel free to add them to the tickets 
> > > themselves.Thanks in advance.                                        
> > > 
> > 
>

RE: JIRA issues 1248/1249

Reply via email to