Re: Mathematical background of ALS recommenders

2013-03-25 Thread Sean Owen
Points from across several e-mails -- The initial item-feature matrix can be just random unit vectors too. I have slightly better results with that. You are finding the least-squares solution of A = U M' for U given A and M. Yes you can derive that analytically as the zero of the derivative of

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Ted Dunning
Even more in-line. On Mon, Mar 25, 2013 at 11:46 AM, Sean Owen sro...@gmail.com wrote: Points from across several e-mails -- The initial item-feature matrix can be just random unit vectors too. I have slightly better results with that. You are finding the least-squares solution of A = U M'

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Ted Dunning
Well, actually, you can. LSI does exactly that. What the effect is of doing this is not clear to me. Do you know what happens if you assume missing values are 0? On Mon, Mar 25, 2013 at 12:10 PM, Sebastian Schelter s...@apache.org wrote: I think one crucial point is missing from this

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Sean Owen
OK, the 'k iterations' happen inline in one job? I thought the Lanczos algorithm found the k eigenvalues/vectors one after the other. Yeah I suppose that doesn't literally mean k map/reduce jobs. Yes the broader idea was whether or not you might get something useful out of ALS earlier. On Mon,

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Sebastian Schelter
Well in LSI it is ok to do that, as a missing entry means that the document contains zero occurrences of a given term which is totally fine. In Collaborative Filtering with explicit feedback, a missing rating is not automatically a rating of zero, it is simply unknown what the user would give as

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Sean Owen
On Mon, Mar 25, 2013 at 11:25 AM, Sebastian Schelter s...@apache.org wrote: Well in LSI it is ok to do that, as a missing entry means that the document contains zero occurrences of a given term which is totally fine. In Collaborative Filtering with explicit feedback, a missing rating is not

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Koobas
On Mon, Mar 25, 2013 at 7:48 AM, Sean Owen sro...@gmail.com wrote: On Mon, Mar 25, 2013 at 11:25 AM, Sebastian Schelter s...@apache.org wrote: Well in LSI it is ok to do that, as a missing entry means that the document contains zero occurrences of a given term which is totally fine. In

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Sean Owen
(The unobserved entries are still in the loss function, just with low weight. They are also in the system of equations you are solving for.) On Mon, Mar 25, 2013 at 1:38 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: Classic als wr is bypassing underlearning problem by cutting out unrated

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Dmitriy Lyubimov
On Mar 25, 2013 6:38 AM, Dmitriy Lyubimov dlie...@gmail.com wrote: On Mar 25, 2013 4:15 AM, Ted Dunning ted.dunn...@gmail.com wrote: Well, actually, you can. LSI does exactly that. What the effect is of doing this is not clear to me. Do you know what happens if you assume missing

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Dmitriy Lyubimov
On Mar 25, 2013 6:44 AM, Sean Owen sro...@gmail.com wrote: (The unobserved entries are still in the loss function, just with low weight. They are also in the system of equations you are solving for.) Not in the classic alswr paper i was specifically referring to. It actually uses minors of

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Sean Owen
On Mon, Mar 25, 2013 at 1:41 PM, Koobas koo...@gmail.com wrote: But the assumption works nicely for click-like data. Better still when you can weakly prefer to reconstruct the 0 for missing observations and much more strongly prefer to reconstruct the 1 for observed data. This does seem

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Sebastian Schelter
As clarification, here are the relevant papers. The approach for explicit feedback [1] does not use unobserved cells, only the approch for handling implicit feedback [2] does, but weighs them down. /s [1] Large-scale Parallel Collaborative Filtering for the Netflix Prize

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Koobas
On Mon, Mar 25, 2013 at 9:52 AM, Sean Owen sro...@gmail.com wrote: On Mon, Mar 25, 2013 at 1:41 PM, Koobas koo...@gmail.com wrote: But the assumption works nicely for click-like data. Better still when you can weakly prefer to reconstruct the 0 for missing observations and much more

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Sean Owen
If your input is clicks, carts, etc. yes you ought to get generally good results from something meant to consume implicit feedback, like ALS (for implicit feedback, yes there are at least two main variants). I think you are talking about the implicit version since you mention 0/1. lambda is the

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Koobas
On Mon, Mar 25, 2013 at 10:43 AM, Sean Owen sro...@gmail.com wrote: If your input is clicks, carts, etc. yes you ought to get generally good results from something meant to consume implicit feedback, like ALS (for implicit feedback, yes there are at least two main variants). I think you are

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Ted Dunning
Yes. But SSVD != Lanczos. Lanczos is vector at at time sequential like you said. SSVD does all the vectors in one go. That one go requires a few steps, but does not require O(k) iterations. On Mon, Mar 25, 2013 at 12:16 PM, Sean Owen sro...@gmail.com wrote: OK, the 'k iterations' happen

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Abhijith CHandraprabhu
Sorry, I actually meant svds(sparse SVD). I think in mahout they use Lanczos also. On Mon, Mar 25, 2013 at 4:25 PM, Ted Dunning ted.dunn...@gmail.com wrote: Yes. But SSVD != Lanczos. Lanczos is vector at at time sequential like you said. SSVD does all the vectors in one go. That one go

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Ted Dunning
No. We don't. We used to use Lanczos, but that has improved. On Mon, Mar 25, 2013 at 4:43 PM, Abhijith CHandraprabhu abhiji...@gmail.com wrote: Sorry, I actually meant svds(sparse SVD). I think in mahout they use Lanczos also. On Mon, Mar 25, 2013 at 4:25 PM, Ted Dunning

Mathematical background of ALS recommenders

2013-03-24 Thread Dominik Huebner
It's quite hard for me to get the mathematical concepts of the ALS recommenders. It would be great if someone could help me to figure out the details. This is my current status: 1. The item-feature (M) matrix is initialized using the average ratings and random values (explicit case) 2. The

Re: Mathematical background of ALS recommenders

2013-03-24 Thread Koobas
You managed to ask the question in a convoluted way. I am not sure if you don't understand the principles or the intricacies. The high level answer is the following. ALS is just like SVD, only it is not SVD. It produces a low rank approximation of the user-item matrix. I.e., represents the

Re: Mathematical background of ALS recommenders

2013-03-24 Thread Ted Dunning
If you have an observation matrix that actually is low rank and you start with the exact value of U, then one iteration on M will suffice to solve the system. Likewise, if you have M, one iteration on U will suffice. This holds regardless of the number of features of M or U (which must match,