A is the user x item history matrix.  Each row is a user history.
>
> A' is the transposed user x item matrix which is of the shape item x user.
>
> A' A is the user-level item cooccurrence matrix and has the shape item x
> item.
>

Then (A' A)ij is a similarity weight between ith and jth items.
if Aij   is the "rating of ith user for jth item",  the highest value of
"ith row of A' A" is the most similar item for "ith item".
if the values in A are binary, then (A' A)ij   is  number of users who have
rated/clicked/viewed  both item i and item j.

For this multiplication technique, (A' A)ij is computed by :
(A' i) x (A' j)'  ( A' i   and  A' j  are row vectors ).

So we have 3 algorithms to compute the similarity between two items
X : row vector A'i
Y : row vector A'j



1. basic matrix approach:

    weight:=0;
    for each user u
        if   u rated both X and Y weight+= rating(u,X)*rating(u,Y);


2. Taste's implementation was:

    weight:=0;
    for each user u
        if   rating(u,X) != 0 and rating(u,Y) != 0
updateWeightWithChosenSimilarityMetric( );


3. After the change:
   a. Matrix approach becomes:

       userSet=Intersect ( nonZeroElements(A' i), nonZeroElements(A' j) );
       weight:=0;
       for each user u in userSet
           weight+= rating(u,X)*rating(u,Y);


   b: Taste's implementation becomes:

       userSet=Intersect ( nonZeroElements(A' i), nonZeroElements(A' j) );
       weight:=0;
       for each user u in userSet
           updateWeightWithChosenSimilarityMetric( );


I think 3rd version is already benefiting from sparsity of data; obviously,
as the data set gets sparser and/or # of users increases, that trick will
speed up the algorithm more. So, doesn't it mean 3rd is an algorithm that
uses sparse data structures that you mentioned?


"Love spurned effect":
I totally agree, people don't give negative feedback to items that are very
dissimilar to items they like, they don't even look at such items.
I think the reason is, they give negative feedback to "false positive"
recommendations.


-- 
Gökhan Çapan

Reply via email to