[ 
https://issues.apache.org/jira/browse/MAHOUT-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964476#comment-13964476
 ] 

Pat Ferrel commented on MAHOUT-1422:
------------------------------------

I get doing XRSJ on two matrices with identical row and column spaces, and also 
I get doing a self-RSJ on the adjoint A. The XRSJ makes plenty of sense for 
multiple actions on the same items--the case it's used in the solr-recommender 
example.

But how do you interpret a [A_1' A_2] calculated using XRSJ when the column 
spaces are different? Multiply works only because the ordinal values of the 
columns are used but what they identify in the real world don't match. Maybe 
cosine is a bad example but what does it mean to compare the angle between row 
1 of A_1 and row 1 of A_2? The ordinal Mahout Ids will give a result but the 
actual dimensions are non-intersecting--the spaces are completely different. I 
guess I can see that LLR and cooccurrence would make some sense but do the 
other metrics work?

> Make a version of RSJ that uses two inputs
> ------------------------------------------
>
>                 Key: MAHOUT-1422
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1422
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 1.0
>         Environment: mapreduce
>            Reporter: Pat Ferrel
>              Labels: recommender, similarity
>             Fix For: 1.0
>
>
> Currently the RowSimiairtyJob uses a similarity measure to pairwise compare 
> all rows in a DistributedRowMatrix.
> For many applications including a cross-action recommender we need something 
> like RSJ that takes two DRMs and compares matching rows of each.  The output 
> would be the same form as RSJ, and ideally would allow the use of any 
> similarity type already defined--especially LLR.
> There are two implementations of a Cross-Recommender one based on the Mahout 
> RecommenderJob, and another based on Solr, that can immediately benefit from 
> a Cross-RSJ. 
> A modification of the matrix multiply job may be a place to start since the 
> current RSJ seems to rely heavily if self-similarity.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to