[jira] [Commented] (SPARK-4675) Find similar products and similar users in MatrixFactorizationModel

2015-05-16 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546995#comment-14546995
 ] 

Apache Spark commented on SPARK-4675:
-

User 'debasish83' has created a pull request for this issue:
https://github.com/apache/spark/pull/6213

 Find similar products and similar users in MatrixFactorizationModel
 ---

 Key: SPARK-4675
 URL: https://issues.apache.org/jira/browse/SPARK-4675
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Steven Bourke
Priority: Trivial
  Labels: mllib, recommender

 Using the latent feature space that is learnt in MatrixFactorizationModel, I 
 have added 2 new functions to find similar products and similar users. A user 
 of the API can for example pass a product ID, and get the closest products. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4675) Find similar products and similar users in MatrixFactorizationModel

2014-12-11 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242808#comment-14242808
 ] 

Sean Owen commented on SPARK-4675:
--

The lower dimensional space is of course smaller. This makes it faster and more 
efficient to work with, which is an advantage to be sure at scale. But the real 
reason is that the original high-dimensional space is extremely sparse. 
Standard similarity measures are undefined for most pairs, or are 0. It's sort 
of a symptom of the curse of dimensionality. 

 Find similar products and similar users in MatrixFactorizationModel
 ---

 Key: SPARK-4675
 URL: https://issues.apache.org/jira/browse/SPARK-4675
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Steven Bourke
Priority: Trivial
  Labels: mllib, recommender

 Using the latent feature space that is learnt in MatrixFactorizationModel, I 
 have added 2 new functions to find similar products and similar users. A user 
 of the API can for example pass a product ID, and get the closest products. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4675) Find similar products and similar users in MatrixFactorizationModel

2014-12-11 Thread Debasish Das (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243026#comment-14243026
 ] 

Debasish Das commented on SPARK-4675:
-

Is there a metric like MAP / AUC kind of measure that can help us validate 
similarUsers and similarProducts ? 

Right now if I run column similarities with sparse vector on matrix 
factorization datasets for product similarities, it will assume all unvisited 
entries (which should be ?) as 0 and compute column similarities for...If the 
sparse vector has ? in place of 0 then basically all similarity calculation is 
incorrect...so in that sense it makes more sense to compute the similarities on 
the matrix factors...

But then we are back to map-reduce calculation of rowSimilarities.

 Find similar products and similar users in MatrixFactorizationModel
 ---

 Key: SPARK-4675
 URL: https://issues.apache.org/jira/browse/SPARK-4675
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Steven Bourke
Priority: Trivial
  Labels: mllib, recommender

 Using the latent feature space that is learnt in MatrixFactorizationModel, I 
 have added 2 new functions to find similar products and similar users. A user 
 of the API can for example pass a product ID, and get the closest products. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4675) Find similar products and similar users in MatrixFactorizationModel

2014-12-10 Thread Debasish Das (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241535#comment-14241535
 ] 

Debasish Das commented on SPARK-4675:
-

There are few issues:

1. Batch API for topK similar users and topK similar products
2. Comparison of product x product similarities generated with 
columnSimilarities and compared with topK similar products

I added batch APIs for topK product recommendation for each user and topK user 
recommendation for each product in SPARK-4231...similar batch API will be very 
helpful for topK similar users and topK similar products...

I agree with Cosine Similarity...you should be able to re-use column similarity 
calculations...I think a better idea is to add rowMatrix.similarRows and re-use 
that code to generate product similarities and user similarities...

But my question is more on validation. We can compute product similarities on 
raw features and we can compute product similarities on matrix product 
factor...which one is better ?

 Find similar products and similar users in MatrixFactorizationModel
 ---

 Key: SPARK-4675
 URL: https://issues.apache.org/jira/browse/SPARK-4675
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Steven Bourke
Priority: Trivial
  Labels: mllib, recommender

 Using the latent feature space that is learnt in MatrixFactorizationModel, I 
 have added 2 new functions to find similar products and similar users. A user 
 of the API can for example pass a product ID, and get the closest products. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4675) Find similar products and similar users in MatrixFactorizationModel

2014-12-10 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241952#comment-14241952
 ] 

Joseph K. Bradley commented on SPARK-4675:
--

Just to make sure I get your last question, are you asking, Why compute 
product similarities using the low-dimensional space when we could do it in the 
high-dimensional space?  If so, then my understanding is that the 
low-dimensional space will give more meaningful similarities in general.

 Find similar products and similar users in MatrixFactorizationModel
 ---

 Key: SPARK-4675
 URL: https://issues.apache.org/jira/browse/SPARK-4675
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Steven Bourke
Priority: Trivial
  Labels: mllib, recommender

 Using the latent feature space that is learnt in MatrixFactorizationModel, I 
 have added 2 new functions to find similar products and similar users. A user 
 of the API can for example pass a product ID, and get the closest products. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4675) Find similar products and similar users in MatrixFactorizationModel

2014-12-10 Thread Debasish Das (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242034#comment-14242034
 ] 

Debasish Das commented on SPARK-4675:
-

[~josephkb] how do we validate that low dimension space is giving more 
meaningful similarities than the feature space (which is sparse) ?

 Find similar products and similar users in MatrixFactorizationModel
 ---

 Key: SPARK-4675
 URL: https://issues.apache.org/jira/browse/SPARK-4675
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Steven Bourke
Priority: Trivial
  Labels: mllib, recommender

 Using the latent feature space that is learnt in MatrixFactorizationModel, I 
 have added 2 new functions to find similar products and similar users. A user 
 of the API can for example pass a product ID, and get the closest products. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4675) Find similar products and similar users in MatrixFactorizationModel

2014-12-01 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229610#comment-14229610
 ] 

Apache Spark commented on SPARK-4675:
-

User 'sbourke' has created a pull request for this issue:
https://github.com/apache/spark/pull/3536

 Find similar products and similar users in MatrixFactorizationModel
 ---

 Key: SPARK-4675
 URL: https://issues.apache.org/jira/browse/SPARK-4675
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Steven Bourke
Priority: Trivial
  Labels: mllib, recommender

 Using the latent feature space that is learnt in MatrixFactorizationModel, I 
 have added 2 new functions to find similar products and similar users. A user 
 of the API can for example pass a product ID, and get the closest products. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org