How to add all combinations of items rated by user and difference between the ratings?

2015-03-28 Thread anishm
The input file is of format: userid, movieid, rating
From this plan, I want to extract all possible combinations of movies and
difference between the ratings for each user.

(movie1, movie2),(rating(movie1)-rating(movie2))

This process should be processed for each user in the dataset. Finally, I
would like to find the average disagreement of movies for the user.

(movie1, movie2), average difference between ratings

How do I do the same in python?

I did write a code for Hadoop Streaming, but having a real hard time
converting it to Spark compatible code.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-all-combinations-of-items-rated-by-user-and-difference-between-the-ratings-tp22268.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



How to augment data to existing MatrixFactorizationModel?

2015-02-26 Thread anishm
I am a beginner to the world of Machine Learning and the usage of Apache
Spark. 
I have followed the tutorial at 
https://databricks-training.s3.amazonaws.com/movie-recommendation-with-mllib.html#augmenting-matrix-factors
https://databricks-training.s3.amazonaws.com/movie-recommendation-with-mllib.html#augmenting-matrix-factors
 
, and was succesfully able to develop the application. Now, as it is
required that today's web application need to be powered by real time
recommendations. I would like my model to be ready for new data that keeps
coming on the server. 
The site has quoted:
*
A better way to get the recommendations for you is training a matrix
factorization model first and then augmenting the model using your ratings.*

How do I do that? I am using Python to develop my application. Also, please
tell me how do I persist the model to use it again, or an idea how do I
interface this with a web service.

Thanking you,
Anish Mashankar
A Data Science Enthusiast



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-augment-data-to-existing-MatrixFactorizationModel-tp21831.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Augment more data to existing MatrixFactorization Model?

2015-02-26 Thread anishm
I am a beginner to the world of Machine Learning and the usage of Apache
Spark. 
I have followed the tutorial at 
https://databricks-training.s3.amazonaws.com/movie-recommendation-with-mllib.html#augmenting-matrix-factors
https://databricks-training.s3.amazonaws.com/movie-recommendation-with-mllib.html#augmenting-matrix-factors
 
, and was succesfully able to develop the application. Now, as it is
required that today's web application need to be powered by real time
recommendations. I would like my model to be ready for new data that keeps
coming on the server. 
The site has quoted:
*
A better way to get the recommendations for you is training a matrix
factorization model first and then augmenting the model using your ratings.*

How do I do that? I am using Python to develop my application. Also, please
tell me how do I persist the model to use it again, or an idea how do I
interface this with a web service.

Thanking you,
Anish Mashankar
A Data Science Enthusiast



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Augment-more-data-to-existing-MatrixFactorization-Model-tp21830.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org