How to add all combinations of items rated by user and difference between the ratings?
The input file is of format: userid, movieid, rating From this plan, I want to extract all possible combinations of movies and difference between the ratings for each user. (movie1, movie2),(rating(movie1)-rating(movie2)) This process should be processed for each user in the dataset. Finally, I would like to find the average disagreement of movies for the user. (movie1, movie2), average difference between ratings How do I do the same in python? I did write a code for Hadoop Streaming, but having a real hard time converting it to Spark compatible code. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-add-all-combinations-of-items-rated-by-user-and-difference-between-the-ratings-tp22268.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
How to augment data to existing MatrixFactorizationModel?
I am a beginner to the world of Machine Learning and the usage of Apache Spark. I have followed the tutorial at https://databricks-training.s3.amazonaws.com/movie-recommendation-with-mllib.html#augmenting-matrix-factors https://databricks-training.s3.amazonaws.com/movie-recommendation-with-mllib.html#augmenting-matrix-factors , and was succesfully able to develop the application. Now, as it is required that today's web application need to be powered by real time recommendations. I would like my model to be ready for new data that keeps coming on the server. The site has quoted: * A better way to get the recommendations for you is training a matrix factorization model first and then augmenting the model using your ratings.* How do I do that? I am using Python to develop my application. Also, please tell me how do I persist the model to use it again, or an idea how do I interface this with a web service. Thanking you, Anish Mashankar A Data Science Enthusiast -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-augment-data-to-existing-MatrixFactorizationModel-tp21831.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Augment more data to existing MatrixFactorization Model?
I am a beginner to the world of Machine Learning and the usage of Apache Spark. I have followed the tutorial at https://databricks-training.s3.amazonaws.com/movie-recommendation-with-mllib.html#augmenting-matrix-factors https://databricks-training.s3.amazonaws.com/movie-recommendation-with-mllib.html#augmenting-matrix-factors , and was succesfully able to develop the application. Now, as it is required that today's web application need to be powered by real time recommendations. I would like my model to be ready for new data that keeps coming on the server. The site has quoted: * A better way to get the recommendations for you is training a matrix factorization model first and then augmenting the model using your ratings.* How do I do that? I am using Python to develop my application. Also, please tell me how do I persist the model to use it again, or an idea how do I interface this with a web service. Thanking you, Anish Mashankar A Data Science Enthusiast -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Augment-more-data-to-existing-MatrixFactorization-Model-tp21830.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org