Yes, this is a very important point. We have found that the % of video viewed is indeed a very important factor but rather than sending some fraction to indicate the length viewed we have taken the approach before to determine the % that indicates the user liked the video.
This we do by triggering a “veiw-10”, “view-25”, “view-95” etc for different viewing times. We found that for different content types there were different % of viewing that best predicts what the user will like. We found that for “newsy” videos “view-10” was the best indicator. This make sense because people often do not need all the details to understand a videos content. But for movies a “view-10” indicated a dislike. The User started a movie, hated it and stopped it. We used “view-95” as the best indicator. 1) You know your content, do you think you have multiple types of content like “newsy” and “stories/movies”? You may need different indicators of a user “like” corresponding to different % of watch based on the type 2) Gather the viewing experience as % and create categories like “veiw-10”, “view-25”, “view-95” etc. Ingest each event for any given user. Run cross-validation tests to see which gives the best results for each type on content you have. If you have only one type you will find the best % to gather. 3) the problem with simply sending in the % is that for one type of content 10% is a like (newsy) and for another type 10% alone is a dislike (long-form movies) This leads us to using the categorical method for defining indicators to give the best result instead of using the % of video raw, which may yield confusing of wrong results. The extra step of testing the indicators in #2 can make a significant difference in performance. BTW if you are able to find an indicator of dislike, this may be useful to predict likes: https://developer.ibm.com/dwblog/2017/mahout-spark-correlated-cross-occurences/ <https://developer.ibm.com/dwblog/2017/mahout-spark-correlated-cross-occurences/> On Oct 9, 2017, at 10:23 AM, Daniel Tirdea <dan.tir...@gmail.com> wrote: Hi, I know there were a lot of question on this matter, I've looked everywhere but didn't find a good answer. I'm using the Universal Recommender to make a recommendation system for a video sharing website. I have a lot of details in terms of user behavior but the most important one ( at least that's what I'm now ) is the amount of seconds consumed by a visitor. A ration between the video length in seconds and the seconds the visitor actually has seen from it. Let's say that a visitor reached a landing page with a video with total length of 60 seconds. If the user actually sees 60 seconds ( the video player reports that the video played the entire 60 seconds ) I think I can assume that the visitor gave an implicit score of 10 out of 10 for this video. Is there a way I can include this value in the prediction system ? Or, order the returned items by this value? Thanks for reading this, any thought will be greatly appreciated. Thanks, Dan