Re: Thoughts and Questions on a e-Commerce Recommender System and Expired Items
Hi, Could this be due to a small number of items viewed/liked/purchased per user? Correct me if I'm wrong, but this would make the total recommendation space sparse, thus making it hard to find good recommendations (ie recommendations which are relevent and not obsolete). If so, it might be worth considering simpler approaches than CF. Sigbjørn On 22 October 2013 12:43, Cassio Melo melo.cas...@gmail.com wrote: Hi all, I have a product recommendation use case for an e-commerce site and I've been playing around mahout's CF capabilities lately. I reached a point where I should ask a feedback from the community on my approach. I'm struggling to get ANY recommendation. Here's what I've done so far: 1) Collected data in this format: USER_ID,PRODUCT_ID,VIEW,LIKE,FAVORITE,PURCHASE 0001,,1,0,1,1 2) Built preferences for each user based on a computed score [0,1] for the attributes (view,like,favorite,purchase) 3) Implemented a custom ItemSimilarity class to boost products in the same type, from the same manufacturer, with similar prices, etc 4) Implemented a custom IDRescorer to filter out products that are no longer available. 5) Implemented a Recommender subclass that simply instantiates a GenericItemRecommender with my custom similarity class: recommender = new GenericItemBasedRecommender(myModel, new ProductSimilarity()); It happens that bmost/b of product/preferences data is historical and most products are no longer available. Because of this I get zero recommendations when the IDRescorer's filter is activated (yes I checked the isFiltered method, it is returning true for expired products and false otherwise as expected) event though there are lots of valid products. Here's the overridden method in the IDRescorer's class: public boolean isFiltered(long id) { Product a = PreferencesDataModel.lookupProduct(id); return ! a.isActive(); // filter expired product } I see it as beneficial to feed the engine with historical data on expired products and filtering them for recommendation, but getting zero recommendations made me rethink this approach (I also tried different similarity metrics including UserSim). What do you guys think? -- Sigbjørn DYBDAHL | 55 | fifty-five.com http://www.fifty-five.com/ | 4, place de l'Opéra, 75002 Paris | 01 76 21 91 32
Re: Thoughts and Questions on a e-Commerce Recommender System and Expired Items
Cassio, I would implement a CandidateItemsStrategy that returns products that are available now. A neighborhood based recommender would iterate over those products, and rank them based on the similarity measure you provide. If the DataModel of your recommender does not contain most of your available items, you may want to overcome this cold-start challenge by making your similarity take content features into account. Hope that helps, Best Gokhan On Wed, Nov 6, 2013 at 1:40 PM, Sigbjørn Dybdahl sigbj...@fifty-five.comwrote: Hi, Could this be due to a small number of items viewed/liked/purchased per user? Correct me if I'm wrong, but this would make the total recommendation space sparse, thus making it hard to find good recommendations (ie recommendations which are relevent and not obsolete). If so, it might be worth considering simpler approaches than CF. Sigbjørn On 22 October 2013 12:43, Cassio Melo melo.cas...@gmail.com wrote: Hi all, I have a product recommendation use case for an e-commerce site and I've been playing around mahout's CF capabilities lately. I reached a point where I should ask a feedback from the community on my approach. I'm struggling to get ANY recommendation. Here's what I've done so far: 1) Collected data in this format: USER_ID,PRODUCT_ID,VIEW,LIKE,FAVORITE,PURCHASE 0001,,1,0,1,1 2) Built preferences for each user based on a computed score [0,1] for the attributes (view,like,favorite,purchase) 3) Implemented a custom ItemSimilarity class to boost products in the same type, from the same manufacturer, with similar prices, etc 4) Implemented a custom IDRescorer to filter out products that are no longer available. 5) Implemented a Recommender subclass that simply instantiates a GenericItemRecommender with my custom similarity class: recommender = new GenericItemBasedRecommender(myModel, new ProductSimilarity()); It happens that bmost/b of product/preferences data is historical and most products are no longer available. Because of this I get zero recommendations when the IDRescorer's filter is activated (yes I checked the isFiltered method, it is returning true for expired products and false otherwise as expected) event though there are lots of valid products. Here's the overridden method in the IDRescorer's class: public boolean isFiltered(long id) { Product a = PreferencesDataModel.lookupProduct(id); return ! a.isActive(); // filter expired product } I see it as beneficial to feed the engine with historical data on expired products and filtering them for recommendation, but getting zero recommendations made me rethink this approach (I also tried different similarity metrics including UserSim). What do you guys think? -- Sigbjørn DYBDAHL | 55 | fifty-five.com http://www.fifty-five.com/ | 4, place de l'Opéra, 75002 Paris | 01 76 21 91 32
Re: Thoughts and Questions on a e-Commerce Recommender System and Expired Items
Good idea Gokhan, thanks! @ Sigbjørn: Thanks for the feedback. In fact we have plenty of [implicit] preference data for each user, e.g. product view. What I found out is that data from a certain point in time were very noisy and inconsistent, when I started fetching from that point on I got much better results. On Wed, Nov 6, 2013 at 2:24 PM, Gokhan Capan gkhn...@gmail.com wrote: Cassio, I would implement a CandidateItemsStrategy that returns products that are available now. A neighborhood based recommender would iterate over those products, and rank them based on the similarity measure you provide. If the DataModel of your recommender does not contain most of your available items, you may want to overcome this cold-start challenge by making your similarity take content features into account. Hope that helps, Best Gokhan On Wed, Nov 6, 2013 at 1:40 PM, Sigbjørn Dybdahl sigbj...@fifty-five.com wrote: Hi, Could this be due to a small number of items viewed/liked/purchased per user? Correct me if I'm wrong, but this would make the total recommendation space sparse, thus making it hard to find good recommendations (ie recommendations which are relevent and not obsolete). If so, it might be worth considering simpler approaches than CF. Sigbjørn On 22 October 2013 12:43, Cassio Melo melo.cas...@gmail.com wrote: Hi all, I have a product recommendation use case for an e-commerce site and I've been playing around mahout's CF capabilities lately. I reached a point where I should ask a feedback from the community on my approach. I'm struggling to get ANY recommendation. Here's what I've done so far: 1) Collected data in this format: USER_ID,PRODUCT_ID,VIEW,LIKE,FAVORITE,PURCHASE 0001,,1,0,1,1 2) Built preferences for each user based on a computed score [0,1] for the attributes (view,like,favorite,purchase) 3) Implemented a custom ItemSimilarity class to boost products in the same type, from the same manufacturer, with similar prices, etc 4) Implemented a custom IDRescorer to filter out products that are no longer available. 5) Implemented a Recommender subclass that simply instantiates a GenericItemRecommender with my custom similarity class: recommender = new GenericItemBasedRecommender(myModel, new ProductSimilarity()); It happens that bmost/b of product/preferences data is historical and most products are no longer available. Because of this I get zero recommendations when the IDRescorer's filter is activated (yes I checked the isFiltered method, it is returning true for expired products and false otherwise as expected) event though there are lots of valid products. Here's the overridden method in the IDRescorer's class: public boolean isFiltered(long id) { Product a = PreferencesDataModel.lookupProduct(id); return ! a.isActive(); // filter expired product } I see it as beneficial to feed the engine with historical data on expired products and filtering them for recommendation, but getting zero recommendations made me rethink this approach (I also tried different similarity metrics including UserSim). What do you guys think? -- Sigbjørn DYBDAHL | 55 | fifty-five.com http://www.fifty-five.com/ | 4, place de l'Opéra, 75002 Paris | 01 76 21 91 32
Re: Thoughts and Questions on a e-Commerce Recommender System and Expired Items
If you have a lot of old historical data for products that no longer exist you may be getting recommendations from that set. Using the old data is, in principal fine and should make recs better. However you may be running into a limit for the default number of recs returned, which is something like 100. If all 100 are from old out of stock products you’ll filter all results out. Have you checked before the filter? You could try increasing the number returned and see if more recs get through the filter. From experience with a retail product recommender, we found the item-item similarities often produced better results than individual user preferences. You can use the api to get recommendations for an item rather than a user if you know an item context. So when viewing an item detail page, you can display similar items without using user preferences. As to decaying user preferences, I’d not do that at first or at least be aware of the ramifications. Decay based on time then using Mahout has the effect of removing training data that may be very useful. Unfortunately Mahout uses the preference collection as both training data and query data. It would be nice if the query had some number of the most recent user actions and the training data contained many more actions from all users. You can do this with the Solr/Mahout-recommender since training and query data can be separate. On Nov 6, 2013, at 9:00 AM, Cassio Melo melo.cas...@gmail.com wrote: Good idea Gokhan, thanks! @ Sigbjørn: Thanks for the feedback. In fact we have plenty of [implicit] preference data for each user, e.g. product view. What I found out is that data from a certain point in time were very noisy and inconsistent, when I started fetching from that point on I got much better results. On Wed, Nov 6, 2013 at 2:24 PM, Gokhan Capan gkhn...@gmail.com wrote: Cassio, I would implement a CandidateItemsStrategy that returns products that are available now. A neighborhood based recommender would iterate over those products, and rank them based on the similarity measure you provide. If the DataModel of your recommender does not contain most of your available items, you may want to overcome this cold-start challenge by making your similarity take content features into account. Hope that helps, Best Gokhan On Wed, Nov 6, 2013 at 1:40 PM, Sigbjørn Dybdahl sigbj...@fifty-five.com wrote: Hi, Could this be due to a small number of items viewed/liked/purchased per user? Correct me if I'm wrong, but this would make the total recommendation space sparse, thus making it hard to find good recommendations (ie recommendations which are relevent and not obsolete). If so, it might be worth considering simpler approaches than CF. Sigbjørn On 22 October 2013 12:43, Cassio Melo melo.cas...@gmail.com wrote: Hi all, I have a product recommendation use case for an e-commerce site and I've been playing around mahout's CF capabilities lately. I reached a point where I should ask a feedback from the community on my approach. I'm struggling to get ANY recommendation. Here's what I've done so far: 1) Collected data in this format: USER_ID,PRODUCT_ID,VIEW,LIKE,FAVORITE,PURCHASE 0001,,1,0,1,1 2) Built preferences for each user based on a computed score [0,1] for the attributes (view,like,favorite,purchase) 3) Implemented a custom ItemSimilarity class to boost products in the same type, from the same manufacturer, with similar prices, etc 4) Implemented a custom IDRescorer to filter out products that are no longer available. 5) Implemented a Recommender subclass that simply instantiates a GenericItemRecommender with my custom similarity class: recommender = new GenericItemBasedRecommender(myModel, new ProductSimilarity()); It happens that bmost/b of product/preferences data is historical and most products are no longer available. Because of this I get zero recommendations when the IDRescorer's filter is activated (yes I checked the isFiltered method, it is returning true for expired products and false otherwise as expected) event though there are lots of valid products. Here's the overridden method in the IDRescorer's class: public boolean isFiltered(long id) { Product a = PreferencesDataModel.lookupProduct(id); return ! a.isActive(); // filter expired product } I see it as beneficial to feed the engine with historical data on expired products and filtering them for recommendation, but getting zero recommendations made me rethink this approach (I also tried different similarity metrics including UserSim). What do you guys think? -- Sigbjørn DYBDAHL | 55 | fifty-five.com http://www.fifty-five.com/ | 4, place de l'Opéra, 75002 Paris | 01 76 21 91 32