Good idea Gokhan, thanks! @ Sigbjørn: Thanks for the feedback. In fact we have plenty of [implicit] preference data for each user, e.g. product view. What I found out is that data from a certain point in time were very noisy and inconsistent, when I started fetching from that point on I got much better results.
On Wed, Nov 6, 2013 at 2:24 PM, Gokhan Capan <gkhn...@gmail.com> wrote: > Cassio, > > I would implement a CandidateItemsStrategy that returns products that are > available now. A neighborhood based recommender would iterate over those > products, and rank them based on the similarity measure you provide. > > If the DataModel of your recommender does not contain most of your > available items, you may want to overcome this "cold-start" challenge by > making your similarity take content features into account. > > Hope that helps, > Best > > > > Gokhan > > > On Wed, Nov 6, 2013 at 1:40 PM, Sigbjørn Dybdahl <sigbj...@fifty-five.com > >wrote: > > > Hi, > > > > Could this be due to a small number of items viewed/liked/purchased per > > user? > > > > Correct me if I'm wrong, but this would make the total recommendation > space > > sparse, thus making it hard to find good recommendations (ie > > recommendations which are relevent and not obsolete). If so, it might be > > worth considering simpler approaches than CF. > > > > > > Sigbjørn > > > > > > On 22 October 2013 12:43, Cassio Melo <melo.cas...@gmail.com> wrote: > > > > > Hi all, > > > > > > I have a product recommendation use case for an e-commerce site and > I've > > > been playing around mahout's CF capabilities lately. I reached a point > > > where I should ask a feedback from the community on my approach. I'm > > > struggling to get ANY recommendation. > > > > > > Here's what I've done so far: > > > > > > 1) Collected data in this format: > > > > > > USER_ID,PRODUCT_ID,VIEW,LIKE,FAVORITE,PURCHASE > > > 0001,3333,1,0,1,1 > > > > > > 2) Built preferences for each user based on a computed score [0,1] for > > the > > > attributes (view,like,favorite,purchase) > > > > > > 3) Implemented a custom ItemSimilarity class to boost products in the > > same > > > type, from the same manufacturer, with similar prices, etc > > > > > > 4) Implemented a custom IDRescorer to filter out products that are no > > > longer available. > > > > > > 5) Implemented a Recommender subclass that simply instantiates a > > > GenericItemRecommender with my custom similarity class: recommender = > new > > > GenericItemBasedRecommender(myModel, new ProductSimilarity()); > > > > > > It happens that <b>most</b> of product/preferences data is historical > and > > > most products are no longer available. Because of this I get zero > > > recommendations when the IDRescorer's filter is activated (yes I > checked > > > the isFiltered method, it is returning true for expired products and > > false > > > otherwise as expected) event though there are lots of valid products. > > > > > > Here's the overridden method in the IDRescorer's class: > > > > > > public boolean isFiltered(long id) { > > > > > > Product a = PreferencesDataModel.lookupProduct(id); > > > > > > return ! a.isActive(); // filter expired product > > > > > > } > > > > > > I see it as beneficial to feed the engine with historical data on > expired > > > products and filtering them for recommendation, but getting zero > > > recommendations made me rethink this approach (I also tried different > > > similarity metrics including UserSim). What do you guys think? > > > > > > > > > > > -- > > Sigbjørn DYBDAHL | 55 | fifty-five.com <http://www.fifty-five.com/> | 4, > > place de l'Opéra, 75002 Paris | 01 76 21 91 32 > > >