Hi all,

I have a product recommendation use case for an e-commerce site and I've
been playing around mahout's CF capabilities lately. I reached a point
where I should ask a feedback from the community on my approach. I'm
struggling to get ANY recommendation.

Here's what I've done so far:

1) Collected data in this format:

USER_ID,PRODUCT_ID,VIEW,LIKE,FAVORITE,PURCHASE
0001,3333,1,0,1,1

2) Built preferences for each user based on a computed score [0,1] for the
attributes (view,like,favorite,purchase)

3) Implemented a custom ItemSimilarity class to boost products in the same
type, from the same manufacturer, with similar prices, etc

4) Implemented a custom IDRescorer to filter out products that are no
longer available.

5) Implemented a Recommender subclass that simply instantiates a
GenericItemRecommender with my custom similarity class: recommender = new
GenericItemBasedRecommender(myModel, new ProductSimilarity());

It happens that <b>most</b> of product/preferences data is historical and
most products are no longer available. Because of this I get zero
recommendations when the IDRescorer's filter is activated (yes I checked
the isFiltered method, it is returning true for expired products and false
otherwise as expected) event though there are lots of valid products.

Here's the overridden method in the IDRescorer's class:

public boolean isFiltered(long id) {

 Product a = PreferencesDataModel.lookupProduct(id);

    return  ! a.isActive(); // filter expired product

}

I see it as beneficial to feed the engine with historical data on expired
products and filtering them for recommendation, but getting zero
recommendations made me rethink this approach (I also tried different
similarity metrics including UserSim). What do you guys think?

Reply via email to