If you have a lot of old historical data for products that no longer exist you 
may be getting recommendations from that set. Using the old data is, in 
principal fine and should make recs better. However you may be running into a 
limit for the default number of recs returned, which is something like 100. If 
all 100 are from old out of stock products you’ll filter all results out. Have 
you checked before the filter? You could try increasing the number returned and 
see if more recs get through the filter.

From experience with a retail product recommender, we found the item-item 
similarities often produced better results than individual user preferences. 
You can use the api to get recommendations for an item rather than a user if 
you know an item context. So when viewing an item detail page, you can display 
similar items without using user preferences.

As to decaying user preferences, I’d not do that at first or at least be aware 
of the ramifications. Decay based on time then using Mahout has the effect of 
removing training data that may be very useful. Unfortunately Mahout uses the 
preference collection as both training data and query data. It would be nice if 
the query had some number of the most recent user actions and the training data 
contained many more actions from all users. You can do this with the 
Solr/Mahout-recommender since training and query data can be separate. 

On Nov 6, 2013, at 9:00 AM, Cassio Melo <melo.cas...@gmail.com> wrote:

Good idea Gokhan, thanks!

@ Sigbjørn: Thanks for the feedback. In fact we have plenty of [implicit]
preference data for each user, e.g. product view. What I found out is that
data from a certain point in time were very noisy and inconsistent, when I
started fetching from that point on I got much better results.


On Wed, Nov 6, 2013 at 2:24 PM, Gokhan Capan <gkhn...@gmail.com> wrote:

> Cassio,
> 
> I would implement a CandidateItemsStrategy that returns products that are
> available now. A neighborhood based recommender would iterate over those
> products, and rank them based on the similarity measure you provide.
> 
> If the DataModel of your recommender does not contain most of your
> available items, you may want to overcome this "cold-start" challenge by
> making your similarity take content features into account.
> 
> Hope that helps,
> Best
> 
> 
> 
> Gokhan
> 
> 
> On Wed, Nov 6, 2013 at 1:40 PM, Sigbjørn Dybdahl <sigbj...@fifty-five.com
>> wrote:
> 
>> Hi,
>> 
>> Could this be due to a small number of items viewed/liked/purchased per
>> user?
>> 
>> Correct me if I'm wrong, but this would make the total recommendation
> space
>> sparse, thus making it hard to find good recommendations (ie
>> recommendations which are relevent and not obsolete). If so, it might be
>> worth considering simpler approaches than CF.
>> 
>> 
>> Sigbjørn
>> 
>> 
>> On 22 October 2013 12:43, Cassio Melo <melo.cas...@gmail.com> wrote:
>> 
>>> Hi all,
>>> 
>>> I have a product recommendation use case for an e-commerce site and
> I've
>>> been playing around mahout's CF capabilities lately. I reached a point
>>> where I should ask a feedback from the community on my approach. I'm
>>> struggling to get ANY recommendation.
>>> 
>>> Here's what I've done so far:
>>> 
>>> 1) Collected data in this format:
>>> 
>>> USER_ID,PRODUCT_ID,VIEW,LIKE,FAVORITE,PURCHASE
>>> 0001,3333,1,0,1,1
>>> 
>>> 2) Built preferences for each user based on a computed score [0,1] for
>> the
>>> attributes (view,like,favorite,purchase)
>>> 
>>> 3) Implemented a custom ItemSimilarity class to boost products in the
>> same
>>> type, from the same manufacturer, with similar prices, etc
>>> 
>>> 4) Implemented a custom IDRescorer to filter out products that are no
>>> longer available.
>>> 
>>> 5) Implemented a Recommender subclass that simply instantiates a
>>> GenericItemRecommender with my custom similarity class: recommender =
> new
>>> GenericItemBasedRecommender(myModel, new ProductSimilarity());
>>> 
>>> It happens that <b>most</b> of product/preferences data is historical
> and
>>> most products are no longer available. Because of this I get zero
>>> recommendations when the IDRescorer's filter is activated (yes I
> checked
>>> the isFiltered method, it is returning true for expired products and
>> false
>>> otherwise as expected) event though there are lots of valid products.
>>> 
>>> Here's the overridden method in the IDRescorer's class:
>>> 
>>> public boolean isFiltered(long id) {
>>> 
>>> Product a = PreferencesDataModel.lookupProduct(id);
>>> 
>>>    return  ! a.isActive(); // filter expired product
>>> 
>>> }
>>> 
>>> I see it as beneficial to feed the engine with historical data on
> expired
>>> products and filtering them for recommendation, but getting zero
>>> recommendations made me rethink this approach (I also tried different
>>> similarity metrics including UserSim). What do you guys think?
>>> 
>> 
>> 
>> 
>> --
>> Sigbjørn DYBDAHL | 55 | fifty-five.com <http://www.fifty-five.com/> | 4,
>> place de l'Opéra, 75002 Paris | 01 76 21 91 32
>> 
> 

Reply via email to