Ok, you guys got me convinced :)

>From a technical point of view two ways to implement that filter come to
my mind:

1) Just load the user/item pairs to filter into memory in the
AggregateAndRecommendReducer (easy but might not be scalable) like Han
Hui suggested
2) Have the AggregateAndRecommendReducer not pick only the top-K
recommendations but write all predicted preferences to disk. Add another
M/R step after that which joins recommendations and user/item filter
pairs to allow for custom rescoring/filtering

--sebastian

Am 24.08.2010 06:07, schrieb Ted Dunning:
> Sorry to chime in late, but removing items after recommendation isn't such a
> crazy thing to do.
>
> In particular, it is common to remove previously viewed items (for a period
> of time).  Likewise, it the user says "don't show this again", it makes
> sense to backstop the actual recommendation system with a UI limitation that
> does a post-recommendation elimination.
>
> Moreover, this approach has the great benefit that the results are very
> predictable.  Exactly the requested/seen items will be eliminated and no
> surprising effect on recommendations will occur.
>
> That predictability is exactly the problem, though.  Generally you want a
> bit more systemic effect for negative recommendations.  This is a really
> sticky area, however, because negative recommendations often impart
> information about positive preferences in addition to some level of negative
> information.
>
> I used an explicit filter at both Musicmatch and at Veoh.  Both systems
> worked well.  Especially at Veoh, there was a lot of additional machinery
> required to handle the related problem of anti-flooding.  That was done at
> the UI level as well.
>
> On Mon, Aug 23, 2010 at 8:16 PM, Sean Owen <[email protected]> wrote:
>
>   
>> (Uncanny, I was just minutes before researching Grooveshark for
>> unrelated reasons... Good to hear from any company doing
>> recommendations and is willing to talk about it. I know of a number
>> that can't or won't unfortunately.)
>>
>> Yeah, sounds like we're all on the same page. One key point in what I
>> think everyone is talking about is that this is not simply removing
>> items *after* recommendations are computed. This risks removing most
>> or all recommended items. It needs to be done during the process of
>> selecting recommendations.
>>
>> But beyond that, it's a simple idea and just a question of
>> implementation. It's "Rescorer" in the non-Hadoop code, which does
>> more than provide a way to remove items but rather generally rearrange
>> recommendations according to some logic. I think it's likely easy and
>> useful to imitate this with a simple optional Mapper/Reducer phase in
>> this nascent "RecommenderJob" pipeline that Sebastian is now helping
>> expand into something more configurable and general purpose.
>>
>> Sean
>>
>> On Mon, Aug 23, 2010 at 8:25 PM, Chris Bates
>> <[email protected]> wrote:
>>     
>>> Hi all,
>>>
>>> I'm new to this forum and haven't seen the code you are talking about, so
>>> take this with a grain of salt.  The way we handle "banned items" at
>>> Grooveshark is to post-process the itemID pairs in Hive.  If a user
>>>       
>> dislikes
>>     
>>> a recommended song/artist, an item pair is stored in HDFS and then when
>>>       
>> the
>>     
>>> recs are computed, those banned user-item pairs are taken into account.
>>> Here is an example query:
>>>
>>> SELECT DISTINCT st.uid, st.simuid, IF(b.uid=st.uid,1,0) as banned  FROM
>>> streams_u2u st LEFT OUTER JOIN bannedsimusers b ON (b.simuid=st.simuid);
>>>
>>> That query will print out a 1 or a 0 if the recommended item pair is
>>>       
>> banned
>>     
>>> or not.  Hive also supports case statements (I think), so you can make a
>>> range of "banned-ness" I guess.  Just another solution to the "dislike"
>>> problem.
>>>
>>> Chris
>>>       
>>     
>   

Reply via email to