A velocity measure of sorts, makes a lot of sense for a “what’s hot” list.

The particular thing I’m looking at now is how to rank a list of items by some 
measure of popularity when you don’t have a velocity. There is an introduction 
date though so another way to look at popularity might be to decay it with 
something like e^-t where t is it’s age. You can see the decay in the views 
histogram.

On Feb 6, 2014, at 4:35 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:

Rising popularity is often a better match to what people want to see on a
"most popular" page.

The best measure for that in my experience is log (new_count + offset) /
(old_count + offset) where new and old counts are the number of views
during the periods in question and offset is used partly to avoid log(0) or
x/0 problems, but also to give a Bayesian grounding to the measure.




On Thu, Feb 6, 2014 at 5:33 PM, Sean Owen <sro...@gmail.com> wrote:

> Agree - I thought by asking for most popular you meant to look for apple
> pie.
> 
> Agree with you and Ted that the sum of similarity says something
> interesting even if it is not popularity exactly.
> On Feb 6, 2014 11:16 AM, "Pat Ferrel" <p...@occamsmachete.com> wrote:
> 
>> The problem with the usual preference count is that big hit items can be
>> overwhelmingly popular. If you want to know which ones the most people
> saw
>> and are likely to have an opinion about then this seems a good measure.
> But
>> these hugely popular items may not differentiate taste.
>> 
>> So we calculate the “important” taste indicators with LLR. The benefit of
>> the similarity matrix is that it attempts to model the “important”
>> cooccurrences.
>> 
>> There is an affect of hugely popular items where they really say nothing
>> about similarity of taste. Everyone likes motherhood and Apple pie so it
>> doesn’t say much about us if we both do to. This is usually accounted for
>> with something like TFIDF so I suppose another weighted popularity
> measure
>> would be to run the preference matrix through TFIDF to de-weight
>> non-differentiating preferences.
>> 
>> On Feb 6, 2014, at 7:14 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>> 
>> If you look at the indicator matrix (cooccurrence reduced by LLR), you
> will
>> usually have asymmetry due to limitations on the number of indicators per
>> row.
>> 
>> This will give you some interesting results when you look at the column
>> sums.  I wouldn't call it popularity, but it is an interesting measure.
>> 
>> 
>> 
>> On Thu, Feb 6, 2014 at 2:15 PM, Sean Owen <sro...@gmail.com> wrote:
>> 
>>> I have always defined popularity as just the number of ratings/prefs,
>>> yes. You could rank on some kind of 'net promoter score' -- good
>>> ratings minus bad ratings -- though that becomes more like 'most
>>> liked'.
>>> 
>>> How do you get popularity from similarity -- similarity to what?
>>> Ranking by sum of similarities seems more like a measure of how much
>>> the item is the 'centroid' of all items. Not necessarily most popular
>>> but 'least eccentric'.
>>> 
>>> 
>>> On Thu, Feb 6, 2014 at 7:41 AM, Tevfik Aytekin <
> tevfik.ayte...@gmail.com
>>> 
>>> wrote:
>>>> Well, I think what you are suggesting is to define popularity as being
>>>> similar to other items. So in this way most popular items will be
>>>> those which are most similar to all other items, like the centroids in
>>>> K-means.
>>>> 
>>>> I would first check the correlation between this definition and the
>>>> standard one (that is, the definition of popularity as having the
>>>> highest number of ratings). But my intuition is that they are
>>>> different things. For example. an item might lie at the center in the
>>>> similarity space but it might not be a popular item. However, there
>>>> might still be some correlation, it would be interesting to check it.
>>>> 
>>>> hope it helps
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Wed, Feb 5, 2014 at 3:27 AM, Pat Ferrel <p...@occamsmachete.com>
>>> wrote:
>>>>> Trying to come up with a relative measure of popularity for items in
> a
>>> recommender. Something that could be used to rank items.
>>>>> 
>>>>> The user - item preference matrix would be the obvious thought. Just
>>> add the number of preferences per item. Maybe transpose the preference
>>> matrix (the temp DRM created by the recommender), then for each row
>> vector
>>> (now that a row = item) grab the number of non zero preferences. This
>>> corresponds to the number of preferences, and would give one measure of
>>> popularity. In the case where the items are not boolean you'd sum the
>>> weights.
>>>>> 
>>>>> However it might be a better idea to look at the item-item similarity
>>> matrix. It doesn't need to be transposed and contains the "important"
>>> similarities--as calculated by LLR for example. Here similarity means
>>> similarity in which users preferred an item. So summing the non-zero
>>> weights would give perhaps an even better relative "popularity"
> measure.
>>> For the same reason clustering the similarity matrix would yield
>>> "important" clusters.
>>>>> 
>>>>> Anyone have intuition about this?
>>>>> 
>>>>> I started to think about this because transposing the user-item
> matrix
>>> seems to yield a fromat that cannot be sent directly into clustering.
>>> 
>> 
>> 
> 

Reply via email to