One way to deal with that is to build a model that predicts the ultimate number
of views/plays/purchases for the item based on history so far.
If this model can be made Bayesian enough to sample from the posterior
distribution of total popularity, then you can use the Thomson sampling trick
a
A velocity measure of sorts, makes a lot of sense for a “what’s hot” list.
The particular thing I’m looking at now is how to rank a list of items by some
measure of popularity when you don’t have a velocity. There is an introduction
date though so another way to look at popularity might be to de
Oops... Didn't see ted's response before I had replied..
Sent from my iPhone
> On Feb 6, 2014, at 7:31 PM, Ted Dunning wrote:
>
> OK. Cool.
>
> That probably means that problem is much smaller and more likely to be
> logistics. Your suggestion of an off-by-one issue is quite plausible.
>
>
Sent from my iPhone
> On Feb 6, 2014, at 10:08 AM, Ted Dunning wrote:
>
> I can't comment on the specific question that you ask, but it should not
> necessarily be expected that LDA will reconstruct the categories that you
> have in mind. It will develop categories that explain the data as we
Rising popularity is often a better match to what people want to see on a
"most popular" page.
The best measure for that in my experience is log (new_count + offset) /
(old_count + offset) where new and old counts are the number of views
during the periods in question and offset is used partly to
OK. Cool.
That probably means that problem is much smaller and more likely to be
logistics. Your suggestion of an off-by-one issue is quite plausible.
On Thu, Feb 6, 2014 at 4:46 PM, Stamatis Rapanakis
wrote:
> That is correct. My problem is not the categories developed (which are
> meaningfu
Agree - I thought by asking for most popular you meant to look for apple
pie.
Agree with you and Ted that the sum of similarity says something
interesting even if it is not popularity exactly.
On Feb 6, 2014 11:16 AM, "Pat Ferrel" wrote:
> The problem with the usual preference count is that big
The problem with the usual preference count is that big hit items can be
overwhelmingly popular. If you want to know which ones the most people saw and
are likely to have an opinion about then this seems a good measure. But these
hugely popular items may not differentiate taste.
So we calculate
That is correct. My problem is not the categories developed (which are
meaningful by the way) but the fact that a certain document is not assigned
to the proper (LDA generated) category. The document to topics assignment
is really bad...
On Thu, Feb 6, 2014 at 5:08 PM, Ted Dunning wrote:
> I ca
If you look at the indicator matrix (cooccurrence reduced by LLR), you will
usually have asymmetry due to limitations on the number of indicators per
row.
This will give you some interesting results when you look at the column
sums. I wouldn't call it popularity, but it is an interesting measure.
I can't comment on the specific question that you ask, but it should not
necessarily be expected that LDA will reconstruct the categories that you
have in mind. It will develop categories that explain the data as well as
it can, but that won't necessarily match the categories you intend.
It is li
I have always defined popularity as just the number of ratings/prefs,
yes. You could rank on some kind of 'net promoter score' -- good
ratings minus bad ratings -- though that becomes more like 'most
liked'.
How do you get popularity from similarity -- similarity to what?
Ranking by sum of similar
Yeah that's the version that's bundled with 4.x. 5.x has basically 0.8
plus patches to work on MR2.
Mahout is not really something you have to install. Even though it
does get packaged and dumped onto the cluster nodes. Just use it
against your cluster -- it can be from a machine that isn't part o
Well, I think what you are suggesting is to define popularity as being
similar to other items. So in this way most popular items will be
those which are most similar to all other items, like the centroids in
K-means.
I would first check the correlation between this definition and the
standard one
Hi everyone,
Is there a simple way to install Mahout 0.9 on a cluster running Cloudera's
CDH 4.5 ?
When I try what they advise on their doc (yum install mahout on my CentOS 6
node), it wants to install mahout version 0.7+22-1.cdh4.5.0.p0.14.el6.
Thanks in advance !
--
Kévin Moulart
GSM France
15 matches
Mail list logo