Two things will help in addition to what Josh suggested:

a) when looking for items that are trending hot, use the difference in the
log rank as a score.  For most internetly things, rank is proportional to
1/rate so log rank is -log rate.  Refining this slightly to -log (epsilon +
1/rank) makes things a little less jumpy.

b) use various forms of coalescence.  If you are trending queries, normalize
the queries by sorting terms.  If you have a category handy, try that.
Always invent a display name, of course.  Usually, I just use the most
common input that maps to a coalesced group.

Item (b) may involve clustering or it may not.  Depends on the data you have
and the exact results you want.

On Sat, Jun 18, 2011 at 7:52 PM, Mark <[email protected]> wrote:

> Sorry if this isn't the right place to ask but how would I go about finding
> trending data over a certain period of time.
>
> For example: http://www.ebay.com has a section "Trends on eBay" that is
> updated daily. I was wondering how this can be accomplished using Mahout (if
> possible)
>
> For input I have:
>    - user searches by day
>    - titles of products purchased by day
>
> Would this require some sort of clustering? classification?
>
> Thanks in advance
>

Reply via email to