This is a brain dump (LONG) as I work through this problem.  Hope it
helps others or others can help me by correcting mistakes or
suggesting alternatives.

First, look at this article on Digg-like functionality:
http://code.google.com/appengine/articles/overheard.html

That article suggests this formula for a decaying vote ranking in
models.set_vote():

rank = quote.created * DAY_SCALE + quote.votesum

Each new vote causes this rank to be recalculated.  The keys to this
approach are (1) you only compute on write and (2) the passage of time
increases rank values so all untouched (i.e., not recently voted on)
articles don't have to be updated.  In the example, DAY_SCALE = 4 and
quote.created is # of days, so an article a day older would need 4
votes more than a new article to rank the same.

In your case, you want an "ordered list of URLs with most votes in the
last hour."    This means a new vote is worth infinitely more than a
vote more than an hour ago.  (And you want a step function -- all
votes within an hour are treated equally.)  Any solution, to be
manageable, should not require periodic updates to all articles, even
untouched ones.

First, put a property last_vote_time for the most recent vote in the
model.  So we can simply ignore every link that has last_vote_time
more than an hour ago.

Now we're left with all links that received a vote in the last hour,
but rank scores will contain stale votes.  This is where I run into
trouble figuring out how to adapt the previous method to this
problem.  So I'm going to have to think of another approach.

Let's start with a simpler problem where we only use non-overlapping
hour windows, i.e., the rankings get reset every hour on the hour.
In this case, the first vote after an hour boundary gets crossed
resets the rank score.  Rank is just votesum for this hour window.

Now what happens if we use more of these hour windows, but stagger
them across an hour:

votesum_0_min = db.IntegerProperty()
votesum_5_min = ...
votesum_10_min = ...
...
votesum_55_min = ...

In the above, votesum_5_min covers the future hour span that starts 5
minutes into the hour, e.g. 6:05 to 7:05.

On new vote, we look at the current time and the last_vote_time, then
clear every votesum that straddles its boundary.
Example:  Last vote time was 8:16, current time of new vote is 8:33.
We clear votesum_20_min, votesum_25_min, votesum_30_min, and set all
of them to 1 vote for this new vote.  All other votesums += 1.

Let's say we it's now 8:38 pm and we want to see all the links with
most votes for the last hour.  Call our model URLModel.

q = URLModel.gql('ORDER by votesum_40_min DESC')
urls = q.fetch(...)

We order on the window that has been active the longest for the
current time.

Looking at the query above, you notice that we have to remove all
URLModel entities with last_vote_time more than an hour ago, but we
want to order on votesum_40_min.  We could do:

q = URLModel.gql('WHERE last_vote_time > :hour_ago ORDER BY
last_vote_time, votesum_40_min DESC')

but we'll have to do the real votesum_40_min ordering after the fetch.

We could use an equality filter that pretty much removes all entities
with last_vote_time more than an hour ago.  One solution would be to
add a StringListProperty, call it last_vote_hours, that contains
strings corresponding to the last vote's hour in 00-23 format and its
hour + 1.  So a vote at 7:45 pm would store ['19', '20] in
last_vote_hours.

Back to our query at 8:38 pm:

q = URLModel.gql('WHERE last_vote_hours = '20' ORDER by votesum_40_min
DESC')
urls = q.fetch(...)

Then just remove any urls with last_vote_hours more than an hour ago.
I think that may be the most efficient way to do it but look forward
to hearing from others if they've actually read this far :)

-Bill


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to