[google-appengine] Help with an accurate counter with no contention...?

2009-11-28 Thread peterk
Hey all,

I've been looking at the Task Queue API and counter example. In my
app, each user will have a couple of counters maintained for them,
counting various things.

Thing is, these counters need to be accurate. So I'm not sure if the
example given for the Task Queue API using memcache would be
appropriate for me - it would not be good, really, if my counters were
to be inaccurate. My users would expect accurate counts here.

So I was thinking about a sort of modified version whereby each change
to the counter would be stored in the DS in its own entity. E.g. an
entity called 'counter_delta' or some such, which holds the delta to
apply to a counter, and the key to the counter that the delta is to be
applied to.

Then, using the Task Queue I guess I could hoover up all these delta
entities, aggregate them, and apply them in one go (or in batches) to
the counter. And then delete the delta entries.

Thus the task queue is the only thing accessing the counter entity,
and it does so in a controllable fashion - so no real contention. Each
change to the counter gets written to the store in its own
counter_delta entity, so no contention there either. And because the
deltas are stored in DS and not in memcache, it should be much more
reliable.

However, I'm not entirely sure how I should actually go about
implementing this, or specifically, the task queue end of things.

I'm thinking if there is a change to a counter to be made, I should
check if there's a task already running for this counter, and if so,
not to do insert any new task, and let the currently running task take
care of it. If there is no running task for this counter, I would
instead create one, and set it to run in - say - 60 seconds, allowing
time for further deltas for this counter to accumulate so the task can
take care of more than just one delta. This would mean the counter
might be inaccurate for up to 60 seconds, but I can live with that.

But what I'm wondering is, how can I implement this 'don't insert a
new task if one for this counter is already in the queue or running'
behaviour?

I was thinking initially that I could give the task a name based on
the counter, so that only one such task can exist at any one time.
However, I believe we have no control over when that name is freed up
- it isn't necessarily freed up when the task ends, I believe names
can be reserved for up to 7 days (?) So that wouldn't work. If a name
could be freed up when a task was really finished then this could
work, I think.

I was thinking also I could store a flag so that when a counter_delta
is created, I'd look to see if a flag for this counter was present,
and if so, do nothing. If not, create the task, and create the flag.
Then when the task was all done and didn't see any more
counter_deltas, it'd delete the flag. But I'm worried that there could
be race conditions here, and some deltas might get overlooked as a
result? And if I were to use transactions on such a flag, would I not
fall into the same contention trap I'm trying to avoid in the first
place?

Help? :| Thanks for any advice/insight...

--

You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.




Re: [google-appengine] Help with an accurate counter with no contention...?

2009-11-28 Thread Eli Jones
I think there would have to be some divergence between what the counter
should be and what the user will actually see at any given time.. since if
you have a high rate of counts happening for a user.. you'd get contention
when trying to update all the count each time the event being counted
happened.

Of course, you know that part.. since they have all these sharding examples.

So, you gotta decide how stale the count can be...

Once you decide that, since you don't seem to want any potential loss of
counts... you'd probably need two Models to do counting for each user.
(memcache would be out since that allows potential lost counts)

One for each individual count inserted (call it UserCounter) and one for the
count that the user sees (UserTotalCount).

So, if a count-event happens you insert a new row into UserCounter.

Then you should have a task that runs that selects __key__ from UserCounter,
finds out how many entities were returned, updates the UserTotalCount model
with the additional counts, and once that update is successful, it deletes
the keys, entities for those counts that it selected.  AND then, once all of
that is successful, have the Task essentially schedule itself to run again
in N seconds or however long you've decided to give it.

Presumably, doing it this way would allow you to make sure that the
counterupdate task is running one at a time for each user (since it can only
start itself up again after it is done counting).. and you would avoid write
contention since the task is the only thing updating the user's counter.

Probably, you could set up two Expando models to do this for all users.. and
each time a new user was created, you'd add a new Column to the Expando
models for that user.

so, you'd have these initial definitions:

class UserCounter(db.Expando):
BobCountEvent = db.BooleanProperty(required=True)

class UserTotalCount(db.Expando):
BobTotalCount = db.IntegerProperty(required=True)


Then, each time user Bob has a count event you do:

bobCount = UserCounter(BobCountEvent = True)
bobCount.put()

And when you want to update Bob's Total Count, you do (I have to do this
quasi-pseudo since it isn't trivial to do):

results = Select __key__ from UserCounter Where BobCountEvent = True
If len(results) > 0:
  countResult = Select * from UserTotalCount Where BobTotalCount >= 0
  if len(countResult) > 0:
countResult.BobTotalCount += len(results)
db.put(countResults)
  else:
newCount = UserTotalCount(BobTotalCount = len(results))
newCount.put()
  db.delete(results)

Now, you might wonder... how do I do puts for variable user names? (You can'
t just create new put functions for each new user)..  In Python, you can use
exec to do that..

I have not tested how any of this performs... having an expando model may
hurt performance.. but, I don't think so, and I know the method works for
other things (not sure how it'd do on this counter method).

See here for Google's sharded counts example:
http://code.google.com/appengine/articles/sharding_counters.html

On Sat, Nov 28, 2009 at 5:17 PM, peterk  wrote:

> Hey all,
>
> I've been looking at the Task Queue API and counter example. In my
> app, each user will have a couple of counters maintained for them,
> counting various things.
>
> Thing is, these counters need to be accurate. So I'm not sure if the
> example given for the Task Queue API using memcache would be
> appropriate for me - it would not be good, really, if my counters were
> to be inaccurate. My users would expect accurate counts here.
>
> So I was thinking about a sort of modified version whereby each change
> to the counter would be stored in the DS in its own entity. E.g. an
> entity called 'counter_delta' or some such, which holds the delta to
> apply to a counter, and the key to the counter that the delta is to be
> applied to.
>
> Then, using the Task Queue I guess I could hoover up all these delta
> entities, aggregate them, and apply them in one go (or in batches) to
> the counter. And then delete the delta entries.
>
> Thus the task queue is the only thing accessing the counter entity,
> and it does so in a controllable fashion - so no real contention. Each
> change to the counter gets written to the store in its own
> counter_delta entity, so no contention there either. And because the
> deltas are stored in DS and not in memcache, it should be much more
> reliable.
>
> However, I'm not entirely sure how I should actually go about
> implementing this, or specifically, the task queue end of things.
>
> I'm thinking if there is a change to a counter to be made, I should
> check if there's a task already running for this counter, and if so,
> not to do insert any new task, and let the currently running task take
> care of it. If there is no running task for this counter, I would
> instead create one, and set it to run in - say - 60 seconds, allowing
> time for further deltas for this counter to accumulate so the task can
> take care of more than just o

Re: [google-appengine] Help with an accurate counter with no contention...?

2009-11-28 Thread Eli Jones
To be fair, this method I described may be overkill.

I developed it when I was thinking about how to lighten up insert costs to
the datastore.

I figured that, if one could store some of the relevant information in the
Column name (specifically, string info like "who's count is this?"), that
would reduce the total size of the entity.. and thus speed up writes.

It was suggested that the performance wouldn't be much different than just
having models like this:

class UserCounter(db.Model):
  Username = db.StringProperty(required = True)

class UserTotalCount(db.Model):
  Username = db.StringProperty(required = True)
  Count = db.IntegerProperty(required = True)

Then, you'd just
Select __key__ from UserCounter Where Username = 'Bob'
and
Select * from UserTotalCount Where Username = 'Bob'

To do your counting and updating..

Though, my intuition is that doing it this way would take more processing
power (and maybe lead to some contention) since you're inserting
StringProperties into one big column when putting UserCounter events.

Here is the initial thread covering what I was trying to figure out:
Expando and Index
Partitioning


On Sat, Nov 28, 2009 at 6:46 PM, Eli Jones  wrote:

> I think there would have to be some divergence between what the counter
> should be and what the user will actually see at any given time.. since if
> you have a high rate of counts happening for a user.. you'd get contention
> when trying to update all the count each time the event being counted
> happened.
>
> Of course, you know that part.. since they have all these sharding
> examples.
>
> So, you gotta decide how stale the count can be...
>
> Once you decide that, since you don't seem to want any potential loss of
> counts... you'd probably need two Models to do counting for each user.
> (memcache would be out since that allows potential lost counts)
>
> One for each individual count inserted (call it UserCounter) and one for
> the count that the user sees (UserTotalCount).
>
> So, if a count-event happens you insert a new row into UserCounter.
>
> Then you should have a task that runs that selects __key__ from
> UserCounter, finds out how many entities were returned, updates the
> UserTotalCount model with the additional counts, and once that update is
> successful, it deletes the keys, entities for those counts that it selected.
>  AND then, once all of that is successful, have the Task essentially
> schedule itself to run again in N seconds or however long you've decided to
> give it.
>
> Presumably, doing it this way would allow you to make sure that the
> counterupdate task is running one at a time for each user (since it can only
> start itself up again after it is done counting).. and you would avoid write
> contention since the task is the only thing updating the user's counter.
>
> Probably, you could set up two Expando models to do this for all users..
> and each time a new user was created, you'd add a new Column to the Expando
> models for that user.
>
> so, you'd have these initial definitions:
>
> class UserCounter(db.Expando):
> BobCountEvent = db.BooleanProperty(required=True)
>
> class UserTotalCount(db.Expando):
> BobTotalCount = db.IntegerProperty(required=True)
>
>
> Then, each time user Bob has a count event you do:
>
> bobCount = UserCounter(BobCountEvent = True)
> bobCount.put()
>
> And when you want to update Bob's Total Count, you do (I have to do this
> quasi-pseudo since it isn't trivial to do):
>
> results = Select __key__ from UserCounter Where BobCountEvent = True
> If len(results) > 0:
>   countResult = Select * from UserTotalCount Where BobTotalCount >= 0
>   if len(countResult) > 0:
> countResult.BobTotalCount += len(results)
> db.put(countResults)
>   else:
> newCount = UserTotalCount(BobTotalCount = len(results))
> newCount.put()
>   db.delete(results)
>
> Now, you might wonder... how do I do puts for variable user names? (You
> can' t just create new put functions for each new user)..  In Python, you
> can use exec to do that..
>
> I have not tested how any of this performs... having an expando model may
> hurt performance.. but, I don't think so, and I know the method works for
> other things (not sure how it'd do on this counter method).
>
> See here for Google's sharded counts example:
> http://code.google.com/appengine/articles/sharding_counters.html
>
> On Sat, Nov 28, 2009 at 5:17 PM, peterk  wrote:
>
>> Hey all,
>>
>> I've been looking at the Task Queue API and counter example. In my
>> app, each user will have a couple of counters maintained for them,
>> counting various things.
>>
>> Thing is, these counters need to be accurate. So I'm not sure if the
>> example given for the Task Queue API using memcache would be
>> appropriate for me - it would not be good, really, if my counters were
>> to be inaccurate. My users would expect accurate counts here.
>>

Re: [google-appengine] Help with an accurate counter with no contention...?

2009-11-29 Thread Eli Jones
I think that.. Since you require exactness on the counters, your
primary option is the sharded counter option without memcache.

You could/should modify the code so that.. If you get two timeouts in
a row when trying to update a counter, then it just creates a new
shard for the user.  This way, "busy" users would get more shards than
non-busy users.

Now, I am presuming that you will not end up with more than 1,000
counter shards per user (I don't see how you could have that much
going on for one user every second that that many shards would be
needed).  If that happens, then, like you say, you'll have to page
through the shards.

When it comes to counters, they are usually discussing them in the
context of counting total visits to a site or something like that..
The task queue method, from my recollection, is used to aggregate
memcached counters on a timed interval.

The sub-case of having a sharded counter per user should work fine, I think.

If you create your counter so it creates a new shard for a user after
2 or 3 timeouts on a write, you should be fine (in my estimation).  In
that case, you are guaranteed to get all counts written to the
datastore (barring code, system failure)




On 11/29/09, peterk  wrote:
> I am slightly leaning toward going back to good old sharded
> counters...
>
> The reason I was looking for alternatives was because of some things
> Brett Slatkin said in his Google IO presentation on task queues - that
> really sharded counters were just a band-aid at the time, and that
> they still have problems with time-outs.
>
> But I guess as long as you keep enough shards for a given counter, it
> shouldn't time out? We can presumably catch contention warnings and
> within a number of retries create a new shard if necessary for an
> update to write to so we never lose one.
>
> I do agree with Brett that it's a bit heavy-handed, but if it
> guarantees accuracy...
>
> As far as I see it, the options seem to be this:
>
> Write-Behind Cache:
> + Control over DB write rate - 100 counter updates can be one DB write
> with this etc.
> + No redundancy in counter DB storage
> + Lower latency in the update request
> - Busy counters could keep a task in the queue constantly. Lots of
> busy counters (e.g. one for every user) = lots of task entries =
> problem with daily limit and lagging db update due to 10 task/sec
> invocation limit (?)
> - Lots of busy counters = thrashing of the cache and higher potential
> for a counter to disappear from cache before its saved (?)
>
> Sharded Counters:
> + No impact on task queue quota
> + More reliable, no dependency on a cache
> + Can maybe scale better with larger number of counters since you're
> not dealing with a limited cache size and task quota (?)
> + Can still deal with contention...
> -...as long as you manage the number of shards. Contention (and thus a
> dropped update) possible if number of shards is too low for a counter
> - More storage space required (n entities for n shards vs one counter
> entity)
> - DB write on every update; more latency in the request
> - Problem with reading counter total if number of shards goes beyond
> 1000? Thus limited to 1000 shards = ~5000 counter updates/sec? Maybe
> could be overcome this during such busy times with a single
> total_count entity that aggregates all shards every now and then via a
> task (i.e. a task to read the shard values, up to n shards >1000, and
> write the aggregate to a single total_count entity). Count total would
> lag for a bit, but probably OK if it was being updated this much.
>
> The examples we were thinking of where we write deltas to the
> datastore would be more reliable than the memcache approach I think,
> but would still suffer from the problem of using an awful lot of tasks
> with many counters (and thus also being limited to updating ten
> counters per second because of the 10 task/sec run limit on the
> queues?). If you had a hundred thousand active counters, that means
> potentially tends of thousands of counters with tasks in queues at a
> given time, which means a counter could be waiting...maybe half an
> hour or more between updates... (if you have 2 counters being
> updated within a given time - feasible with a large active site with
> counters per user).
>
> Any thoughts? Agreement, disagreement? I'm thinking sharded counters
> with careful safeguards for concurrency exceptions may be the way to
> go, but I'm wondering if there's any more monsters waiting down the
> line with that choice...
>
> On Nov 29, 1:37 am, Eli Jones  wrote:
>> oh, and duh.. the first part of that article does the sharded counting..
>> using transactions without memcached.
>>
>> So, presumably, it should do exactly what you want.
>>
>> you just need to modify that to allow counting for an arbitrary number of
>> users.
>>
>> On Sat, Nov 28, 2009 at 8:30 PM, Eli Jones  wrote:
>> > Though, let me re-iterate... all the round-about stuff I'm talking about
>> > is
>> > something to consider if