[google-appengine] Re: counter use pattern

josh l Mon, 03 Nov 2008 14:11:56 -0800

Why do you say with memcache I would only write to the datastore once
per second, or every 10 seconds?  This doesn't make any sense... If I
only attempt write 1/10 of my counter increments to the datastore, I
will still be writing a few times/second at least (I am writing this
to expect >30QPS, but likely it will be more).


Also, my issue/question is not regarding Transaction Collisions.  Of
course these are avoidable by just using many shards, and adding more
shards until collisions are extremely infrequent.  My issue is
regarding total request/response time.  Attempting to write anything
to the datastore takes time.  It just does.  The cpu cycles may not
count against us (in $cost), but the time does (or seems to, at
least).  If a request ends up needing to write to multiple counters
(and all of mine do), quickly the total response is in the many
hundreds of ms.  I don't want this, I want quicker.

My testing shows memcache to be fairly reliable.  I am not worried
about the very occasional (I hope) memcache issue.  I am worried about
total time per request.  I believe I can drastically shorten this time
by not attempting to write to the datastore as often, and having the
counter use just memcache most of the time for increments.  I can't
imagine I am the only person to think of this?

  -Josh



On Nov 3, 1:56 pm, yejun <[EMAIL PROTECTED]> wrote:
> The reason why not use shard count with memcache cache write is that
> with memcache you will only write datastore once per second or every
> 10 seconds. I see no reason why you need a shard counter for that kind
> frequency.
> Because when a memcache reset happends, losing 1/10 second of data or
> 1 second seems have the similar reliability to me.
>
> On Nov 3, 4:32 pm, josh l <[EMAIL PROTECTED]> wrote:
>
> > Yejun,
>
> > I've been told that the memcached framework on GAE uses a first-
> > created, first-deleted algorithm, and that we have finite (a few
> > hundred megs, but maybe even up to a GB) of memcache for any specific
> > app.  This means that once you hit your limit, older objects WILL get
> > deleted.  And my app will definitely be going over this max limit.
> > This is not a huge deal to me (probably won't happen that my counter
> > gets deleted that often, and it's ok if it's a little bit off) but I
> > figured my counter might want as well handle that small case anyway.
>
> > Again, the much bigger case:  Not writing to the datastore each time
> > you want to increment.  And yes, I am aware of why to use the sharded
> > counter.  The point is what if you have about 50QPS coming in (so you
> > need a sharded counter with a lot of shards for sure), and every
> > single request is writing to ~3 different counters.  Now each request
> > is taking a while because of the datastore writes that it attempts
> > each time, even with no Transaction collisions on shard-writes.  And
> > also I believe there is a GAE watchdog looking to see if over a period
> > of time your average request is >300ms.
>
> > So I am simply saying, why not try cut down on the total datastore
> > writes, and write to a shard only 1/10 times, but still get the
> > correct totals?  This is the reasoning for my arguments above.  So now
> > you have an order of magnitude less datastore writes, and the average
> > response time is way down.  This sounds good to me, and I am sure
> > others who plan to write apps that have a number of sharded counter
> > increments / avg. request, might feel similarly.  Am I missing
> > something obvious here?
>
> >   -Josh
>
> > On Nov 3, 12:54 pm, yejun <[EMAIL PROTECTED]> wrote:
>
> > > I believe mecached algorithm will keep most used object not just by
> > > creation time.
> > > The reason you use a sharded count is that you want increment
> > > operation always hit the disk. Otherwise you don't need shard because
> > > you don't access it very often if write is cached in memcache. Shard
> > > is used here is to improve write concurrency.
>
> > > On Nov 3, 3:43 pm, josh l <[EMAIL PROTECTED]> wrote:
>
> > > > Yejun,
>
> > > > Thanks for the updated code example.  Since a counter is such a common
> > > > need, I think it might be helpful if we all worked together on the
> > > > same codebase, rather than forking each time, or at least if we could
> > > > be specific about the changes we made (I know I can do a diff, but if
> > > > you noted what exactly you changed, and why, that would be awesome for
> > > > all future users who are curious).
>
> > > > Moving on, I think I didn't explain myself well regarding the
> > > > destruction of the memcache object.  Imagine an app where
> > > >  1) There will definitely be at least 10 counter requests/sec (for
> > > > same named counter.  let's call it the TotalRequests counter, and is
> > > > referred to by some Stats model)
> > > >  2) Lots and lots of other entities get written to memcache (millions
> > > > of entities in the system, and each gets cached upon initial request)
>
> > > > In this situation, it is guaranteed objects in our memcache will
> > > > disappear after some use, since we have less memcache total avail than
> > > > the size of items that will be cached over a few days of use.  Now,
> > > > which items get removed?  In this case, our counter is the first
> > > > created item in memcache, and definitely one of the first items to be
> > > > just nuked from memcache when we hit the max storage limit for our
> > > > apps' memcache.  To ensure it never gets nuked due to it being the
> > > > 'oldest object in memcache', then we could 'occasionally' destroy/
> > > > recreate it.  Maybe, for example, I could also have a time_created on
> > > > it, and if it's older than a few hours, then nuke/recreate upon
> > > > resetting it.  I figured might as well do this every time, but anyway
> > > > hopefully you see my point as to why I was thinking about the need to
> > > > destroy/reuse.
>
> > > > Much more important than this very occasional mis-count for a
> > > > destroyed memcache item, tho, is my general idea of just not even
> > > > attempting to write to a shard entity unless we've had a few (10?,
> > > > 50?) counter increments.  I am getting ~350ms/request average due to
> > > > the time it takes writing to the shards (multiple counters/request),
> > > > and this is my main concern with the current code.
>
> > > > I will diff your code (thanks again) and check it out this afternoon.
>
> > > >   -Josh
>
> > > > On Nov 3, 12:22 pm, yejun <[EMAIL PROTECTED]> wrote:
>
> > > > > > To solve this, I'm also
> > > > > > planning to destroy and recreate the memcache object upon successful
> > > > > > datastore write (and associated memcache delay being reset to zero).
>
> > > > > You shouldn't do this. It the completely negates the reason why you
> > > > > use this counter class.
> > > > > If you just need a counter for a local object, you should just save
> > > > > the counter with object itself.
>
> > > > > This counter  class should only be used in situation when the same
> > > > > named counter need to be increment more than 10 times per second by
> > > > > different requests/users concurrently.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

[google-appengine] Re: counter use pattern

Reply via email to