[google-appengine] Re: Idempotence & multiple task execution

hawkett Wed, 12 May 2010 03:49:14 -0700

Bump - still not clear whether the same task can be executing multiple
times concurrently? I noticed that failed tasks seem to back off for
significantly longer recently - perhaps this has helped the situation?
Appreciate any clarification - cheers,


Colin

On May 1, 1:08 am, hawkett <hawk...@gmail.com> wrote:
> My use case is as follows -
>
> 1. tasks which do not support idempotence inherently (such as deletes,
> and some puts) carry a unique identifier, which is written as a
> receipt in an attribute of an entity that is updated in the
> transaction.
> 2. When a task arrives carrying a receipt, I check that it does not
> already exist - so receipted tasks incur an additional, key only, db
> read
>
> This is essentially my algorithm for ensuring idempotence (in
> situations where it is not inherent) - ignore subsequent executions.
>
> If the same task *cannot* be running in parallel, then the check for
> the receipt can be done outside the transaction that writes the
> receipt - which has a couple of advantages -
>
> a. It can be done up front in the task handler, so I don't have to go
> all the way through to the transactional write before discovering it
> already executed
> b. More importantly, I can reduce the work done inside the transaction
> - every extra millisecond spent in the transaction locks the entity
> group, and at scale, those milliseconds can add up - especially on
> entity groups that are somewhat write intensive.
>
> If the same task *can* be running in parallel, then I need to do the
> receipt read inside the transaction that writes it. It would be a pity
> to do that extra work in every transaction for a very rare scenario.
>
> As stated earlier, it seems that it might be possible for GAE to
> guarantee that it does not execute the same task in parallel - by
> ensuring that, for error scenarios like those above (408, client
> crash, perhaps others), the 2nd execution waits 30 seconds.  That has
> some obvious downsides, but given how rarely it occurs, and given that
> an app shouldn't be relying on the speed with which a task is
> executed, it seems like a reasonable trade-off to get a reduction in
> transactional work for the vast majority of the time - less
> contention, less CPU, less datastore activity.
>
> A simple example is a task which increments a counter - we don't want
> to increment the counter twice.
>
> The problem is the same whether one or many entities are being updated
> during handling of the task.
>
> Do you have many situations where you perform a read that does not
> result in some sort of update - db update, another task raised, email
> sent, external system notified etc.?  There's a subset of most of
> these that we want to avoid doing twice. It's the multiple writes,
> rather than multiple reads causing issues.
>
> Anyone from google able to end the speculation? :)
>
> On Apr 30, 2:31 am, Eli Jones <eli.jo...@gmail.com> wrote:
>
>
>
> > In my opinion, the case you are asking about is pretty much the reason they
> > state that tasks must be idempotent.. even with named tasks.
>
> > They cannot guarantee 100% that some transient error will not occur when a
> > scheduled task is executed (even if you are naming tasks and are guaranteed
> > 100% that your task will not be added to the queue more than once).
>
> > So, it is possible to have more than one version of the "same" task
> > executing at the same time.  You just need to construct your tasks so they
> > aren't doing too much at once (e.g. reading some data, then updating or
> > inserting.. then reading other data... and updating some more), or you need
> > to make sure to do all that inside a big transaction.. and, even then, you
> > still need to ensure idempotence.
>
> > I sort of prefer a poor man's version of idempotence for my chained tasks.
> >  Mainly, if the "same" task runs more than once.. each version will have a
> > potentially different result, but I am perfectly happy getting the result
> > from the task that ran last.  But, I can easily accept this since my tasks
> > are not doing multiple updates at once.. and they are not reading from the
> > same entities that they are updating.
>
> > What is your exact use case?
>
> > On Thu, Apr 29, 2010 at 7:28 PM, hawkett <hawk...@gmail.com> wrote:
> > > Thanks for the response - it's good to know that the multiple
> > > executions cannot occur in parallel, although I'm not sure I
> > > completely understand the reasons. Take the following example -
>
> > > 1. task queue executes a task for the first time (T1E1)
> > > 2. application receives task, and begins processing
> > > 3. the http connection is lost soon after, and the task queue receives
> > > a HTTP response code
> > > 4. task queue backs off (e.g. waits 2s)
> > > 5. task queue executes the task a second time (T1E2)
> > > 6. application receives task and begins processing
>
> > > Why is it that T1E1 cannot still be running at step 5/6? Are there no
> > > conditions at step 3 where a response (of any status) is received
> > > while the processing at step 2 is still underway?
>
> > > There is also another situation, where the HTTP client crashes, which
> > > is also unclear -
>
> > > 1. task queue executes a task for the first time (T1E1)
> > > 2. application receives task, and begins processing
> > > 3. the task queue crashes (i.e. the HTTP client), so no response can
> > > be received
> > > 4. task queue recovers, or another node takes over - (how does it
> > > determine the state of T1E1?)
> > > 5. task queue executes the task a second time, since it cannot know
> > > whether T1E1 completed successfully? (T1E2)
> > > 6. application receives task and begins processing
>
> > > Is it possible in this scenario that it will re-execute the task
> > > (T1E2) prior to the completion of the first (T1E1)?
>
> > > Thanks,
>
> > > Colin
>
> > > On Apr 29, 5:36 pm, djidjadji <djidja...@gmail.com> wrote:
> > > > The decision to rerun a task is done based on the HTTP response code.
> > > > There is always a response code, even when the connection is lost.
>
> > > > When the code is 200 the task is considered complete and will not be
> > > rerun.
> > > > Any other code means the task needs a rerun.
> > > > The time between the reruns is increased with each retry.
>
> > > > This means a certain task is never retried in parallel.
>
> > > > But it could be that a task created later will finish first because it
> > > > did not need to retry.
>
> > > > 2010/4/25 hawkett <hawk...@gmail.com>:
>
> > > > > Wondering if I haven't asked the question clearly enough. Regarding
> > > > > the statement that we need to assume tasks may be executed multiple
> > > > > times (i.e. ensure idempotence): is that multiple times serially, or
> > > > > possibly multiple times concurrently?
>
> > > > > I've gone ahead and coded my idempotence solution to assume that they
> > > > > cannot be running concurrently, just because its a bit easier, and a
> > > > > bit less work inside a transaction. I'm guessing that the reason they
> > > > > may be run multiple times is that GAE won't know what to do if it
> > > > > doesn't get a response from a task it executes - it can't be sure that
> > > > > the task was received by the application, or that the application was
> > > > > given the opportunity to correctly react the task - in fact it has to
> > > > > assume that it didn't, and therefore runs it again to be sure.  I'm
> > > > > assuming that GAE always knows for certain that a task has been fired,
> > > > > just not whether it was fired successfully - and it will only fire
> > > > > again if it hasn't correctly processed a response from the previous
> > > > > execution. If this were true, then it seems as long as GAE guarantees
> > > > > that it waits > 30s before firing the task a second time (rather than
> > > > > just reacting to the loss of the http connection for example), then we
> > > > > can know it is not executing in parallel, because the first execution
> > > > > cannot be still running due to the request limit.
>
> > > > > Am I looking at this correctly? Is it fair to assume that the same
> > > > > task cannot be running in parallel? Cheers,
>
> > > > > Colin
>
> > > > > On Apr 23, 3:14 pm, hawkett <hawk...@gmail.com> wrote:
> > > > >> Hi Tim - there's a couple of reasons why this won't work - firstly, 
> > > > >> it
> > > > >> is my understanding that named tasks are also subject to the
> > > > >> possibility of being executed twice (the name only prevents the same
> > > > >> name being added to the queue twice), and secondly tasks raised
> > > > >> transactionally cannot have a task name.
>
> > > > >> On Apr 23, 11:45 am, Tim Hoffman <zutes...@gmail.com> wrote:
>
> > > > >> > Probably the best way to guard would be have the task name specific
> > > to
> > > > >> > the operation.
> > > > >> > You cant have another task with the same name for about a week,
>
> > > > >> > T
>
> > > > >> > On Apr 23, 3:51 pm, hawkett <hawk...@gmail.com> wrote:
>
> > > > >> > > HI,
>
> > > > >> > > I understand that it is possible for a single task to be executed
> > > more
> > > > >> > > than once, but is it safe to assume that only one instance of a
> > > > >> > > specific task will be executing at the one time? It makes it much
> > > more
> > > > >> > > difficult (time consuming) to implement idempotent behaviour if 
> > > > >> > > it
> > > is
> > > > >> > > possible for the subsequent executions of a task to begin before
> > > the
> > > > >> > > first has completed - i.e. for the same task to be executing
> > > > >> > > concurrently. I can think of ways of using db locking (memcache 
> > > > >> > > is
> > > not
> > > > >> > > reliable - especially when this scenario is most likely to occur
> > > > >> > > during system failures) to recognise the multiple concurrent
> > > > >> > > executions, but it would be great to know that this scenario
> > > cannot
> > > > >> > > occur.  Thanks,
>
> > > > >> > > Colin
>
> > > > >> > > --
> > > > >> > > You received this message because you are subscribed to the 
> > > > >> > > Google
> > > Groups "Google App Engine" group.
> > > > >> > > To post to this group, send email to
> > > google-appeng...@googlegroups.com.
> > > > >> > > To unsubscribe from this group, send email to
> > > google-appengine+unsubscr...@googlegroups.com<google-appengine%2Bunsubscrib
> > >  e...@googlegroups.com>
> > > .
> > > > >> > > For more options, visit this group athttp://
> > > groups.google.com/group/google-appengine?hl=en.
>
> ...
>
> read more »

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

[google-appengine] Re: Idempotence & multiple task execution

Reply via email to