Thanks Eli - couple more comments below

On Sep 9, 5:41 pm, Eli Jones <eli.jo...@gmail.com> wrote:
> How did I determine concurrent execution?
>
> I determined that I had concurrent task execution because you can see the
> task_name in the logs, and a named task successfully ran twice.  And, the
> one that ran last threw a TaskAlreadyExists error when trying to add the
> next chained task to the queue since each named task has a specifically
> defined name for the next task in the chain and the version that finished
> first had already added the next named task to the queue. (This is why it is
> absolutely important to use named tasks when chaining.. some sort of random
> error can fork your tasks).

I might have misunderstood something, but this example seems to only
show multiple execution, not concurrent?

> Why do I suggest tasks do not just retry immediately (or in less than 30
> seconds after failure).. and have done so in the time before your April 23rd
> e-mail.

Well I guess our experience differs then. For me, it sometimes backed
off for longer, but generally retried immediately, and frequently,
until the backoff time reached a decent level. Way too much energy
spent on it to forget it :)

>
> Here are some logs showing a task retry on Feb 22nd (it's hard to find many
> examples since Appengine Logs only keep error logs after a few months.. so I
> need to find two errors in a row for a task to see the retry).
>
> The task's first run was at 12:20:00.026 PM.  It ran for 29 seconds and
> failed at 12:20:29.275 PM with Deadline Exceeded.. then it retried at
> 12:21:07.596 PM (37 seconds after failure):
>
> 02-22 12:21PM 07.596 /myTask 500 28548ms 306cpu_ms 160api_cpu_ms 2kb
> AppEngine-Google; (+http://code.google.com/appengine)
> E 02-22 12:21PM 36.140 <class
> 'google.appengine.runtime.DeadlineExceededError'>: Traceback (most recent
> call last): File "/base/data/home/apps/myApp/1.34005759049070
>
> 02-22 12:20PM 00.026 /myTask 500 29255ms 2777cpu_ms 193api_cpu_ms 2kb
> AppEngine-Google; (+http://code.google.com/appengine)
> E 02-22 12:20PM 29.275 <class
> 'google.appengine.runtime.DeadlineExceededError'>: Traceback (most recent
> call last): File "/base/data/home/apps/myApp/1.34005759049070
>
> The general behaviour for my app is more like.. the task will fail, and then
> it will retry in 120 seconds (I have error logs showing this occurring back
> in February as well.)
>
> Maybe non-named tasks that are set to run immediately have retried on a
> different timeframe in the past.. but the retry time has not just been some
> generic sub-30 second time.
>
> As for Ikai's comment, it says what it says: "The same task should not be
> executed multiple times concurrently."
>
> It does not say that the same task cannot be executed multiple times
> concurrently.
>
> Again, my money is on the reality that one cannot guarantee 100% that an
> error will never occur that could lead to concurrent task execution... you
> would cripple the task queu subsytem if you put in a bunch of preventative
> checks.  Though, one can state with reasonable confidence that it is highly
> improbable that a task will execute concurrently.  But, good luck getting a
> literal answer to your question.
>
>
>
> On Thu, Sep 9, 2010 at 5:26 AM, hawkett <hawk...@gmail.com> wrote:
> > Hi Eli, notes below -
>
> > On Sep 8, 4:14 pm, Eli Jones <eli.jo...@gmail.com> wrote:
> > > Well, I've been doing named, chained tasks since November 2009, and I can
> > > point out three things:
>
> > Task names aren't especially relevant to the question - names stop the
> > same task being raised twice, not executed twice. I have been using
> > the task queue since it was released, and definitely noticed tasks
> > being executed more than once, but never concurrently.
>
> > > 1.  I've had concurrent tasks execute at least once (that I noticed) when
> > > only one was supposed to run.. And, this appeared to happen when the
> > > subsystem first fired off the task (after it had already been added to
> > the
> > > queue.. since TombstonedTaskError and TaskAlreadyExistsError seem to work
> > > nicely.).
>
> > Well, from Ikai's comment it would sound like google does not expect
> > this behaviour. I raised this thread through hypothetical analysis of
> > the technology, but if you have seen it happen, then that is
> > especially interesting. I personally can't see how it could
> > legitimately happen if it backs off for more than 30s - it would be a
> > bug in the system for the task to fire duplicates when it is first
> > raised, IMO. How did you determine the execution was concurrent?
>
> > > 2.  The GAE doc that I linked to explicitly states "it is possible in
> > > exceptional circumstances that a Task may execute multiple times".  I
> > > believe that this covers both cases of the same task running concurrently
> > or
> > > sequentially.
>
> > I don't think it does, but this is specifically the point of this
> > thread - it is not clear. I don't want to engineer significant
> > overhead into my application based on interpretation of unclear
> > documentation. To me, the same task id executing at the same time in
> > app engine, if it is possible, is something that needs to be
> > explicitly documented, because it has significant impact on app
> > architecture. Again, Ikai's comment above seems to imply Google does
> > not expect this to happen. So if the documentation is unclear, and
> > google seems to suggest the opposite of your interpretation, that's a
> > good reason to be wary of the assumption you are making.
>
> > > 3.  For my failed tasks, I'm pretty sure the backoff has always been more
> > > than 30 seconds (if the task failed in the middle of running).
> >  Generally,
> > > if a task failed in the middle of running, it would run again 60 seconds
> > -
> > > 120 seconds later.
>
> > It hasn't. Absolutely, definitely used to retry immediately and back
> > off at incrementally larger intervals that were initially < 30s.
> > Worked like this for quite a long while. Indeed, people other than me
> > suggested this behaviour should be changed to 30s plus to deal with
> > the issue in this thread. I had many, many situations where I had a
> > bug in a task, and the work it generated straight after failure would
> > fill up the error logs almost instantly. It was a real hassle for a
> > while there, and one of the reasons why I raised this issue in June
> > last year -http://code.google.com/p/googleappengine/issues/detail?id=1771
> > (among a bunch of others). I wouldn't have suggested backoff should be
> > changed to > 30s if it was already the case.
>
> > > I can see how one would like the doc to explicitly address the potential
> > for
> > > concurrent execution.. but you should presume that it is possible since
> > the
> > > doc infers it.. and the doc doesn't say it can't happen.. and (less
> > > importantly) some guy on an internet news group is telling you that it
> > has
> > > occurred in the past.
>
> > I don't think the docs infer it. I think it is ambiguous, especially
> > in relation to Ikai's comment.
>
> > > I personally cannot imagine how one could guarantee that this would never
> > > happen without bogging down the entire taskqueue subsystem with triple
> > and
> > > quadruple checks and adding in random (1-3 second) wait times for exactly
> > > when any task would execute.. (but, I have a limited imagination).. and
> > it
> > > seems like even then.. you cannot guarantee 100% that a task would not
> > > execute twice at once if a drastic system error occurred.
>
> > Executing twice is fine, I get that. Executing the same task id
> > concurrently seems to be something that can be avoided - I don't see
> > anything other than the 30s+ backoff being required to achieve this.
> > Maybe that's wrong, but its sufficient for me, and was the suggestion
> > I made to address it. Unless someone highlights another reason why it
> > could occur, I'm glad to avoid the additional architecture.
>
> > > On Wed, Sep 8, 2010 at 4:18 AM, hawkett <hawk...@gmail.com> wrote:
> > > > Hi Eli,
>
> > > > Thanks for the info - the question was definitely trying to get a
> > > > specific statement about whether app engine could run the same task id
> > > > at the same time. Ikai's post seems to suggest that google did not
> > > > think this is possible, but did not seem to address the failure
> > > > scenarios I outlined.
>
> > > > It was about the time that I queried Ikai'a response that re-executed
> > > > tasks started backing off for a significant period (over 30s) - they
> > > > used to go immediately, and then get slower and slower. e.g. 1s, 2s,
> > > > 4s, 8s type behaviour. Probably co-incidence, but the fact it started
> > > > happening meant that I chose to assume that concurrent tasks with the
> > > > same id could not occur. As you can see in the above thread, I had
> > > > suggested backing off for more than 30s as a solution.
>
> > > > I agree that the problem is making sure you know how idempotent your
> > > > operations need to be, which is specifically why it is important to
> > > > have a definitive statement from google as to whether this the
> > > > concurrent execution can occur or not. Without that information, I
> > > > don't know how idempotent my operations need to be. Without this
> > > > information, I should probably be assuming concurrent execution *can*
> > > > occur, but I'm taking a risk because the overhead is so high (in my
> > > > application).
>
> > > > So from my perspective, it would be a reasonable courtesy for google
> > > > to comment on this thread - it is a reasonable question with some fair
> > > > effort spent on articulating it, and it appears they may have fixed it
> > > > in response to this thread without taking the time to say so.
>
> > > > Thanks,
>
> > > > Colin
>
> > > > On Sep 7, 5:04 pm, Eli Jones <eli.jo...@gmail.com> wrote:
> > > > > Just in case anyone comes across this thread and is wondering about
> > the
> > > > > potential for concurrent execution of a named task.
>
> > > > > This is documented:
>
> > > > >http://code.google.com/appengine/docs/python/taskqueue/overview.html
>
> > > > > <
> >http://code.google.com/appengine/docs/python/taskqueue/overview.html
> > > > >The
> > > > > important quote is:
>
> > > > > "When implementing the code for Tasks (as worker URLs within your
> > app),
> > > > it
> > > > > is important that you consider whether the task is idempotent. App
> > > > Engine's
> > > > > Task Queue API is designed to only invoke a given task once, however
> > it
> > > > is
> > > > > possible in exceptional circumstances that a Task may execute
> > multiple
> > > > times
> > > > > (e.g. in the unlikely case of major system
>
> ...
>
> read more »

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to