Thanks Eli - couple more comments below On Sep 9, 5:41 pm, Eli Jones <eli.jo...@gmail.com> wrote: > How did I determine concurrent execution? > > I determined that I had concurrent task execution because you can see the > task_name in the logs, and a named task successfully ran twice. And, the > one that ran last threw a TaskAlreadyExists error when trying to add the > next chained task to the queue since each named task has a specifically > defined name for the next task in the chain and the version that finished > first had already added the next named task to the queue. (This is why it is > absolutely important to use named tasks when chaining.. some sort of random > error can fork your tasks).
I might have misunderstood something, but this example seems to only show multiple execution, not concurrent? > Why do I suggest tasks do not just retry immediately (or in less than 30 > seconds after failure).. and have done so in the time before your April 23rd > e-mail. Well I guess our experience differs then. For me, it sometimes backed off for longer, but generally retried immediately, and frequently, until the backoff time reached a decent level. Way too much energy spent on it to forget it :) > > Here are some logs showing a task retry on Feb 22nd (it's hard to find many > examples since Appengine Logs only keep error logs after a few months.. so I > need to find two errors in a row for a task to see the retry). > > The task's first run was at 12:20:00.026 PM. It ran for 29 seconds and > failed at 12:20:29.275 PM with Deadline Exceeded.. then it retried at > 12:21:07.596 PM (37 seconds after failure): > > 02-22 12:21PM 07.596 /myTask 500 28548ms 306cpu_ms 160api_cpu_ms 2kb > AppEngine-Google; (+http://code.google.com/appengine) > E 02-22 12:21PM 36.140 <class > 'google.appengine.runtime.DeadlineExceededError'>: Traceback (most recent > call last): File "/base/data/home/apps/myApp/1.34005759049070 > > 02-22 12:20PM 00.026 /myTask 500 29255ms 2777cpu_ms 193api_cpu_ms 2kb > AppEngine-Google; (+http://code.google.com/appengine) > E 02-22 12:20PM 29.275 <class > 'google.appengine.runtime.DeadlineExceededError'>: Traceback (most recent > call last): File "/base/data/home/apps/myApp/1.34005759049070 > > The general behaviour for my app is more like.. the task will fail, and then > it will retry in 120 seconds (I have error logs showing this occurring back > in February as well.) > > Maybe non-named tasks that are set to run immediately have retried on a > different timeframe in the past.. but the retry time has not just been some > generic sub-30 second time. > > As for Ikai's comment, it says what it says: "The same task should not be > executed multiple times concurrently." > > It does not say that the same task cannot be executed multiple times > concurrently. > > Again, my money is on the reality that one cannot guarantee 100% that an > error will never occur that could lead to concurrent task execution... you > would cripple the task queu subsytem if you put in a bunch of preventative > checks. Though, one can state with reasonable confidence that it is highly > improbable that a task will execute concurrently. But, good luck getting a > literal answer to your question. > > > > On Thu, Sep 9, 2010 at 5:26 AM, hawkett <hawk...@gmail.com> wrote: > > Hi Eli, notes below - > > > On Sep 8, 4:14 pm, Eli Jones <eli.jo...@gmail.com> wrote: > > > Well, I've been doing named, chained tasks since November 2009, and I can > > > point out three things: > > > Task names aren't especially relevant to the question - names stop the > > same task being raised twice, not executed twice. I have been using > > the task queue since it was released, and definitely noticed tasks > > being executed more than once, but never concurrently. > > > > 1. I've had concurrent tasks execute at least once (that I noticed) when > > > only one was supposed to run.. And, this appeared to happen when the > > > subsystem first fired off the task (after it had already been added to > > the > > > queue.. since TombstonedTaskError and TaskAlreadyExistsError seem to work > > > nicely.). > > > Well, from Ikai's comment it would sound like google does not expect > > this behaviour. I raised this thread through hypothetical analysis of > > the technology, but if you have seen it happen, then that is > > especially interesting. I personally can't see how it could > > legitimately happen if it backs off for more than 30s - it would be a > > bug in the system for the task to fire duplicates when it is first > > raised, IMO. How did you determine the execution was concurrent? > > > > 2. The GAE doc that I linked to explicitly states "it is possible in > > > exceptional circumstances that a Task may execute multiple times". I > > > believe that this covers both cases of the same task running concurrently > > or > > > sequentially. > > > I don't think it does, but this is specifically the point of this > > thread - it is not clear. I don't want to engineer significant > > overhead into my application based on interpretation of unclear > > documentation. To me, the same task id executing at the same time in > > app engine, if it is possible, is something that needs to be > > explicitly documented, because it has significant impact on app > > architecture. Again, Ikai's comment above seems to imply Google does > > not expect this to happen. So if the documentation is unclear, and > > google seems to suggest the opposite of your interpretation, that's a > > good reason to be wary of the assumption you are making. > > > > 3. For my failed tasks, I'm pretty sure the backoff has always been more > > > than 30 seconds (if the task failed in the middle of running). > > Generally, > > > if a task failed in the middle of running, it would run again 60 seconds > > - > > > 120 seconds later. > > > It hasn't. Absolutely, definitely used to retry immediately and back > > off at incrementally larger intervals that were initially < 30s. > > Worked like this for quite a long while. Indeed, people other than me > > suggested this behaviour should be changed to 30s plus to deal with > > the issue in this thread. I had many, many situations where I had a > > bug in a task, and the work it generated straight after failure would > > fill up the error logs almost instantly. It was a real hassle for a > > while there, and one of the reasons why I raised this issue in June > > last year -http://code.google.com/p/googleappengine/issues/detail?id=1771 > > (among a bunch of others). I wouldn't have suggested backoff should be > > changed to > 30s if it was already the case. > > > > I can see how one would like the doc to explicitly address the potential > > for > > > concurrent execution.. but you should presume that it is possible since > > the > > > doc infers it.. and the doc doesn't say it can't happen.. and (less > > > importantly) some guy on an internet news group is telling you that it > > has > > > occurred in the past. > > > I don't think the docs infer it. I think it is ambiguous, especially > > in relation to Ikai's comment. > > > > I personally cannot imagine how one could guarantee that this would never > > > happen without bogging down the entire taskqueue subsystem with triple > > and > > > quadruple checks and adding in random (1-3 second) wait times for exactly > > > when any task would execute.. (but, I have a limited imagination).. and > > it > > > seems like even then.. you cannot guarantee 100% that a task would not > > > execute twice at once if a drastic system error occurred. > > > Executing twice is fine, I get that. Executing the same task id > > concurrently seems to be something that can be avoided - I don't see > > anything other than the 30s+ backoff being required to achieve this. > > Maybe that's wrong, but its sufficient for me, and was the suggestion > > I made to address it. Unless someone highlights another reason why it > > could occur, I'm glad to avoid the additional architecture. > > > > On Wed, Sep 8, 2010 at 4:18 AM, hawkett <hawk...@gmail.com> wrote: > > > > Hi Eli, > > > > > Thanks for the info - the question was definitely trying to get a > > > > specific statement about whether app engine could run the same task id > > > > at the same time. Ikai's post seems to suggest that google did not > > > > think this is possible, but did not seem to address the failure > > > > scenarios I outlined. > > > > > It was about the time that I queried Ikai'a response that re-executed > > > > tasks started backing off for a significant period (over 30s) - they > > > > used to go immediately, and then get slower and slower. e.g. 1s, 2s, > > > > 4s, 8s type behaviour. Probably co-incidence, but the fact it started > > > > happening meant that I chose to assume that concurrent tasks with the > > > > same id could not occur. As you can see in the above thread, I had > > > > suggested backing off for more than 30s as a solution. > > > > > I agree that the problem is making sure you know how idempotent your > > > > operations need to be, which is specifically why it is important to > > > > have a definitive statement from google as to whether this the > > > > concurrent execution can occur or not. Without that information, I > > > > don't know how idempotent my operations need to be. Without this > > > > information, I should probably be assuming concurrent execution *can* > > > > occur, but I'm taking a risk because the overhead is so high (in my > > > > application). > > > > > So from my perspective, it would be a reasonable courtesy for google > > > > to comment on this thread - it is a reasonable question with some fair > > > > effort spent on articulating it, and it appears they may have fixed it > > > > in response to this thread without taking the time to say so. > > > > > Thanks, > > > > > Colin > > > > > On Sep 7, 5:04 pm, Eli Jones <eli.jo...@gmail.com> wrote: > > > > > Just in case anyone comes across this thread and is wondering about > > the > > > > > potential for concurrent execution of a named task. > > > > > > This is documented: > > > > > >http://code.google.com/appengine/docs/python/taskqueue/overview.html > > > > > > < > >http://code.google.com/appengine/docs/python/taskqueue/overview.html > > > > >The > > > > > important quote is: > > > > > > "When implementing the code for Tasks (as worker URLs within your > > app), > > > > it > > > > > is important that you consider whether the task is idempotent. App > > > > Engine's > > > > > Task Queue API is designed to only invoke a given task once, however > > it > > > > is > > > > > possible in exceptional circumstances that a Task may execute > > multiple > > > > times > > > > > (e.g. in the unlikely case of major system > > ... > > read more » -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.