Re: [google-appengine] Why are several production issues related to DeadlineExceededErrors being ignored?

Karl Rosaen Sat, 14 Jan 2012 06:26:10 -0800

Thanks Brandon.  Many of the DeadlineExceededErrors were occurring during 
warmup requests, according to the stacktraces, during python import 
statements.  I upped the number of idle instances in an attempt to mitigate 
this sort of thrashing, and your advice makes sense for this case.  Our 
pending latency is set to 'Automatic' on both ends.

I'm attaching some graphs from the period when this was the worst

Instances:

<https://lh4.googleusercontent.com/--AtYMbWJ4ek/TxGNT3nfp0I/AAAAAAAAUuE/hTlZm78Mc08/s1600/Screen%252520Shot%2525202012-01-14%252520at%2525209.08.59%252520AM.png>

Requests per second:

<https://lh6.googleusercontent.com/-LoIlwGhvLrA/TxGOnvzGmSI/AAAAAAAAUuc/Sg07YssPK_4/s1600/Screen%252520Shot%2525202012-01-14%252520at%2525209.17.39%252520AM.png>

Milliseconds per request:

<https://lh5.googleusercontent.com/-A76zVs8CCEo/TxGNZ9kcpfI/AAAAAAAAUuQ/w20AuPvgw50/s1600/Screen%252520Shot%2525202012-01-14%252520at%2525209.09.41%252520AM.png>

This suggests that some higher latency handlers were hit (some people were
editing content), taking up the existing front end instances, after which
GAE was trying to spin up some dynamic instances to serve other requests.
But during warmup, there were DeadelineExceededErrors during file imports,
suggesting that the dynamic instances aren't being given enough time to
warmup.

Increasing the idle instances helps. So perhaps the revised question, at
least for our particular situation is: why, under load, do the dynamic
instances timeout during warmup? That seems to compound the problem as the
dynamic instances aren't able to serve the requests that are backed up,
leading to user visible 500 errors, and more attempts to dynamically load
instances.

Does my theory have any holes? Is relying on dynamic instances to handle
spikes without 500 errors unrealistic? I know the docs state, "A smaller
number of idle Instances means your application costs less to run, but may
encounter more startup latency during load spikes." but thrashing on
DeadlineExceededErrors during warmup seems to indicate that dynamic
instances can't be relied upon for load spikes at all right now.

--
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/google-appengine/-/bYRgRhlKZjoJ.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Why are several production issues related to DeadlineExceededErrors being ignored?

Reply via email to