Richard,

> But Tom seems to think that "1" is an appropriate number for his app. Why
offer that option if it's automatically wrong?

If his purpose is reduce the number of user-facing loading requests, and he
still sees many user-facing loading requests, the current settings is not
enough.

Jeff,

> I vaguely expect something like this:
>
>  * All incoming requests go into a pending queue.
>  * Requests in this queue are handed off to warm instances only.
>  * Requests in the pending queue are only sent to warmed up instances.
>  * New instances can be started up based on (adjustable) depth of the
> pending queue.
>  * If there aren't enough instances to serve load, the pending queue
> will back up until more instances come online.
>
> Isn't this fairly close to the way appengine works?  What puzzles me
> is why requests would ever be removed from the pending queue and sent
> to a cold instance.  Even in Pythonland, 5-10s startup times are
> common.  Seems like the request is almost certainly better off waiting
> in the queue.

Probably reading the following section woud help understanding the
scheduler:
https://developers.google.com/appengine/docs/adminconsole/performancesettings#scheduler

A request comes in, if there's available dynamic instance, he'll be handled
by that dynamic instance. Then, if there's available resident instance,
he'll be handled by that resident instance. Then he goes to the pending
queue. He can be sent any available instances at any time(it's fortunate
for him). Then according to the pending latency settings, he will be sent
to a new cold instance.

So, if you prefer pending queue rather than a cold instance, you can set
high minimum latency, however, it might not be what you really want because
it will cause a bad performance on subsequent requests.

Generally speaking, just looking at a statistic for a spiky event, you
might have a feeling that our scheduler can do better, however, the
difficult part is that those requests in that statistic were not issued at
the flat rate. In other words, the scheduler starts a new dynamic instance
because it is really needed at that moment.

Well again, in order to reduce the number of user-facing loading requests,
the most effective thing is to set sufficient number of min idle instances.
The second thing to consider would be, if you have longer backend tasks,
putting those tasks into another version, in order to avoid blocking other
frontend requests. If you use python2.7 runtime with concurrent request
enabled, probably you'd better isolate CPU bound operations from
user-facing frontend version in order to avid slow-down in the frontend
version.

Probably it is great if we offer an API for dynamically configuring the
performance settings especially in terms of cost efficiency, and I think
it's worth filing a feature request.

-- 
Takashi Matsuo

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to