I have been playing around with the settings for a while now and have come to the conclusion that enabling concurrent requests i.e. <thread- safe>true</thread-safe> does not mean that the scheduler chooses to send multiple requests to active instances - it still starts more. I am also confident that Min Pending Latency is not correctly implemented - I can not see evidence that any requests were waiting 15 seconds in the logs yet the schedular chose to start additional instances.
We all need justification for the schedulers strange behaviour. Personally I would like to be able specify a maximum number of instances. So, for example, if I say 1 instance, all requests go to that 1 instance. A maximum rate could be specified by google for an instance, say 100 per second, beyond that a redirect to a page saying server overloaded and an email sent to admin to suggest paying for an additional instance. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.