I would much rather specify a hard limit for number of active instances so my app scales up to that point and then degrades progressively. At that point the limit is reached, latencies would increase with increasing traffic, but if I was smart I would have set the limit so the higher traffic would not be able to exhaust my budget because instances stop increasing. So if 30 instances worth of users were hammering my 10 hard limited instances, their browser loads would be unpleasant, but the site wouldn't go completely offline.
The current reality is that I can only limit idle instances. As active traffic increases, the scheduler can spin up and actually exhaust my budget. So browser latencies would be nicer for a bit, but then suddenly hit a brick wall. After my budget is exhausted, now I've got 30 instances worth of users all hammering 1 single remaining instance. For all intents, my site is offline. In fact, if my budget is exhausted, I think it will go completely offline perhaps running 0 instances but definitely not able to do datastore operations. That's not a red herring. It's legitimately needing a way to limit performance scaling so costs stay within a manageable budget. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/K_KO68dzq1UJ. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.