First question: Is this accurate <https://serverfault.com/a/999900>? That is, if an auto-scaled service has min_instances set to nonzero, does that mean that instances in old versions don't get shut down when you deploy a new version? And those instances get billed?
I've been running a service with the following configuration (standard environment, Python 3.7): runtime: python37 instance_class: F4 automatic_scaling: min_instances: 1 max_instances: 10 inbound_services: - warmup New versions are deployed frequently because it's our integration environment. Apparently we've ended up with more and more instances running, because when we deploy a new version, the old version continues to exist and have 1 running instance. And apparently we're getting billed for these. I don't think I should have expected this, based on a close reading of the documentation <https://cloud.google.com/appengine/docs/standard/python3/config/appref#scaling_elements>. (I have opened a billing support request, since I think it's Google's error in documentation, if not actually a bug.) So now I'm trying to fix the configuration to avoid this. Based on the Server Fault article I linked above, I tried setting min_instances to 0 and min_idle_instances to 1. This seems to result in always at least 2 instances running. I think maybe because one instance is getting requests (we have a once-a-minute cron job, among other things), so it's not "idle", so there has to be one more instance to have a minimum of one idle instance. So I tried setting both min_instances and min_idle_instances to 0, but I *still* seem to always have at least 2 instances. It's really hard to tell though, because the GCP console sometimes takes a bit to update, and maybe sometimes there are actually more instances than the configuration requires (I think maybe they're not always billed?). So, second question: What is min_idle_instances actually supposed to mean? Is it the minimum number of instances, or is it the minimum number of "idle" instances, for some definition of "idle"? If an instance is serving 1 simple query per minute, does that mean it's not idle, so there will be a second, idle instance? But then there are 2 instances, and if there are a few simple queries happening per minute, it seems like some might be randomly routed to each instance, so neither instance would be idle, and a third instance would get started. Another complication: in our production environment, I increased min_instances to 2 because of this issue <https://groups.google.com/g/google-appengine/c/SeVMpquMdnQ/m/tEr9prqLAQAJ>. It's pretty important that we always (>99.9% anyway) have at least 1 running instance, since apparently there's no way to get instance startup time to less than 20-30 seconds, and to guarantee that, apparently we need 2 instances running most of the time, because they can get preempted at any time without warming up a replacement first. So now I'm not sure whether to set min_idle_instances to 1 or 2 or what in production. Do I need to set max_idle_instances? The documentation says its default value is "automatic", but doesn't say what that actually means. I'm having a hard time figuring out these issues based on the documentation. All I want is - Instances get shut down when I deploy a new version (I would have thought this was always the case no matter what!) - Each of my environments normally just has 1 instance running, assuming light traffic (1-10 queries per minute). Having 2 instances always running in production is ok, if that's the only way to achieve the next point: - Never (or almost never) have zero instances running (aka almost never have a query take more than 2 seconds because of warmup time) - Autoscale up to a reasonable maximum if traffic gets heavier I didn't think this would be so difficult. I've been using App Engine for a long time, and thought I knew what I was doing, but I guess I've never used these options in the current environment (Python 3, standard). -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/84d8f642-3eae-4144-962e-71a1990dbbedn%40googlegroups.com.