First question: Is this accurate <https://serverfault.com/a/999900>? That 
is, if an auto-scaled service has min_instances set to nonzero, does that 
mean that instances in old versions don't get shut down when you deploy a 
new version? And those instances get billed?

I've been running a service with the following configuration (standard 
environment, Python 3.7):

runtime: python37
instance_class: F4
automatic_scaling:
  min_instances: 1
  max_instances: 10
inbound_services:
- warmup

New versions are deployed frequently because it's our integration 
environment. Apparently we've ended up with more and more instances 
running, because when we deploy a new version, the old version continues to 
exist and have 1 running instance. And apparently we're getting billed for 
these. I don't think I should have expected this, based on a close reading 
of the documentation 
<https://cloud.google.com/appengine/docs/standard/python3/config/appref#scaling_elements>.
 
(I have opened a billing support request, since I think it's Google's error 
in documentation, if not actually a bug.)

So now I'm trying to fix the configuration to avoid this. Based on the 
Server Fault article I linked above, I tried setting min_instances to 0 and 
min_idle_instances to 1. This seems to result in always at least 2 
instances running. I think maybe because one instance is getting requests 
(we have a once-a-minute cron job, among other things), so it's not "idle", 
so there has to be one more instance to have a minimum of one idle instance.

So I tried setting both min_instances and min_idle_instances to 0, but I 
*still* seem to always have at least 2 instances.

It's really hard to tell though, because the GCP console sometimes takes a 
bit to update, and maybe sometimes there are actually more instances than 
the configuration requires (I think maybe they're not always billed?).

So, second question: What is min_idle_instances actually supposed to mean? 
Is it the minimum number of instances, or is it the minimum number of 
"idle" instances, for some definition of "idle"? If an instance is serving 
1 simple query per minute, does that mean it's not idle, so there will be a 
second, idle instance? But then there are 2 instances, and if there are a 
few simple queries happening per minute, it seems like some might be 
randomly routed to each instance, so neither instance would be idle, and a 
third instance would get started.

Another complication: in our production environment, I increased 
min_instances to 2 because of this issue 
<https://groups.google.com/g/google-appengine/c/SeVMpquMdnQ/m/tEr9prqLAQAJ>. 
It's pretty important that we always (>99.9% anyway) have at least 1 
running instance, since apparently there's no way to get instance startup 
time to less than 20-30 seconds, and to guarantee that, apparently we need 
2 instances running most of the time, because they can get preempted at any 
time without warming up a replacement first. So now I'm not sure whether to 
set min_idle_instances to 1 or 2 or what in production.

Do I need to set max_idle_instances? The documentation says its default 
value is "automatic", but doesn't say what that actually means.

I'm having a hard time figuring out these issues based on the documentation.

All I want is

   - Instances get shut down when I deploy a new version (I would have 
   thought this was always the case no matter what!)
   - Each of my environments normally just has 1 instance running, assuming 
   light traffic (1-10 queries per minute). Having 2 instances always running 
   in production is ok, if that's the only way to achieve the next point:
   - Never (or almost never) have zero instances running (aka almost never 
   have a query take more than 2 seconds because of warmup time)
   - Autoscale up to a reasonable maximum if traffic gets heavier

I didn't think this would be so difficult. I've been using App Engine for a 
long time, and thought I knew what I was doing, but I guess I've never used 
these options in the current environment (Python 3, standard).

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/84d8f642-3eae-4144-962e-71a1990dbbedn%40googlegroups.com.

Reply via email to