Hello,
As Olu mentioned, many metrics, called Resources, can affect scaling decisions as seen in this documentation [1], which lists CPU as you have stated but also memory. Depending on our configuration in the app.yaml file, these will affect the available resources per instance by having App Engine create a Machine Type based on those CPU and memory settings. That machine is guaranteed to have at least what you have specified, or with the possibility of having more. As for the target utilization, it is an indicator of the average CPU utilization that the autoscaler should maintain [2]. In the case that the average CPU utilization is lower than the target, then it will create a new instance, while a utilization lower than the target will remove instances to maintain the utilization you have set [3]. So, in your case, since you would have set the CPU utilization at 0.5 and your 3 instances are at 0.1, which are not gonna be idle as they have a load to work on, it is lower than the target and the autoscaler would scale down to remove instances. That can be prevented by setting the min_num_instances to 3 in the app.yaml as indicated in [1] [1] https://cloud.google.com/appengine/docs/flexible/python/reference/app-yaml#resource-settings [2] https://cloud.google.com/compute/docs/autoscaler/scaling-cpu-load-balancing?hl=en_US&_ga=2.46193063.-1877344601.1575944051#scaling_based_on_cpu_utilization [3] https://cloud.google.com/compute/docs/autoscaler/scaling-cpu-load-balancing?hl=en_US&_ga=2.46193063.-1877344601.1575944051#scaling_based_on_cpu_utilization -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/3e57f215-1ca9-4aad-9cee-ece7576f89df%40googlegroups.com.