[google-appengine] The region us-west2 does not have enough resources - on deployment

Pip Jones Sat, 12 Mar 2022 11:04:53 -0800

When trying to deploy my App Engine Flex app today, I am getting this 
error, after the build has completed.

`ERROR: (gcloud.app.deploy) Error Response: [9] An internal error occurred
while processing task
/app-engine-flex/flex_await_healthy/flex_await_healthy>2022-03-12T14:09:32.742Z6575.hg.2:

The region us-west2 does not have enough resources available to fulfill the
request. Please try again later.`

After about 10x attempts, and about 2 hours, it killed off my previous
running version's two instances, and so now my company's app it down, and I
cannot bring it back up.

I get this error pretty regularly when deploying new versions (almost every
time) but it usually succeeds after a couple of attempts, so I've just
lived with it. But now my site is down and it's 4 hours later, and I'd
really like to know if there's something I can do to fix it, or is it just
a case of waiting for the zone to have more capacity?

A newly deployed version shows up in the console (and command line version
list command), for a while but then disappears on its own.

I have checked my quotas under IAM & Admin, and nothing is above 30%
allocation at most. Besides now my site is not running, I don't have many
resources in use.

I noticed the previous version had 2 instances which seemed stuck in
"restarting" state from a couple of weeks ago. I killed them off manually
in the console thinking these might have been consuming resources. I wonder
if this has somehow skewed the auto-scaler? I was hoping it would
eventually repair itself, as it seems GCP sometimes takes a while to do
stuff in the background.

I have tried restarting the previous version in the console, but it just
sits at 0 instances. It's autoscaled, but the autoscaling has stopped
working. I tried stopping it, waiting, then restarting it.

I have checked the stackdriver logs and it's definitely a
ZONE_RESOURCE_POOL_EXHAUSTED
error.
e.g.
serviceName: "compute.googleapis.com"
status: {
code: 8
details: [
0: {
@type: "type.googleapis.com/google.protobuf.Struct"
value: {
zoneResourcePoolExhausted: {
resource: {
project: {
canonicalProjectId: "XXX"
}
resourceName: "us-west2-b"
resourceType: "ZONE"
scope: {
scopeName: "global"
scopeType: "GLOBAL"
}
}
}
}
}
]
message: "ZONE_RESOURCE_POOL_EXHAUSTED"
}

I have tried increasing the readiness_check: app_start_timeout_sec and
increasing failure_threshold and timeouts etc in case this was on the edge,
but judging by the logs, the instance doesn't even begin to get booted (due
to the VM not being allocated).

I tried re-deploying the previous version again.

I tried stopping the current version (which previously was "SERVING" but
with 0 instances) and then deploying, but this doesn't help. So at this
point I'm deploying over nothing running at all in my project, confirming
it cannot be quotas.

I noticed in my service logs though, seemingly inconsistent reports of the
number of instances. This doesn't make sense because the are NO instances
running either before or after deployment.

2022-03-12 13:09:55.422 GMT
The number of running VMs for version 20220211t141828 changed from 2 to 1
2022-03-12 13:10:19.089 GMT
The number of running VMs for version 20220211t141828 changed from 1 to 3
2022-03-12 13:10:25.919 GMT
The number of running VMs for version 20220211t141828 changed from 3 to 4
2022-03-12 13:10:41.160 GMT
The number of running VMs for version 20220211t141828 changed from 4 to 2

I tried deploying the app to a different service name, (and was going to
change my dispatch to reroute to that) but that service deployment failed
with the same error.

The status pages look OK.

I've tried --verbosity=debug which didn't reveal any extra info.

I've read every post I can find (including my own previous post in this
group where is was a "prerequesite" error caused by quotas), and the only
thing I seem to be left with is migrating my app to a new project in a more
reliable zone like us-central? However this will be a lot of work as I'm
using GCS, Functions, and networking to other providers which will all have
to be migrated.

Is there any way to get more detailed information on the resource problem?

thanks
Pip Jones

--
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to google-appengine+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/google-appengine/7aa4e573-364e-4555-9d7c-80dd5f7c43c1n%40googlegroups.com.

[google-appengine] The region us-west2 does not have enough resources - on deployment

Reply via email to