It's possible that new instances may not be spinning up quickly enough to respond to spikes in load, causing requests to be discarded. I'd recommend lowering the <cool-down-period-sec> value to 60 (the minimum) to allow the autoscaler checks to occur more frequently, and keep <target-utilization> at 0.5 to give new instances more headroom to spin up.
Try this first to see if the incidences of 502 errors improve. You can try to adjust further by increasing the number of <min-num-instances> or lowering <target-utilization> further. Java apps in particular can incur large startup times due to classpath scanning and initialization. If you can give some details about the types of frameworks and libraries you're using, as well as the request times your'e seeing for loading requests (loading_request = 1) I may be able to give some further recommendations. On Wednesday, September 28, 2016 at 10:36:46 AM UTC-4, Vinay Chitlangia wrote: > > Hi, > I am using appengine flexible environment. > > I am facing issues when there is a sudden spurt of requests to our system. > (About 30 requests in a second). The server gets back to normal in 5-10 > seconds (perhaps with the decrease > in traffic again) > Generally the requests come at about 5-8 per second. > > I am running with 2 instances. > > <automatic-scaling> > <min-num-instances>2</min-num-instances> > <max-num-instances>10</max-num-instances> > <cool-down-period-sec>120</cool-down-period-sec> > > <cpu-utilization><target-utilization>0.7</target-utilization></cpu-utilization> > </automatic-scaling> > > > AFAICT the requests did not hit my servlet. (I added a log, the first line > in the servlet, and for failed request it does not get printed). > "loading_request" is 0 for the failed requests as well. > > The error code is 502 > > Is it that the requests are queued while a new instance comes up? Is it > possible to override that, that is start the process of bringing up the new > servers, > but use the old one (if it is taking a while?). The log shows some of the > request waiting for 600seconds, most though fail in 10-15ms. > The failure happens for between 0.5 to 2% of the requests > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To post to this group, send email to google-appengine@googlegroups.com. Visit this group at https://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/f9fc1696-d71f-4684-a490-e612f9d17d23%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.