Re: [google-appengine] Re: Error while handling "large'ish" number of parallel requests

'Adam (Cloud Platform Support)' via Google App Engine Fri, 07 Oct 2016 14:53:29 -0700

Thanks for the updates. If the issue happens again it will also be useful 
to know the message detail from the 502 response. Usually in the context of 
App Engine flexible it means there are no instances ready to serve the 
request however it's good to avoid any ambiguity.


On Wednesday, October 5, 2016 at 3:11:53 AM UTC-4, Vinay Chitlangia wrote:
>
>
>
> On Wed, Oct 5, 2016 at 11:16 AM, Vinay Chitlangia wrote:
>
>>
>>
>> On Tue, Oct 4, 2016 at 10:51 PM, 'Adam (Cloud Platform Support)' via 
>> Google App Engine <google-appengine@googlegroups.com> wrote:
>>
>>> It's possible that new instances may not be spinning up quickly enough 
>>> to respond to spikes in load, causing requests to be discarded. I'd 
>>> recommend lowering the <cool-down-period-sec> value to 60 (the minimum) to 
>>> allow the autoscaler checks to occur more frequently, and keep 
>>> <target-utilization> at 0.5 to give new instances more headroom to spin up.
>>>
>> Thanks Adam.
>> I have restarted our servers with target-utilization of 0.4 (have kept 
>> the cool-down-period the same..in the interest of changing one variable at 
>> a time!!) 
>>
> So this is what has happened...I am not sure what to make of it...
> The number of servers are now at 10. The error rate is 0.6% (with the 
> latest deployment) so its at the lower end, certainly has not broken the 
> proverbial glass:)
> The total number of requests for the day are well within reason...so there 
> is no X factor, in that we do not have an unuusally busy or lax day.
> While I was at it, I have changed the cool down period to what you 
> suggested...I am all in:):)
>
>>
>>> Try this first to see if the incidences of 502 errors improve. You can 
>>> try to adjust further by increasing the number of <min-num-instances> or 
>>> lowering <target-utilization> further.
>>>
>>> Java apps in particular can incur large startup times due to classpath 
>>> scanning and initialization. If you can give some details about the types 
>>> of frameworks and libraries you're using, as well as the request times 
>>> your'e seeing for loading requests (loading_request = 1) I may be able to 
>>> give some further recommendations.
>>>
>> This is a backend server with cloud bigtable as its major dependency. I 
>> could not correlate the 502s in my server with the bigtable errors (as 
>> reported by the cluster dashboard).
>> I will try to see if I can get some loading_request=1 request when the 
>> server is under duress. The absolute first request at the time of server 
>> startup are well behaved, that is to say that my server does not have a 
>> very big upfront cost...it does about 2 seconds worth of initialization 
>> which is done by a <load-on-startup> servlet.
>>
>>> On Wednesday, September 28, 2016 at 10:36:46 AM UTC-4, Vinay Chitlangia 
>>> wrote:
>>>>
>>>> Hi,
>>>> I am using appengine flexible environment.
>>>>
>>>> I am facing issues when there is a sudden spurt of requests to our 
>>>> system. (About 30 requests in a second). The server gets back to normal in 
>>>> 5-10 seconds (perhaps with the decrease
>>>> in traffic again)
>>>> Generally the requests come at about 5-8 per second.
>>>>
>>>> I am running with 2 instances.
>>>>
>>>> <automatic-scaling>
>>>>   <min-num-instances>2</min-num-instances>
>>>>   <max-num-instances>10</max-num-instances>
>>>>   <cool-down-period-sec>120</cool-down-period-sec>
>>>>   
>>>> <cpu-utilization><target-utilization>0.7</target-utilization></cpu-utilization>
>>>> </automatic-scaling>
>>>>
>>>>
>>>> AFAICT the requests did not hit my servlet. (I added a log, the first 
>>>> line in the servlet, and for failed request it does not get printed).
>>>> "loading_request" is 0 for the failed requests as well.
>>>>
>>>> The error code is 502
>>>>
>>>> Is it that the requests are queued while a new instance comes up? Is it 
>>>> possible to override that, that is start the process of bringing up the 
>>>> new 
>>>> servers,
>>>> but use the old one (if it is taking a while?). The log shows some of 
>>>> the request waiting for 600seconds, most though fail in 10-15ms.
>>>> The failure happens for between 0.5 to 2% of the requests
>>>>
>>> -- 
>>> You received this message because you are subscribed to a topic in the 
>>> Google Groups "Google App Engine" group.
>>> To unsubscribe from this topic, visit 
>>> https://groups.google.com/d/topic/google-appengine/TvPHzbMMbhc/unsubscribe
>>> .
>>> To unsubscribe from this group and all its topics, send an email to 
>>> google-appengine+unsubscr...@googlegroups.com.
>>> To post to this group, send email to google-appengine@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/google-appengine.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/google-appengine/f9fc1696-d71f-4684-a490-e612f9d17d23%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/google-appengine/f9fc1696-d71f-4684-a490-e612f9d17d23%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/c7356b53-f6c8-454d-8e6b-34baac616c90%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [google-appengine] Re: Error while handling "large'ish" number of parallel requests

Reply via email to