Re: [google-appengine] Re: 502 Bad gateway error

'Shivam(Google Cloud Support)' via Google App Engine Fri, 25 Aug 2017 14:36:26 -0700

Google Groups discussion forum is meant for open-ended discussions. Issues 
such as these most of the times tend to be project/application specific. I 
would recommend to post on  App Engine public issue tracker 
<https://cloud.google.com/support/docs/issue-trackers>.



On Wednesday, August 23, 2017 at 10:38:49 AM UTC-4, Karikalan Kumaresan 
wrote:
>
> Hi, I am facing the same issue. I am running spring boot in GAE flex. 
> While it runs fine for around 30 concurrent users, when the number of users 
> increases it throws 502 error and the tomcat gets restarted. I am not sure 
> what causes this issue. Not seeing any useful errors in the logs. Any 
> resolution on this? We are blocked with this issue. Any help would be good.
>
> On Tuesday, 15 August 2017 22:35:45 UTC+5:30, Tomas Erlingsson wrote:
>>
>> Did this get resolved?  We have an flex java app running in development 
>> with almost no traffic. We are constantly getting 502 telling us to try in 
>> 30sec and our server app is rebooted many times a day. I am running this 
>> locally without any problems. Not seeing any errors in the log.
>>
>> On Friday, 10 February 2017 06:07:26 UTC, Vinay Chitlangia wrote:
>>>
>>>
>>>
>>> On Thu, Feb 9, 2017 at 7:52 PM, 'Nicholas (Google Cloud Support)' via 
>>> Google App Engine <google-a...@googlegroups.com> wrote:
>>>
>>>> I realize that we've already begun investigating this here but I think 
>>>> this would be most appropriate for the App Engine public issue tracker.  
>>>> The issue is leading to an increasingly specific situation and I suspect 
>>>> will require some exchange of code/project to reproduce the behavior 
>>>> you've 
>>>> described.  We monitor that issue tracker closely.
>>>>
>>>> When filing a new issue on the tracker, please link back to this thread 
>>>> for context while posting a link to the issue here so that others in the 
>>>> community can see the whole picture.
>>>>
>>>>    - Be sure to include the latest logs for related to the *502*s.  
>>>>    When viewing the logs in Stackdriver Logging for instance, include *All 
>>>>    logs* rather than just *request_log* as *nginx.error*, *stderr*, 
>>>>    *stdout* and *vm.** logs may reveal clues as to a root cause.
>>>>    - Mention if your are using any middleware like servlet filters 
>>>>    that may receive request before that actual handler
>>>>    - Lastly, include what the CPU and/or memory usage looks like on 
>>>>    the instance(s) at the time of the 502s.  Screenshots of 
>>>>    *Utilization *and *Memory Usage* graphs from the Developers Console 
>>>>    will likely be sufficient
>>>>    
>>>> I look forward to this issue report.
>>>>
>>> https://code.google.com/p/googleappengine/issues/detail?id=13543
>>> The logs are "All logs" around the time of the incident, however as a 
>>> copy/paste from the browser. Couldnt retrieve any logs using gcloud beta 
>>> logging read. This is the command I tried:
>>> gcloud beta logging read 'timestamp >= "2017-02-11T03:00:00Z" AND 
>>> timestamp <="2017-02-12T03:05:00Z"' 
>>>
>>>>
>>>> On Wednesday, February 8, 2017 at 1:24:01 PM UTC-5, Vinay Chitlangia 
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 8, 2017 at 10:29 PM, 'Nicholas (Google Cloud Support)' via 
>>>>> Google App Engine <google-a...@googlegroups.com> wrote:
>>>>>
>>>>>> Hey Vinay Chitlangia,
>>>>>>
>>>>>> Thanks for some preliminary troubleshooting and linking this 
>>>>>> interesting article.  App Engine runs Nginx processes to handle routes 
>>>>>> to 
>>>>>> your application's handlers.  Handlers serving static assets for 
>>>>>> instance 
>>>>>> are handled by this Nginx process and the resources are served directly, 
>>>>>> thus bypassing the application altogether to save on precious 
>>>>>> application 
>>>>>> resources.
>>>>>>
>>>>>> The Nginx process will often serve a *502* if the application raises 
>>>>>> an exception, an internal API call raises an exception or if the request 
>>>>>> simply takes too long.  As such, the status code by itself does not tell 
>>>>>> us 
>>>>>> much.
>>>>>>
>>>>>> Looking at the GAE logs for your application, I found the *502*s you 
>>>>>> mentioned.  One thing I noticed is that they all occur from the 
>>>>>> */read* endpoint.  From the naming, I assume this endpoint is 
>>>>>> reading some data from BigTable.  Investigating further, perhaps you 
>>>>>> could 
>>>>>> provide some additional information:
>>>>>>
>>>>>>    - What exactly is happening at the */read* endpoint?  A code 
>>>>>>    sample would be ideal if that's not too sensitive.
>>>>>>    
>>>>>> As you surmised, we are reading some data from bigtable in this 
>>>>> endpoint.
>>>>>
>>>>>>
>>>>>>    - What kind of error handling exists in said endpoint if the 
>>>>>>    BigTable API returns non-success responses?
>>>>>>    
>>>>>> The entire endpoint is in a try catch block catching Exception. In 
>>>>> the case of failure the exception stack trace gets written to the logs.
>>>>> The first line of the endpoint is a log message signalling receiveing 
>>>>> the request (this was done for this debugging of course!!) 
>>>>> For the successful request the log message (the introductory one) gets 
>>>>> written. For the 502 ones never.
>>>>> For requests that fail because of bigtable related errors, the logs 
>>>>> have the stacktrace but not for 502s.
>>>>> The 502 failure requests finish in <10ms.
>>>>>
>>>>>>
>>>>>>    - 
>>>>>>    - Can you log various steps in the */read* endpoint?  This might 
>>>>>>    help identify the progress the request reaches before the *502* 
>>>>>>    is served.  It would also help in confirming that your application is 
>>>>>>    actually even getting the request as I can't currently confirm that 
>>>>>> from 
>>>>>>    the logs.
>>>>>>    
>>>>>> My best guess is that the request does not make it to the servlet. 
>>>>> The reason for that is that for the 100s of failed 502 logs that I have 
>>>>> seen, not one has the log message, which is the absolute first line in 
>>>>> the 
>>>>> code of the read handler. 
>>>>>
>>>>>>
>>>>>>    - 
>>>>>>    - If said endpoint does in fact read from BigTable, what API and 
>>>>>>    java library are you using?
>>>>>>
>>>>>> we are using the google provided bigtable hbase1.2 jars version 
>>>>> 0.9.4. 
>>>>>
>>>>>> Regarding the article you linked, while the configuration of an HTTPS 
>>>>>> load balancer and nginx.conf can be very important, both the load 
>>>>>> balancing 
>>>>>> component and nginx.conf are out of the hands of the developer with App 
>>>>>> Engine.  Your scaling settings, health check settings and handlers in 
>>>>>> the 
>>>>>> app.yaml are the only rules over which you have control that affect load 
>>>>>> balancing and nginx rules.
>>>>>>
>>>>>> On Wednesday, February 8, 2017 at 11:27:43 AM UTC-5, Vinay Chitlangia 
>>>>>> wrote:
>>>>>>>
>>>>>>> Might be related:
>>>>>>>
>>>>>>> https://blog.percy.io/tuning-nginx-behind-google-cloud-platform-http-s-load-balancer-305982ddb340#.6k2laoada
>>>>>>>
>>>>>>> The symptoms mentioned in this blog
>>>>>>> Somewhat moderate requests
>>>>>>> No logs
>>>>>>>
>>>>>>> match our observations.
>>>>>>>
>>>>>>> I do not see the 
>>>>>>> "backend_connection_closed_before_data_sent_to_client" status in the 
>>>>>>> logs.
>>>>>>>
>>>>>>> The error message for a failed request received by the client is:
>>>>>>> 11:12:44.549com.yawasa.server.storage.RpcStorageService LogError: 
>>>>>>> <html><head><title>502 Bad Gateway</title></head><body 
>>>>>>> bgcolor="white"><center><h1>502 Bad 
>>>>>>> Gateway</h1></center><hr><center>nginx</center></body></html> (
>>>>>>> RpcStorageService.java:137 
>>>>>>> <https://console.cloud.google.com/debug/fromlog?appModule=default&appVersion=1&file=RpcStorageService.java&line=137&logInsertId=589569d9000e7bf6825479e4&logNanos=1486186963359794000&nestedLogIndex=0&project=village-test>
>>>>>>> )
>>>>>>>
>>>>>>> The mention of nginx in the log message appears promising. We are 
>>>>>>> not using nginx deliberately, so I am assuming this is something 
>>>>>>> happening 
>>>>>>> under the hood.
>>>>>>>
>>>>>>> On Tuesday, February 7, 2017 at 11:08:55 AM UTC+5:30, Vinay 
>>>>>>> Chitlangia wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>> We are seeing intermittent occurrences of 502 Bad Gateway error in 
>>>>>>>> our server.
>>>>>>>> About 0.5% requests fail with this error.
>>>>>>>>
>>>>>>>> Out setup is:
>>>>>>>> Flex running jetty9-compat
>>>>>>>> F1 machine
>>>>>>>> 1 server
>>>>>>>>
>>>>>>>> Our request pattern is bursty. So the server gets ~30 requests in 
>>>>>>>> parallel. 
>>>>>>>> The failures, when they happen are clustered, that is over a period 
>>>>>>>> of 10'ish seconds one would see 3-4 errors.
>>>>>>>>
>>>>>>>> The requests which complete successfully, finish in 50-100 ms, so 
>>>>>>>> it does not appear like the server is under major load and not able to 
>>>>>>>> keep 
>>>>>>>> up.
>>>>>>>> To rule out this possibility, I started the servers with 5 
>>>>>>>> replicas. However the failure percentage did not change.
>>>>>>>>
>>>>>>>> From the looks of it, it appears that there is some throttling or 
>>>>>>>> quota issue at play. I tried tweaking max-concurrent-requests param. 
>>>>>>>> Set it 
>>>>>>>> to 300, but that did not make any difference either.
>>>>>>>>
>>>>>>>> I do not see new instances being created at the time of failure 
>>>>>>>> either.
>>>>>>>>
>>>>>>>>
>>>>>>>> The request log for the failed request:
>>>>>>>> 09:57:30.686POST502262 B4 msAppEngine-Google; (+
>>>>>>>> http://code.google.com/appengine; appid: s~village-test)/read
>>>>>>>> 107.178.194.3 - - [07/Feb/2017:09:57:30 +0530] "POST /read 
>>>>>>>> HTTP/1.1" 502 262 - "AppEngine-Google; (+
>>>>>>>> http://code.google.com/appengine; ms=4 cpu_ms=0 
>>>>>>>> cpm_usd=2.9279999999999998e-8 loading_request=0 instance=- 
>>>>>>>> app_engine_release=1.9.48 trace_id=-
>>>>>>>> {
>>>>>>>> protoPayload: {…}  
>>>>>>>> insertId: "58994cb30002335cb47fd364"  
>>>>>>>> httpRequest: {…}  
>>>>>>>> resource: {…}  
>>>>>>>> timestamp: "2017-02-07T04:27:30.686052Z"  
>>>>>>>> labels: {…}  
>>>>>>>>
>>>>>>>> operation: {…}  
>>>>>>>> }
>>>>>>>>
>>>>>>>> Looking around at other logs at around the time of failure I see. 
>>>>>>>> 09:57:30.000[error] 32#32: *35107 recv() failed (104: Connection 
>>>>>>>> reset by peer) while reading response header from upstream, client: 
>>>>>>>> 169.254.160.2, server: , request: "POST /read HTTP/1.1", upstream: "
>>>>>>>> http://172.17.0.4:8080/read";, host: "bigtable-dev.appspot.com"
>>>>>>>> AFAICT this request never made it to our servlet.
>>>>>>>>
>>>>>>> -- 
>>>>>> You received this message because you are subscribed to a topic in 
>>>>>> the Google Groups "Google App Engine" group.
>>>>>> To unsubscribe from this topic, visit 
>>>>>> https://groups.google.com/d/topic/google-appengine/zHSuoxkmqjw/unsubscribe
>>>>>> .
>>>>>> To unsubscribe from this group and all its topics, send an email to 
>>>>>> google-appengi...@googlegroups.com.
>>>>>> To post to this group, send email to google-a...@googlegroups.com.
>>>>>> Visit this group at https://groups.google.com/group/google-appengine.
>>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/google-appengine/ea48946b-fbd9-47af-a7b4-136493f0d583%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/google-appengine/ea48946b-fbd9-47af-a7b4-136493f0d583%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>> -- 
>>>> You received this message because you are subscribed to a topic in the 
>>>> Google Groups "Google App Engine" group.
>>>> To unsubscribe from this topic, visit 
>>>> https://groups.google.com/d/topic/google-appengine/zHSuoxkmqjw/unsubscribe
>>>> .
>>>> To unsubscribe from this group and all its topics, send an email to 
>>>> google-appengi...@googlegroups.com.
>>>> To post to this group, send email to google-a...@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/google-appengine.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/google-appengine/e2f0a495-82e1-4b03-b1b3-1d8355de7630%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/google-appengine/e2f0a495-82e1-4b03-b1b3-1d8355de7630%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/dfd9d65d-c980-42f6-a773-03eebfa3fae9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [google-appengine] Re: 502 Bad gateway error

Reply via email to