Hi,
I found the issue and the solution.

I thought that GCP was killing off the VM/container running the App Engine 
server as the logging and request handling would just STOP right when this 
larger query was made. I was sure it had something to do with the limit on 
the amount of data that an App Engine service could request from another 
service though. However, this is a TCP connection to the MongoDB. And as 
HTTP has a 32MB limit, TCP supposedly as no limit. So for a while we just 
modified the query to return a much smaller result while I continued to 
search for an answer.

The weirdest part was that this HTTP route/App Engine service would work 
most of the time, handling very large queries without issues. The issue 
though is that it needs to run on a cron/Cloud Scheduler at midnight. With 
consistency, working every time. But it was failing some nights for some 
unknown reason. And I couldn't find any exceptions in the logs from the 
code. The issue is that I had somehow missed an error message, probably due 
to our over zealous logging, from Google App Engine itself letting me know 
about memory constraints.

So yesterday we had the same issue. 

Searching through the logs I find that the HTTP request that we were making 
to this App Engine service which does this large query was failing. It was 
scheduled in Scheduler so I could see there in the logs that it errored out.

Looking through the logs I found this gem of a log message:

"Exceeded soft memory limit of 256 MB with 269 MB after servicing 0 
requests total. Consider setting a larger instance class in app.yaml."

It turns out that the VM/container (whatever is being used to host an 
AppEngine service) has a maximum memory limit of 256MB by default. And some 
nights our site would not get many requests or a fresh AppEngine server 
would be spun up and handle the expensive query request on it's own. Free 
to use all 256 MB of memory. Otherwise it was sharing memory with all the 
other "normal" requests that were coming through our service.

So to solve this I just created a Dockerfile, basically the template 
provided by the documentation, created the image, uploaded to the Container 
Registry and created a Cloud Run instance. So now we have routing to a 
service dedicated to handling requests for these expensive requests (memory 
wise) and we can even configure easily what the memory limit is on the 
different Cloud Run services.

So the issue it turns out was that we ran out of RAM in a shared 
environment and the solution was to move the service off to it's own 
hosting solution, Cloud Run. So we could route these expensive requests and 
only these requests to these new Cloud Run instances dedicated to being 
able to handle the load of the query.

Thank you so much for you time, any one who tried to help!

Thanks and hope this is helpful for anyone else,
Brad Barrows

On Wednesday, June 9, 2021 at 8:59:38 AM UTC-7 Bradley Barrows wrote:

> I thought that the original post did not actually send. My browser quit 
> right after hitting send so I re wrote it.
>
> I also found no acknowledgement that the original was received so had no 
> way of verifying if I needed to re type it or not.
>
> Sorry about that.
>
> Thanks
> Brad
>
> On Wed, Jun 9, 2021 at 3:01 AM 'Angel (Google Cloud Platform Support)' via 
> Google App Engine <google-a...@googlegroups.com> wrote:
>
>> Hello Brad,
>>
>> I believe this is a similar, if not the same question that has already 
>> been answered here [1]. Please let me know if this is the case or if I 
>> misunderstood your question.
>>
>> Kind regards.
>> __________________________
>> [1] - 
>> https://groups.google.com/g/google-appengine/c/Lj6qaT2lC90/m/TvhGSteRAwAJ
>>
>> On Tuesday, June 8, 2021 at 9:05:27 AM UTC+2 bradeba...@gmail.com wrote:
>>
>>> Hi
>>>
>>> I have a NodeJS App Engine server which is awaiting a MongoDB query 
>>> which, when reaching a certain size, never returns. There is no exception 
>>> thrown from App Engine either. It is as if the process is just killed 
>>> because execution flow and therefore logging from the process making the 
>>> MongoDB query stops.
>>>
>>> If this were a request to a HTTP server I could look at the limits and 
>>> quotas page and see that a maximum of 32MB requests and response size is 
>>> allowed.
>>>
>>> I am unable to find any limits or quotas for TCP connections from an App 
>>> Engine server.
>>>
>>> We temporarily solved the issue by using a limit or projection to 
>>> decrease the amount of data returned by the MongoDB query. When less data 
>>> is sent back to the App Engine server execution continues and the server 
>>> runs as expected.
>>>
>>> I was wondering if it was possible if at some point if TCP data usage 
>>> reaches a limit will the App Engine process be killed? Without any logging? 
>>> Why would the execution of the process handling the response that fires off 
>>> the MongoDB query stop and no exception or anything further be logged?
>>>
>>> Thank you
>>> Brad
>>
>> -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "Google App Engine" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/google-appengine/qQSy-QQLZjM/unsubscribe
>> .
>> To unsubscribe from this group and all its topics, send an email to 
>> google-appengi...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/google-appengine/7e3c5f5b-cc3b-4460-965f-00a20896feb0n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/google-appengine/7e3c5f5b-cc3b-4460-965f-00a20896feb0n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/10c44d01-f199-4daa-b032-bc341a4252ean%40googlegroups.com.

Reply via email to