Re: [google-appengine] Updates from the Google App Engine team (Fall 2021)

Joshua Smith Fri, 05 Nov 2021 13:48:19 -0700

I figured out from the logs (since it prints a message for each worker it 
starts) that 3.7 is starting 8 workers by default. So adding the entrypoint 
with defaults has no effect.


However, I did some experiments and I think I can characterize the difference 
between 2.7 and 3.7 when it comes to scaling.

When first landing on my app's page it shows a list of thumbnails. These come 
from the datastore, with each thumbnail making a HTTP request (GET 
/image?which=1, GET /image?which=2, etc.). If we:

1. Query for all of these at once so the server gets hammered with simultaneous 
requests:
        - 2.7 starts one extra instance, and while that's happening the images 
are all served by the instance that already existed; the new instance never 
actually serves any of those image requests.
        - 3.7 starts a lot of instances; it dumps a bunch of requests to each 
of them (one per worker thread, I presume), all of which are going to take 5 
seconds because that's how long it takes each instance to start.

The net result is the user of the 2.7 app sees all the images load right away, 
while the 3.7 app gets them over the course of 15 seconds or so, and sometimes 
there are timeout errors.

2. Query for these with a delay that's long enough to avoid overlapping 
requests:
        - Neither starts extra instances.

The user sees all the images load pretty quick, but not quite as quick as 2.7 
with the first approach.

My conclusion is that the change is that the old scheduler wouldn't use an 
instance until it was running, whereas the new scheduler queues up requests for 
instances that are still starting up. The old scheduler doesn't start a new 
instance until the one it started most recently is running, whereas the new 
scheduler starts a new instance every time the request queue fills up for 
existing instances.

If you can serve a burst of requests in the time it take an instance to start, 
the old scheduler will get all those served. But the new scheduler will choke 
and perform very badly.

I'm not saying that the old scheduler was better. It just handled this 
particular case (cold start with a firehose) a lot better. Once the two systems 
are running at scale, I wouldn't expect there to be much difference.

-Joshua


> On Nov 5, 2021, at 3:29 PM, Joshua Smith <mrjoshuaesm...@gmail.com> wrote:
> 
> I think it's telling me that since my P3.7 app has a F4_1G instance, I should 
> be configuring it to run gunicorn with 8 workers. I don't see any information 
> about the default number of workers. I guess I can do an experiment to see 
> what explicitly setting that does...
> 
>> On Nov 5, 2021, at 3:17 PM, Jason Collins <jason.a.coll...@gmail.com 
>> <mailto:jason.a.coll...@gmail.com>> wrote:
>> 
>> It sounds like there might be some differences due to concurrency between 
>> 2.7 and 3.7/8/9. There are some notes on how to tweak the number of worker 
>> processes started for Python 3 here: 
>> https://cloud.google.com/appengine/docs/standard/python3/runtime#entrypoint_best_practices
>>  
>> <https://cloud.google.com/appengine/docs/standard/python3/runtime#entrypoint_best_practices>
>> 
>> 
>> On Friday, 5 November 2021 at 11:50:52 UTC-7 George (Cloud Platform Support) 
>> wrote:
>> There is no official description, somewhere, of "how the scaling has 
>> changed" or "how it has been implemented differently", as there is no change 
>> in scaling behavior, and no different implementation. This is an assumption 
>> while we investigate the situation described by Joshua above. I could not 
>> find a similar issue in the Public Issue Tracker 
>> <https://issuetracker.google.com/>, so the difference in scaling between 
>> Python 2 and 3 is likely a one-time occurrence, and caused by idiosyncratic 
>> settings in the project, or application-specific factors. Such issue are 
>> difficult to approach in a publicly-accessible thread as this one, as 
>> investigation needs particulars such as project ID, and access to other 
>> private data. To have this issue properly addressed, it's better to open a 
>> support case 
>> <https://cloud.google.com/support/docs/manage-cases#creating_cases> or an 
>> issue in the Public Issue Tracker <https://issuetracker.google.com/>. 
>> 
>> On Friday, 05 November 2021 at 05:50:42 UTC-4 Nicolas Fonrose (Teevity) 
>> wrote:
>> Hello David,
>> 
>> >There’s not an official way of forcing the scaling to behave like it used 
>> >to be on python 2.7.
>> Is there an official description, somewhere, of "how the scaling has 
>> changed" or "how it has been implemented differently" ?
>> Or this change just a side effect of technical choices that were made in the 
>> new runtimes?
>> 
>> Thanks,
>> 
>> On Thursday, November 4, 2021 at 9:56:43 PM UTC+1 David (Cloud Platform 
>> Support) wrote:
>> Hello,
>> 
>> Other than applying changes within your own application so it will go easy 
>> at startup, you can try using warmup requests 
>> <https://cloud.google.com/appengine/docs/standard/python3/configuring-warmup-requests>
>>  along with tweaking the min_idle_instances element 
>> <https://cloud.google.com/appengine/docs/standard/python3/config/appref#scaling_elements>
>>  in your app.yaml, in order to reduce request and response latency during 
>> the time when your app's code is being loaded to a newly created instance.
>> 
>> There’s not an official way of forcing the scaling to behave like it used to 
>> be on python 2.7. However, this type of feedback can be passed to the App 
>> Engine engineering team in the form of a feature request. You can create 
>> such a feature request here 
>> <https://issuetracker.google.com/issues/new?component=187191&template=1162953>.
>>  The App Engine engineering team would then evaluate it and decide whether 
>> it could be implemented or not.
>> On Thursday, November 4, 2021 at 11:38:52 AM UTC-4 Joshua Smith wrote:
>> Thanks for sending out this update. I did, indeed, miss most of this news.
>> 
>> This item, in particular, is awesome:
>> 
>> 
>>> Extending support for App Engine bundled services 
>>> <https://cloud.google.com/blog/products/serverless/support-for-app-engine-services-in-second-generation-runtimes>
>>>  (Sep 2021) 
>> 
>> This sounds like it will make it so much easier to migrate to Python 3.7.
>> 
>> One thing I've noticed is that my older apps on 2.7 seem to handle peak 
>> scaling a lot better than my newer apps on 3.7. For example, if I have a web 
>> page that hits a 2.7 app with 100 REST calls at startup (bad design, but it 
>> happens), the old app serves them all eventually. But if I do the same thing 
>> in a 3.7 app, it's likely to choke and fail a bunch of those requests with 
>> these:
>> 
>> 
>> 
>> The specific pattern is that after the first couple requests, it spins up a 
>> new instance. That takes 5 seconds to serve its first request (simple app, 
>> so I guess that's just GAE overhead). Then it spins up a couple more. Then I 
>> start getting those errors on some of the requests, because 15s have passed. 
>> In this capture the blue ones are new instances spinning up, and the orange 
>> ones are timeout errors (note the 15s time on those).
>> 
>> 
>> 
>> I've been designing around the issue by making sure my web apps go easy on 
>> the server on startup. But it is yet another concern about migrating.
>> 
>> Is there a way to tune the auto-scaling for 3.7 to behave like the 2.7? 
>> Here's a similar set of requests to my 2.7 app, which only spins up one 
>> extra instance, and never throws timeout errors, ever:
>> 
>> 
>> 
>> -Joshua
>> 
>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Google App Engine" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to google-appengine+unsubscr...@googlegroups.com 
>> <mailto:google-appengine+unsubscr...@googlegroups.com>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/google-appengine/150109bb-18a4-40b1-9efb-8b86bc60eceen%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/google-appengine/150109bb-18a4-40b1-9efb-8b86bc60eceen%40googlegroups.com?utm_medium=email&utm_source=footer>.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/5B04AE8B-06A4-43C3-9922-D9859BB4610E%40gmail.com.

Re: [google-appengine] Updates from the Google App Engine team (Fall 2021)

Reply via email to