The only theory I have is informed by these log messages on startup: A [2018-09-14 19:28:38 +0000] [1] [INFO] Starting gunicorn 19.9.0 A [2018-09-14 19:28:38 +0000] [1] [INFO] Listening at: http://0.0.0.0:8080 (1) A [2018-09-14 19:28:38 +0000] [1] [INFO] Using worker: threads A [2018-09-14 19:28:38 +0000] [8] [INFO] Booting worker with pid: 8 A [2018-09-14 19:28:38 +0000] [9] [INFO] Booting worker with pid: 9 A [2018-09-14 19:28:38 +0000] [10] [INFO] Booting worker with pid: 10 A [2018-09-14 19:28:38 +0000] [11] [INFO] Booting worker with pid: 11 A [2018-09-14 19:28:38 +0000] [12] [INFO] Booting worker with pid: 12 A [2018-09-14 19:28:38 +0000] [13] [INFO] Booting worker with pid: 13 A [2018-09-14 19:28:39 +0000] [14] [INFO] Booting worker with pid: 14 A [2018-09-14 19:28:39 +0000] [15] [INFO] Booting worker with pid: 15
It looks like multiple workers are started, so maybe the fixed memory overhead is just duplicated by each worker. This is also supported by the fact that some of my requests load matplotlib. Locally this makes a slow first request then fast on subsequent ones. On GAE F4 the first few are slow, then it stays fast, as if multiple processes have to be warmed up. If this is true, then going to a larger instance is poor advice; doubling memory size also doubles CPU count which doubles fixed overhead and we're back at square one. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To post to this group, send email to google-appengine@googlegroups.com. Visit this group at https://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/6a3b7259-1ea8-42f6-bb38-97e78260018d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.