Re: [google-appengine] Re: Why resident instances in auto scaling are idle?

2016-10-28 Thread 'Jordan (Cloud Platform Support)' via Google App Engine
Hey Vidya.

You are correct that the instance start time is greatly based on your code, 
as each time a new instance is created it must load and prepare a fresh 
copy of your code to serve.

As for the reason why you are seeing a single instance handling the bulk of 
your requests, this comes down to the App Engine scheduler as you have 
mentioned. The scheduler will simply ask the first instance if it can 
handle a request. Based on your scaling configuration for pending latency 
and concurrent requests, your first instance will tell the scheduler that 
it can handle an extra request, and so it does; leaving the rest of your 
instances waiting to handle any overflow. 

If App Engine thinks you may need an extra instance warmed up just in case 
of overflow, it will create one. This is why you see a single Dynamic 
instance at the bottom handling no requests. Again, App Engine sends 
requests to Dynamic instances and not idle Resident instance. If there is 
no available Dynamic instance, your Resident Instance will be treated as a 
Dynamic instance and a new Resident Instance will be kicked up to meet your 
configured 
 
minimum idle instances.  

To configure your scaling options 
to 
force requests to be more spread across available instances, simply reduce 
the amount of concurrent requests a single instance is allowed to handle, 
reduce the minimum pending latency a request is allowed to wait in an 
instance's pending queue for, and reduce the max pending latency to force a 
request to be handled by a new instance after a period of time. Note, I 
would not recommend setting any of these to zero forcing each request to be 
handled by a single instance. This is because you still want multiple 
requests to be handled by each instance, to balance cost and performance. 

Continue to use the Stackdriver Trace  
tool to see the breakdown of latency for requests, and use this to 
configure the optimal scaling settings for your app so that requests are 
not waiting too long in a pending queue for other requests in front of it 
to finish. Ideally optimizing your code to execute requests very quickly in 
an asynchronous style (such as using the Task Queue to perform long image 
manipulation tasks instead of forcing a user to wait) will make your 
application scalable for Cloud computing. 

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/231d24e7-ac1c-4eba-bfb3-8fada9677094%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [google-appengine] Re: Why resident instances in auto scaling are idle?

2016-10-28 Thread Vidya
I think we were talking about orthogonal points to an extent here. There
are two separate components:

1. Why does it take so long to startup an instance?

The response here is that we need to work on our app startup times and
that's fair. I am positive we have room for optimization there, although
there will come a point where the tradeoff is really about using external
libraries vs getting down to the low level APIs - and we need to make that
an acceptable 2016 coding solution without needing to go back to circa 2000
:)

2. While the new instance is being brought up, why aren't requests being
served by the resident instance and why do we always see the resident
instance to be idle?

This is a question I haven't seen an answer to yet. Really, this is the
bigger question, since even if we brought down our app startup times to
say, 5 seconds, that's still unacceptable user latencies (and an average of
5s will imply enough variances that reach 2-3x that).

Now, after this discussion, I dug up a bunch of things and have learned a
thing or two about how the GAE schedulers *may* be working. A particularly
interesting thread I found was this -
https://groups.google.com/forum/#!topic/google-appengine/sA3o-PTAckc%5B26-50%5D
.

Based on my understanding so far, it looks like new requests will be routed
to the new instance as soon as it is "instantiated", without necessarily
waiting for it to actually be up (or they may be routed based on some
thresholds of the request rates). Obviously, this will put the instance
startup latencies right in the path of the user response times.

If this is indeed true, this is going to be a terrible experience for
anyone that is not operating at, well, essentially Google scale :). I can
imagine that this works very well at massive scale (and that may be the
great benefit of using GAE) - but, an app wouldn't live to see that day if,
in its initial days, it is providing a sucky UX due to long response times.

Our own experience matches this explanation - with min pending latency
values of 30ms (we left it at default), a new instance is getting spun up
the minute there are any requests in the queue whatsoever - which means the
new requests are routed immediately to the new instance, while the resident
instance remains idle practically all the time.

I would imagine that the scheduler would adapt to particular app startup
times + total number of operating instances to determine when to move new
requests over to a new instance.

Given little to no official information about how the scheduler works, I'd
like to understand if the experimentation and observations that the GAE
community has developed is in fact correct - otherwise, we're going to be
launching into a mini manual Monte Carlo simulation to really tune the
knobs that may work for our case. The risk here, of course, is that as soon
as the operating parameters change for our app, we need to be re-running
the simulation.

I have to say that the GAE docs leave a lot to be desired. For the number
of people that have suffered through this topic, one would hope by now that
there are more insights on what's actually going on and how to maximize the
efficiency of app engine.

Unfortunately, "optimize your app's startup times" is a necessary but
insufficient answer. As I wrote, unless we're talking 100ms app startup
times, in today's expected UX metrics, we aren't going very far with this.

All that said, we're currently starting to experiment with longer min
pending time values to see if that causes the resident instances to kick
in. It looks like some people have had luck with that. This will obviously
cause problems at scale, but, if it solves our problem at the small scales
now, we'd start there and change it as we scale up.

One more hack to our list there - but, if you have better suggestions, I'd
like to know.

Thanks,
Vidya

On Thu, Oct 27, 2016 at 6:59 PM, Nick  wrote:

> A few years ago I came across a suggestion that you can improve startup
> time by including you war class files in a jar, implying that maybe the war
> is loaded in an exploded manner. Not sure if this is true or relevant, I've
> never bothered. You could test it out.
>
> I've never seen resident instance work or do anything, I always observe
> cold starts on requests I make even when a resident instance is running.
> Quite often you'll also see the full load being born by one or two
> instances while others lie around for a long time only serving one or two
> requests. It would be great if someone had time to play and understand
> practically what matters.
>
> I found it interesting that resident instances are supposed to be
> converted to dynamic though - I never noticed that.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Google App Engine" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/google-appengine/0ZBLsyc51gk/unsubscribe.
> To unsubscribe from this group and all its to

Re: [google-appengine] Re: Why resident instances in auto scaling are idle?

2016-10-27 Thread Nick
A few years ago I came across a suggestion that you can improve startup time by 
including you war class files in a jar, implying that maybe the war is loaded 
in an exploded manner. Not sure if this is true or relevant, I've never 
bothered. You could test it out.

I've never seen resident instance work or do anything, I always observe cold 
starts on requests I make even when a resident instance is running. Quite often 
you'll also see the full load being born by one or two instances while others 
lie around for a long time only serving one or two requests. It would be great 
if someone had time to play and understand practically what matters.

I found it interesting that resident instances are supposed to be converted to 
dynamic though - I never noticed that.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/25dbc043-4fc8-4286-92f7-9485cd7b908a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [google-appengine] Re: Why resident instances in auto scaling are idle?

2016-10-27 Thread 'Jordan (Cloud Platform Support)' via Google App Engine
You are correct, executing a URL Fetch request during the initiation of 
your code will cause a large amount of latency as your instance must wait 
for the requesting server to respond. As previously mentioned you can use 
the Stackdriver Trace  tool on a specific 
'/_ah/warmup' request that is seeing high latency to investigate the exact 
parts of your code that are taking up the most time. 

Using this same tool I went ahead and took a look at your project. I saw 
that on 2016-10-22 a single '/urlfetch.Fetch' took 47.5 seconds to return 
in one of your '/_ah/warmup' instance startups. I also saw a single request 
to your endpoint '/api/admin/warmup' took 62 seconds, and 56 
'/datastore_v3.Put' calls took a combined 2 seconds, all during the same 
instance startup. These results were quickly pulled from the 'Summary' tab 
in the trace details  for 
one of your calls. 

Concerning the latency comparison of running your app in production vs 
locally in development. While running locally your application and all of 
its assets lives within a local web server on your computer. This web 
server is hosted on your same localhost IP, meaning any outgoing requests 
will be instantly served by your same computer. This differs from 
production in that any URL Fetch request or API call needs to be served 
from a different location by a different server, bringing network latency 
and the traffic congestion of each server into the mix.  

You can easily see from the above how removing URL Fetch requests and 
batching your requests to Google Services would drastically reduce the 
startup time for your instances. 


-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/f1a781ea-3664-41af-992c-6bcf6320f438%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [google-appengine] Re: Why resident instances in auto scaling are idle?

2016-10-27 Thread Jeff Schnitzer
The GAE classloader does some security checking that isn’t present in the
dev container. Plus actual loading of classes from jars seems to be slower
(probably some sort of network filesystem is involved). 5-10s startup time
locally is quite long; a corresponding 30-60 server-side seems realistic,
even with F4s. Things to avoid, if you can: Classpath scanning, AOP. Both
seem to slow things down.

Also: I have always found Bad Things happen when trying to use resident
instances. At lower traffic levels, it seems to produce _more_ cold start
requests rather than less. I’ve had the best user experience leaving all
automating scaling settings at their default - for both high traffic and
low traffic apps.

Jeff

On Wed, Oct 26, 2016 at 9:11 PM, Vidya  wrote:

> Let me try to understand this correctly. There is a general set of best
> practices for being efficient with latencies - including trying to do batch
> requests wherever possible and storing data in memcache to avoid a lot of
> datastore queries and so on.
>
> And then there is the question of what causes the latency spikes at the
> time of an instance starting up. We have observed specific spikes that
> occur only when a new instance is started. If I understand what you wrote
> correctly, you appear to be saying that if there are any external URL
> fetches, those calls will need to wait for the instance to startup before
> they can be handled - is that right? I am assuming any Image Service calls
> will be considered external server calls as well?
>
> A separate question is why the instance takes 30-60s to come up when our
> application startup times on a local machine or compute engine instance are
> in the order of 5-10s - what could be causing a 6x increase in startup
> times? Given we are on F4 instances, this sounds very strange.
>
> Thanks,
> Vidya
>
> On Tue, Oct 25, 2016 at 7:54 AM, 'Jordan (Cloud Platform Support)' via
> Google App Engine  wrote:
>
>> Resident Instances turning into Dynamic Instances to handle requests does
>> not effect the time required to start an instance (it is actually designed
>> to help it). When a new Idle Resident Instance is required, an
>> '/_ah/warmup' will be sent to your application. This will trigger the
>> creation of a new instance, and your code will begin to run. Therefore if
>> you are seeing high latency during instance startup, it is likely your code
>> that is the cause.
>>
>> You can use the Stackdriver Trace
>>  tool to sort
>> requests to your application over a period of time by highest latency. You
>> can then select the requests with the highest latency which will show you
>> the actual processes that ran during the request. If your code makes any
>> URL Fetch requests to other applications or servers, this forces your app
>> to wait for this external service to run, causing higher latency. Multiple
>> individual calls to other Google services such as the Datastore may also
>> cause latency. It is recommended to perform batch requests
>> 
>> to any Google service that supports it, in order to reduce the amount of
>> calls your application makes. By optimizing the time needed to execute your
>> code, the latency experienced by an incoming request will be reduced.
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Google App Engine" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>> pic/google-appengine/0ZBLsyc51gk/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> google-appengine+unsubscr...@googlegroups.com.
>> To post to this group, send email to google-appengine@googlegroups.com.
>> Visit this group at https://groups.google.com/group/google-appengine.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/google-appengine/d4baf583-826f-45b3-b139-1f47e699697e%
>> 40googlegroups.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to google-appengine+unsubscr...@googlegroups.com.
> To post to this group, send email to google-appengine@googlegroups.com.
> Visit this group at https://groups.google.com/group/google-appengine.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/google-appengine/CADf2OP7eVBdg%3DfoNOpBfSU8dFdTV9BWS86bqGpM_
> jxpc%2Bb-fjw%40mail.gmail.com
> 
> .
>
> For more 

Re: [google-appengine] Re: Why resident instances in auto scaling are idle?

2016-10-26 Thread Vidya
Let me try to understand this correctly. There is a general set of best
practices for being efficient with latencies - including trying to do batch
requests wherever possible and storing data in memcache to avoid a lot of
datastore queries and so on.

And then there is the question of what causes the latency spikes at the
time of an instance starting up. We have observed specific spikes that
occur only when a new instance is started. If I understand what you wrote
correctly, you appear to be saying that if there are any external URL
fetches, those calls will need to wait for the instance to startup before
they can be handled - is that right? I am assuming any Image Service calls
will be considered external server calls as well?

A separate question is why the instance takes 30-60s to come up when our
application startup times on a local machine or compute engine instance are
in the order of 5-10s - what could be causing a 6x increase in startup
times? Given we are on F4 instances, this sounds very strange.

Thanks,
Vidya

On Tue, Oct 25, 2016 at 7:54 AM, 'Jordan (Cloud Platform Support)' via
Google App Engine  wrote:

> Resident Instances turning into Dynamic Instances to handle requests does
> not effect the time required to start an instance (it is actually designed
> to help it). When a new Idle Resident Instance is required, an
> '/_ah/warmup' will be sent to your application. This will trigger the
> creation of a new instance, and your code will begin to run. Therefore if
> you are seeing high latency during instance startup, it is likely your code
> that is the cause.
>
> You can use the Stackdriver Trace
>  tool to sort
> requests to your application over a period of time by highest latency. You
> can then select the requests with the highest latency which will show you
> the actual processes that ran during the request. If your code makes any
> URL Fetch requests to other applications or servers, this forces your app
> to wait for this external service to run, causing higher latency. Multiple
> individual calls to other Google services such as the Datastore may also
> cause latency. It is recommended to perform batch requests
> 
> to any Google service that supports it, in order to reduce the amount of
> calls your application makes. By optimizing the time needed to execute your
> code, the latency experienced by an incoming request will be reduced.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Google App Engine" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/google-appengine/0ZBLsyc51gk/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> google-appengine+unsubscr...@googlegroups.com.
> To post to this group, send email to google-appengine@googlegroups.com.
> Visit this group at https://groups.google.com/group/google-appengine.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/google-appengine/d4baf583-826f-45b3-b139-
> 1f47e699697e%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/CADf2OP7eVBdg%3DfoNOpBfSU8dFdTV9BWS86bqGpM_jxpc%2Bb-fjw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.