[google-appengine] Re: GAE starting unnecessary instances

Galoch Fri, 22 Jul 2011 11:57:38 -0700

Hi Johan,

Thanks for the explanation. I have couple of questions on that.


1. "1 Hours ago while all your Always On instance were busy and you
had a burst of incoming requests"
While this may be true when my Always On instances were "busy" running
some stuff but what about when 2 Always On instances show only "1"
request served which is the Warmup request itself. Does this mean
Warmup requests are considered as traffic? If that is the case then
Always On instances seem rather useless since they will never ever get
called in this scenario.


2. As Tom mentioned, what qualifies "busy". When threadsafe option was
implemented in GAE these 3 Always On instances were able to do most of
the heavy lifting with occasional spinning of dynamic instances.
Nothing has changed on our side that should alter this behavior. With
all these changes happening within GAE I am trying to figure out what
changed and what we can do to contain this burst of traffic within 3
(or more ) Always On instances with less frequent spinning of Dynamic
instances.


3. "- 2 Minutes ago all your instances Always On + Dynamic were busy
again and the scheduler spawned a new Dynamic instance that handle 7
incoming requests. "
Again what constitutes "busy" as I do not see any request being served
by Always On instances 2 and 3 in last 1 hour. Note that number of
requests served by Always On 2/3 are unchanged since they were
created ...
Here's my reading in this scenario:
a. It kills Dynamic Instance 1 within 2 minutes of serving a request
b. When traffic comes in it looks only for Dynamic Instances if they
are busy and completely ignores Always On instances at this point
c. It recreates Dynamic Instance 1

In other words, what rule is applied in this case?

Also I fail to understand rule 4 as both Rob and Luca mentioned. That
completely undermines having Always On instances under threadsafe
mode.

4. I like Rob's suggestion of better load balancing techniques but
again with a caveat that an instance needs to be able to serve
multiple threads before reaching a set capacity (80% or so)

5. Luca's suggestion also makes sense but again with the same
caveat ... it should be able to process multiple threads before
queuing

6. I looked at the new sliders in the Admin console and with those the
situation is even worse. I set the Max Idle Instances to 3 (that's the
minimum I could choose) and Min Pending Latency to 15 secs ... Guess
what our CPU usage has gone up to 15 in 12 hrs because of constant
creation and killing of 3 dynamic instances. Bare minimum traffic and
few light weight crons.
But the good side is now I see requests coming in on the 3 Always On
instances. Is that enough load they are serving ... I don't know yet
but something to observe.


Two things I suggest would be really helpful for us:
A. The overall key here is to know the thread handling capacity of an
instance. Better yet if it can be configured similar to Backends but
dynamic in nature (and of course Backends pricing is outrageous ...
but that's another topic)
B. Able to add more Always On instances but again with a dependency
explained in point A.

Hope it makes sense.

Thanks,
galoch





On Jul 22, 7:57 am, Johan Euphrosine <pro...@google.com> wrote:
> HI Galoch,
>
> Thanks for the followup,
>
> I think you are experiencing a combinaison fo the two following rules
> I was pointing to in my previous email:
> (> reads as has priority for handling the incoming request)
> 2/ Spawning a new Dynamic instance > Busy Always On instance
> 4/ Idle Dynamic instance > Idle Always On instance
>
> Applied to your example it could means that:
> Resident Instance 1:   Requests: 49     Age: 1Hr
> Resident Instance 2:   Requests: 6      Age: 1Hr
> Resident Instance 3:   Requests: 2      Age: 1Hr
> Dynamic Instance 1:   Requests: 7      Age: 2min
> Dynamic Instance 2:   Requests: 291  Age: 1Hr
> Dynamic Instance 3:   Requests: 322  Age: 1Hr
>
> - 1 Hours ago while all your Always On instance were busy and you had
> a burst of incoming requests and the scheduler spawned new Dynamic
> instances as per rule 2/ highlighted above.
> - After the burst and back to normal traffic the new Dynamic Instances
> were handing incoming requests in priority as per rule 4/ highlighted
> above.
> - 2 Minutes ago all your instances Always On + Dynamic were busy again
> and the scheduler spawned a new Dynamic instance that handle 7
> incoming requests.
>
> Hope that make more sense for you and Francois, but as I said earlier
> we are open to suggestion and I will make sure someone working on the
> scheduler team monitor this thread for your input.
>
>
>
>
>
>
>
>
>
> On Fri, Jul 22, 2011 at 9:09 AM, Galoch <galoch...@gmail.com> wrote:
> > @Johan,
> > The issue is not about Always On instance being busy. Its actually the
> > other way ... the Always On instance is never busy ... at least that
> > is what we observed in last 3-4 days. Your explanation may be partly
> > true since this behavior keeps on changing.
>
> > For e.g. I have a snapshot of instances from July 19th and here's the
> > details (for some reason I can't see a link to attach the snapshot
> > images here):
> > Resident Instance 1:   Requests: 49     Age: 1Hr
> > Resident Instance 2:   Requests: 6      Age: 1Hr
> > Resident Instance 3:   Requests: 2      Age: 1Hr
> > Dynamic Instance 1:   Requests: 7      Age: 2min
> > Dynamic Instance 2:   Requests: 291  Age: 1Hr
> > Dynamic Instance 3:   Requests: 322  Age: 1Hr
>
> > This is under "no load" with only very light weight cron jobs running.
> > This gets much much worse during the day under peak load with requests
> > for dynamic instances reaching 1000+ in matter of minutes and resident
> > instances have only "1" request served.
>
> > As you see above Resident Instance 2 and 3 are hardly hit so I don't
> > think they are busy at all. On the other hand, Dynamic Instance 2 and
> > 3 get most of the hits.
>
> > Dynamic Instance 1 is what is killing us. It keeps getting killed and
> > reborn within that 5 minute window!!
>
> > We use Spring framework and it is really very expensive for us when a
> > new instance starts up.
>
> > Just to give you a background, we had gone through a real roller
> > coaster ride to make this to work on GAE by breaking the loading of
> > framework into many different chunks. But still spinning was out of
> > control. Then we found java threads to our rescue. We worked through
> > the hack to load JDO to avoid UnsupportedOperationException. We
> > finally got it to work where most of our requests were served by
> > Always On instances with occasional spinning of Dynamic instances. It
> > was quite impressive.
>
> > Unfortunately, this was short lived when we hit this new behavior with
> > GAE. The very last thing we want GAE to do is create a new instance
> > every few minutes as it could easily reach 30 second deadline during
> > the day and throw critical error.
>
> > I am not sure when the new billing will come into effect but we really
> > need this thing fixed as it literally brings down our app to a
> > grinding halt. So I am open to any suggestions you guys think can help
> > us.
>
> > Another thought about new scheduler is to have a configurable
> > schedule. For e.g. our users are mostly business users who work during
> > normal business hours. We want to be able to spin more Always On
> > instances during those hours and bring the number down during nights
> > and weekends. Dynamic instances won't work for us due to reason
> > explained above.
>
> > Thanks,
> > galoch
>
> > On Jul 21, 5:56 pm, Johan Euphrosine <pro...@google.com> wrote:
> >> After speaking with Engs, I think I can explain what is going on:
>
> >> Here are the current scheduling rules: (> reads as has priority for
> >> handling the incoming request)
>
> >> 1/ Idle Always On instance > Spawning a new Dynamic instance
> >> 2/ Spawning a new Dynamic instance > Busy Always On instance
> >> 3/ Idle Dynamic instance > Busy Always On instance
> >> 4/ Idle Dynamic instance > Idle Always On instance
>
> >> I will give you an example to illustrate the behavior you all noticed,
> >> that is Dynamic instance handling request while Always On is idle.
>
> >> (Always On instance started)
> >> - Incoming request
> >> - Always On instance handle the request
> >> - another Incoming request
> >> (Always On instance busy)
> >> - A new Dynamic instance is spawned
> >> (Dynamic instance idle, Always on instance busy)
> >> - Dynamic instance handle the request
> >> - another Incoming request
> >> (Dynamic instance idle, Always on instance idle)
> >> - Dynamic instance handle the request
> >> - No request for more than idle-dynamic-instance-timeout
> >> - Dynamic instance shut down
> >> - another Incoming request
> >> (Always On instance idle)
> >> - Always On instance handle the request
>
> >> Hope it makes thing clearer.
>
> >> As part of the new billing model you will have a scheduler knob called
> >> 'max-idle-instances' that you can use if extra idling dynamic
> >> instances are undesired.
>
> >> The good news is that we are open to suggestion, if you think this
> >> behavior is the wrong default, feel free to comment on that thread and
> >> I will follow up your suggestion to the Engineering team.
>
> >> On Wed, Jul 20, 2011 at 12:18 AM, Galoch <galoch...@gmail.com> wrote:
> >> > Same here. Seems like GAE is totally ignoring Always On instances.
> >> > I also noticed that even with no user hitting our app and a single
> >> > cron job that runs every 5 minutes it is still spinning instances
> >> > every 3 minutes and then killing them in 2 minutes.
>
> >> > This has been happening since after the upgrade on 14th July. During
> >> > peak load this really gets nasty and brings down the performance.
>
> >> > This is the feedback I got yesterday from one of our customers since
> >> > it takes time to spin an instance (and yes we use Spring):
>
> >> > "1) I found the GUI to be very laggy"
>
> >> > Can someone from Google please respond?
>
> >> > --
> >> > You received this message because you are subscribed to the Google 
> >> > Groups "Google App Engine" group.
> >> > To post to this group, send email to google-appengine@googlegroups.com.
> >> > To unsubscribe from this group, send email to 
> >> > google-appengine+unsubscr...@googlegroups.com.
> >> > For more options, visit this group 
> >> > athttp://groups.google.com/group/google-appengine?hl=en.
>
> >> --
> >> Johan Euphrosine (proppy)
> >> Developer Programs Engineer
> >> Google Developer Relations
>
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "Google App Engine" group.
> > To post to this group, send email to google-appengine@googlegroups.com.
> > To unsubscribe from this group, send email to 
> > google-appengine+unsubscr...@googlegroups.com.
> > For more options, visit this group 
> > athttp://groups.google.com/group/google-appengine?hl=en.
>
> --
> Johan Euphrosine (proppy)
> Developer Programs Engineer
> Google Developer Relations

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

[google-appengine] Re: GAE starting unnecessary instances

Reply via email to