On Fri, Jul 22, 2011 at 8:57 PM, Galoch <galoch...@gmail.com> wrote: > Hi Johan, > > Thanks for the explanation. I have couple of questions on that.
Thanks for showing interest in GAE internals, I'd be happy to answer those questions directly if I can, or forward them to someone who can answer them better. > 1. "1 Hours ago while all your Always On instance were busy and you > had a burst of incoming requests" > While this may be true when my Always On instances were "busy" running > some stuff but what about when 2 Always On instances show only "1" > request served which is the Warmup request itself. Does this mean > Warmup requests are considered as traffic? If that is the case then > Always On instances seem rather useless since they will never ever get > called in this scenario. On the admin console capture you included in your previous mail, I didn't see Always On instances showing only "1" request served but rather: Resident Instance 1: Requests: 49 Age: 1Hr Resident Instance 2: Requests: 6 Age: 1Hr Resident Instance 3: Requests: 2 Age: 1Hr Let me know if I missed something. > 2. As Tom mentioned, what qualifies "busy". When threadsafe option was > implemented in GAE these 3 Always On instances were able to do most of > the heavy lifting with occasional spinning of dynamic instances. > Nothing has changed on our side that should alter this behavior. With > all these changes happening within GAE I am trying to figure out what > changed and what we can do to contain this burst of traffic within 3 > (or more ) Always On instances with less frequent spinning of Dynamic > instances. There are two scheduler knobs that could help you to affect the way Dynamic instance are spawned. "Minimum Pending Latency" and "Max Idle Instances" as described here: http://code.google.com/appengine/docs/adminconsole/performancesettings.html > 3. "- 2 Minutes ago all your instances Always On + Dynamic were busy > again and the scheduler spawned a new Dynamic instance that handle 7 > incoming requests. " > Again what constitutes "busy" as I do not see any request being served > by Always On instances 2 and 3 in last 1 hour. Note that number of > requests served by Always On 2/3 are unchanged since they were > created ... > Here's my reading in this scenario: > a. It kills Dynamic Instance 1 within 2 minutes of serving a request > b. When traffic comes in it looks only for Dynamic Instances if they > are busy and completely ignores Always On instances at this point > c. It recreates Dynamic Instance 1 > > In other words, what rule is applied in this case? Sorry, those were mostly specification of mine, I didn't know that the request served by Always On 2/3 were unchanged according to the information you provided. I can investigate deeper into the specific behaviour of your application, if you open a Production Issue with your application id. > Also I fail to understand rule 4 as both Rob and Luca mentioned. That > completely undermines having Always On instances under threadsafe > mode. > > 4. I like Rob's suggestion of better load balancing techniques but > again with a caveat that an instance needs to be able to serve > multiple threads before reaching a set capacity (80% or so) > > 5. Luca's suggestion also makes sense but again with the same > caveat ... it should be able to process multiple threads before > queuing Thanks a lot for your feedback, I will make sure to forward those suggestions to the engineering team. > > 6. I looked at the new sliders in the Admin console and with those the > situation is even worse. I set the Max Idle Instances to 3 (that's the > minimum I could choose) and Min Pending Latency to 15 secs ... Guess > what our CPU usage has gone up to 15 in 12 hrs because of constant > creation and killing of 3 dynamic instances. Bare minimum traffic and > few light weight crons. > But the good side is now I see requests coming in on the 3 Always On > instances. Is that enough load they are serving ... I don't know yet > but something to observe. Maybe you can open a feature request for having a smaller min for 'Max Idle Instance' when Always On is activated or having Always On instances count in Max Idle Instance. > Two things I suggest would be really helpful for us: > A. The overall key here is to know the thread handling capacity of an > instance. Better yet if it can be configured similar to Backends but > dynamic in nature (and of course Backends pricing is outrageous ... > but that's another topic) Are you looking for <max-concurrent-requests> support for Servlet ? If so I would recommend to open a Feature request. > B. Able to add more Always On instances but again with a dependency > explained in point A. Again, opening a feature request make sense to track this separately. > On Jul 22, 7:57 am, Johan Euphrosine <pro...@google.com> wrote: >> HI Galoch, >> >> Thanks for the followup, >> >> I think you are experiencing a combinaison fo the two following rules >> I was pointing to in my previous email: >> (> reads as has priority for handling the incoming request) >> 2/ Spawning a new Dynamic instance > Busy Always On instance >> 4/ Idle Dynamic instance > Idle Always On instance >> >> Applied to your example it could means that: >> Resident Instance 1: Requests: 49 Age: 1Hr >> Resident Instance 2: Requests: 6 Age: 1Hr >> Resident Instance 3: Requests: 2 Age: 1Hr >> Dynamic Instance 1: Requests: 7 Age: 2min >> Dynamic Instance 2: Requests: 291 Age: 1Hr >> Dynamic Instance 3: Requests: 322 Age: 1Hr >> >> - 1 Hours ago while all your Always On instance were busy and you had >> a burst of incoming requests and the scheduler spawned new Dynamic >> instances as per rule 2/ highlighted above. >> - After the burst and back to normal traffic the new Dynamic Instances >> were handing incoming requests in priority as per rule 4/ highlighted >> above. >> - 2 Minutes ago all your instances Always On + Dynamic were busy again >> and the scheduler spawned a new Dynamic instance that handle 7 >> incoming requests. >> >> Hope that make more sense for you and Francois, but as I said earlier >> we are open to suggestion and I will make sure someone working on the >> scheduler team monitor this thread for your input. >> >> >> >> >> >> >> >> >> >> On Fri, Jul 22, 2011 at 9:09 AM, Galoch <galoch...@gmail.com> wrote: >> > @Johan, >> > The issue is not about Always On instance being busy. Its actually the >> > other way ... the Always On instance is never busy ... at least that >> > is what we observed in last 3-4 days. Your explanation may be partly >> > true since this behavior keeps on changing. >> >> > For e.g. I have a snapshot of instances from July 19th and here's the >> > details (for some reason I can't see a link to attach the snapshot >> > images here): >> > Resident Instance 1: Requests: 49 Age: 1Hr >> > Resident Instance 2: Requests: 6 Age: 1Hr >> > Resident Instance 3: Requests: 2 Age: 1Hr >> > Dynamic Instance 1: Requests: 7 Age: 2min >> > Dynamic Instance 2: Requests: 291 Age: 1Hr >> > Dynamic Instance 3: Requests: 322 Age: 1Hr >> >> > This is under "no load" with only very light weight cron jobs running. >> > This gets much much worse during the day under peak load with requests >> > for dynamic instances reaching 1000+ in matter of minutes and resident >> > instances have only "1" request served. >> >> > As you see above Resident Instance 2 and 3 are hardly hit so I don't >> > think they are busy at all. On the other hand, Dynamic Instance 2 and >> > 3 get most of the hits. >> >> > Dynamic Instance 1 is what is killing us. It keeps getting killed and >> > reborn within that 5 minute window!! >> >> > We use Spring framework and it is really very expensive for us when a >> > new instance starts up. >> >> > Just to give you a background, we had gone through a real roller >> > coaster ride to make this to work on GAE by breaking the loading of >> > framework into many different chunks. But still spinning was out of >> > control. Then we found java threads to our rescue. We worked through >> > the hack to load JDO to avoid UnsupportedOperationException. We >> > finally got it to work where most of our requests were served by >> > Always On instances with occasional spinning of Dynamic instances. It >> > was quite impressive. >> >> > Unfortunately, this was short lived when we hit this new behavior with >> > GAE. The very last thing we want GAE to do is create a new instance >> > every few minutes as it could easily reach 30 second deadline during >> > the day and throw critical error. >> >> > I am not sure when the new billing will come into effect but we really >> > need this thing fixed as it literally brings down our app to a >> > grinding halt. So I am open to any suggestions you guys think can help >> > us. >> >> > Another thought about new scheduler is to have a configurable >> > schedule. For e.g. our users are mostly business users who work during >> > normal business hours. We want to be able to spin more Always On >> > instances during those hours and bring the number down during nights >> > and weekends. Dynamic instances won't work for us due to reason >> > explained above. >> >> > Thanks, >> > galoch >> >> > On Jul 21, 5:56 pm, Johan Euphrosine <pro...@google.com> wrote: >> >> After speaking with Engs, I think I can explain what is going on: >> >> >> Here are the current scheduling rules: (> reads as has priority for >> >> handling the incoming request) >> >> >> 1/ Idle Always On instance > Spawning a new Dynamic instance >> >> 2/ Spawning a new Dynamic instance > Busy Always On instance >> >> 3/ Idle Dynamic instance > Busy Always On instance >> >> 4/ Idle Dynamic instance > Idle Always On instance >> >> >> I will give you an example to illustrate the behavior you all noticed, >> >> that is Dynamic instance handling request while Always On is idle. >> >> >> (Always On instance started) >> >> - Incoming request >> >> - Always On instance handle the request >> >> - another Incoming request >> >> (Always On instance busy) >> >> - A new Dynamic instance is spawned >> >> (Dynamic instance idle, Always on instance busy) >> >> - Dynamic instance handle the request >> >> - another Incoming request >> >> (Dynamic instance idle, Always on instance idle) >> >> - Dynamic instance handle the request >> >> - No request for more than idle-dynamic-instance-timeout >> >> - Dynamic instance shut down >> >> - another Incoming request >> >> (Always On instance idle) >> >> - Always On instance handle the request >> >> >> Hope it makes thing clearer. >> >> >> As part of the new billing model you will have a scheduler knob called >> >> 'max-idle-instances' that you can use if extra idling dynamic >> >> instances are undesired. >> >> >> The good news is that we are open to suggestion, if you think this >> >> behavior is the wrong default, feel free to comment on that thread and >> >> I will follow up your suggestion to the Engineering team. >> >> >> On Wed, Jul 20, 2011 at 12:18 AM, Galoch <galoch...@gmail.com> wrote: >> >> > Same here. Seems like GAE is totally ignoring Always On instances. >> >> > I also noticed that even with no user hitting our app and a single >> >> > cron job that runs every 5 minutes it is still spinning instances >> >> > every 3 minutes and then killing them in 2 minutes. >> >> >> > This has been happening since after the upgrade on 14th July. During >> >> > peak load this really gets nasty and brings down the performance. >> >> >> > This is the feedback I got yesterday from one of our customers since >> >> > it takes time to spin an instance (and yes we use Spring): >> >> >> > "1) I found the GUI to be very laggy" >> >> >> > Can someone from Google please respond? >> >> >> > -- >> >> > You received this message because you are subscribed to the Google >> >> > Groups "Google App Engine" group. >> >> > To post to this group, send email to google-appengine@googlegroups.com. >> >> > To unsubscribe from this group, send email to >> >> > google-appengine+unsubscr...@googlegroups.com. >> >> > For more options, visit this group >> >> > athttp://groups.google.com/group/google-appengine?hl=en. >> >> >> -- >> >> Johan Euphrosine (proppy) >> >> Developer Programs Engineer >> >> Google Developer Relations >> >> > -- >> > You received this message because you are subscribed to the Google Groups >> > "Google App Engine" group. >> > To post to this group, send email to google-appengine@googlegroups.com. >> > To unsubscribe from this group, send email to >> > google-appengine+unsubscr...@googlegroups.com. >> > For more options, visit this group >> > athttp://groups.google.com/group/google-appengine?hl=en. >> >> -- >> Johan Euphrosine (proppy) >> Developer Programs Engineer >> Google Developer Relations > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to google-appengine@googlegroups.com. > To unsubscribe from this group, send email to > google-appengine+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > -- Johan Euphrosine (proppy) Developer Programs Engineer Google Developer Relations -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.