subject:"Re\: \[google\-appengine\] Re\: Outages\?"

I believe this was related to:
http://code.google.com/p/googleappengine/issues/detail?id=7130

This should now be fixed.

On Tue, Mar 13, 2012 at 9:59 AM, Miroslav Genov mge...@gmail.com wrote:

 I'm encountering the same issue with HR app. The spike got started in
 about ~1 hour from now.

 AppID: cmsevobg
 Datastore: HR

 On Tuesday, March 13, 2012 10:36:38 AM UTC+2, Richard Watson wrote:

 In case you're keeping track of issues thinking it's generally cleared up:

 I'm on HR and have noticed higher latencies (couple seconds instead of
 e.g. 300ms) lately and sometimes higher error rates (a few instead of 0-3).
 Yesterday over about 6 hours I got a ton of 60-second requests that threw
 500's with accompanying messages [1], usually on memcache sets hitting a
 deadline exceeded.  Also, over the last couple weeks I've been running 3
 instances permanently despite sometimes shutting them down manually.
  Usually I get by on one just fine with bursts of 2 or 3.  I've noticed
 that one instance serves the majority of traffic with the other two serving
 maybe 50 requests over many hours, so the shutdown isn't aggressive enough.

 This request caused a new process to be started for your application...
 +
 A problem was encountered with the process that handled this request,
 causing it to exit in the same request.

 App-id: 2dumo-hr


 On Tuesday, March 13, 2012 10:36:38 AM UTC+2, Richard Watson wrote:

 In case you're keeping track of issues thinking it's generally cleared up:

 I'm on HR and have noticed higher latencies (couple seconds instead of
 e.g. 300ms) lately and sometimes higher error rates (a few instead of 0-3).
 Yesterday over about 6 hours I got a ton of 60-second requests that threw
 500's with accompanying messages [1], usually on memcache sets hitting a
 deadline exceeded.  Also, over the last couple weeks I've been running 3
 instances permanently despite sometimes shutting them down manually.
  Usually I get by on one just fine with bursts of 2 or 3.  I've noticed
 that one instance serves the majority of traffic with the other two serving
 maybe 50 requests over many hours, so the shutdown isn't aggressive enough.

 This request caused a new process to be started for your application...
 +
 A problem was encountered with the process that handled this request,
 causing it to exit in the same request.

 App-id: 2dumo-hr

  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To view this discussion on the web visit
 https://groups.google.com/d/msg/google-appengine/-/YUHiXXkGPAgJ.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
Johan Euphrosine (proppy)
Developer Programs Engineer
Google Developer Relations

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

Hi,

I believe this was also related to:
http://code.google.com/p/googleappengine/issues/detail?id=7130

And should now be fixed.

On Tue, Mar 13, 2012 at 10:02 AM, Sébastien Tromp sebastien.tr...@gmail.com
 wrote:

 Hello,

 Same thing here, since around an hour ago:
 AppID: fiveorbsgame  fiveorbsgame-test


 On Tue, Mar 13, 2012 at 9:59 AM, Miroslav Genov mge...@gmail.com wrote:

 I'm encountering the same issue with HR app. The spike got started in
 about ~1 hour from now.

 AppID: cmsevobg
 Datastore: HR

 On Tuesday, March 13, 2012 10:36:38 AM UTC+2, Richard Watson wrote:

 In case you're keeping track of issues thinking it's generally cleared
 up:

 I'm on HR and have noticed higher latencies (couple seconds instead of
 e.g. 300ms) lately and sometimes higher error rates (a few instead of 0-3).
 Yesterday over about 6 hours I got a ton of 60-second requests that threw
 500's with accompanying messages [1], usually on memcache sets hitting a
 deadline exceeded.  Also, over the last couple weeks I've been running 3
 instances permanently despite sometimes shutting them down manually.
  Usually I get by on one just fine with bursts of 2 or 3.  I've noticed
 that one instance serves the majority of traffic with the other two serving
 maybe 50 requests over many hours, so the shutdown isn't aggressive enough.

 This request caused a new process to be started for your application...
 +
 A problem was encountered with the process that handled this request,
 causing it to exit in the same request.

 App-id: 2dumo-hr


 On Tuesday, March 13, 2012 10:36:38 AM UTC+2, Richard Watson wrote:

 In case you're keeping track of issues thinking it's generally cleared
 up:

 I'm on HR and have noticed higher latencies (couple seconds instead of
 e.g. 300ms) lately and sometimes higher error rates (a few instead of 0-3).
 Yesterday over about 6 hours I got a ton of 60-second requests that threw
 500's with accompanying messages [1], usually on memcache sets hitting a
 deadline exceeded.  Also, over the last couple weeks I've been running 3
 instances permanently despite sometimes shutting them down manually.
  Usually I get by on one just fine with bursts of 2 or 3.  I've noticed
 that one instance serves the majority of traffic with the other two serving
 maybe 50 requests over many hours, so the shutdown isn't aggressive enough.

 This request caused a new process to be started for your application...
 +
 A problem was encountered with the process that handled this request,
 causing it to exit in the same request.

 App-id: 2dumo-hr

  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To view this discussion on the web visit
 https://groups.google.com/d/msg/google-appengine/-/YUHiXXkGPAgJ.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
Johan Euphrosine (proppy)
Developer Programs Engineer
Google Developer Relations

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

As Chris pointed earlier in that thread, M/S app are more vulnerable to
this kind of transient infrastructure issues because moving them around
require a maintenance period.

HRD applications are covered by the SLA, replicated around multiple
datacenter, and better distributed if we notice an issue impacting one or
many of them we can easily take actions without impacting other
applications.

I strongly suggest to you to try out the self migration tool in your
administration console, depending of the size of your data and your write
QPS, the read only period needed to migrate your application could be very
small:
https://appengine.google.com/migrating?app_id=your application id

On Tue, Mar 13, 2012 at 2:37 PM, Ronoaldo José de Lana Pereira 
rpere...@beneficiofacil.com.br wrote:

 This is very disturbing ... Our M/S app is getting higher error rates and
 some instances take from 15s to 70s to start. We can't do anything about
 this and even debug what is happening. If there was a issue with our code,
 they should always take 70s to start! I really can't understand or think
 about what in our code we are loading the whole world to take all that
 time... Then the solutions is: move to HRD and ... experience the same load
 time on app startup??? So, what can I do!!!

 Sorry to the rude speaking but I'm very concerned. We started the process
 to get operational support and plan our migration to have the SLA, but if
 our app startup will continue taking a lot of time I really can imagine
 what to do...

 Googlers, does the startup problem is getting solved for both M/S and HRD?
 Only HRD? None? Is there any thing we can do to avoid that strange behavior
 of our instances? Instance startup seems to be vital to the application's
 health: if your app takes to much time to startup then all concurrent
 requests to that instances die with 500's. This is very odd, and the
 warmup requests seems to never work. By our observations and other people
 observations, even if you set a fixed min_idle instances to be always
 there, they don't serve the traffic and you still get errors!

 Hope to see some answers, we really liked GAE in our first year woking
 with the platform, and now I feel completely lost...

 Best Regards,

 -Ronoaldo

 Em terça-feira, 13 de março de 2012 07h51min03s UTC-3, Richard Watson
 escreveu:

 My graph showing ms/sec (attached) over last 24h.  Average spikes were up
 to 32 seconds, but I think all the errors were 60+-1 sec.

 On Tuesday, March 13, 2012 10:36:38 AM UTC+2, Richard Watson wrote:

 In case you're keeping track of issues thinking it's generally cleared
 up:

 I'm on HR and have noticed higher latencies (couple seconds instead of
 e.g. 300ms) lately and sometimes higher error rates (a few instead of 0-3).
 Yesterday over about 6 hours I got a ton of 60-second requests that threw
 500's with accompanying messages [1], usually on memcache sets hitting a
 deadline exceeded.  Also, over the last couple weeks I've been running 3
 instances permanently despite sometimes shutting them down manually.
  Usually I get by on one just fine with bursts of 2 or 3.  I've noticed
 that one instance serves the majority of traffic with the other two serving
 maybe 50 requests over many hours, so the shutdown isn't aggressive enough.

 This request caused a new process to be started for your application...
 +
 A problem was encountered with the process that handled this request,
 causing it to exit in the same request.

 App-id: 2dumo-hr

  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To view this discussion on the web visit
 https://groups.google.com/d/msg/google-appengine/-/6r3NpHShJcgJ.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
Johan Euphrosine (proppy)
Developer Programs Engineer
Google Developer Relations

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

If you are not using the datastore it should be trivial to move your
application to the new HRD infrastructure.

If you don't need the same appid, just create a new HRD application and
deploy your code on it.

If you need the same appid, use the self migration tool:
https://appengine.google.com/migrating?app_id=your application id

Feel free to open a production ticket if you need any assistance migrating
your application:
http://code.google.com/p/googleappengine/issues/entry?template=Production%20issue

On Tue, Mar 13, 2012 at 12:44 PM, charisl charisl...@gmail.com wrote:

Hello Ikai

our app is *betscoreslive*. In our settings master/slave replication is
activated but we are NOT using the datastore at all and we have been
experiencing the DeadlineExceedExceptions and increased instance number
mentioned by the rest of the people in the discussion. Our app is only
using memcache.

Normal operation
during traffic peaks: instances ~25, ~45 requests/second and QPS 1

Now we observe: 12 instances doing nth with ~2 requests/second and QPS 0.2
We had the Deadline exceptions since one week now with small periods of
normal operation. During the problematic periods, our app is pretty
unresponsive.

Will the announced maintenance for master/slave apps will solve the issue
for our app as well?

thanks in advance

Charis

On Friday, March 9, 2012 9:15:35 PM UTC+2, Ikai Lan wrote:

Hey everyone,

Here are a few things that will help:

1. Application IDs (--- if you have nothing else, at least provide this)
2. What is your QPS?
3. What % of your requests are errors?

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Fri, Mar 9, 2012 at 7:24 AM, Ronoaldo José de Lana Pereira
rpere...@beneficiofacil.com.**br rpere...@beneficiofacil.com.br wrote:

+1 for seeing the same problems on my app.

It started to be worse after maintenance on March 7.

Em sexta-feira, 9 de março de 2012 08h33min36s UTC-3, Nikolai escreveu:

+1
we had to move to our backup systems. Everything is full of 500 errors
or hardcore latency.
Most of the 500 errors we see aren't even logged so this seems to be a
goole problem one abstraction layer above the app.

And yes - sometimes we have got the same feeling, that we are the only
ones that use appengine in a production setting. You are not alone ;)

regards,
nikolai

Am Dienstag, 6. März 2012 22:17:37 UTC+1 schrieb Adam Sherman:

Am I the only one seeing short duration outages? They are being
reflected at:

http://code.google.com/status/appenginehttp://code.google.com/status/appengine

But I don't see anyone else complaining anywhere, so it makes me
worried.

--
You received this message because you are subscribed to the Google
Groups Google App Engine group.
To view this discussion on the web visit https://groups.google.com/d/**
msg/google-appengine/-/**yixu1yAlMs4Jhttps://groups.google.com/d/msg/google-appengine/-/yixu1yAlMs4J
.

To post to this group, send email to
google-appengine@googlegroups.**comgoogle-appengine@googlegroups.com
.
To unsubscribe from this group, send email to
google-appengine+unsubscribe@**googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
.
For more options, visit this group at http://groups.google.com/**
group/google-appengine?hl=enhttp://groups.google.com/group/google-appengine?hl=en
.

--
Johan Euphrosine (proppy)
Developer Programs Engineer
Google Developer Relations

Re: [google-appengine] Re: Outages?

What is your application id?

Feel free to open a production issue, if you want to investigate this
offthread:
http://code.google.com/p/googleappengine/issues/entry?template=Production%20issue


On Tue, Mar 13, 2012 at 11:44 AM, Mos mosa...@googlemail.com wrote:

 Same thing the last minutes on our app (HRD, Java, Low-Traffic, one
 instance, no new deployment, simple page just hitting MemCache):

 Request was aborted after waiting too long to attempt to service your
 request.   --  User sees 500er

 GAE-Team, what is going on the last days?  In my opinion the Google App
 Engine is unreliable and looks more like a alpha- or beta-
 cloudenvrionment

 Please Google share you analysis with us.

 Cheers
 Mos





 2012/3/13 Sébastien Tromp sebastien.tr...@gmail.com

 Hello,

 Same thing here, since around an hour ago:
 AppID: fiveorbsgame  fiveorbsgame-test


 On Tue, Mar 13, 2012 at 9:59 AM, Miroslav Genov mge...@gmail.com wrote:

 I'm encountering the same issue with HR app. The spike got started in
 about ~1 hour from now.

 AppID: cmsevobg
 Datastore: HR

 On Tuesday, March 13, 2012 10:36:38 AM UTC+2, Richard Watson wrote:

 In case you're keeping track of issues thinking it's generally cleared
 up:

 I'm on HR and have noticed higher latencies (couple seconds instead of
 e.g. 300ms) lately and sometimes higher error rates (a few instead of 0-3).
 Yesterday over about 6 hours I got a ton of 60-second requests that threw
 500's with accompanying messages [1], usually on memcache sets hitting a
 deadline exceeded.  Also, over the last couple weeks I've been running 3
 instances permanently despite sometimes shutting them down manually.
  Usually I get by on one just fine with bursts of 2 or 3.  I've noticed
 that one instance serves the majority of traffic with the other two serving
 maybe 50 requests over many hours, so the shutdown isn't aggressive enough.

 This request caused a new process to be started for your
 application...
 +
 A problem was encountered with the process that handled this request,
 causing it to exit in the same request.

 App-id: 2dumo-hr


 On Tuesday, March 13, 2012 10:36:38 AM UTC+2, Richard Watson wrote:

 In case you're keeping track of issues thinking it's generally cleared
 up:

 I'm on HR and have noticed higher latencies (couple seconds instead of
 e.g. 300ms) lately and sometimes higher error rates (a few instead of 0-3).
 Yesterday over about 6 hours I got a ton of 60-second requests that threw
 500's with accompanying messages [1], usually on memcache sets hitting a
 deadline exceeded.  Also, over the last couple weeks I've been running 3
 instances permanently despite sometimes shutting them down manually.
  Usually I get by on one just fine with bursts of 2 or 3.  I've noticed
 that one instance serves the majority of traffic with the other two serving
 maybe 50 requests over many hours, so the shutdown isn't aggressive enough.

 This request caused a new process to be started for your
 application...
 +
 A problem was encountered with the process that handled this request,
 causing it to exit in the same request.

 App-id: 2dumo-hr

  --
 You received this message because you are subscribed to the Google
 Groups Google App Engine group.
 To view this discussion on the web visit
 https://groups.google.com/d/msg/google-appengine/-/YUHiXXkGPAgJ.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
Johan Euphrosine (proppy)
Developer Programs Engineer
Google Developer Relations

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

What is your application id? Did you already fill a production issue?

On Tue, Mar 13, 2012 at 3:10 PM, Jason jaso...@gmail.com wrote:

 I'm using HRD and am still getting tons of 500s since early this
 morning.

 On Mar 13, 8:52 am, Johan Euphrosine pro...@google.com wrote:
  I believe this was related to:
 http://code.google.com/p/googleappengine/issues/detail?id=7130
 
  This should now be fixed.
 
 
 
 
 
 
 
 
 
  On Tue, Mar 13, 2012 at 9:59 AM, Miroslav Genov mge...@gmail.com
 wrote:
   I'm encountering the same issue with HR app. The spike got started in
   about ~1 hour from now.
 
   AppID: cmsevobg
   Datastore: HR
 
   On Tuesday, March 13, 2012 10:36:38 AM UTC+2, Richard Watson wrote:
 
   In case you're keeping track of issues thinking it's generally
 cleared up:
 
   I'm on HR and have noticed higher latencies (couple seconds instead of
   e.g. 300ms) lately and sometimes higher error rates (a few instead of
 0-3).
   Yesterday over about 6 hours I got a ton of 60-second requests that
 threw
   500's with accompanying messages [1], usually on memcache sets
 hitting a
   deadline exceeded.  Also, over the last couple weeks I've been
 running 3
   instances permanently despite sometimes shutting them down manually.
Usually I get by on one just fine with bursts of 2 or 3.  I've
 noticed
   that one instance serves the majority of traffic with the other two
 serving
   maybe 50 requests over many hours, so the shutdown isn't aggressive
 enough.
 
   This request caused a new process to be started for your
 application...
   +
   A problem was encountered with the process that handled this request,
   causing it to exit in the same request.
 
   App-id: 2dumo-hr
 
   On Tuesday, March 13, 2012 10:36:38 AM UTC+2, Richard Watson wrote:
 
   In case you're keeping track of issues thinking it's generally
 cleared up:
 
   I'm on HR and have noticed higher latencies (couple seconds instead of
   e.g. 300ms) lately and sometimes higher error rates (a few instead of
 0-3).
   Yesterday over about 6 hours I got a ton of 60-second requests that
 threw
   500's with accompanying messages [1], usually on memcache sets
 hitting a
   deadline exceeded.  Also, over the last couple weeks I've been
 running 3
   instances permanently despite sometimes shutting them down manually.
Usually I get by on one just fine with bursts of 2 or 3.  I've
 noticed
   that one instance serves the majority of traffic with the other two
 serving
   maybe 50 requests over many hours, so the shutdown isn't aggressive
 enough.
 
   This request caused a new process to be started for your
 application...
   +
   A problem was encountered with the process that handled this request,
   causing it to exit in the same request.
 
   App-id: 2dumo-hr
 
--
   You received this message because you are subscribed to the Google
 Groups
   Google App Engine group.
   To view this discussion on the web visit
  https://groups.google.com/d/msg/google-appengine/-/YUHiXXkGPAgJ.
 
   To post to this group, send email to google-appengine@googlegroups.com
 .
   To unsubscribe from this group, send email to
   google-appengine+unsubscr...@googlegroups.com.
   For more options, visit this group at
  http://groups.google.com/group/google-appengine?hl=en.
 
  --
  Johan Euphrosine (proppy)
  Developer Programs Engineer
  Google Developer Relations

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




-- 
Johan Euphrosine (proppy)
Developer Programs Engineer
Google Developer Relations

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

2012-03-13 Thread Mos

 What is your application id?

krisen-talk(www.krisentalk.de)

 Feel free to open a production issue

There is already an issue from someone else (following this thread a lot of
people are affected):

http://code.google.com/p/googleappengine/issues/detail?id=7133

Johan, what's going on with GAE the last days?  It doesn't feel like a PaaS
in production mode.
Perhaps Google should reintroduce the Beta status. ;)


On Tue, Mar 13, 2012 at 3:10 PM, Johan Euphrosine pro...@google.com wrote:

 What is your application id?

 Feel free to open a production issue, if you want to investigate this
 offthread:

 http://code.google.com/p/googleappengine/issues/entry?template=Production%20issue


 On Tue, Mar 13, 2012 at 11:44 AM, Mos mosa...@googlemail.com wrote:

 Same thing the last minutes on our app (HRD, Java, Low-Traffic, one
 instance, no new deployment, simple page just hitting MemCache):

 Request was aborted after waiting too long to attempt to service your
 request.   --  User sees 500er

 GAE-Team, what is going on the last days?  In my opinion the Google App
 Engine is unreliable and looks more like a alpha- or beta-
 cloudenvrionment

 Please Google share you analysis with us.

 Cheers
 Mos





 2012/3/13 Sébastien Tromp sebastien.tr...@gmail.com

 Hello,

 Same thing here, since around an hour ago:
 AppID: fiveorbsgame  fiveorbsgame-test


 On Tue, Mar 13, 2012 at 9:59 AM, Miroslav Genov mge...@gmail.comwrote:

 I'm encountering the same issue with HR app. The spike got started in
 about ~1 hour from now.

 AppID: cmsevobg
 Datastore: HR

 On Tuesday, March 13, 2012 10:36:38 AM UTC+2, Richard Watson wrote:

 In case you're keeping track of issues thinking it's generally cleared
 up:

 I'm on HR and have noticed higher latencies (couple seconds instead of
 e.g. 300ms) lately and sometimes higher error rates (a few instead of 
 0-3).
 Yesterday over about 6 hours I got a ton of 60-second requests that threw
 500's with accompanying messages [1], usually on memcache sets hitting a
 deadline exceeded.  Also, over the last couple weeks I've been running 3
 instances permanently despite sometimes shutting them down manually.
  Usually I get by on one just fine with bursts of 2 or 3.  I've noticed
 that one instance serves the majority of traffic with the other two 
 serving
 maybe 50 requests over many hours, so the shutdown isn't aggressive 
 enough.

 This request caused a new process to be started for your
 application...
 +
 A problem was encountered with the process that handled this request,
 causing it to exit in the same request.

 App-id: 2dumo-hr


 On Tuesday, March 13, 2012 10:36:38 AM UTC+2, Richard Watson wrote:

 In case you're keeping track of issues thinking it's generally cleared
 up:

 I'm on HR and have noticed higher latencies (couple seconds instead of
 e.g. 300ms) lately and sometimes higher error rates (a few instead of 
 0-3).
 Yesterday over about 6 hours I got a ton of 60-second requests that threw
 500's with accompanying messages [1], usually on memcache sets hitting a
 deadline exceeded.  Also, over the last couple weeks I've been running 3
 instances permanently despite sometimes shutting them down manually.
  Usually I get by on one just fine with bursts of 2 or 3.  I've noticed
 that one instance serves the majority of traffic with the other two 
 serving
 maybe 50 requests over many hours, so the shutdown isn't aggressive 
 enough.

 This request caused a new process to be started for your
 application...
 +
 A problem was encountered with the process that handled this request,
 causing it to exit in the same request.

 App-id: 2dumo-hr

  --
 You received this message because you are subscribed to the Google
 Groups Google App Engine group.
 To view this discussion on the web visit
 https://groups.google.com/d/msg/google-appengine/-/YUHiXXkGPAgJ.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


  --
 You received this message because you are subscribed to the Google
 Groups Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.




 --
 Johan Euphrosine (proppy)
 Developer Programs Engineer
 Google Developer Relations

 --
 You received this message because

Re: [google-appengine] Re: Outages?

2012-03-13 Thread Riley Eynon-Lynch

We migrated our app to the HRD last night. With 4 GB of data quota in
around 2M entities it took 20-30 minutes. On the new app, we are seeing
response times at about 1% of the MS app - 100 times faster. Our app was
read only for less than three seconds - enough to affect 10 requests from 2
users.

We decided to do this without a very thorough testing because the app was
broken across all of our users on M/S, and will only break in a small
percentage of users on HRD. When it does break, it will be pretty
harmless, and we expect to fix the inconsistency problems by the end of
today. We just could not wait until Monday for a possible fix for the MS
app.

If you've got a similar setup or similar needs, I recommend a
hastier-than-usual switch. Just watch out for the email limits~

On Tue, Mar 13, 2012 at 9:04 AM, Johan Euphrosine pro...@google.com wrote:

If you are not using the datastore it should be trivial to move your
application to the new HRD infrastructure.

If you don't need the same appid, just create a new HRD application and
deploy your code on it.

If you need the same appid, use the self migration tool:
https://appengine.google.com/migrating?app_id=your application id

Feel free to open a production ticket if you need any assistance migrating
your application:

http://code.google.com/p/googleappengine/issues/entry?template=Production%20issue

On Tue, Mar 13, 2012 at 12:44 PM, charisl charisl...@gmail.com wrote:

Hello Ikai

Normal operation
during traffic peaks: instances ~25, ~45 requests/second and QPS 1

Will the announced maintenance for master/slave apps will solve the issue
for our app as well?

thanks in advance

Charis

On Friday, March 9, 2012 9:15:35 PM UTC+2, Ikai Lan wrote:

Hey everyone,

Here are a few things that will help:

1. Application IDs (--- if you have nothing else, at least provide this)
2. What is your QPS?
3. What % of your requests are errors?

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Fri, Mar 9, 2012 at 7:24 AM, Ronoaldo José de Lana Pereira
rpere...@beneficiofacil.com.**br rpere...@beneficiofacil.com.brwrote:

+1 for seeing the same problems on my app.

It started to be worse after maintenance on March 7.

Em sexta-feira, 9 de março de 2012 08h33min36s UTC-3, Nikolai escreveu:

And yes - sometimes we have got the same feeling, that we are the only
ones that use appengine in a production setting. You are not alone ;)

regards,
nikolai

Am Dienstag, 6. März 2012 22:17:37 UTC+1 schrieb Adam Sherman:

Am I the only one seeing short duration outages? They are being
reflected at:

http://code.google.com/status/appenginehttp://code.google.com/status/appengine

But I don't see anyone else complaining anywhere, so it makes me
worried.

To post to this group, send email to google-appengine@googlegroups.**
com google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscribe@**googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
.
For more options, visit this group at http://groups.google.com/**
group/google-appengine?hl=enhttp://groups.google.com/group/google-appengine?hl=en
.

--
Johan Euphrosine (proppy)
Developer Programs Engineer
Google Developer Relations

--
You received this message because you are subscribed to the Google Groups
Google App Engine group.
To post to this group, send email to

Re: [google-appengine] Re: Outages?

2012-03-13 Thread Mark



 I have had outages throughout this morning.  


  My app is bedbuzzserver.appspot.com, Java app on HR.

  From 2am until 7.48am.  I have filed Production issue 7138.

  My instances keep getting reset (and then taking too long to start up),  
my '*Current Load' *logs get reset to 0 
(e.g. bedbuzzserver.appspot.com/logon goes from 100 calls today, to 0)

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/mRSLiWjZgc0J.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

RE: [google-appengine] Re: Outages?

2012-03-13 Thread Brandon Wirtz

We have noticed that many of the downtimes Pingdom reports are the result of
AppsForDomains.  If you hit your app from another app, or via AOL, or
another provider that has a peering arrangement with AppEngine it will be
up.

I'm calling this AppsForDomains issues because typically during these
outages we get error pages in AppsforDomains admin pages.

In these instances Green Checks will show in the status for Appengine. But
your app will fail to resolve.

 

 

 

From: google-appengine@googlegroups.com
[mailto:google-appengine@googlegroups.com] On Behalf Of Rick Mangi
Sent: Tuesday, March 13, 2012 12:06 PM
To: google-appengine@googlegroups.com
Subject: [google-appengine] Re: Outages?

 

Same here. I use pingdom to monitor my site and it's been down on and off
for the past 24 hours to the tune of around 30 minutes.

 

I opened an enterprise support ticket but haven't heard anything back.

-- 
You received this message because you are subscribed to the Google Groups
Google App Engine group.
To view this discussion on the web visit
https://groups.google.com/d/msg/google-appengine/-/TAjLzJanor4J.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

2012-03-13 Thread j

Ikai,

I have not moved to HRD yet. But I am pretty sure I am the only user of my 
application. However, ever since couples of days back, not only that it is 
slow but I kept on running out of quota, despite the fact that I turned on 
the billing. I have switched off billing yesterday as it didn't help me.

Can I get a refund? The request is due to the fact that I am hitting the 
app less than 20 times per day, and I run out of quotas. Enabled billing 
didn't help. I am perplexed with a single user, how would it be possible to 
exhaust 0.05 million operations? see below as an example -

Datastore Read Operations
[image: 100%]
100%0.05 of 0.05 Million Ops0.00$0.70/ Million Ops$0.00

I don't want publish my appid, if you would like to know, I get text you at 
your google voice number. If that helps.

Thanks for any reply.

On Monday, March 12, 2012 2:52:29 PM UTC-4, Ikai Lan wrote:

 Hi Riley,

 That's a legitimate question, and one that we haven't officially answered 
 yet. It's certainly the direction that things have been moving simply due 
 to the nature of production management. Given that the SLA applies to HRD 
 and not master/slave applications, you are definitely going to get a better 
 quality of service migrating to HRD. In fact, I strongly advise that you do 
 so. 

 One challenge that we have when dealing with issues is to decide whether 
 we should do emergency maintenance that requires downtime. With any 
 production system, it's not always guaranteed that maintenance will result 
 in issues being completely resolved, which would be really bad for app 
 developers. At what threshold do we determine that a downtime with no 
 guarantee of addressing the issues is worthwhile? Global 0.1% error rate? 
 1%? The call is not always clear cut because those errors may not be evenly 
 distributed, and the impact may be huge, or it may be small. With 
 master/slave applications, we do what we can to address the short term 
 symptoms as well as the underlying system issues without impacting serving, 
 which is often an order of magnitude more difficult (It kind of reminds me 
 of that scene in Indiana Jones where he takes an artifact, swapping it with 
 a bag of sand as quickly as possible to try to avoid setting off traps. 
 Pillaging of historic artifacts is way easier when it's not dangerous, not 
 speaking from personal experience). When your application runs on High 
 Replication, the call is easy: there's no downtime required in 99% of 
 cases, so we perform the maintenance right away because if it doesn't 
 address the issue, there's no serving downtime for users.

 If you're not subscribed to downtime-notify, I recommend that you do so. 
 Announcements like this will NOT and never will be moving to StackOverflow:


 https://groups.google.com/forum/?fromgroups#!forum/google-appengine-downtime-notify

 We may be announcing a maintenance in the very near future that will 
 impact the serving of master/slave applications.

 --
 Ikai Lan 
 Developer Programs Engineer, Google App Engine
 plus.ikailan.com



 On Mon, Mar 12, 2012 at 9:46 AM, Riley rileyl...@gmail.com wrote:

 Ikai, it sounds like support for HR apps is being prioritized.  Is that 
 the case? Should we expect that to be the case in the future? Sorry if 
 that's documented somewhere already~

 Riley


 On Monday, March 12, 2012 11:44:59 AM UTC-5, Riley wrote:

 Our appid is activegrade, we use the m/s datastore, and get from 0-10 
 QPS throughout the day.  Normally we have 1-4 instances running, but since 
 this seems *mostly* related to startup, we dedicated 10 idle resident 
 instances to run all the time.  This covers us a little, but still, when a 
 user triggers a new instance, they get the 60+ second wait and then an 
 error. Ugh!  Our costs are relatively minor - about $20 a day now that we 
 are running these 10 ordinarily unnecessary instances - but this is a big 
 cost for us, and embarrassing too.

 On Tuesday, March 6, 2012 3:17:37 PM UTC-6, Adam Sherman wrote:

 Am I the only one seeing short duration outages? They are being 
 reflected at:

 http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

 But I don't see anyone else complaining anywhere, so it makes me 
 worried.

 A.

  -- 
 You received this message because you are subscribed to the Google Groups 
 Google App Engine group.
 To view this discussion on the web visit 
 https://groups.google.com/d/msg/google-appengine/-/VzKRK5UG96MJ.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to 
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/google-appengine?hl=en.




-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/O-8tEZUbUEYJ.
To post to this group, send email to

Re: [google-appengine] Re: Outages?

2012-03-13 Thread Mauricio Aristizabal

Potential fix: set performance sliders to auto.

This is purely anecdotal but it might mean something:  After reading some
post this afternoon about the instance settings not really working I
switched to AUTO idle instances and AUTO pending latency (before they were
set to 1-1 and 25ms-1.5s respectively).

That was about 5 hours ago, and within an hour or so everything started
working fine.  Before that, problems had been continuous as far as I could
tell for 6 days or so.

Or maybe the AppEngine guys finally got it under control.



On Tue, Mar 13, 2012 at 5:58 PM, stephenp slpe...@gmail.com wrote:

 One more here.

 appid: carglyplatform (HRD)

 It's been flaky off-and-on for a couple weeks, yesterday was better, today
 bad again. Lots of warmup errors, instance restarts, errors in general.

 Stephen


 On Tuesday, March 6, 2012 3:17:37 PM UTC-6, Adam Sherman wrote:

 Am I the only one seeing short duration outages? They are being reflected
 at:

 http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

 But I don't see anyone else complaining anywhere, so it makes me worried.

 A.


 On Tuesday, March 6, 2012 3:17:37 PM UTC-6, Adam Sherman wrote:

 Am I the only one seeing short duration outages? They are being reflected
 at:

 http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

 But I don't see anyone else complaining anywhere, so it makes me worried.

 A.


 On Tuesday, March 6, 2012 3:17:37 PM UTC-6, Adam Sherman wrote:

 Am I the only one seeing short duration outages? They are being reflected
 at:

 http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

 But I don't see anyone else complaining anywhere, so it makes me worried.

 A.

  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To view this discussion on the web visit
 https://groups.google.com/d/msg/google-appengine/-/C5nrBOmPaPcJ.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

Hi Riley,

That's a legitimate question, and one that we haven't officially answered
yet. It's certainly the direction that things have been moving simply due
to the nature of production management. Given that the SLA applies to HRD
and not master/slave applications, you are definitely going to get a better
quality of service migrating to HRD. In fact, I strongly advise that you do
so.

One challenge that we have when dealing with issues is to decide whether we
should do emergency maintenance that requires downtime. With any production
system, it's not always guaranteed that maintenance will result in issues
being completely resolved, which would be really bad for app developers. At
what threshold do we determine that a downtime with no guarantee of
addressing the issues is worthwhile? Global 0.1% error rate? 1%? The call
is not always clear cut because those errors may not be evenly distributed,
and the impact may be huge, or it may be small. With master/slave
applications, we do what we can to address the short term symptoms as well
as the underlying system issues without impacting serving, which is often
an order of magnitude more difficult (It kind of reminds me of that scene
in Indiana Jones where he takes an artifact, swapping it with a bag of sand
as quickly as possible to try to avoid setting off traps. Pillaging of
historic artifacts is way easier when it's not dangerous, not speaking from
personal experience). When your application runs on High Replication, the
call is easy: there's no downtime required in 99% of cases, so we perform
the maintenance right away because if it doesn't address the issue, there's
no serving downtime for users.

If you're not subscribed to downtime-notify, I recommend that you do so.
Announcements like this will NOT and never will be moving to StackOverflow:

https://groups.google.com/forum/?fromgroups#!forum/google-appengine-downtime-notify

We may be announcing a maintenance in the very near future that will impact
the serving of master/slave applications.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com



On Mon, Mar 12, 2012 at 9:46 AM, Riley rileyl...@gmail.com wrote:

 Ikai, it sounds like support for HR apps is being prioritized.  Is that
 the case? Should we expect that to be the case in the future? Sorry if
 that's documented somewhere already~

 Riley


 On Monday, March 12, 2012 11:44:59 AM UTC-5, Riley wrote:

 Our appid is activegrade, we use the m/s datastore, and get from 0-10 QPS
 throughout the day.  Normally we have 1-4 instances running, but since this
 seems *mostly* related to startup, we dedicated 10 idle resident instances
 to run all the time.  This covers us a little, but still, when a user
 triggers a new instance, they get the 60+ second wait and then an error.
 Ugh!  Our costs are relatively minor - about $20 a day now that we are
 running these 10 ordinarily unnecessary instances - but this is a big cost
 for us, and embarrassing too.

 On Tuesday, March 6, 2012 3:17:37 PM UTC-6, Adam Sherman wrote:

 Am I the only one seeing short duration outages? They are being
 reflected at:

 http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

 But I don't see anyone else complaining anywhere, so it makes me worried.

 A.

  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To view this discussion on the web visit
 https://groups.google.com/d/msg/google-appengine/-/VzKRK5UG96MJ.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

Regarding that maintenance period:

https://groups.google.com/forum/?fromgroups#!topic/google-appengine-downtime-notify/CO_x02OF9Ak

It's happening next Monday, March 19th at 4pm US/Pacific (19th March, 23:00
GMT).

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Mon, Mar 12, 2012 at 11:52 AM, Ikai Lan (Google) ika...@google.comwrote:

Hi Riley,

That's a legitimate question, and one that we haven't officially answered
yet. It's certainly the direction that things have been moving simply due
to the nature of production management. Given that the SLA applies to HRD
and not master/slave applications, you are definitely going to get a better
quality of service migrating to HRD. In fact, I strongly advise that you do
so.

One challenge that we have when dealing with issues is to decide whether
we should do emergency maintenance that requires downtime. With any
production system, it's not always guaranteed that maintenance will result
in issues being completely resolved, which would be really bad for app
developers. At what threshold do we determine that a downtime with no
guarantee of addressing the issues is worthwhile? Global 0.1% error rate?
1%? The call is not always clear cut because those errors may not be evenly
distributed, and the impact may be huge, or it may be small. With
master/slave applications, we do what we can to address the short term
symptoms as well as the underlying system issues without impacting serving,
which is often an order of magnitude more difficult (It kind of reminds me
of that scene in Indiana Jones where he takes an artifact, swapping it with
a bag of sand as quickly as possible to try to avoid setting off traps.
Pillaging of historic artifacts is way easier when it's not dangerous, not
speaking from personal experience). When your application runs on High
Replication, the call is easy: there's no downtime required in 99% of
cases, so we perform the maintenance right away because if it doesn't
address the issue, there's no serving downtime for users.

If you're not subscribed to downtime-notify, I recommend that you do so.
Announcements like this will NOT and never will be moving to StackOverflow:

https://groups.google.com/forum/?fromgroups#!forum/google-appengine-downtime-notify

We may be announcing a maintenance in the very near future that will
impact the serving of master/slave applications.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Mon, Mar 12, 2012 at 9:46 AM, Riley rileyl...@gmail.com wrote:

Ikai, it sounds like support for HR apps is being prioritized. Is that
the case? Should we expect that to be the case in the future? Sorry if
that's documented somewhere already~

Riley

On Monday, March 12, 2012 11:44:59 AM UTC-5, Riley wrote:

Our appid is activegrade, we use the m/s datastore, and get from 0-10
QPS throughout the day. Normally we have 1-4 instances running, but since
this seems *mostly* related to startup, we dedicated 10 idle resident
instances to run all the time. This covers us a little, but still, when a
user triggers a new instance, they get the 60+ second wait and then an
error. Ugh! Our costs are relatively minor - about $20 a day now that we
are running these 10 ordinarily unnecessary instances - but this is a big
cost for us, and embarrassing too.

On Tuesday, March 6, 2012 3:17:37 PM UTC-6, Adam Sherman wrote:

Am I the only one seeing short duration outages? They are being
reflected at:

http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

But I don't see anyone else complaining anywhere, so it makes me
worried.

Re: [google-appengine] Re: Outages?

2012-03-12 Thread Riley Eynon-Lynch

Thanks a lot. FYI: That post says Wednesday the 19th instead of Monday the
19th.

Riley

On Mon, Mar 12, 2012 at 3:52 PM, Ikai Lan (Google) ika...@google.comwrote:

Regarding that maintenance period:

https://groups.google.com/forum/?fromgroups#!topic/google-appengine-downtime-notify/CO_x02OF9Ak

It's happening next Monday, March 19th at 4pm US/Pacific (19th March,
23:00 GMT).

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Mon, Mar 12, 2012 at 11:52 AM, Ikai Lan (Google) ika...@google.comwrote:

Hi Riley,

If you're not subscribed to downtime-notify, I recommend that you do so.
Announcements like this will NOT and never will be moving to StackOverflow:

https://groups.google.com/forum/?fromgroups#!forum/google-appengine-downtime-notify

We may be announcing a maintenance in the very near future that will
impact the serving of master/slave applications.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Mon, Mar 12, 2012 at 9:46 AM, Riley rileyl...@gmail.com wrote:

Ikai, it sounds like support for HR apps is being prioritized. Is that
the case? Should we expect that to be the case in the future? Sorry if
that's documented somewhere already~

Riley

On Monday, March 12, 2012 11:44:59 AM UTC-5, Riley wrote:

On Tuesday, March 6, 2012 3:17:37 PM UTC-6, Adam Sherman wrote:

Am I the only one seeing short duration outages? They are being
reflected at:

http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

But I don't see anyone else complaining anywhere, so it makes me
worried.

Re: [google-appengine] Re: Outages?

GAH, it's like no matter how many times I read these things over I always
make at least one mistake.

And that's why code review is a Good Thing.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Mon, Mar 12, 2012 at 1:54 PM, Riley Eynon-Lynch rileyl...@gmail.comwrote:

Thanks a lot. FYI: That post says Wednesday the 19th instead of Monday
the 19th.

Riley

On Mon, Mar 12, 2012 at 3:52 PM, Ikai Lan (Google) ika...@google.comwrote:

Regarding that maintenance period:

https://groups.google.com/forum/?fromgroups#!topic/google-appengine-downtime-notify/CO_x02OF9Ak

It's happening next Monday, March 19th at 4pm US/Pacific (19th March,
23:00 GMT).

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Mon, Mar 12, 2012 at 11:52 AM, Ikai Lan (Google) ika...@google.comwrote:

Hi Riley,

That's a legitimate question, and one that we haven't officially
answered yet. It's certainly the direction that things have been moving
simply due to the nature of production management. Given that the SLA
applies to HRD and not master/slave applications, you are definitely going
to get a better quality of service migrating to HRD. In fact, I strongly
advise that you do so.

If you're not subscribed to downtime-notify, I recommend that you do so.
Announcements like this will NOT and never will be moving to StackOverflow:

https://groups.google.com/forum/?fromgroups#!forum/google-appengine-downtime-notify

We may be announcing a maintenance in the very near future that will
impact the serving of master/slave applications.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Mon, Mar 12, 2012 at 9:46 AM, Riley rileyl...@gmail.com wrote:

Ikai, it sounds like support for HR apps is being prioritized. Is that
the case? Should we expect that to be the case in the future? Sorry if
that's documented somewhere already~

Riley

On Monday, March 12, 2012 11:44:59 AM UTC-5, Riley wrote:

On Tuesday, March 6, 2012 3:17:37 PM UTC-6, Adam Sherman wrote:

Am I the only one seeing short duration outages? They are being
reflected at:

http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

But I don't see anyone else complaining anywhere, so it makes me
worried.

Re: [google-appengine] Re: Outages?

2012-03-12 Thread Amit Sangani

Hi Ikan,

We wouldn't mind moving to HRD from M/S, but isn't it 3X more expensive?

Also, what's the minimal way to impact our users when datastore is in 
read-only mode during downtimes? Consider that every action our users take 
involves writing to a datastore. Will using memcache help? Will memcache be 
available without interruption during datastore downtime?

thanks!
Amit

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/MooASjwFQ28J.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

RE: [google-appengine] Re: Outages?

2012-03-12 Thread Brandon Wirtz

 We wouldn't mind moving to HRD from M/S, but isn't it 3X more expensive?

 

No.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

2012-03-12 Thread Chris Ramsdale

Moving to HRD is the safest way to ensure that your users are not impacted
during a downtime. Memcache and other mechanisms can be used, but will
definitely not scale and aren't guaranteed to be resilient in the face of
all downtime scenarios.

For the particular issues reported on this thread, we have a few root
causes that we're looking into. In terms of a fix though, all of the
affected apps are running on M/S and, as a result, our options are much
more constrained -- we're not able to move the as apps freely as we can
with HRD-based applications.

M/S worked really well when it was first rolled out, but given the increase
in number of apps and datastore transactions we needed an even better
solution -- thus HRD. While the pros and cons of HRD have been discussed
and debated within this group, the simple fact is: if you want to minimize
your exposure to downtimes you need to move over to HRD. There's an SLA of
99.95%, which we've consistently
beathttp://googleappengine.blogspot.com/2012/01/happy-birthday-high-replication.htmlmonth
over month.

We're committed to resolving the current issue, but I strongly urge anyone
running on M/S to make the move over to HRD. It's the quickest and most
long-term fix that you can make.

-- Chris Ramsdale

Product Manager, Google App Engine


On Mon, Mar 12, 2012 at 2:33 PM, Amit Sangani amit.sang...@gmail.comwrote:

 Hi Ikan,

 We wouldn't mind moving to HRD from M/S, but isn't it 3X more expensive?

 Also, what's the minimal way to impact our users when datastore is in
 read-only mode during downtimes? Consider that every action our users take
 involves writing to a datastore. Will using memcache help? Will memcache be
 available without interruption during datastore downtime?

 thanks!
 Amit

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To view this discussion on the web visit
 https://groups.google.com/d/msg/google-appengine/-/MooASjwFQ28J.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

HRD is not 3x more expensive. We lowered the cost to make it match the
master/slave cost.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com



On Mon, Mar 12, 2012 at 2:33 PM, Amit Sangani amit.sang...@gmail.comwrote:

 Hi Ikan,

 We wouldn't mind moving to HRD from M/S, but isn't it 3X more expensive?

 Also, what's the minimal way to impact our users when datastore is in
 read-only mode during downtimes? Consider that every action our users take
 involves writing to a datastore. Will using memcache help? Will memcache be
 available without interruption during datastore downtime?

 thanks!
 Amit

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To view this discussion on the web visit
 https://groups.google.com/d/msg/google-appengine/-/MooASjwFQ28J.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

Quick update: the time has been pushed back 2 hours to 6PM US/Pacific. See
the latest message here:

https://groups.google.com/forum/?fromgroups#!topic/google-appengine-downtime-notify/CO_x02OF9Ak

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Mon, Mar 12, 2012 at 1:59 PM, Ikai Lan (Google) ika...@google.comwrote:

GAH, it's like no matter how many times I read these things over I always
make at least one mistake.

And that's why code review is a Good Thing.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Mon, Mar 12, 2012 at 1:54 PM, Riley Eynon-Lynch rileyl...@gmail.comwrote:

Thanks a lot. FYI: That post says Wednesday the 19th instead of Monday
the 19th.

Riley

On Mon, Mar 12, 2012 at 3:52 PM, Ikai Lan (Google) ika...@google.comwrote:

Regarding that maintenance period:

https://groups.google.com/forum/?fromgroups#!topic/google-appengine-downtime-notify/CO_x02OF9Ak

It's happening next Monday, March 19th at 4pm US/Pacific (19th March,
23:00 GMT).

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Mon, Mar 12, 2012 at 11:52 AM, Ikai Lan (Google)
ika...@google.comwrote:

Hi Riley,

That's a legitimate question, and one that we haven't officially
answered yet. It's certainly the direction that things have been moving
simply due to the nature of production management. Given that the SLA
applies to HRD and not master/slave applications, you are definitely going
to get a better quality of service migrating to HRD. In fact, I strongly
advise that you do so.

One challenge that we have when dealing with issues is to decide
whether we should do emergency maintenance that requires downtime. With any
production system, it's not always guaranteed that maintenance will result
in issues being completely resolved, which would be really bad for app
developers. At what threshold do we determine that a downtime with no
guarantee of addressing the issues is worthwhile? Global 0.1% error rate?
1%? The call is not always clear cut because those errors may not be evenly
distributed, and the impact may be huge, or it may be small. With
master/slave applications, we do what we can to address the short term
symptoms as well as the underlying system issues without impacting serving,
which is often an order of magnitude more difficult (It kind of reminds me
of that scene in Indiana Jones where he takes an artifact, swapping it with
a bag of sand as quickly as possible to try to avoid setting off traps.
Pillaging of historic artifacts is way easier when it's not dangerous, not
speaking from personal experience). When your application runs on High
Replication, the call is easy: there's no downtime required in 99% of
cases, so we perform the maintenance right away because if it doesn't
address the issue, there's no serving downtime for users.

If you're not subscribed to downtime-notify, I recommend that you do
so. Announcements like this will NOT and never will be moving to
StackOverflow:

https://groups.google.com/forum/?fromgroups#!forum/google-appengine-downtime-notify

We may be announcing a maintenance in the very near future that will
impact the serving of master/slave applications.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Mon, Mar 12, 2012 at 9:46 AM, Riley rileyl...@gmail.com wrote:

Ikai, it sounds like support for HR apps is being prioritized. Is
that the case? Should we expect that to be the case in the future? Sorry
if
that's documented somewhere already~

Riley

On Monday, March 12, 2012 11:44:59 AM UTC-5, Riley wrote:

Our appid is activegrade, we use the m/s datastore, and get from 0-10
QPS throughout the day. Normally we have 1-4 instances running, but
since
this seems *mostly* related to startup, we dedicated 10 idle resident
instances to run all the time. This covers us a little, but still, when
a
user triggers a new instance, they get the 60+ second wait and then an
error. Ugh! Our costs are relatively minor - about $20 a day now that we
are running these 10 ordinarily unnecessary instances - but this is a big
cost for us, and embarrassing too.

On Tuesday, March 6, 2012 3:17:37 PM UTC-6, Adam Sherman wrote:

Am I the only one seeing short duration outages? They are being
reflected at:

http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

But I don't see anyone else complaining anywhere, so it makes me
worried.

To post to this group, send email to google-appengine@googlegroups.com
.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at

Re: [google-appengine] Re: Outages?

2012-03-12 Thread philburk

Hi Chris,

On Monday, March 12, 2012 3:55:00 PM UTC-7, Chris Ramsdale wrote:

 For the particular issues reported on this thread, we have a few root 
 causes that we're looking into. In terms of a fix though, all of the 
 affected apps are running on M/S


Our app prodicta is using HRD, not M/S. We have been seeing a big 
degradation in performance the last few days. I just ran a test and my 
client code timed out because it took 47 seconds to load a 200KB static JAR 
file. It took 34 seconds to load a small PNG file. Customers are 
complaining.

I starred this issue:
http://code.google.com/p/googleappengine/issues/detail?id=7093 

Is this problem being considered a high priority? The status chart does not 
seem to reflect the problems being reported.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/tYTjYwmVIZoJ.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

RE: [google-appengine] Re: Outages?

2012-03-12 Thread Brandon Wirtz

Should update the message when you start a new app that says it costs 3x as
much.

From: google-appengine@googlegroups.com
[mailto:google-appengine@googlegroups.com] On Behalf Of Ikai Lan (Google)
Sent: Monday, March 12, 2012 4:09 PM
To: google-appengine@googlegroups.com
Subject: Re: [google-appengine] Re: Outages?

HRD is not 3x more expensive. We lowered the cost to make it match the
master/slave cost.

--

Ikai Lan 
Developer Programs Engineer, Google App Engine

plus.ikailan.com http://plus.ikailan.com/ 

On Mon, Mar 12, 2012 at 2:33 PM, Amit Sangani amit.sang...@gmail.com
wrote:

Hi Ikan,

We wouldn't mind moving to HRD from M/S, but isn't it 3X more expensive?

Also, what's the minimal way to impact our users when datastore is in
read-only mode during downtimes? Consider that every action our users take
involves writing to a datastore. Will using memcache help? Will memcache be
available without interruption during datastore downtime?

thanks!

Amit

-- 
You received this message because you are subscribed to the Google Groups
Google App Engine group.

To view this discussion on the web visit
https://groups.google.com/d/msg/google-appengine/-/MooASjwFQ28J.

To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com
mailto:google-appengine%2bunsubscr...@googlegroups.com .
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

2012-03-11 Thread Alexey Konovalov

Ikai,

Our apps ids: 

rvaserver
rvauser
contentfinancial
contentsports

QPS and error rates differ but they've all been getting a lot of 
DeadlineExceeded exceptions and the number of instances has been higher 
than usual over the last couple of days.

Regards,
Alexey



On Friday, March 9, 2012 2:15:35 PM UTC-5, Ikai Lan wrote:

 Hey everyone,

 Here are a few things that will help:

 1. Application IDs (--- if you have nothing else, at least provide this)
 2. What is your QPS?
 3. What % of your requests are errors?

 --
 Ikai Lan 
 Developer Programs Engineer, Google App Engine
 plus.ikailan.com



 On Fri, Mar 9, 2012 at 7:24 AM, Ronoaldo José de Lana Pereira 
 rpere...@beneficiofacil.com.br wrote:

 +1 for seeing the same problems on my app.

 It started to be worse after maintenance on March 7.

 Em sexta-feira, 9 de março de 2012 08h33min36s UTC-3, Nikolai escreveu:

 +1
 we had to move to our backup systems. Everything is full of 500 errors 
 or hardcore latency.
 Most of the 500 errors we see aren't even logged so this seems to be a 
 goole problem one abstraction layer above the app.

 And yes - sometimes we have got the same feeling, that we are the only 
 ones that use appengine in a production setting. You are not alone ;)

 regards,
 nikolai

 Am Dienstag, 6. März 2012 22:17:37 UTC+1 schrieb Adam Sherman:

 Am I the only one seeing short duration outages? They are being 
 reflected at:

 http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

 But I don't see anyone else complaining anywhere, so it makes me 
 worried.

 A.

  -- 
 You received this message because you are subscribed to the Google Groups 
 Google App Engine group.
 To view this discussion on the web visit 
 https://groups.google.com/d/msg/google-appengine/-/yixu1yAlMs4J.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to 
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/google-appengine?hl=en.




-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/Qpy-ppdrhrAJ.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

2012-03-10 Thread Amit Sangani

appid: textyserver

Still getting lots of exceptions, mainly:

1) com.google.apphosting.runtime.HardDeadlineExceededError exceptions, 
2) Failed startup of context 
com.google.apphosting.utils.jetty.RuntimeAppEngineWebAppContext
3) javax.jdo.JDOException: Transaction failed to commit at 
org.datanucleus.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:419)
 
at 
org.datanucleus.jdo.JDOPersistenceManager.close(JDOPersistenceManager.java:281) 

Status page is saying everything is normal - 
http://code.google.com/status/appengine which does not seem true.

Please let us know if you need more information. 

thanks!
Amit

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/8NS6YYPFVtkJ.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

2012-03-10 Thread nischalshetty

Best explanation ever.

On Wednesday, March 7, 2012 9:45:11 PM UTC+5:30, Brandon Wirtz wrote:

  So, apparently, we all imagined the problem. The status page no longer
  admits to anything.

 In most systems the Uptime is 100% minus the summation of the downtime of
 all other systems.  The exception to this rule is logging. When Logging
 fails to record the downtime, Uptime goes up.  As a result Google has been
 working hard to build a logging system that goes down just ahead of all
 other systems, and comes up shortly after.




-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/LeAHoX7YXR4J.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

Hey everyone,

Here are a few things that will help:

1. Application IDs (--- if you have nothing else, at least provide this)
2. What is your QPS?
3. What % of your requests are errors?

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com



On Fri, Mar 9, 2012 at 7:24 AM, Ronoaldo José de Lana Pereira 
rpere...@beneficiofacil.com.br wrote:

 +1 for seeing the same problems on my app.

 It started to be worse after maintenance on March 7.

 Em sexta-feira, 9 de março de 2012 08h33min36s UTC-3, Nikolai escreveu:

 +1
 we had to move to our backup systems. Everything is full of 500 errors or
 hardcore latency.
 Most of the 500 errors we see aren't even logged so this seems to be a
 goole problem one abstraction layer above the app.

 And yes - sometimes we have got the same feeling, that we are the only
 ones that use appengine in a production setting. You are not alone ;)

 regards,
 nikolai

 Am Dienstag, 6. März 2012 22:17:37 UTC+1 schrieb Adam Sherman:

 Am I the only one seeing short duration outages? They are being
 reflected at:

 http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

 But I don't see anyone else complaining anywhere, so it makes me worried.

 A.

  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To view this discussion on the web visit
 https://groups.google.com/d/msg/google-appengine/-/yixu1yAlMs4J.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

2012-03-09 Thread Alexander Trakhimenok

Hey Ikai,

Our app id: petaclasses

QPS: 5-20 requests per second

Current instances in dashboard: 110 - 160
Usual instances: 8-15 

It's hard to say % of failed requests as we have also request that fail for 
other reasons (e.g. non existing pages, etc) and not sure how easily 
separate them.

By the way, are you guys considering to create a page where we can 
post/report this data in some structured way and join an issue so you can 
accumulate and understand the scale of an issue easily.

Alex

On Friday, 9 March 2012 15:15:35 UTC-4, Ikai Lan wrote:

 Hey everyone,

 Here are a few things that will help:

 1. Application IDs (--- if you have nothing else, at least provide this)
 2. What is your QPS?
 3. What % of your requests are errors?

 --
 Ikai Lan 
 Developer Programs Engineer, Google App Engine
 plus.ikailan.com



 On Fri, Mar 9, 2012 at 7:24 AM, Ronoaldo José de Lana Pereira 
 rpere...@beneficiofacil.com.br wrote:

 +1 for seeing the same problems on my app.

 It started to be worse after maintenance on March 7.

 Em sexta-feira, 9 de março de 2012 08h33min36s UTC-3, Nikolai escreveu:

 +1
 we had to move to our backup systems. Everything is full of 500 errors 
 or hardcore latency.
 Most of the 500 errors we see aren't even logged so this seems to be a 
 goole problem one abstraction layer above the app.

 And yes - sometimes we have got the same feeling, that we are the only 
 ones that use appengine in a production setting. You are not alone ;)

 regards,
 nikolai

 Am Dienstag, 6. März 2012 22:17:37 UTC+1 schrieb Adam Sherman:

 Am I the only one seeing short duration outages? They are being 
 reflected at:

 http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

 But I don't see anyone else complaining anywhere, so it makes me 
 worried.

 A.

  -- 
 You received this message because you are subscribed to the Google Groups 
 Google App Engine group.
 To view this discussion on the web visit 
 https://groups.google.com/d/msg/google-appengine/-/yixu1yAlMs4J.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to 
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/google-appengine?hl=en.




-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/ErrbHpuYmWgJ.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

Alex, to answer that question: yes. We are looking to revamp the production
issues tracker which is far from optimal. When users can join or aggregate
issues, it allows us to quickly separate actual infrastructure hiccups from
user code issues.

Thanks for the info! Is there any other behavior you can report? Does it
sound reasonable that you have 110-160 instances because of long startup
teams leading to more instances required to serve the same load? Are you
Python, Java or Go, and do you have concurrent requests enabled?

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Fri, Mar 9, 2012 at 11:48 AM, Alexander Trakhimenok
alexander.trakhime...@gmail.com wrote:

Hey Ikai,

Our app id: petaclasses

QPS: 5-20 requests per second

Current instances in dashboard: 110 - 160
Usual instances: 8-15

It's hard to say % of failed requests as we have also request that fail
for other reasons (e.g. non existing pages, etc) and not sure how easily
separate them.

By the way, are you guys considering to create a page where we can
post/report this data in some structured way and join an issue so you can
accumulate and understand the scale of an issue easily.

Alex

On Friday, 9 March 2012 15:15:35 UTC-4, Ikai Lan wrote:

Hey everyone,

Here are a few things that will help:

1. Application IDs (--- if you have nothing else, at least provide this)
2. What is your QPS?
3. What % of your requests are errors?

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Fri, Mar 9, 2012 at 7:24 AM, Ronoaldo José de Lana Pereira
rpere...@beneficiofacil.com.**br rpere...@beneficiofacil.com.br wrote:

+1 for seeing the same problems on my app.

It started to be worse after maintenance on March 7.

Em sexta-feira, 9 de março de 2012 08h33min36s UTC-3, Nikolai escreveu:

And yes - sometimes we have got the same feeling, that we are the only
ones that use appengine in a production setting. You are not alone ;)

regards,
nikolai

Am Dienstag, 6. März 2012 22:17:37 UTC+1 schrieb Adam Sherman:

Am I the only one seeing short duration outages? They are being
reflected at:

http://code.google.com/status/appenginehttp://code.google.com/status/appengine

But I don't see anyone else complaining anywhere, so it makes me
worried.

Re: [google-appengine] Re: Outages?

2012-03-09 Thread Alexander Trakhimenok

We are Python 2.5 (no concurrent).

Yes, it seems the start-up time is just crazy high for at least some or all
instances.

I also noticed that there are lot's of instances that served just 1 request
and have average latency 0ms and have QPS=0 average instance age about 8-9
minutes (up to 11 minutes). For me it seems like an instance is created to
serve static content and not used anymore and stays here until it die in a
while.

At the moment we have 264 active instances and it's killing our budget :( -
see the screenshot attached. We had 2 hours downtime due to exceeded budget.

Alex

On Friday, 9 March 2012 15:57:27 UTC-4, Ikai Lan wrote:

Alex, to answer that question: yes. We are looking to revamp the
production issues tracker which is far from optimal. When users can join or
aggregate issues, it allows us to quickly separate actual infrastructure
hiccups from user code issues.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Fri, Mar 9, 2012 at 11:48 AM, Alexander Trakhimenok
alexander.trakhime...@gmail.com wrote:

Hey Ikai,

Our app id: petaclasses

QPS: 5-20 requests per second

Current instances in dashboard: 110 - 160
Usual instances: 8-15

It's hard to say % of failed requests as we have also request that fail
for other reasons (e.g. non existing pages, etc) and not sure how easily
separate them.

By the way, are you guys considering to create a page where we can
post/report this data in some structured way and join an issue so you can
accumulate and understand the scale of an issue easily.

Alex

On Friday, 9 March 2012 15:15:35 UTC-4, Ikai Lan wrote:

Hey everyone,

Here are a few things that will help:

1. Application IDs (--- if you have nothing else, at least provide this)
2. What is your QPS?
3. What % of your requests are errors?

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Fri, Mar 9, 2012 at 7:24 AM, Ronoaldo José de Lana Pereira
rpere...@beneficiofacil.com.**br rpere...@beneficiofacil.com.brwrote:

+1 for seeing the same problems on my app.

It started to be worse after maintenance on March 7.

Em sexta-feira, 9 de março de 2012 08h33min36s UTC-3, Nikolai escreveu:

And yes - sometimes we have got the same feeling, that we are the only
ones that use appengine in a production setting. You are not alone ;)

regards,
nikolai

Am Dienstag, 6. März 2012 22:17:37 UTC+1 schrieb Adam Sherman:

Am I the only one seeing short duration outages? They are being
reflected at:

http://code.google.com/status/appenginehttp://code.google.com/status/appengine

But I don't see anyone else complaining anywhere, so it makes me
worried.

To post to this group, send email to google-appengine@googlegroups.**
com google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscribe@**googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.com
.
For more options, visit this group at http://groups.google.com/**
group/google-appengine?hl=enhttp://groups.google.com/group/google-appengine?hl=en
.

--
You received this message because you are subscribed to the Google Groups
Google App Engine group.
To view this discussion on the web visit
https://groups.google.com/d/msg/google-appengine/-/DAWn9eldNmEJ.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.

attachment: Screen Shot 2012-03-09 at 16.12.27.png

Re: [google-appengine] Re: Outages?

2012-03-09 Thread Ronoaldo José de Lana Pereira

Just a follow up:
1. Application Id: oferta-unica
2. QPS: Currently around ~10 dynamic req/sec, overall ~32 req/sec
3. After disabling concurrent requests, ~0.6 errors/sec; before, ~1.5
errors/sec.

Like Alexanders said, some of the errors aren't due to this issue, but I
can confirm that we have lots of 500 user-facing errors because our custom
500 error page sends events in Google Analytics:

https://lh5.googleusercontent.com/-UWJyZCb-zjE/T1pmEnDI3ZI/AC8/_m27Og4FPEg/s1600/appengine-error-rate.png

Thanks for your help.

Em sexta-feira, 9 de março de 2012 16h15min35s UTC-3, Ikai Lan escreveu:

Hey everyone,

Here are a few things that will help:

1. Application IDs (--- if you have nothing else, at least provide this)
2. What is your QPS?
3. What % of your requests are errors?

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Fri, Mar 9, 2012 at 7:24 AM, Ronoaldo José de Lana Pereira
rpere...@beneficiofacil.com.br wrote:

+1 for seeing the same problems on my app.

It started to be worse after maintenance on March 7.

Em sexta-feira, 9 de março de 2012 08h33min36s UTC-3, Nikolai escreveu:

And yes - sometimes we have got the same feeling, that we are the only
ones that use appengine in a production setting. You are not alone ;)

regards,
nikolai

Am Dienstag, 6. März 2012 22:17:37 UTC+1 schrieb Adam Sherman:

Am I the only one seeing short duration outages? They are being
reflected at:

http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

But I don't see anyone else complaining anywhere, so it makes me
worried.

--
You received this message because you are subscribed to the Google Groups
Google App Engine group.
To view this discussion on the web visit
https://groups.google.com/d/msg/google-appengine/-/X341mSA2KIcJ.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

2012-03-09 Thread Nick

appid: i-strive-to
java - thread safe set to true.

On Friday, March 9, 2012 2:15:35 PM UTC-5, Ikai Lan wrote:

 Hey everyone,

 Here are a few things that will help:

 1. Application IDs (--- if you have nothing else, at least provide this)
 2. What is your QPS?
 3. What % of your requests are errors?

 --
 Ikai Lan 
 Developer Programs Engineer, Google App Engine
 plus.ikailan.com



 On Fri, Mar 9, 2012 at 7:24 AM, Ronoaldo José de Lana Pereira 
 rpere...@beneficiofacil.com.br wrote:

 +1 for seeing the same problems on my app.

 It started to be worse after maintenance on March 7.

 Em sexta-feira, 9 de março de 2012 08h33min36s UTC-3, Nikolai escreveu:

 +1
 we had to move to our backup systems. Everything is full of 500 errors 
 or hardcore latency.
 Most of the 500 errors we see aren't even logged so this seems to be a 
 goole problem one abstraction layer above the app.

 And yes - sometimes we have got the same feeling, that we are the only 
 ones that use appengine in a production setting. You are not alone ;)

 regards,
 nikolai

 Am Dienstag, 6. März 2012 22:17:37 UTC+1 schrieb Adam Sherman:

 Am I the only one seeing short duration outages? They are being 
 reflected at:

 http://code.google.com/status/**appenginehttp://code.google.com/status/appengine

 But I don't see anyone else complaining anywhere, so it makes me 
 worried.

 A.

  -- 
 You received this message because you are subscribed to the Google Groups 
 Google App Engine group.
 To view this discussion on the web visit 
 https://groups.google.com/d/msg/google-appengine/-/yixu1yAlMs4J.

 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to 
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/google-appengine?hl=en.




-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/BIhSzW__bVUJ.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

2012-03-09 Thread Amit Sangani

Also now getting below exceptions -

java.lang.ExceptionInInitializerError
at
org.datanucleus.jdo.metadata.JDOAnnotationReader.processClassAnnotations(JDOAnnotationReader.java:140)
at
org.datanucleus.metadata.annotations.AbstractAnnotationReader.getMetaDataForClass(AbstractAnnotationReader.java:122)
at
org.datanucleus.metadata.annotations.AnnotationManagerImpl.getMetaDataForClass(AnnotationManagerImpl.java:136)
at
org.datanucleus.metadata.MetaDataManager.loadAnnotationsForClass(MetaDataManager.java:2278)
at
org.datanucleus.jdo.metadata.JDOMetaDataManager.getMetaDataForClassInternal(JDOMetaDataManager.java:369)
at
org.datanucleus.metadata.MetaDataManager.getMetaDataForClass(MetaDataManager.java:1125)
at
org.datanucleus.store.appengine.EntityUtils.idToInternalKey(EntityUtils.java:122)
at
org.datanucleus.store.appengine.jdo.DatastoreJDOPersistenceManager.getObjectById(DatastoreJDOPersistenceManager.java:63)

com.google.apphosting.runtime.HardDeadlineExceededError: This request
(a9a7135b6a5f023e) started at 2012/03/09 21:37:22.971 UTC and was still
executing at 2012/03/09 21:38:22.911 UTC.
at
org.datanucleus.store.appengine.DatastoreManager.getDatastoreClass(DatastoreManager.java:765)
at
org.datanucleus.store.appengine.DatastoreManager.getDatastoreClass(DatastoreManager.java:88)
at
org.datanucleus.store.appengine.EntityUtils.determineKind(EntityUtils.java:98)
at
org.datanucleus.store.appengine.EntityUtils.determineKind(EntityUtils.java:94)
at
org.datanucleus.store.appengine.EntityUtils.idToInternalKey(EntityUtils.java:124)
com.google.apphosting.api.DeadlineExceededException: This request
(34c3e1d5cfc6d211) started at 2012/03/09 21:36:34.867 UTC and was still
executing at 2012/03/09 21:37:34.367 UTC.
at
com.google.appengine.runtime.Request.process-34c3e1d5cfc6d211(Request.java)
at java.util.zip.ZipFile.read(Native Method)
at java.util.zip.ZipFile.access$1200(ZipFile.java:57)
at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:476)
at java.util.zip.ZipFile$1.fill(ZipFile.java:259)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.FilterInputStream.read(FilterInputStream.java:107)

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

I forgot to ask if these were master/slave or high replication apps. I can
always check by going to the admin console, but I'm hoping to separate them
out.

We're looking into the HR apps first (one I figure out which is which).

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com



On Fri, Mar 9, 2012 at 1:53 PM, Amit Sangani amit.sang...@gmail.com wrote:

 Also now getting below exceptions -

 java.lang.ExceptionInInitializerError
 at
 org.datanucleus.jdo.metadata.JDOAnnotationReader.processClassAnnotations(JDOAnnotationReader.java:140)
  at
 org.datanucleus.metadata.annotations.AbstractAnnotationReader.getMetaDataForClass(AbstractAnnotationReader.java:122)
  at
 org.datanucleus.metadata.annotations.AnnotationManagerImpl.getMetaDataForClass(AnnotationManagerImpl.java:136)
 at
 org.datanucleus.metadata.MetaDataManager.loadAnnotationsForClass(MetaDataManager.java:2278)
  at
 org.datanucleus.jdo.metadata.JDOMetaDataManager.getMetaDataForClassInternal(JDOMetaDataManager.java:369)
 at
 org.datanucleus.metadata.MetaDataManager.getMetaDataForClass(MetaDataManager.java:1125)
  at
 org.datanucleus.store.appengine.EntityUtils.idToInternalKey(EntityUtils.java:122)
 at
 org.datanucleus.store.appengine.jdo.DatastoreJDOPersistenceManager.getObjectById(DatastoreJDOPersistenceManager.java:63)

 com.google.apphosting.runtime.HardDeadlineExceededError: This request
 (a9a7135b6a5f023e) started at 2012/03/09 21:37:22.971 UTC and was still
 executing at 2012/03/09 21:38:22.911 UTC.
  at
 org.datanucleus.store.appengine.DatastoreManager.getDatastoreClass(DatastoreManager.java:765)
 at
 org.datanucleus.store.appengine.DatastoreManager.getDatastoreClass(DatastoreManager.java:88)
  at
 org.datanucleus.store.appengine.EntityUtils.determineKind(EntityUtils.java:98)
 at
 org.datanucleus.store.appengine.EntityUtils.determineKind(EntityUtils.java:94)
  at
 org.datanucleus.store.appengine.EntityUtils.idToInternalKey(EntityUtils.java:124)
  com.google.apphosting.api.DeadlineExceededException: This request
 (34c3e1d5cfc6d211) started at 2012/03/09 21:36:34.867 UTC and was still
 executing at 2012/03/09 21:37:34.367 UTC.
  at
 com.google.appengine.runtime.Request.process-34c3e1d5cfc6d211(Request.java)
 at java.util.zip.ZipFile.read(Native Method)
  at java.util.zip.ZipFile.access$1200(ZipFile.java:57)
 at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:476)
  at java.util.zip.ZipFile$1.fill(ZipFile.java:259)
 at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
  at java.io.FilterInputStream.read(FilterInputStream.java:133)
 at java.io.FilterInputStream.read(FilterInputStream.java:107)

  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

2012-03-09 Thread Amit Sangani

textyserver is on master/slave.

On Fri, Mar 9, 2012 at 2:07 PM, Ikai Lan (Google) ika...@google.com wrote:

 I forgot to ask if these were master/slave or high replication apps. I can
 always check by going to the admin console, but I'm hoping to separate them
 out.

 We're looking into the HR apps first (one I figure out which is which).

 --
 Ikai Lan
 Developer Programs Engineer, Google App Engine
 plus.ikailan.com



 On Fri, Mar 9, 2012 at 1:53 PM, Amit Sangani amit.sang...@gmail.comwrote:

 Also now getting below exceptions -

 java.lang.ExceptionInInitializerError
 at
 org.datanucleus.jdo.metadata.JDOAnnotationReader.processClassAnnotations(JDOAnnotationReader.java:140)
  at
 org.datanucleus.metadata.annotations.AbstractAnnotationReader.getMetaDataForClass(AbstractAnnotationReader.java:122)
  at
 org.datanucleus.metadata.annotations.AnnotationManagerImpl.getMetaDataForClass(AnnotationManagerImpl.java:136)
 at
 org.datanucleus.metadata.MetaDataManager.loadAnnotationsForClass(MetaDataManager.java:2278)
  at
 org.datanucleus.jdo.metadata.JDOMetaDataManager.getMetaDataForClassInternal(JDOMetaDataManager.java:369)
 at
 org.datanucleus.metadata.MetaDataManager.getMetaDataForClass(MetaDataManager.java:1125)
  at
 org.datanucleus.store.appengine.EntityUtils.idToInternalKey(EntityUtils.java:122)
 at
 org.datanucleus.store.appengine.jdo.DatastoreJDOPersistenceManager.getObjectById(DatastoreJDOPersistenceManager.java:63)

 com.google.apphosting.runtime.HardDeadlineExceededError: This request
 (a9a7135b6a5f023e) started at 2012/03/09 21:37:22.971 UTC and was still
 executing at 2012/03/09 21:38:22.911 UTC.
  at
 org.datanucleus.store.appengine.DatastoreManager.getDatastoreClass(DatastoreManager.java:765)
 at
 org.datanucleus.store.appengine.DatastoreManager.getDatastoreClass(DatastoreManager.java:88)
  at
 org.datanucleus.store.appengine.EntityUtils.determineKind(EntityUtils.java:98)
 at
 org.datanucleus.store.appengine.EntityUtils.determineKind(EntityUtils.java:94)
  at
 org.datanucleus.store.appengine.EntityUtils.idToInternalKey(EntityUtils.java:124)
  com.google.apphosting.api.DeadlineExceededException: This request
 (34c3e1d5cfc6d211) started at 2012/03/09 21:36:34.867 UTC and was still
 executing at 2012/03/09 21:37:34.367 UTC.
  at
 com.google.appengine.runtime.Request.process-34c3e1d5cfc6d211(Request.java)
 at java.util.zip.ZipFile.read(Native Method)
  at java.util.zip.ZipFile.access$1200(ZipFile.java:57)
 at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:476)
  at java.util.zip.ZipFile$1.fill(ZipFile.java:259)
 at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
  at java.io.FilterInputStream.read(FilterInputStream.java:133)
 at java.io.FilterInputStream.read(FilterInputStream.java:107)

  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine group.
 To post to this group, send email to google-appengine@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?

Yep, I figured it out (when you look at an app in the admin console, if the
app ID has a s~ prefix, that means it runs in High Replication). I was just
pointing it out for people who hadn't yet reported application IDs.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Fri, Mar 9, 2012 at 2:12 PM, Amit Sangani amit.sang...@gmail.com wrote:

textyserver is on master/slave.

On Fri, Mar 9, 2012 at 2:07 PM, Ikai Lan (Google) ika...@google.comwrote:

I forgot to ask if these were master/slave or high replication apps. I
can always check by going to the admin console, but I'm hoping to separate
them out.

We're looking into the HR apps first (one I figure out which is which).

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com

On Fri, Mar 9, 2012 at 1:53 PM, Amit Sangani amit.sang...@gmail.comwrote:

Also now getting below exceptions -

java.lang.ExceptionInInitializerError
at
org.datanucleus.jdo.metadata.JDOAnnotationReader.processClassAnnotations(JDOAnnotationReader.java:140)
at
org.datanucleus.metadata.annotations.AbstractAnnotationReader.getMetaDataForClass(AbstractAnnotationReader.java:122)
at
org.datanucleus.metadata.annotations.AnnotationManagerImpl.getMetaDataForClass(AnnotationManagerImpl.java:136)
at
org.datanucleus.metadata.MetaDataManager.loadAnnotationsForClass(MetaDataManager.java:2278)
at
org.datanucleus.jdo.metadata.JDOMetaDataManager.getMetaDataForClassInternal(JDOMetaDataManager.java:369)
at
org.datanucleus.metadata.MetaDataManager.getMetaDataForClass(MetaDataManager.java:1125)
at
org.datanucleus.store.appengine.EntityUtils.idToInternalKey(EntityUtils.java:122)
at
org.datanucleus.store.appengine.jdo.DatastoreJDOPersistenceManager.getObjectById(DatastoreJDOPersistenceManager.java:63)

com.google.apphosting.runtime.HardDeadlineExceededError: This request
(a9a7135b6a5f023e) started at 2012/03/09 21:37:22.971 UTC and was still
executing at 2012/03/09 21:38:22.911 UTC.
at
org.datanucleus.store.appengine.DatastoreManager.getDatastoreClass(DatastoreManager.java:765)
at
org.datanucleus.store.appengine.DatastoreManager.getDatastoreClass(DatastoreManager.java:88)
at
org.datanucleus.store.appengine.EntityUtils.determineKind(EntityUtils.java:98)
at
org.datanucleus.store.appengine.EntityUtils.determineKind(EntityUtils.java:94)
at
org.datanucleus.store.appengine.EntityUtils.idToInternalKey(EntityUtils.java:124)
com.google.apphosting.api.DeadlineExceededException: This request
(34c3e1d5cfc6d211) started at 2012/03/09 21:36:34.867 UTC and was still
executing at 2012/03/09 21:37:34.367 UTC.
at
com.google.appengine.runtime.Request.process-34c3e1d5cfc6d211(Request.java)
at java.util.zip.ZipFile.read(Native Method)
at java.util.zip.ZipFile.access$1200(ZipFile.java:57)
at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:476)
at java.util.zip.ZipFile$1.fill(ZipFile.java:259)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.FilterInputStream.read(FilterInputStream.java:107)

--
You received this message because you are subscribed to the Google
Groups Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: Outages?