[google-appengine] Re: Request was aborted after waiting too long followed by random DeadlineExceededError on import.
I was under the impression that something happened internally at Google to adjust the way that apps were balanced around machines and/ or other internal tuning. Additionally, we run a ping every 10 seconds to keep an instance hot. While I understand how this doesn't have much effect in a distributed environment (though practically speaking in this case it does seem to have a positive effect), and while I also understand how this is abuses a shared resource, I'm currently afraid to turn it off. j On Jan 19, 8:10 pm, Wesley Chun (Google) wesc+...@google.com wrote: dave, jason, just wanted to do a follow-up to see where things stand with your apps now. i'm coming across a similar user issue and was wondering whether it's the same problem or not. can you post your complete error stack traces if you're still running into this issue? here's the issue filed by the other user FYI, who's app seems to have few requests but each one has high latency: http://code.google.com/p/googleappengine/issues/detail?id=2621 if your respective apps don't suffer from this problem any more, what did you do to resolve it or did it magically go away? thanks, -- wesley - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Core Python Programming, Prentice Hall, (c)2007,2001 Python Fundamentals, Prentice Hall, (c)2009 http://corepython.com wesley.j.chun :: wesc+...@google.com developer relations :: google app engine -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: Request was aborted after waiting too long followed by random DeadlineExceededError on import.
dave, jason, just wanted to do a follow-up to see where things stand with your apps now. i'm coming across a similar user issue and was wondering whether it's the same problem or not. can you post your complete error stack traces if you're still running into this issue? here's the issue filed by the other user FYI, who's app seems to have few requests but each one has high latency: http://code.google.com/p/googleappengine/issues/detail?id=2621 if your respective apps don't suffer from this problem any more, what did you do to resolve it or did it magically go away? thanks, -- wesley - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Core Python Programming, Prentice Hall, (c)2007,2001 Python Fundamentals, Prentice Hall, (c)2009 http://corepython.com wesley.j.chun :: wesc+...@google.com developer relations :: google app engine -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: Request was aborted after waiting too long followed by random DeadlineExceededError on import.
We (steprep) still saw a set of them on Dec 16 starting 3.54am through 6.57am (log time). j On Dec 15, 1:56 pm, Ikai L (Google) ika...@google.com wrote: I made the change right before I sent the email. Let me know how it works for you. Jason, I also made the change to your application. Please report back after tomorrow if you continue to experience issues. On Tue, Dec 15, 2009 at 11:39 AM, Dave Peck davep...@gmail.com wrote: Ikai, We'll keep an eye on our app for the next ~24 hours and report back. At what time did you make the changes to our instance? We had substantial downtime earlier today, alas. Can you provide any details about what sort of change was made? Thanks, Dave On Dec 15, 11:26 am, Ikai L (Google) ika...@google.com wrote: Dave, You're correct that this is likely affecting other applications, but it's not a global issue. There are hotspots in the cloud that we notice are being especially impacted during certain times of the day. We're actively working on addressing these issues, but in the meantime, there are manual steps we can try to prevent your applications from becoming resource starved. We do these on a one-off basis and reserve them only for applications that seem to exhibit the behavior of seeing DeadlineExceeded on simple actions (not initial JVM startup), and at fairly predictable intervals during the day. I've taken these steps to try to remedy your application. Can you let us know if these seem to help? If not, they may indicate that something is going on with your application code, though that does not seem like the case here. On Tue, Dec 15, 2009 at 10:54 AM, Dave Peck davep...@gmail.com wrote: Hi Ikai, Any further details on your end? I get the feeling we're not the only ones, and we've experienced very serious downtime in the last ~48 hours. This is a critical issue for us to resolve, but at the same time we lack key pieces of data that would help us solve it on our own... Thanks, Dave On Dec 15, 9:14 am, Jason C jason.a.coll...@gmail.com wrote: Ikai, We see daily DeadlineExceededErrors on app id 'steprep' from 6.30am to 7.30am (log time). Can you look into that as well? Thanks, j On Dec 14, 3:32 pm, Ikai L (Google) ika...@google.com wrote: Do you see that it's consistent at the same times? What's your application ID? I'll look into it. On Mon, Dec 14, 2009 at 11:28 AM, Dave Peck davep...@gmail.com wrote: Hello, I have an app (citygoround.org) that, especially in the morning, often has 10-15 minutes of outright downtime due to server errors. Looking into it, I see that right before the downtime starts, a few requests log the following warning message: Request was aborted after waiting too long to attempt to service your request. Most likely, this indicates that you have reached your simultaneous dynamic request limit. I'm certainly not over my limit, but I can believe that the request in question could take a while. (I'll get to the details of that request in a moment.) Immediately after these warnings, my app has a large amount of time (10+ minutes) where *all requests* -- no matter how unthreatening -- raise a DeadlineExceededError. Usually this is raised during the import of an innocuous module like re or time or perhaps a Django 1.1 module. (We use use_library.) My best theory at the moment is that: 1. It's a cold start, so nothing is cached. 2. App Engine encounters the high latency request and bails. 3. We probably inadvertently catch the DeadlineExceededError, so the runtime doesn't clean up properly. 4. Future requests are left in a busted state. Does this sound at all reasonable? I see a few related issues (2396, 2266, and 1409) but no firm/completely clear discussion of what's happening in any of them. Thanks, Dave PS: The specifics about our high latency request are *not* strictly relevant to the larger problem I'm having, but I will include them because I have a second side question to ask about it. The high latency request is serving an image. Our app lets users upload images and we store them in the data store. When serving an image, our handler: 1. Checks to see if the bytes for the image are in memcache, and if so returns them immediately. 2. Otherwise grabs the image from the datastore, and if it is smaller than 64K, adds the bytes to the memcache 3. Returns the result I'm wondering if using memcache in this way is a smart idea -- it may very well be the cause of our latency issues. It's hard to tell.
[google-appengine] Re: Request was aborted after waiting too long followed by random DeadlineExceededError on import.
Ikai, We see daily DeadlineExceededErrors on app id 'steprep' from 6.30am to 7.30am (log time). Can you look into that as well? Thanks, j On Dec 14, 3:32 pm, Ikai L (Google) ika...@google.com wrote: Do you see that it's consistent at the same times? What's your application ID? I'll look into it. On Mon, Dec 14, 2009 at 11:28 AM, Dave Peck davep...@gmail.com wrote: Hello, I have an app (citygoround.org) that, especially in the morning, often has 10-15 minutes of outright downtime due to server errors. Looking into it, I see that right before the downtime starts, a few requests log the following warning message: Request was aborted after waiting too long to attempt to service your request. Most likely, this indicates that you have reached your simultaneous dynamic request limit. I'm certainly not over my limit, but I can believe that the request in question could take a while. (I'll get to the details of that request in a moment.) Immediately after these warnings, my app has a large amount of time (10+ minutes) where *all requests* -- no matter how unthreatening -- raise a DeadlineExceededError. Usually this is raised during the import of an innocuous module like re or time or perhaps a Django 1.1 module. (We use use_library.) My best theory at the moment is that: 1. It's a cold start, so nothing is cached. 2. App Engine encounters the high latency request and bails. 3. We probably inadvertently catch the DeadlineExceededError, so the runtime doesn't clean up properly. 4. Future requests are left in a busted state. Does this sound at all reasonable? I see a few related issues (2396, 2266, and 1409) but no firm/completely clear discussion of what's happening in any of them. Thanks, Dave PS: The specifics about our high latency request are *not* strictly relevant to the larger problem I'm having, but I will include them because I have a second side question to ask about it. The high latency request is serving an image. Our app lets users upload images and we store them in the data store. When serving an image, our handler: 1. Checks to see if the bytes for the image are in memcache, and if so returns them immediately. 2. Otherwise grabs the image from the datastore, and if it is smaller than 64K, adds the bytes to the memcache 3. Returns the result I'm wondering if using memcache in this way is a smart idea -- it may very well be the cause of our latency issues. It's hard to tell. Alternatively, the issue could be: we have a page that shows a large number (~100) of such images. If someone requests this page, we may have a lot of simultaneous image-producing requests happening at the same time. Perhaps _this_ is the root cause of the original Request was aborted issue? Just not sure here... -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib e...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- Ikai Lan Developer Programs Engineer, Google App Engine -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: Request was aborted after waiting too long followed by random DeadlineExceededError on import.
Hi Ikai, Any further details on your end? I get the feeling we're not the only ones, and we've experienced very serious downtime in the last ~48 hours. This is a critical issue for us to resolve, but at the same time we lack key pieces of data that would help us solve it on our own... Thanks, Dave On Dec 15, 9:14 am, Jason C jason.a.coll...@gmail.com wrote: Ikai, We see daily DeadlineExceededErrors on app id 'steprep' from 6.30am to 7.30am (log time). Can you look into that as well? Thanks, j On Dec 14, 3:32 pm, Ikai L (Google) ika...@google.com wrote: Do you see that it's consistent at the same times? What's your application ID? I'll look into it. On Mon, Dec 14, 2009 at 11:28 AM, Dave Peck davep...@gmail.com wrote: Hello, I have an app (citygoround.org) that, especially in the morning, often has 10-15 minutes of outright downtime due to server errors. Looking into it, I see that right before the downtime starts, a few requests log the following warning message: Request was aborted after waiting too long to attempt to service your request. Most likely, this indicates that you have reached your simultaneous dynamic request limit. I'm certainly not over my limit, but I can believe that the request in question could take a while. (I'll get to the details of that request in a moment.) Immediately after these warnings, my app has a large amount of time (10+ minutes) where *all requests* -- no matter how unthreatening -- raise a DeadlineExceededError. Usually this is raised during the import of an innocuous module like re or time or perhaps a Django 1.1 module. (We use use_library.) My best theory at the moment is that: 1. It's a cold start, so nothing is cached. 2. App Engine encounters the high latency request and bails. 3. We probably inadvertently catch the DeadlineExceededError, so the runtime doesn't clean up properly. 4. Future requests are left in a busted state. Does this sound at all reasonable? I see a few related issues (2396, 2266, and 1409) but no firm/completely clear discussion of what's happening in any of them. Thanks, Dave PS: The specifics about our high latency request are *not* strictly relevant to the larger problem I'm having, but I will include them because I have a second side question to ask about it. The high latency request is serving an image. Our app lets users upload images and we store them in the data store. When serving an image, our handler: 1. Checks to see if the bytes for the image are in memcache, and if so returns them immediately. 2. Otherwise grabs the image from the datastore, and if it is smaller than 64K, adds the bytes to the memcache 3. Returns the result I'm wondering if using memcache in this way is a smart idea -- it may very well be the cause of our latency issues. It's hard to tell. Alternatively, the issue could be: we have a page that shows a large number (~100) of such images. If someone requests this page, we may have a lot of simultaneous image-producing requests happening at the same time. Perhaps _this_ is the root cause of the original Request was aborted issue? Just not sure here... -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib e...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- Ikai Lan Developer Programs Engineer, Google App Engine -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] Re: Request was aborted after waiting too long followed by random DeadlineExceededError on import.
Dave, You're correct that this is likely affecting other applications, but it's not a global issue. There are hotspots in the cloud that we notice are being especially impacted during certain times of the day. We're actively working on addressing these issues, but in the meantime, there are manual steps we can try to prevent your applications from becoming resource starved. We do these on a one-off basis and reserve them only for applications that seem to exhibit the behavior of seeing DeadlineExceeded on simple actions (not initial JVM startup), and at fairly predictable intervals during the day. I've taken these steps to try to remedy your application. Can you let us know if these seem to help? If not, they may indicate that something is going on with your application code, though that does not seem like the case here. On Tue, Dec 15, 2009 at 10:54 AM, Dave Peck davep...@gmail.com wrote: Hi Ikai, Any further details on your end? I get the feeling we're not the only ones, and we've experienced very serious downtime in the last ~48 hours. This is a critical issue for us to resolve, but at the same time we lack key pieces of data that would help us solve it on our own... Thanks, Dave On Dec 15, 9:14 am, Jason C jason.a.coll...@gmail.com wrote: Ikai, We see daily DeadlineExceededErrors on app id 'steprep' from 6.30am to 7.30am (log time). Can you look into that as well? Thanks, j On Dec 14, 3:32 pm, Ikai L (Google) ika...@google.com wrote: Do you see that it's consistent at the same times? What's your application ID? I'll look into it. On Mon, Dec 14, 2009 at 11:28 AM, Dave Peck davep...@gmail.com wrote: Hello, I have an app (citygoround.org) that, especially in the morning, often has 10-15 minutes of outright downtime due to server errors. Looking into it, I see that right before the downtime starts, a few requests log the following warning message: Request was aborted after waiting too long to attempt to service your request. Most likely, this indicates that you have reached your simultaneous dynamic request limit. I'm certainly not over my limit, but I can believe that the request in question could take a while. (I'll get to the details of that request in a moment.) Immediately after these warnings, my app has a large amount of time (10+ minutes) where *all requests* -- no matter how unthreatening -- raise a DeadlineExceededError. Usually this is raised during the import of an innocuous module like re or time or perhaps a Django 1.1 module. (We use use_library.) My best theory at the moment is that: 1. It's a cold start, so nothing is cached. 2. App Engine encounters the high latency request and bails. 3. We probably inadvertently catch the DeadlineExceededError, so the runtime doesn't clean up properly. 4. Future requests are left in a busted state. Does this sound at all reasonable? I see a few related issues (2396, 2266, and 1409) but no firm/completely clear discussion of what's happening in any of them. Thanks, Dave PS: The specifics about our high latency request are *not* strictly relevant to the larger problem I'm having, but I will include them because I have a second side question to ask about it. The high latency request is serving an image. Our app lets users upload images and we store them in the data store. When serving an image, our handler: 1. Checks to see if the bytes for the image are in memcache, and if so returns them immediately. 2. Otherwise grabs the image from the datastore, and if it is smaller than 64K, adds the bytes to the memcache 3. Returns the result I'm wondering if using memcache in this way is a smart idea -- it may very well be the cause of our latency issues. It's hard to tell. Alternatively, the issue could be: we have a page that shows a large number (~100) of such images. If someone requests this page, we may have a lot of simultaneous image-producing requests happening at the same time. Perhaps _this_ is the root cause of the original Request was aborted issue? Just not sure here... -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2bunsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib e...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- Ikai Lan Developer Programs Engineer, Google App Engine -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send
[google-appengine] Re: Request was aborted after waiting too long followed by random DeadlineExceededError on import.
Ikai, We'll keep an eye on our app for the next ~24 hours and report back. At what time did you make the changes to our instance? We had substantial downtime earlier today, alas. Can you provide any details about what sort of change was made? Thanks, Dave On Dec 15, 11:26 am, Ikai L (Google) ika...@google.com wrote: Dave, You're correct that this is likely affecting other applications, but it's not a global issue. There are hotspots in the cloud that we notice are being especially impacted during certain times of the day. We're actively working on addressing these issues, but in the meantime, there are manual steps we can try to prevent your applications from becoming resource starved. We do these on a one-off basis and reserve them only for applications that seem to exhibit the behavior of seeing DeadlineExceeded on simple actions (not initial JVM startup), and at fairly predictable intervals during the day. I've taken these steps to try to remedy your application. Can you let us know if these seem to help? If not, they may indicate that something is going on with your application code, though that does not seem like the case here. On Tue, Dec 15, 2009 at 10:54 AM, Dave Peck davep...@gmail.com wrote: Hi Ikai, Any further details on your end? I get the feeling we're not the only ones, and we've experienced very serious downtime in the last ~48 hours. This is a critical issue for us to resolve, but at the same time we lack key pieces of data that would help us solve it on our own... Thanks, Dave On Dec 15, 9:14 am, Jason C jason.a.coll...@gmail.com wrote: Ikai, We see daily DeadlineExceededErrors on app id 'steprep' from 6.30am to 7.30am (log time). Can you look into that as well? Thanks, j On Dec 14, 3:32 pm, Ikai L (Google) ika...@google.com wrote: Do you see that it's consistent at the same times? What's your application ID? I'll look into it. On Mon, Dec 14, 2009 at 11:28 AM, Dave Peck davep...@gmail.com wrote: Hello, I have an app (citygoround.org) that, especially in the morning, often has 10-15 minutes of outright downtime due to server errors. Looking into it, I see that right before the downtime starts, a few requests log the following warning message: Request was aborted after waiting too long to attempt to service your request. Most likely, this indicates that you have reached your simultaneous dynamic request limit. I'm certainly not over my limit, but I can believe that the request in question could take a while. (I'll get to the details of that request in a moment.) Immediately after these warnings, my app has a large amount of time (10+ minutes) where *all requests* -- no matter how unthreatening -- raise a DeadlineExceededError. Usually this is raised during the import of an innocuous module like re or time or perhaps a Django 1.1 module. (We use use_library.) My best theory at the moment is that: 1. It's a cold start, so nothing is cached. 2. App Engine encounters the high latency request and bails. 3. We probably inadvertently catch the DeadlineExceededError, so the runtime doesn't clean up properly. 4. Future requests are left in a busted state. Does this sound at all reasonable? I see a few related issues (2396, 2266, and 1409) but no firm/completely clear discussion of what's happening in any of them. Thanks, Dave PS: The specifics about our high latency request are *not* strictly relevant to the larger problem I'm having, but I will include them because I have a second side question to ask about it. The high latency request is serving an image. Our app lets users upload images and we store them in the data store. When serving an image, our handler: 1. Checks to see if the bytes for the image are in memcache, and if so returns them immediately. 2. Otherwise grabs the image from the datastore, and if it is smaller than 64K, adds the bytes to the memcache 3. Returns the result I'm wondering if using memcache in this way is a smart idea -- it may very well be the cause of our latency issues. It's hard to tell. Alternatively, the issue could be: we have a page that shows a large number (~100) of such images. If someone requests this page, we may have a lot of simultaneous image-producing requests happening at the same time. Perhaps _this_ is the root cause of the original Request was aborted issue? Just not sure here... -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to
Re: [google-appengine] Re: Request was aborted after waiting too long followed by random DeadlineExceededError on import.
I made the change right before I sent the email. Let me know how it works for you. Jason, I also made the change to your application. Please report back after tomorrow if you continue to experience issues. On Tue, Dec 15, 2009 at 11:39 AM, Dave Peck davep...@gmail.com wrote: Ikai, We'll keep an eye on our app for the next ~24 hours and report back. At what time did you make the changes to our instance? We had substantial downtime earlier today, alas. Can you provide any details about what sort of change was made? Thanks, Dave On Dec 15, 11:26 am, Ikai L (Google) ika...@google.com wrote: Dave, You're correct that this is likely affecting other applications, but it's not a global issue. There are hotspots in the cloud that we notice are being especially impacted during certain times of the day. We're actively working on addressing these issues, but in the meantime, there are manual steps we can try to prevent your applications from becoming resource starved. We do these on a one-off basis and reserve them only for applications that seem to exhibit the behavior of seeing DeadlineExceeded on simple actions (not initial JVM startup), and at fairly predictable intervals during the day. I've taken these steps to try to remedy your application. Can you let us know if these seem to help? If not, they may indicate that something is going on with your application code, though that does not seem like the case here. On Tue, Dec 15, 2009 at 10:54 AM, Dave Peck davep...@gmail.com wrote: Hi Ikai, Any further details on your end? I get the feeling we're not the only ones, and we've experienced very serious downtime in the last ~48 hours. This is a critical issue for us to resolve, but at the same time we lack key pieces of data that would help us solve it on our own... Thanks, Dave On Dec 15, 9:14 am, Jason C jason.a.coll...@gmail.com wrote: Ikai, We see daily DeadlineExceededErrors on app id 'steprep' from 6.30am to 7.30am (log time). Can you look into that as well? Thanks, j On Dec 14, 3:32 pm, Ikai L (Google) ika...@google.com wrote: Do you see that it's consistent at the same times? What's your application ID? I'll look into it. On Mon, Dec 14, 2009 at 11:28 AM, Dave Peck davep...@gmail.com wrote: Hello, I have an app (citygoround.org) that, especially in the morning, often has 10-15 minutes of outright downtime due to server errors. Looking into it, I see that right before the downtime starts, a few requests log the following warning message: Request was aborted after waiting too long to attempt to service your request. Most likely, this indicates that you have reached your simultaneous dynamic request limit. I'm certainly not over my limit, but I can believe that the request in question could take a while. (I'll get to the details of that request in a moment.) Immediately after these warnings, my app has a large amount of time (10+ minutes) where *all requests* -- no matter how unthreatening -- raise a DeadlineExceededError. Usually this is raised during the import of an innocuous module like re or time or perhaps a Django 1.1 module. (We use use_library.) My best theory at the moment is that: 1. It's a cold start, so nothing is cached. 2. App Engine encounters the high latency request and bails. 3. We probably inadvertently catch the DeadlineExceededError, so the runtime doesn't clean up properly. 4. Future requests are left in a busted state. Does this sound at all reasonable? I see a few related issues (2396, 2266, and 1409) but no firm/completely clear discussion of what's happening in any of them. Thanks, Dave PS: The specifics about our high latency request are *not* strictly relevant to the larger problem I'm having, but I will include them because I have a second side question to ask about it. The high latency request is serving an image. Our app lets users upload images and we store them in the data store. When serving an image, our handler: 1. Checks to see if the bytes for the image are in memcache, and if so returns them immediately. 2. Otherwise grabs the image from the datastore, and if it is smaller than 64K, adds the bytes to the memcache 3. Returns the result I'm wondering if using memcache in this way is a smart idea -- it may very well be the cause of our latency issues. It's hard to tell. Alternatively, the issue could be: we have a page that shows a large number (~100) of such images. If someone requests this page, we may have a lot of simultaneous image-producing requests happening at the same time.
[google-appengine] Re: Request was aborted after waiting too long followed by random DeadlineExceededError on import.
Hi Ikai, The app id is citygoround. We had a number of stretches of badness this morning. An example stretch: 6:07AM 33.867 (Request was aborted...) 6:07AM 49.672 through 7:12AM 24.470 (DeadlineExceededError and/or ImproperlyConfiguredError -- looks like it depends on which imports fail.) And another: 8:17AM 37.620 (Request was aborted...) 8:17AM 54.348 through 8:46AM 51.478 (DeadlineExceededError and/or ImproperlyConfiguredError) One last thing: the app is open source. If it helps, you can find the exact code that we're running in production at: http://github.com/davepeck/CityGoRound/ The screenshot handler in question is found in ./citygoround/views/ app.py Line 115. Cheers, Dave On Dec 14, 1:32 pm, Ikai L (Google) ika...@google.com wrote: Do you see that it's consistent at the same times? What's your application ID? I'll look into it. On Mon, Dec 14, 2009 at 11:28 AM, Dave Peck davep...@gmail.com wrote: Hello, I have an app (citygoround.org) that, especially in the morning, often has 10-15 minutes of outright downtime due to server errors. Looking into it, I see that right before the downtime starts, a few requests log the following warning message: Request was aborted after waiting too long to attempt to service your request. Most likely, this indicates that you have reached your simultaneous dynamic request limit. I'm certainly not over my limit, but I can believe that the request in question could take a while. (I'll get to the details of that request in a moment.) Immediately after these warnings, my app has a large amount of time (10+ minutes) where *all requests* -- no matter how unthreatening -- raise a DeadlineExceededError. Usually this is raised during the import of an innocuous module like re or time or perhaps a Django 1.1 module. (We use use_library.) My best theory at the moment is that: 1. It's a cold start, so nothing is cached. 2. App Engine encounters the high latency request and bails. 3. We probably inadvertently catch the DeadlineExceededError, so the runtime doesn't clean up properly. 4. Future requests are left in a busted state. Does this sound at all reasonable? I see a few related issues (2396, 2266, and 1409) but no firm/completely clear discussion of what's happening in any of them. Thanks, Dave PS: The specifics about our high latency request are *not* strictly relevant to the larger problem I'm having, but I will include them because I have a second side question to ask about it. The high latency request is serving an image. Our app lets users upload images and we store them in the data store. When serving an image, our handler: 1. Checks to see if the bytes for the image are in memcache, and if so returns them immediately. 2. Otherwise grabs the image from the datastore, and if it is smaller than 64K, adds the bytes to the memcache 3. Returns the result I'm wondering if using memcache in this way is a smart idea -- it may very well be the cause of our latency issues. It's hard to tell. Alternatively, the issue could be: we have a page that shows a large number (~100) of such images. If someone requests this page, we may have a lot of simultaneous image-producing requests happening at the same time. Perhaps _this_ is the root cause of the original Request was aborted issue? Just not sure here... -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.comgoogle-appengine%2Bunsubscrib e...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- Ikai Lan Developer Programs Engineer, Google App Engine -- You received this message because you are subscribed to the Google Groups Google App Engine group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.