Re: 503 error when decreasing dynos
On May 12, 10:15 am, Peter van Hardenberg wrote: > We normally send a SIGTERM, then wait five (ish?) seconds to let the last > request serve and then, then send SIGKILL if the process still hasn't gone > away. I'm going to go ahead and ignore the fact that I know nothing about the infrastructure's implementations and their constraints --- but from this side of things it looks like this problems stems from the fact that you rely heavily on the applications and/or thin to gracefully finish a request after receiving a SIGTERM. If I may ask, why don't you first wait for the request to finish, and if it doesn't finish after an interval, *then* send a SIGTERM? -- You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com. To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/heroku?hl=en.
Re: 503 error when decreasing dynos
On May 12, 3:08 pm, Peter van Hardenberg wrote: > Sinatra only started handling SIGTERM correctly in 1.1, so you could look at > the source for that and monkeypatch accordingly. On May 12, 1:26 pm, Oren Teich wrote: > If you are on bamboo, you can put thin in your gemfile on the latest version > (1.2.11) and you'll get the correct behavior responding to signals. I've correctly bundled Sinatra 1.2.6 and Thin 1.2.11 gems and dependencies and unfortunately the error is still showing up. Here's my test case code. (Replace myapp with your app name). Scroll to the bottom for the step by step instructions to reproduce the error. config.ru require 'rubygems' require 'bundler' Bundler.require require 'myapp' run Sinatra::Application myapp.rb require 'sinatra' get '/' do delay = params[:delay].to_f sleep(delay) queue_depth = env['HTTP_X_HEROKU_QUEUE_DEPTH'] queue_wait = env['HTTP_X_HEROKU_QUEUE_WAIT_TIME'] "queue_depth: #{queue_depth} queue_wait: #{queue_wait} delay: #{delay}" end heroku_stress_test.rb require 'rubygems' require 'typhoeus' # using typhoeus version 0.1.29 status_code_success = 200 num_threads = 1 num_requests_per_thread = 100 hydra_concurrency = 10 delay = 0.0 # seconds between each thread url = 'http://myapp.heroku.com/?delay=0.5' user_agent = 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_3; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.99 Safari/533.4' $total_successes = 0 threads = [] num_threads.times do |i| sleep(delay) threads << Thread.new { hydra = Typhoeus::Hydra.new(:max_concurrency => hydra_concurrency) hydra.disable_memoization num_requests_per_thread.times do |j| request = Typhoeus::Request.new(url) request.user_agent = user_agent hydra.queue request debug = "" request.on_complete do |response| debug += "\nthread: #{i}" debug += " | response.code:#{response.code}" debug += " | response.time:#{response.time.to_s}" if response.code != status_code_success debug += 'error -' debug += response.body else $total_successes += 1 debug += "\n"+response.body end puts "\n" puts debug p $total_successes end end puts 'hydra.run' hydra.run } end # make sure the program has ended threads.each do |thread| thread.join end how to reproduce the error: 1. upload myapp to heroku and make sure it works 2. open Heroku resources page for the app 3. place the slider at 10 and "Save and Apply" changes 4. place the slider at 5 but *don't* "Save and Apply" changes yet 5. run heroku_stress_test.rb in your local terminal 6. when responses start flowing in, "Save and Apply" changes in resources page. 7. you should see a number of errors (error numbers vary) 7. repeat the process multiple times. in my experience I see something between 0 to 5 request failures (503 errors) each time I run the test with these dyno numbers. The dyno numbers are just an example that I've observed have a high chance of showing errors. I've also tested it with subtracting only one dyno, and there's a roughly 50% chance of having one request failing, maybe less. Worth noting that it seems like the number of errors in this test is never larger than the number of dynos subtracted. If I subtract one dyno, not more than 1 error will appear. If I subtract 5, I can observe up to 5 errors. -- You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com. To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/heroku?hl=en.
Re: 503 error when decreasing dynos
Ah, I see Oren has a more useful response! Thanks, Oren! On Thu, May 12, 2011 at 3:08 PM, Peter van Hardenberg wrote: > Sorry -- you're going to have to figure out the Sinatra stuff yourself. > Based on my "here let me google that for you" investigation, it sounds like > Sinatra only started handling SIGTERM correctly in 1.1, so you could look at > the source for that and monkeypatch accordingly. > > Regards, > Peter > > > On Thu, May 12, 2011 at 1:11 PM, midwaltz wrote: > >> Interesting, thanks Peter. Yes, that's probably worth documenting. >> >> Two questions: >> >> - Are requests in the queue somehow pre-assigned to a specific dyno? >> I'm asking because I'm testing this behavior on a bare sinatra app >> with a get handler that sleeps for 0.5 seconds and then returns 'OK'. >> In other words the request takes half a second, so according to you it >> should be quick enough to finish before being sent a SIGKILL. But I >> keep seeing those errors, so I'm guessing either Sinatra panics and >> kills itself on SIGTERM, or maybe it's a queue thing? I'll be happy to >> share the code I'm using to make the tests if you'd like. (I'm using >> the default Sinatra gem (not specifying a version) on the bamboo- >> ree-1.8.7 stack). >> >> - How do I catch a SIGTERM during a Sinatra app request? >> >> >> >> >> >> >> On May 12, 10:15 am, Peter van Hardenberg wrote: >> > We normally send a SIGTERM, then wait five (ish?) seconds to let the >> last >> > request serve and then, then send SIGKILL if the process still hasn't >> gone >> > away. >> > >> > You can confirm this behaviour here by catching and logging the SIGTERM >> in >> > your app and then reproducing the situation you describe. If you can >> provide >> > a test-case that shows you're not seeing expected behaviour (I use it >> > extensively in one of my test apps) I'll make a ticket gets filed >> against >> > the Runtime. Otherwise, maybe there's somewhere we can improve our >> > documentation here. >> > >> > Regards, >> > >> > Peter >> > Heroku >> > >> > >> > >> > >> > >> > >> > >> > On Wed, May 11, 2011 at 8:49 PM, midwaltz wrote: >> > >> > > When decreasing Dynos while they are busy, some of them return a 503 >> > > status error with Heroku error code H13 (Connection closed without >> > > response >> > >http://devcenter.heroku.com/articles/error-codes#h13__connection_clos. >> .. >> > > ). >> > >> > > I can only speculate, but to me it looks like instead of waiting for >> > > the Dyno to finish sending its request, the Dyno is killed right away. >> > >> > > After a quick Googling I found out this bug might have been known for >> > > a while: >> > >http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug >> > > (except the error status codes look different, but it might have been >> > > a misstep?). >> > >> > > What's the status on fixing this bug? >> > >> > > Cheers, >> > > Steph >> > >> > > -- >> > > You received this message because you are subscribed to the Google >> Groups >> > > "Heroku" group. >> > > To post to this group, send email to heroku@googlegroups.com. >> > > To unsubscribe from this group, send email to >> > > heroku+unsubscr...@googlegroups.com. >> > > For more options, visit this group at >> > >http://groups.google.com/group/heroku?hl=en. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Heroku" group. >> To post to this group, send email to heroku@googlegroups.com. >> To unsubscribe from this group, send email to >> heroku+unsubscr...@googlegroups.com. >> For more options, visit this group at >> http://groups.google.com/group/heroku?hl=en. >> >> > -- You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com. To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/heroku?hl=en.
Re: 503 error when decreasing dynos
Sorry -- you're going to have to figure out the Sinatra stuff yourself. Based on my "here let me google that for you" investigation, it sounds like Sinatra only started handling SIGTERM correctly in 1.1, so you could look at the source for that and monkeypatch accordingly. Regards, Peter On Thu, May 12, 2011 at 1:11 PM, midwaltz wrote: > Interesting, thanks Peter. Yes, that's probably worth documenting. > > Two questions: > > - Are requests in the queue somehow pre-assigned to a specific dyno? > I'm asking because I'm testing this behavior on a bare sinatra app > with a get handler that sleeps for 0.5 seconds and then returns 'OK'. > In other words the request takes half a second, so according to you it > should be quick enough to finish before being sent a SIGKILL. But I > keep seeing those errors, so I'm guessing either Sinatra panics and > kills itself on SIGTERM, or maybe it's a queue thing? I'll be happy to > share the code I'm using to make the tests if you'd like. (I'm using > the default Sinatra gem (not specifying a version) on the bamboo- > ree-1.8.7 stack). > > - How do I catch a SIGTERM during a Sinatra app request? > > > > > > > On May 12, 10:15 am, Peter van Hardenberg wrote: > > We normally send a SIGTERM, then wait five (ish?) seconds to let the last > > request serve and then, then send SIGKILL if the process still hasn't > gone > > away. > > > > You can confirm this behaviour here by catching and logging the SIGTERM > in > > your app and then reproducing the situation you describe. If you can > provide > > a test-case that shows you're not seeing expected behaviour (I use it > > extensively in one of my test apps) I'll make a ticket gets filed against > > the Runtime. Otherwise, maybe there's somewhere we can improve our > > documentation here. > > > > Regards, > > > > Peter > > Heroku > > > > > > > > > > > > > > > > On Wed, May 11, 2011 at 8:49 PM, midwaltz wrote: > > > > > When decreasing Dynos while they are busy, some of them return a 503 > > > status error with Heroku error code H13 (Connection closed without > > > response > > >http://devcenter.heroku.com/articles/error-codes#h13__connection_clos. > .. > > > ). > > > > > I can only speculate, but to me it looks like instead of waiting for > > > the Dyno to finish sending its request, the Dyno is killed right away. > > > > > After a quick Googling I found out this bug might have been known for > > > a while: > > >http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug > > > (except the error status codes look different, but it might have been > > > a misstep?). > > > > > What's the status on fixing this bug? > > > > > Cheers, > > > Steph > > > > > -- > > > You received this message because you are subscribed to the Google > Groups > > > "Heroku" group. > > > To post to this group, send email to heroku@googlegroups.com. > > > To unsubscribe from this group, send email to > > > heroku+unsubscr...@googlegroups.com. > > > For more options, visit this group at > > >http://groups.google.com/group/heroku?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Heroku" group. > To post to this group, send email to heroku@googlegroups.com. > To unsubscribe from this group, send email to > heroku+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/heroku?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com. To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/heroku?hl=en.
Re: 503 error when decreasing dynos
I believe what's missing here is that thin is the process receiving the sigterm, and doesn't respond correctly. Instead of using sigterm as a notice to quit when it can, it treats sigterm as a -9 equivalent. This is a bug in older versions of thin. We've worked with the thin maintainers to get this fixed, and the newest versions do now handle sigterm correctly. If you are on bamboo, you can put thin in your gemfile on the latest version (1.2.11) and you'll get the correct behavior responding to signals. However there are dependencies between older versions of ruby and older versions of thin, so if you aren't running the latest rails, you may not be able to use the latest thin. This is why we haven't deployed it by default to all apps. Yes, we need to document this better. Sorry about the confusion. Oren On Thu, May 12, 2011 at 1:11 PM, midwaltz wrote: > Interesting, thanks Peter. Yes, that's probably worth documenting. > > Two questions: > > - Are requests in the queue somehow pre-assigned to a specific dyno? > I'm asking because I'm testing this behavior on a bare sinatra app > with a get handler that sleeps for 0.5 seconds and then returns 'OK'. > In other words the request takes half a second, so according to you it > should be quick enough to finish before being sent a SIGKILL. But I > keep seeing those errors, so I'm guessing either Sinatra panics and > kills itself on SIGTERM, or maybe it's a queue thing? I'll be happy to > share the code I'm using to make the tests if you'd like. (I'm using > the default Sinatra gem (not specifying a version) on the bamboo- > ree-1.8.7 stack). > > - How do I catch a SIGTERM during a Sinatra app request? > > > > > > > On May 12, 10:15 am, Peter van Hardenberg wrote: > > We normally send a SIGTERM, then wait five (ish?) seconds to let the last > > request serve and then, then send SIGKILL if the process still hasn't > gone > > away. > > > > You can confirm this behaviour here by catching and logging the SIGTERM > in > > your app and then reproducing the situation you describe. If you can > provide > > a test-case that shows you're not seeing expected behaviour (I use it > > extensively in one of my test apps) I'll make a ticket gets filed against > > the Runtime. Otherwise, maybe there's somewhere we can improve our > > documentation here. > > > > Regards, > > > > Peter > > Heroku > > > > > > > > > > > > > > > > On Wed, May 11, 2011 at 8:49 PM, midwaltz wrote: > > > > > When decreasing Dynos while they are busy, some of them return a 503 > > > status error with Heroku error code H13 (Connection closed without > > > response > > >http://devcenter.heroku.com/articles/error-codes#h13__connection_clos. > .. > > > ). > > > > > I can only speculate, but to me it looks like instead of waiting for > > > the Dyno to finish sending its request, the Dyno is killed right away. > > > > > After a quick Googling I found out this bug might have been known for > > > a while: > > >http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug > > > (except the error status codes look different, but it might have been > > > a misstep?). > > > > > What's the status on fixing this bug? > > > > > Cheers, > > > Steph > > > > > -- > > > You received this message because you are subscribed to the Google > Groups > > > "Heroku" group. > > > To post to this group, send email to heroku@googlegroups.com. > > > To unsubscribe from this group, send email to > > > heroku+unsubscr...@googlegroups.com. > > > For more options, visit this group at > > >http://groups.google.com/group/heroku?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Heroku" group. > To post to this group, send email to heroku@googlegroups.com. > To unsubscribe from this group, send email to > heroku+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/heroku?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com. To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/heroku?hl=en.
Re: 503 error when decreasing dynos
Interesting, thanks Peter. Yes, that's probably worth documenting. Two questions: - Are requests in the queue somehow pre-assigned to a specific dyno? I'm asking because I'm testing this behavior on a bare sinatra app with a get handler that sleeps for 0.5 seconds and then returns 'OK'. In other words the request takes half a second, so according to you it should be quick enough to finish before being sent a SIGKILL. But I keep seeing those errors, so I'm guessing either Sinatra panics and kills itself on SIGTERM, or maybe it's a queue thing? I'll be happy to share the code I'm using to make the tests if you'd like. (I'm using the default Sinatra gem (not specifying a version) on the bamboo- ree-1.8.7 stack). - How do I catch a SIGTERM during a Sinatra app request? On May 12, 10:15 am, Peter van Hardenberg wrote: > We normally send a SIGTERM, then wait five (ish?) seconds to let the last > request serve and then, then send SIGKILL if the process still hasn't gone > away. > > You can confirm this behaviour here by catching and logging the SIGTERM in > your app and then reproducing the situation you describe. If you can provide > a test-case that shows you're not seeing expected behaviour (I use it > extensively in one of my test apps) I'll make a ticket gets filed against > the Runtime. Otherwise, maybe there's somewhere we can improve our > documentation here. > > Regards, > > Peter > Heroku > > > > > > > > On Wed, May 11, 2011 at 8:49 PM, midwaltz wrote: > > > When decreasing Dynos while they are busy, some of them return a 503 > > status error with Heroku error code H13 (Connection closed without > > response > >http://devcenter.heroku.com/articles/error-codes#h13__connection_clos... > > ). > > > I can only speculate, but to me it looks like instead of waiting for > > the Dyno to finish sending its request, the Dyno is killed right away. > > > After a quick Googling I found out this bug might have been known for > > a while: > >http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug > > (except the error status codes look different, but it might have been > > a misstep?). > > > What's the status on fixing this bug? > > > Cheers, > > Steph > > > -- > > You received this message because you are subscribed to the Google Groups > > "Heroku" group. > > To post to this group, send email to heroku@googlegroups.com. > > To unsubscribe from this group, send email to > > heroku+unsubscr...@googlegroups.com. > > For more options, visit this group at > >http://groups.google.com/group/heroku?hl=en. -- You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com. To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/heroku?hl=en.
Re: 503 error when decreasing dynos
We normally send a SIGTERM, then wait five (ish?) seconds to let the last request serve and then, then send SIGKILL if the process still hasn't gone away. You can confirm this behaviour here by catching and logging the SIGTERM in your app and then reproducing the situation you describe. If you can provide a test-case that shows you're not seeing expected behaviour (I use it extensively in one of my test apps) I'll make a ticket gets filed against the Runtime. Otherwise, maybe there's somewhere we can improve our documentation here. Regards, Peter Heroku On Wed, May 11, 2011 at 8:49 PM, midwaltz wrote: > > When decreasing Dynos while they are busy, some of them return a 503 > status error with Heroku error code H13 (Connection closed without > response > http://devcenter.heroku.com/articles/error-codes#h13__connection_closed_without_response > ). > > I can only speculate, but to me it looks like instead of waiting for > the Dyno to finish sending its request, the Dyno is killed right away. > > After a quick Googling I found out this bug might have been known for > a while: > http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug > (except the error status codes look different, but it might have been > a misstep?). > > What's the status on fixing this bug? > > Cheers, > Steph > > -- > You received this message because you are subscribed to the Google Groups > "Heroku" group. > To post to this group, send email to heroku@googlegroups.com. > To unsubscribe from this group, send email to > heroku+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/heroku?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com. To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/heroku?hl=en.
503 error when decreasing dynos
When decreasing Dynos while they are busy, some of them return a 503 status error with Heroku error code H13 (Connection closed without response http://devcenter.heroku.com/articles/error-codes#h13__connection_closed_without_response). I can only speculate, but to me it looks like instead of waiting for the Dyno to finish sending its request, the Dyno is killed right away. After a quick Googling I found out this bug might have been known for a while: http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug (except the error status codes look different, but it might have been a misstep?). What's the status on fixing this bug? Cheers, Steph -- You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to heroku@googlegroups.com. To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/heroku?hl=en.