Re: 503 error when decreasing dynos

2011-05-12 Thread midwaltz
On May 12, 10:15 am, Peter van Hardenberg  wrote:
> We normally send a SIGTERM, then wait five (ish?) seconds to let the last
> request serve and then, then send SIGKILL if the process still hasn't gone
> away.

I'm going to go ahead and ignore the fact that I know nothing about
the infrastructure's implementations and their constraints --- but
from this side of things it looks like this problems stems from the
fact that you rely heavily on the applications and/or thin to
gracefully finish a request after receiving a SIGTERM. If I may ask,
why don't you first wait for the request to finish, and if it doesn't
finish after an interval, *then* send a SIGTERM?

-- 
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Re: 503 error when decreasing dynos

2011-05-12 Thread midwaltz

On May 12, 3:08 pm, Peter van Hardenberg  wrote:
> Sinatra only started handling SIGTERM correctly in 1.1, so you could look at 
> the source for that and monkeypatch accordingly.

On May 12, 1:26 pm, Oren Teich  wrote:
> If you are on bamboo, you can put thin in your gemfile on the latest version
> (1.2.11) and you'll get the correct behavior responding to signals.

I've correctly bundled Sinatra 1.2.6 and Thin 1.2.11 gems and
dependencies and unfortunately the error is still showing up.


Here's my test case code. (Replace myapp with your app name). Scroll
to the bottom for the step by step instructions to reproduce the
error.

 config.ru

require 'rubygems'
require 'bundler'
Bundler.require

require 'myapp'

run Sinatra::Application


 myapp.rb

require 'sinatra'

get '/' do

  delay = params[:delay].to_f
  sleep(delay)

  queue_depth = env['HTTP_X_HEROKU_QUEUE_DEPTH']
  queue_wait = env['HTTP_X_HEROKU_QUEUE_WAIT_TIME']
  "queue_depth: #{queue_depth} queue_wait: #{queue_wait} delay:
#{delay}"
end


 heroku_stress_test.rb

require 'rubygems'
require 'typhoeus'
# using typhoeus version 0.1.29

status_code_success = 200

num_threads = 1
num_requests_per_thread = 100
hydra_concurrency = 10

delay = 0.0 # seconds between each thread
url = 'http://myapp.heroku.com/?delay=0.5'

user_agent = 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_3; en-US)
AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.99 Safari/533.4'

$total_successes = 0

threads = []

num_threads.times do |i|
  sleep(delay)
  threads << Thread.new {
hydra = Typhoeus::Hydra.new(:max_concurrency => hydra_concurrency)
hydra.disable_memoization
num_requests_per_thread.times do |j|
  request = Typhoeus::Request.new(url)
  request.user_agent = user_agent
  hydra.queue request
  debug = ""
  request.on_complete do |response|
debug += "\nthread: #{i}"
debug += " | response.code:#{response.code}"
debug += " | response.time:#{response.time.to_s}"

if response.code != status_code_success
  debug += 'error -'
  debug += response.body
else
  $total_successes += 1
  debug += "\n"+response.body
end

puts "\n"
puts debug
p $total_successes

  end
end
puts 'hydra.run'
hydra.run
  }

end

# make sure the program has ended
threads.each do |thread|
  thread.join
end


 how to reproduce the error:

1. upload myapp to heroku and make sure it works
2. open Heroku resources page for the app
3. place the slider at 10 and "Save and Apply" changes
4. place the slider at 5 but *don't* "Save and Apply" changes yet
5. run heroku_stress_test.rb in your local terminal
6. when responses start flowing in, "Save and Apply" changes in
resources page.
7. you should see a number of errors (error numbers vary)
7. repeat the process multiple times. in my experience I see something
between 0 to 5 request failures (503 errors) each time I run the test
with these dyno numbers.

The dyno numbers are just an example that I've observed have a high
chance of showing errors. I've also tested it with subtracting only
one dyno, and there's a roughly 50% chance of having one request
failing, maybe less.

Worth noting that it seems like the number of errors in this test is
never larger than the number of dynos subtracted. If I subtract one
dyno, not more than 1 error will appear. If I subtract 5, I can
observe up to 5 errors.

-- 
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Re: 503 error when decreasing dynos

2011-05-12 Thread Peter van Hardenberg
Ah, I see Oren has a more useful response! Thanks, Oren!

On Thu, May 12, 2011 at 3:08 PM, Peter van Hardenberg wrote:

> Sorry -- you're going to have to figure out the Sinatra stuff yourself.
> Based on my "here let me google that for you" investigation, it sounds like
> Sinatra only started handling SIGTERM correctly in 1.1, so you could look at
> the source for that and monkeypatch accordingly.
>
> Regards,
> Peter
>
>
> On Thu, May 12, 2011 at 1:11 PM, midwaltz  wrote:
>
>> Interesting, thanks Peter. Yes, that's probably worth documenting.
>>
>> Two questions:
>>
>> - Are requests in the queue somehow pre-assigned to a specific dyno?
>> I'm asking because I'm testing this behavior on a bare sinatra app
>> with a get handler that sleeps for 0.5 seconds and then returns 'OK'.
>> In other words the request takes half a second, so according to you it
>> should be quick enough to finish before being sent a SIGKILL. But I
>> keep seeing those errors, so I'm guessing either Sinatra panics and
>> kills itself on SIGTERM, or maybe it's a queue thing? I'll be happy to
>> share the code I'm using to make the tests if you'd like. (I'm using
>> the default Sinatra gem (not specifying a version) on the bamboo-
>> ree-1.8.7 stack).
>>
>> - How do I catch a SIGTERM during a Sinatra app request?
>>
>>
>>
>>
>>
>>
>> On May 12, 10:15 am, Peter van Hardenberg  wrote:
>> > We normally send a SIGTERM, then wait five (ish?) seconds to let the
>> last
>> > request serve and then, then send SIGKILL if the process still hasn't
>> gone
>> > away.
>> >
>> > You can confirm this behaviour here by catching and logging the SIGTERM
>> in
>> > your app and then reproducing the situation you describe. If you can
>> provide
>> > a test-case that shows you're not seeing expected behaviour (I use it
>> > extensively in one of my test apps) I'll make a ticket gets filed
>> against
>> > the Runtime. Otherwise, maybe there's somewhere we can improve our
>> > documentation here.
>> >
>> > Regards,
>> >
>> > Peter
>> > Heroku
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Wed, May 11, 2011 at 8:49 PM, midwaltz  wrote:
>> >
>> > > When decreasing Dynos while they are busy, some of them return a 503
>> > > status error with Heroku error code H13 (Connection closed without
>> > > response
>> > >http://devcenter.heroku.com/articles/error-codes#h13__connection_clos.
>> ..
>> > > ).
>> >
>> > > I can only speculate, but to me it looks like instead of waiting for
>> > > the Dyno to finish sending its request, the Dyno is killed right away.
>> >
>> > > After a quick Googling I found out this bug might have been known for
>> > > a while:
>> > >http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug
>> > > (except the error status codes look different, but it might have been
>> > > a misstep?).
>> >
>> > > What's the status on fixing this bug?
>> >
>> > > Cheers,
>> > > Steph
>> >
>> > > --
>> > > You received this message because you are subscribed to the Google
>> Groups
>> > > "Heroku" group.
>> > > To post to this group, send email to heroku@googlegroups.com.
>> > > To unsubscribe from this group, send email to
>> > > heroku+unsubscr...@googlegroups.com.
>> > > For more options, visit this group at
>> > >http://groups.google.com/group/heroku?hl=en.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Heroku" group.
>> To post to this group, send email to heroku@googlegroups.com.
>> To unsubscribe from this group, send email to
>> heroku+unsubscr...@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/heroku?hl=en.
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Re: 503 error when decreasing dynos

2011-05-12 Thread Peter van Hardenberg
Sorry -- you're going to have to figure out the Sinatra stuff yourself.
Based on my "here let me google that for you" investigation, it sounds like
Sinatra only started handling SIGTERM correctly in 1.1, so you could look at
the source for that and monkeypatch accordingly.

Regards,
Peter

On Thu, May 12, 2011 at 1:11 PM, midwaltz  wrote:

> Interesting, thanks Peter. Yes, that's probably worth documenting.
>
> Two questions:
>
> - Are requests in the queue somehow pre-assigned to a specific dyno?
> I'm asking because I'm testing this behavior on a bare sinatra app
> with a get handler that sleeps for 0.5 seconds and then returns 'OK'.
> In other words the request takes half a second, so according to you it
> should be quick enough to finish before being sent a SIGKILL. But I
> keep seeing those errors, so I'm guessing either Sinatra panics and
> kills itself on SIGTERM, or maybe it's a queue thing? I'll be happy to
> share the code I'm using to make the tests if you'd like. (I'm using
> the default Sinatra gem (not specifying a version) on the bamboo-
> ree-1.8.7 stack).
>
> - How do I catch a SIGTERM during a Sinatra app request?
>
>
>
>
>
>
> On May 12, 10:15 am, Peter van Hardenberg  wrote:
> > We normally send a SIGTERM, then wait five (ish?) seconds to let the last
> > request serve and then, then send SIGKILL if the process still hasn't
> gone
> > away.
> >
> > You can confirm this behaviour here by catching and logging the SIGTERM
> in
> > your app and then reproducing the situation you describe. If you can
> provide
> > a test-case that shows you're not seeing expected behaviour (I use it
> > extensively in one of my test apps) I'll make a ticket gets filed against
> > the Runtime. Otherwise, maybe there's somewhere we can improve our
> > documentation here.
> >
> > Regards,
> >
> > Peter
> > Heroku
> >
> >
> >
> >
> >
> >
> >
> > On Wed, May 11, 2011 at 8:49 PM, midwaltz  wrote:
> >
> > > When decreasing Dynos while they are busy, some of them return a 503
> > > status error with Heroku error code H13 (Connection closed without
> > > response
> > >http://devcenter.heroku.com/articles/error-codes#h13__connection_clos.
> ..
> > > ).
> >
> > > I can only speculate, but to me it looks like instead of waiting for
> > > the Dyno to finish sending its request, the Dyno is killed right away.
> >
> > > After a quick Googling I found out this bug might have been known for
> > > a while:
> > >http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug
> > > (except the error status codes look different, but it might have been
> > > a misstep?).
> >
> > > What's the status on fixing this bug?
> >
> > > Cheers,
> > > Steph
> >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Heroku" group.
> > > To post to this group, send email to heroku@googlegroups.com.
> > > To unsubscribe from this group, send email to
> > > heroku+unsubscr...@googlegroups.com.
> > > For more options, visit this group at
> > >http://groups.google.com/group/heroku?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Heroku" group.
> To post to this group, send email to heroku@googlegroups.com.
> To unsubscribe from this group, send email to
> heroku+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/heroku?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Re: 503 error when decreasing dynos

2011-05-12 Thread Oren Teich
I believe what's missing here is that thin is the process receiving the
sigterm, and doesn't respond correctly.  Instead of using sigterm as a
notice to quit when it can, it treats sigterm as a -9 equivalent.  This is a
bug in older versions of thin.  We've worked with the thin maintainers to
get this fixed, and the newest versions do now handle sigterm correctly.

If you are on bamboo, you can put thin in your gemfile on the latest version
(1.2.11) and you'll get the correct behavior responding to signals.  However
there are dependencies between older versions of ruby and older versions of
thin, so if you aren't running the latest rails, you may not be able to use
the latest thin.  This is why we haven't deployed it by default to all apps.

Yes, we need to document this better.  Sorry about the confusion.

Oren

On Thu, May 12, 2011 at 1:11 PM, midwaltz  wrote:

> Interesting, thanks Peter. Yes, that's probably worth documenting.
>
> Two questions:
>
> - Are requests in the queue somehow pre-assigned to a specific dyno?
> I'm asking because I'm testing this behavior on a bare sinatra app
> with a get handler that sleeps for 0.5 seconds and then returns 'OK'.
> In other words the request takes half a second, so according to you it
> should be quick enough to finish before being sent a SIGKILL. But I
> keep seeing those errors, so I'm guessing either Sinatra panics and
> kills itself on SIGTERM, or maybe it's a queue thing? I'll be happy to
> share the code I'm using to make the tests if you'd like. (I'm using
> the default Sinatra gem (not specifying a version) on the bamboo-
> ree-1.8.7 stack).
>
> - How do I catch a SIGTERM during a Sinatra app request?
>
>
>
>
>
>
> On May 12, 10:15 am, Peter van Hardenberg  wrote:
> > We normally send a SIGTERM, then wait five (ish?) seconds to let the last
> > request serve and then, then send SIGKILL if the process still hasn't
> gone
> > away.
> >
> > You can confirm this behaviour here by catching and logging the SIGTERM
> in
> > your app and then reproducing the situation you describe. If you can
> provide
> > a test-case that shows you're not seeing expected behaviour (I use it
> > extensively in one of my test apps) I'll make a ticket gets filed against
> > the Runtime. Otherwise, maybe there's somewhere we can improve our
> > documentation here.
> >
> > Regards,
> >
> > Peter
> > Heroku
> >
> >
> >
> >
> >
> >
> >
> > On Wed, May 11, 2011 at 8:49 PM, midwaltz  wrote:
> >
> > > When decreasing Dynos while they are busy, some of them return a 503
> > > status error with Heroku error code H13 (Connection closed without
> > > response
> > >http://devcenter.heroku.com/articles/error-codes#h13__connection_clos.
> ..
> > > ).
> >
> > > I can only speculate, but to me it looks like instead of waiting for
> > > the Dyno to finish sending its request, the Dyno is killed right away.
> >
> > > After a quick Googling I found out this bug might have been known for
> > > a while:
> > >http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug
> > > (except the error status codes look different, but it might have been
> > > a misstep?).
> >
> > > What's the status on fixing this bug?
> >
> > > Cheers,
> > > Steph
> >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Heroku" group.
> > > To post to this group, send email to heroku@googlegroups.com.
> > > To unsubscribe from this group, send email to
> > > heroku+unsubscr...@googlegroups.com.
> > > For more options, visit this group at
> > >http://groups.google.com/group/heroku?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Heroku" group.
> To post to this group, send email to heroku@googlegroups.com.
> To unsubscribe from this group, send email to
> heroku+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/heroku?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Re: 503 error when decreasing dynos

2011-05-12 Thread midwaltz
Interesting, thanks Peter. Yes, that's probably worth documenting.

Two questions:

- Are requests in the queue somehow pre-assigned to a specific dyno?
I'm asking because I'm testing this behavior on a bare sinatra app
with a get handler that sleeps for 0.5 seconds and then returns 'OK'.
In other words the request takes half a second, so according to you it
should be quick enough to finish before being sent a SIGKILL. But I
keep seeing those errors, so I'm guessing either Sinatra panics and
kills itself on SIGTERM, or maybe it's a queue thing? I'll be happy to
share the code I'm using to make the tests if you'd like. (I'm using
the default Sinatra gem (not specifying a version) on the bamboo-
ree-1.8.7 stack).

- How do I catch a SIGTERM during a Sinatra app request?






On May 12, 10:15 am, Peter van Hardenberg  wrote:
> We normally send a SIGTERM, then wait five (ish?) seconds to let the last
> request serve and then, then send SIGKILL if the process still hasn't gone
> away.
>
> You can confirm this behaviour here by catching and logging the SIGTERM in
> your app and then reproducing the situation you describe. If you can provide
> a test-case that shows you're not seeing expected behaviour (I use it
> extensively in one of my test apps) I'll make a ticket gets filed against
> the Runtime. Otherwise, maybe there's somewhere we can improve our
> documentation here.
>
> Regards,
>
> Peter
> Heroku
>
>
>
>
>
>
>
> On Wed, May 11, 2011 at 8:49 PM, midwaltz  wrote:
>
> > When decreasing Dynos while they are busy, some of them return a 503
> > status error with Heroku error code H13 (Connection closed without
> > response
> >http://devcenter.heroku.com/articles/error-codes#h13__connection_clos...
> > ).
>
> > I can only speculate, but to me it looks like instead of waiting for
> > the Dyno to finish sending its request, the Dyno is killed right away.
>
> > After a quick Googling I found out this bug might have been known for
> > a while:
> >http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug
> > (except the error status codes look different, but it might have been
> > a misstep?).
>
> > What's the status on fixing this bug?
>
> > Cheers,
> > Steph
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Heroku" group.
> > To post to this group, send email to heroku@googlegroups.com.
> > To unsubscribe from this group, send email to
> > heroku+unsubscr...@googlegroups.com.
> > For more options, visit this group at
> >http://groups.google.com/group/heroku?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



Re: 503 error when decreasing dynos

2011-05-12 Thread Peter van Hardenberg
We normally send a SIGTERM, then wait five (ish?) seconds to let the last
request serve and then, then send SIGKILL if the process still hasn't gone
away.

You can confirm this behaviour here by catching and logging the SIGTERM in
your app and then reproducing the situation you describe. If you can provide
a test-case that shows you're not seeing expected behaviour (I use it
extensively in one of my test apps) I'll make a ticket gets filed against
the Runtime. Otherwise, maybe there's somewhere we can improve our
documentation here.

Regards,

Peter
Heroku


On Wed, May 11, 2011 at 8:49 PM, midwaltz  wrote:

>
> When decreasing Dynos while they are busy, some of them return a 503
> status error with Heroku error code H13 (Connection closed without
> response
> http://devcenter.heroku.com/articles/error-codes#h13__connection_closed_without_response
> ).
>
> I can only speculate, but to me it looks like instead of waiting for
> the Dyno to finish sending its request, the Dyno is killed right away.
>
> After a quick Googling I found out this bug might have been known for
> a while:
> http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug
> (except the error status codes look different, but it might have been
> a misstep?).
>
> What's the status on fixing this bug?
>
> Cheers,
> Steph
>
> --
> You received this message because you are subscribed to the Google Groups
> "Heroku" group.
> To post to this group, send email to heroku@googlegroups.com.
> To unsubscribe from this group, send email to
> heroku+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/heroku?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.



503 error when decreasing dynos

2011-05-11 Thread midwaltz

When decreasing Dynos while they are busy, some of them return a 503
status error with Heroku error code H13 (Connection closed without
response 
http://devcenter.heroku.com/articles/error-codes#h13__connection_closed_without_response).

I can only speculate, but to me it looks like instead of waiting for
the Dyno to finish sending its request, the Dyno is killed right away.

After a quick Googling I found out this bug might have been known for
a while: http://www.continuousthinking.com/2010/11/3/heroku-autoscaling-bug
(except the error status codes look different, but it might have been
a misstep?).

What's the status on fixing this bug?

Cheers,
Steph

-- 
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to heroku@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.