Re: Diagnosing outage

2012-06-08 Thread John McCaffrey
Also, its good to follow @herokustatus on twitter

The new relic add-on has a ping feature which is useful.

The StillAlive add-on goes a bit further with a full functional test (but
doesn't run as frequently).

On Fri, Jun 8, 2012 at 3:33 AM, Neil Middleton wrote:

> You'll have trouble diagnosing this outage as the platform was at fault,
> not your application.  The routing layer was having issues for a few
> minutes meaning that HTTP requests would not have been able to reach your
> application (and as this happens in a layer above your application would
> not show in your logs).
>
> Aside from this there are a number of logging options.  The default Heroku
> provided logging add-ons allow you to view up to 24 hours, whereas add-ons
> such as Papertrail allow you to go back pretty much as far as you want (I
> use Papertrail a fair amount on my apps and find it very useful)
>
> More info on Logging / Syslog (how Papertrail works) can be found here:
> https://devcenter.heroku.com/articles/logging
>
> In the meantime, it's a good idea to bookmark the Heroku status site (
> http://status.heroku.com) as, generally speaking, any issues that are
> outside of your control will appear here with more information.
>
> On Thursday, 7 June 2012 at 23:32, puzzler wrote:
>
> I've been testing out heroku, with a paid account, to determine
> reliability.
> Got a message from my pingdom account today that my server was down
> for a half-hour.
>
> Looking through the logs for the time in question, I see nothing.
> Specifically, I don't see the router process even reporting that it
> received the pings for the half-hour in question.
>
> One possibly suspicious thing I see is that a half-hour before that, I
> see a web dyno process exiting with code 143 and then restarting, and
> I can't find any documentation about what that means.
>
> Another possibly suspicious thing is that in the two requests
> immediately following the half-hour blank spot in the log, the
> requests show a "wait time" of about 15ms. The amount seems
> insignificant, but this web server sees so little traffic that the
> wait time is always 0ms. I find it suspicious that the two requests
> immediately following the outage would be the only requests I've ever
> seen to show a wait time.
>
> The main thing I'm realizing is that I really have no idea how to
> troubleshoot outages. The logs that show up when you type "heroku
> logs" really don't go back that far. Is there a way to get them to go
> back farther? How do I look up what specific exiting codes mean? Any
> idea what could cause web requests to not get through to my app and
> show up in the logs for a half-hour?
>
> Thanks,
>
> mark
>
> --
> You received this message because you are subscribed to the Google
> Groups "Heroku" group.
>
> To unsubscribe from this group, send email to
> heroku+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/heroku?hl=en_US?hl=en
>
>
>  --
> You received this message because you are subscribed to the Google
> Groups "Heroku" group.
>
> To unsubscribe from this group, send email to
> heroku+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/heroku?hl=en_US?hl=en
>



-- 
Thanks,
-John

-- 
You received this message because you are subscribed to the Google
Groups "Heroku" group.

To unsubscribe from this group, send email to
heroku+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/heroku?hl=en_US?hl=en


Re: Diagnosing outage

2012-06-08 Thread Neil Middleton
You'll have trouble diagnosing this outage as the platform was at fault, not 
your application.  The routing layer was having issues for a few minutes 
meaning that HTTP requests would not have been able to reach your application 
(and as this happens in a layer above your application would not show in your 
logs).

Aside from this there are a number of logging options.  The default Heroku 
provided logging add-ons allow you to view up to 24 hours, whereas add-ons such 
as Papertrail allow you to go back pretty much as far as you want (I use 
Papertrail a fair amount on my apps and find it very useful)

More info on Logging / Syslog (how Papertrail works) can be found here:  
https://devcenter.heroku.com/articles/logging

In the meantime, it's a good idea to bookmark the Heroku status site 
(http://status.heroku.com) as, generally speaking, any issues that are outside 
of your control will appear here with more information. 


On Thursday, 7 June 2012 at 23:32, puzzler wrote:

> I've been testing out heroku, with a paid account, to determine
> reliability.
> Got a message from my pingdom account today that my server was down
> for a half-hour.
> 
> Looking through the logs for the time in question, I see nothing.
> Specifically, I don't see the router process even reporting that it
> received the pings for the half-hour in question.
> 
> One possibly suspicious thing I see is that a half-hour before that, I
> see a web dyno process exiting with code 143 and then restarting, and
> I can't find any documentation about what that means.
> 
> Another possibly suspicious thing is that in the two requests
> immediately following the half-hour blank spot in the log, the
> requests show a "wait time" of about 15ms. The amount seems
> insignificant, but this web server sees so little traffic that the
> wait time is always 0ms. I find it suspicious that the two requests
> immediately following the outage would be the only requests I've ever
> seen to show a wait time.
> 
> The main thing I'm realizing is that I really have no idea how to
> troubleshoot outages. The logs that show up when you type "heroku
> logs" really don't go back that far. Is there a way to get them to go
> back farther? How do I look up what specific exiting codes mean? Any
> idea what could cause web requests to not get through to my app and
> show up in the logs for a half-hour?
> 
> Thanks,
> 
> mark
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Heroku" group.
> 
> To unsubscribe from this group, send email to
> heroku+unsubscr...@googlegroups.com 
> (mailto:heroku+unsubscr...@googlegroups.com)
> For more options, visit this group at
> http://groups.google.com/group/heroku?hl=en_US?hl=en
> 
> 


-- 
You received this message because you are subscribed to the Google
Groups "Heroku" group.

To unsubscribe from this group, send email to
heroku+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/heroku?hl=en_US?hl=en


Re: Diagnosing outage

2012-06-07 Thread Jeff Schmitz
Mark 

The loge tries addon goes back 24 hours. 

Jeff

On Jun 7, 2012, at 5:32 PM, puzzler  wrote:

> I've been testing out heroku, with a paid account, to determine
> reliability.
> Got a message from my pingdom account today that my server was down
> for a half-hour.
> 
> Looking through the logs for the time in question, I see nothing.
> Specifically, I don't see the router process even reporting that it
> received the pings for the half-hour in question.
> 
> One possibly suspicious thing I see is that a half-hour before that, I
> see a web dyno process exiting with code 143 and then restarting, and
> I can't find any documentation about what that means.
> 
> Another possibly suspicious thing is that in the two requests
> immediately following the half-hour blank spot in the log, the
> requests show a "wait time" of about 15ms.  The amount seems
> insignificant, but this web server sees so little traffic that the
> wait time is always 0ms.  I find it suspicious that the two requests
> immediately following the outage would be the only requests I've ever
> seen to show a wait time.
> 
> The main thing I'm realizing is that I really have no idea how to
> troubleshoot outages.  The logs that show up when you type "heroku
> logs" really don't go back that far.  Is there a way to get them to go
> back farther?  How do I look up what specific exiting codes mean?  Any
> idea what could cause web requests to not get through to my app and
> show up in the logs for a half-hour?
> 
> Thanks,
> 
> mark
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Heroku" group.
> 
> To unsubscribe from this group, send email to
> heroku+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/heroku?hl=en_US?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "Heroku" group.

To unsubscribe from this group, send email to
heroku+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/heroku?hl=en_US?hl=en


Diagnosing outage

2012-06-07 Thread puzzler
I've been testing out heroku, with a paid account, to determine
reliability.
Got a message from my pingdom account today that my server was down
for a half-hour.

Looking through the logs for the time in question, I see nothing.
Specifically, I don't see the router process even reporting that it
received the pings for the half-hour in question.

One possibly suspicious thing I see is that a half-hour before that, I
see a web dyno process exiting with code 143 and then restarting, and
I can't find any documentation about what that means.

Another possibly suspicious thing is that in the two requests
immediately following the half-hour blank spot in the log, the
requests show a "wait time" of about 15ms.  The amount seems
insignificant, but this web server sees so little traffic that the
wait time is always 0ms.  I find it suspicious that the two requests
immediately following the outage would be the only requests I've ever
seen to show a wait time.

The main thing I'm realizing is that I really have no idea how to
troubleshoot outages.  The logs that show up when you type "heroku
logs" really don't go back that far.  Is there a way to get them to go
back farther?  How do I look up what specific exiting codes mean?  Any
idea what could cause web requests to not get through to my app and
show up in the logs for a half-hour?

Thanks,

mark

-- 
You received this message because you are subscribed to the Google
Groups "Heroku" group.

To unsubscribe from this group, send email to
heroku+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/heroku?hl=en_US?hl=en