Re: App Timeouts

mattsly Wed, 06 Oct 2010 19:38:02 -0700

In just manual testing my app, I've seen a fair number of timeouts
(maybe a dozen) but have not received any communication.  I am pretty
sure I'd have no idea they occurred had I not personally witnessed the
error page.  I find this a borderline "ship blocker" for a migration
to Heroku as I consider migrating a ~500K monthly page view app to
Heroku, and get very anxious thinking about lots of users seeing funky
error page and having no way of being alerted or knowing how prevalent
the issue is.


WRT to the timeouts, it's maybe 1% of requests that timeout...and I
still can't pin down why they're happening.  I'm on a single dyno,
with Koi, and < 5 alpha testers on it "concurrently" (and timeout
errors are related to response...not concurrency...) and these are
extremely simple paging requests, that according to New Relic, return
in ~100MS on average...and then all of a sudden...bam! - a request
timeout.  And we're talking about essentially the exact same code
path, except a different :offset in the ActiveRecord find call.  The
complexity is nothing along the lines of suggested timeout causes
here: http://docs.heroku.com/performance#request-timeout

Strangely, I just tried turning off all varnish level caching (which I
hope to rely on heavily) to try and isolate the issue and now perf
seems *more* consistent and faster (haven't seen a timout yet). Could
it be that the timeouts are being caused during lookup at the Varnish
layer? My understanding is this wouldn't be a possible explanation, as
I think the dyno doesn't even catch a request if the a varnish cache
hit is found.  So maybe Varnish caching is a red herring...but does
seem curious.

Matt



On Sep 24, 7:56 pm, John Norman <j...@7fff.com> wrote:
> Well, you should get an e-mail if your app is generating backlogs.
>
> I have one app that did generate 2 in a whole week, and I received at least
> two e-mails from Heroku suggesting that I up the number of dynos.
>
>
>
> On Fri, Sep 24, 2010 at 11:42 AM, mattsly <matt...@gmail.com> wrote:
> > How are you finding the timeouts? Just manually?  I was having timeout
> > issues (that I now think I've solved - see below) but am concerned
> > that, once I flip my site public, that:
>
> > a) There's no apparent native reporting/alerting for timeouts or
> > backlog too deep errors if they do occur
> > b) No ability to render a custom (static) error page in that case
>
> > Re: reporting. When timeouts occur, am I mistaken in not seeing them
> > reported anywhere?  They don't seem to throw exceptional or new relic
> > exceptions with the free version?  It's unclear to me that they would
> > be with the (expensive - .$.05/hr = $36/month for alerting?) "Silver"
> > - can anyone confirm that they in fact do?
>
> > It seems like timeout/backlog too deep reporting/alerting should
> > really be a built-in feature of Heroku, since they are core elements
> > in the architecture, and such alerting (especially backlog) helps you
> > make a quick call about cranking dyno count up/down and or restarting
> > an app to minimize adverse user affects...i.e. really what this cloud
> > and hosting-as-a-service thing is all about.
>
> > I'm about to (I think) migrate a high traffic site to Heroku. I *love*
> > the idea of being able to focus on development and not sysadmin...but
> > have to say that I am getting a little anxious about quirks like this
> > and what it might mean for my users.
>
> > Matt
>
> > (On a slightly related note - I've learned the hard way the
> > Table.count is a great way to cause a timeout - looks like MySQL and
> > PostGreSQL handle counts *way* differently...something to keep in mind
> > if you're migrating from mysql:
> >http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL#COUNT.28.2A.29)
>
> > On Sep 10, 3:45 am, daniel hoey <danielho...@gmail.com> wrote:
> > > We go through short periods where we get frequentapptimeouts. The
> > > pages that timeout are often very simple and do not relying on
> > > external services or performing any demanding database queries. We
> > > don't get any information in our New Relic transaction traces for
> > > these queries (we have for othertimeoutsin the past). Basically we
> > > can't get any information about what is going on, and only know about
> > > the problem if our users tell us. Has anyone else experienced similar
> > > problems or have anything to suggest in terms of investigating the
> > > root cause?
>
> > > The last time that we are aware of this happening was between 06:30
> > > and 07:00 GMT on Sept 10.
>
> > On Sep 10, 3:45 am, daniel hoey <danielho...@gmail.com> wrote:
> > > We go through short periods where we get frequentapptimeouts. The
> > > pages that timeout are often very simple and do not relying on
> > > external services or performing any demanding database queries. We
> > > don't get any information in our New Relic transaction traces for
> > > these queries (we have for othertimeoutsin the past). Basically we
> > > can't get any information about what is going on, and only know about
> > > the problem if our users tell us. Has anyone else experienced similar
> > > problems or have anything to suggest in terms of investigating the
> > > root cause?
>
> > > The last time that we are aware of this happening was between 06:30
> > > and 07:00 GMT on Sept 10.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Heroku" group.
> > To post to this group, send email to her...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > heroku+unsubscr...@googlegroups.com<heroku%2bunsubscr...@googlegroups.com>
> > .
> > For more options, visit this group at
> >http://groups.google.com/group/heroku?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Heroku" group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to 
heroku+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/heroku?hl=en.

Re: App Timeouts

Reply via email to