Has anyone found a reasonable solution to this problem yet? On our app as well, we notice totally random timeout errors that couldn't possibly be associated with db lookup -- sometimes request time out on pages that lookup a row by primary key on a table with 15 records. Favicon.ico timed out as well. The timeouts seem arbitrary, and *always* get fixed on server restart (heroku restart). This has happened to us a few times over the last week. And yes, as several of you have noted, there is no exceptions raised (neither exceptional nor NewRelic).
I think given that we experienced timeout with favicon.ico and an about page with a single db lookup and newrelic doesn't see this at all, I suspect this is something higher up the heroku stack that is timing out .. It almost smells like a memory leak somewhere which is how app restart seems to fix the problem. Now, the question is whether the memory leak is in our app or somewhere else (plugins, gems, interaction with heroku stack) ... I will debug this, but wanted to see if someone else has found a reasonable solution to this. Subbu. On Oct 6, 9:37 pm, mattsly <matt...@gmail.com> wrote: > In just manual testing my app, I've seen a fair number of timeouts > (maybe a dozen) but have not received any communication. I am pretty > sure I'd have no idea they occurred had I not personally witnessed the > error page. I find this a borderline "ship blocker" for a migration > to Heroku as I consider migrating a ~500K monthly page view app to > Heroku, and get very anxious thinking about lots of users seeing funky > error page and having no way of being alerted or knowing how prevalent > the issue is. > > WRT to the timeouts, it's maybe 1% of requests thattimeout...and I > still can't pin down why they're happening. I'm on a single dyno, > with Koi, and < 5 alpha testers on it "concurrently" (andtimeout > errors are related to response...not concurrency...) and these are > extremely simple paging requests, that according to New Relic, return > in ~100MS on average...and then all of a sudden...bam! - a requesttimeout. > And we're talking about essentially the exact same code > path, except a different :offset in the ActiveRecord find call. The > complexity is nothing along the lines of suggestedtimeoutcauses > here:http://docs.heroku.com/performance#request-timeout > > Strangely, I just tried turning off all varnish level caching (which I > hope to rely on heavily) to try and isolate the issue and now perf > seems *more* consistent and faster (haven't seen a timout yet). Could > it be that the timeouts are being caused during lookup at the Varnish > layer? My understanding is this wouldn't be a possible explanation, as > I think the dyno doesn't even catch a request if the a varnish cache > hit is found. So maybe Varnish caching is a red herring...but does > seem curious. > > Matt > > On Sep 24, 7:56 pm, John Norman <j...@7fff.com> wrote: > > > Well, you should get an e-mail if your app is generating backlogs. > > > I have one app that did generate 2 in a whole week, and I received at least > > two e-mails from Heroku suggesting that I up the number of dynos. > > > On Fri, Sep 24, 2010 at 11:42 AM, mattsly <matt...@gmail.com> wrote: > > > How are you finding the timeouts? Just manually? I was havingtimeout > > > issues (that I now think I've solved - see below) but am concerned > > > that, once I flip my site public, that: > > > > a) There's no apparent native reporting/alerting for timeouts or > > > backlog too deep errors if they do occur > > > b) No ability to render a custom (static) error page in that case > > > > Re: reporting. When timeouts occur, am I mistaken in not seeing them > > > reported anywhere? They don't seem to throw exceptional or new relic > > > exceptions with the free version? It's unclear to me that they would > > > be with the (expensive - .$.05/hr = $36/month for alerting?) "Silver" > > > - can anyone confirm that they in fact do? > > > > It seems liketimeout/backlog too deep reporting/alerting should > > > really be a built-in feature of Heroku, since they are core elements > > > in the architecture, and such alerting (especially backlog) helps you > > > make a quick call about cranking dyno count up/down and or restarting > > > an app to minimize adverse user affects...i.e. really what this cloud > > > and hosting-as-a-service thing is all about. > > > > I'm about to (I think) migrate a high traffic site to Heroku. I *love* > > > the idea of being able to focus on development and not sysadmin...but > > > have to say that I am getting a little anxious about quirks like this > > > and what it might mean for my users. > > > > Matt > > > > (On a slightly related note - I've learned the hard way the > > > Table.count is a great way to cause atimeout- looks like MySQL and > > > PostGreSQL handle counts *way* differently...something to keep in mind > > > if you're migrating from mysql: > > >http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL#COUNT.28.2A.29) > > > > On Sep 10, 3:45 am, daniel hoey <danielho...@gmail.com> wrote: > > > > We go through short periods where we get frequentapptimeouts. The > > > > pages thattimeoutare often very simple and do not relying on > > > > external services or performing any demanding database queries. We > > > > don't get any information in our New Relic transaction traces for > > > > these queries (we have for othertimeoutsin the past). Basically we > > > > can't get any information about what is going on, and only know about > > > > the problem if our users tell us. Has anyone else experienced similar > > > > problems or have anything to suggest in terms of investigating the > > > > root cause? > > > > > The last time that we are aware of this happening was between 06:30 > > > > and 07:00 GMT on Sept 10. > > > > On Sep 10, 3:45 am, daniel hoey <danielho...@gmail.com> wrote: > > > > We go through short periods where we get frequentapptimeouts. The > > > > pages thattimeoutare often very simple and do not relying on > > > > external services or performing any demanding database queries. We > > > > don't get any information in our New Relic transaction traces for > > > > these queries (we have for othertimeoutsin the past). Basically we > > > > can't get any information about what is going on, and only know about > > > > the problem if our users tell us. Has anyone else experienced similar > > > > problems or have anything to suggest in terms of investigating the > > > > root cause? > > > > > The last time that we are aware of this happening was between 06:30 > > > > and 07:00 GMT on Sept 10. > > > > -- > > > You received this message because you are subscribed to the Google Groups > > > "Heroku" group. > > > To post to this group, send email to her...@googlegroups.com. > > > To unsubscribe from this group, send email to > > > heroku+unsubscr...@googlegroups.com<heroku%2bunsubscr...@googlegroups.com> > > > . > > > For more options, visit this group at > > >http://groups.google.com/group/heroku?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to her...@googlegroups.com. To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/heroku?hl=en.