Re: [nodejs] Re: State of the art for request isolation in http servers?

Forrest L Norvell Tue, 14 Jan 2014 22:15:38 -0800

As I see it, there are a few paths open to you that don't require you to do
a total rewrite using a different framework:

1. This is pretty much the exact problem that domains were designed to
solve. It's up to you to decide whether you want to recover from errors or
to shut down the process (gracefully, in a way that lets it finish handling
all the requests you have in flight), but they will at least let you put
the error-handling closer to where the errors are coming from and allow you
to deal with them and not just crash. Domains are still a part of the
platform, they're proven at this point, and while they're not perfect,
they're a fairly easy incremental step for you.

2. If you're using Express or Restify, it's pretty easy to write a
middleware that will make your middleware chain domains-aware:

var domain = require('domain');
function domainsify(request, response, next) {
  var d = domain.create();
  // ourHandler could stop the HTTP server, preventing any new requests from
  // being handled while still allowing in-flight requests to finish
  d.on('error', ourHandler);

  // make sure that any error events on these streams are handled by the
domain
  d.add(request);
  d.add(response);

  // make sure that asynchronous calls within the scope of the middleware
chain are pulled into the domain
  d.run(next);
}

3. If you want something more strongly biased towards keeping your node
processes up and running, Adam Crabtree's trycatch (
https://github.com/CrabDude/trycatch) takes a similar approach to domains
while even more tightly binding code that can fail to an error handler. At
this point, domains and trycatch to me feel like similar flavors of the
same strategy, but the trycatch API is dead simple and that might appeal to
you more than domains, which do have some complexity to them. Adam also has
a very different philosophy than the Node core team, where he wants to keep
Node processes running and actually recover from errors more often than
not, and trycatch reflects that philosophy.

4. In addition to the control flow alternatives others have mentioned, some
people like the way that promises compose for error-handling. I personally
find using streams with promises a little awkward, but if your web service
has any sort of pipeline (pull some data -> do something with the data ->
cache it in e.g. redis -> render a template -> shove it out the response),
promises might be a way to DRY up your error-handling in a way that allows
you to confine the consequences of exceptions. Also, if you use Bluebird,
you probably won't even pay that much of a performance penalty.

Just trying to be extra-careful is probably not going to ever feel very
satisfying. There's a lot that can go wrong, and as much as the core tries
to be consistent about *either* throwing synchronously *or* emitting
'error' events / passing Error objects to callbacks, there are a lot of
gotchas that only time, experience, and production crashes will make clear.
Domains are core's general solution to this problem, along with the
philosophy (that's been articulated here and elsewhere many times) that the
sane thing to do when your Node process encounters an error is to shut down
and restart the process.

Another piece of this is to partition your services such that you don't
have the problem of 10,000 clients being at risk if one request crashes.
Think about scaling your app horizontally (using cluster or something
similar) to keep each process dealing with a smaller number of clients if
you can. PHP handles this better (in terms of your problem -- there are
always tradeoffs!) because each PHP script is being run in its own context
(which for all intents and purposes is a process -- if one PHP handler, it
has no effect on the others), which is just a fundamentally different model
from Node.

On Tue, Jan 14, 2014 at 9:43 PM, Gregg Caines <cai...@gmail.com> wrote:

> Well even though all the responses so far would require some pretty
> non-standard solutions (and therefore major changes to our current app), I
> really do appreciate them.  We have logging, metrics and alerts on server
> restarts, so we know about and fix restarts as fast as possible I believe,
> but losing 10,000+ user requests at once (per server!  and we have dozens
> of servers running!) due to one bad api endpoint is just not worth the risk
> of running like this anymore.  I'm definitely forced to consider the
> weirder solutions if there isn't a standard one.
>
> There have got to be others working on a standard yet somewhat large
> deployment that have similar concerns though.  How is everyone else
> managing this?   (And if your answer is "Be more careful", I'm going to
> assume you're not in the same situation.  Also: we've got a staging
> environment we test in first and nearly 100% test coverage  )
>
> G
>
>
> On Tuesday, January 14, 2014 7:40:51 PM UTC-8, tjholowaychuk wrote:
>>
>> check out Koa http://koajs.com/ you won't get separate stacks like you
>> do with node-fibers but similar otherwise (built with generators)
>>
>> On Tuesday, 14 January 2014 12:28:52 UTC-8, Gregg Caines wrote:
>>>
>>> Hey all... I'm wondering if anyone can point me to the current
>>> best-practice for isolating requests in a web app.  In general I'm trying
>>> to solve the problem of keeping the server running despite bad code in a
>>> particular request.  Are domains my only shot?  Do they completely solve
>>> it?  Does anyone have existing code?
>>>
>>> I'm on a somewhat large team, working on a somewhat large codebase, and
>>> until now I've been just logging restarts and combing logs for these types
>>> of errors, then fixing them (which I'll always do), but I'm starting to
>>> feel a bit silly with PHP having solved this 10 years ago.  ;)  When a bug
>>> does get through, it would be nice to not lose the whole server and the
>>> possible 10,000+ customer requests attached to it, while I scramble to fix
>>> it.
>>>
>>> Thanks for any ideas or pointers!
>>>
>>> G
>>>
>>  --
> --
> Job Board: http://jobs.nodejs.org/
> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> You received this message because you are subscribed to the Google
> Groups "nodejs" group.
> To post to this group, send email to nodejs@googlegroups.com
> To unsubscribe from this group, send email to
> nodejs+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "nodejs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to nodejs+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to nodejs@googlegroups.com
To unsubscribe from this group, send email to
nodejs+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to nodejs+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [nodejs] Re: State of the art for request isolation in http servers?

Reply via email to