On Fri, Jun 1, 2012 at 1:09 PM, Marco Rogers <[email protected]> wrote:
> Being a "core guy" means you have a tremendous amount of
> influence on the way things go. If Mikeal did not agree with this change, we
> would likely not be talking about it. Not saying you wouldn't want to fix
> the core problem, but if Mikeal wanted to protect nextTick for whatever
> reason, then changing it wouldn't be on the table.

That's false.  Mikeal's been in the dissenting minority plenty of
times, and has had to deal with changes he didn't like, and has pet
features of his pulled out because they caused problems.

> The reason this is even a
> problem is at least partially due to getting this wrong in the first place.

No disagreement there.

> This is where you're hubris is showing. Making this kind of presumption
> requires you to dictate the constraints of another person's system. Maybe
> your dataset doesn't stream. Maybe you can't use redis or memcache for some
> reason.

So, you're saying that there are, today, in the real world,
applications where you must iterate over a work queue that is:

1. Impossible to serialize in a child process.
2. Arbitrarily large.
3. Small enough to fit in memory.
4. Well-partitioned enough to split up into multiple passes.
5. Important enough to risk affecting IO performance by shoving a very
high priority uv listener into the queue.
6. Not so important to starve IO while all the passes complete.

Show me.  That's a pretty narrow use case.  I'm skeptical, but you
know how much I love being surprised by evidence :)

In any contest between "maybe someone will need this, how do you know
they don't?" and "this is a known problem for current HTTP servers and
clients", our choice is obvious.

We try very hard in node to deal with reality over hypotheticals.  If
there are real world cases that are currently best served by the
current nextTick (as opposed to, say, a proper idle listener, a child
process, or a end-of-tick handler), then we'll address it.


> And you are completely ignoring
> the fact that it only became not recommended very recently, when in fact we
> have been actively pushing people to next tick to defer execution for a long
> time now. Now you've decided that it's not only "not recommended" but a "bad
> idea". This is the height of presumption.

Not "recently".  Qv. every "node sucks" article that points at CPU
bottlenecks.  Put CPU intensive operations into a child process.
That's node's concurrency story.  It's not new.


> And to be clear nextTick not being used
> as "background fork". It's being used an efficient way to kill the current
> stack to yield ongoing execution and let the event loop continue.

In some of your comments here, it sounds like you might be confusing
the run-to-completion behavior with the IO event loop.  RTC isn't
going to change, and nextTick will still be the approved way to
schedule behavior that happens after anything else in the current v8
invocation.  It'll just happen *immediately* after, instead of in a
high-priority uv listener.

Just to be 100% clear, the behavior of this code won't change:

var foo = new EventEmitter
process.nextTick(function () {
  foo.emit('bar', 'baz')
})
foo.on('bar', console.log)

Both with the change, and without it, this program will log 'baz'.

However, this program will (maybe) be affected:

process.nextTick(function f () {
  process.nextTick(f)
})
setTimeout(function () {
  console.log('bar')
}, 100)

Also, we can probably figure out a way to make it work such that it
runs the first nextTick at the end of the current RTC, but subsequent
ones at some point get deferred.  As I said earlier in the thread, we
haven't investigated it thoroughly.

> Consider functions that check a cache an in memory cache and return without
> doing i/o. You don't want to have the callback by synchronous, so you throw
> it into a nextTick call.

That's still going to be the case.  In fact, that *doesn't* work very
reliably right now, and is nondeterministically slow.  It *will* work
reliably with this change, and with lower latency.  That's what this
is all about.

> That was literally a best practice until 2 days
> ago. Now it's a terrible idea. Because if you r system ends up doing this
> often, you are negatively affecting your i/o throughput because you're not
> actually yielding to pending i/o.

It would only be the case if you have a nextTick that adds another
nextTick, forever.  That would have the same performance
characteristics as an infinite loop.

We could add a nextTickDepth guard or something, and make it so that
process.nextTick(function f(){process.nextTick(f)}) will do a
setTimeout(f, 0) after 100,000 recursions or something.  That'll be
still on the order of a few ms, and it'll *still* be a few orders of
magnitude faster for any use case presented.  That's all still TBD.


> As for how broken nextTick is, I think you're still missing the point of why
> people feel disenfranchised. nextTick is behaving badly for some people. And
> it's the people you are closest to.

In my capacity as the person in charge of the node.js project, I am
closest to those who are pushing node's limits and using it in the
real world under high production loads.

So, yes, I agree with you here.

> But as I said before, to say nextTick is "broken" is to
> ignore everyone else who's telling you that it's not and that we are using
> it just fine.

If you do a lot of HTTP requests under extremely high load, you'll see
sporadic failures in the agent code.

It's not "just fine".  It's "just fine for small n".  Everything is
fine for small n.  Node is for high traffic applications.
High-traffic problems are our problems.

There's currently no good way to assign a handler to the end of the current RTC.

Reply via email to