* Linus Torvalds <[EMAIL PROTECTED]> wrote:

> > My tests show that with 4k connections per second (8k concurrency) 
> > more than 20k connections of 80k total block in tcp_sendmsg() over 
> > gigabit lan between quite fast machines.
> 
> Why do people *keep* taking this up as an issue?
> 
> Use select/poll/epoll/kevent/whatever for event mechanisms. STOP 
> CLAIMING that you'd use threadlets/syslets/aio for that. It's been 
> pointed out over and over and over again, and yet you continue to make 
> the same mistake, Evgeniy.
> 
> So please read that sentence ten times, and then don't continue to 
> make that same mistake. PLEASE.
> 
> Event mechanisms are *superior* for events. But they *suck* for things 
> that aren't events, but are actual code execution with random places 
> that can block. THE TWO THINGS ARE TOTALLY AND UTTERLY INDEPENDENT!

Note that even for something tasks are supposed to suck at, and even if 
used in extremely stupid ways, they perform reasonably well in practice 
;-)

And i fully agree: specialization based on knowledge about frequency of 
blocking will always be useful - if not /forced/ on the workflow 
architecture and if not overdone. On the other hand, fully event-driven 
servers based on 'nonblocking' calls, which Evgeniy is advocating and 
which the kevent model is forcing upon userspace, is pure madness.

We very much can and should use things like epoll for events that we 
expect to happen asynchronously 100% of the time - it just makes no 
sense for those events to take up 4-5K of RAM apiece, when they could 
also be only using up the 32 bytes that say a pending timer takes. I've 
posted the code for that, how to do an 'outer' epoll loop around an 
internal threadlep iterator. But those will always be very narrow event 
sources, and likely wont (and shouldnt) cover 'request-internal' 
processing.

but otherwise, there is no real difference between a task that is 
scheduled and a request that is queued, 'other' than the size of the 
request (the task takes 4-5K of RAM), and the register context (64-128 
bytes on most CPUs, the loading of which is optimized to death).

Which difference can still be significant for certain workloads, so we 
certainly dont want to prohibit specialized event interfaces and force 
generic threads on everything. But for anything that isnt a raw and 
natural external event source (time, network, disk, user-generated) 
there shouldnt be much of an event queueing abstraction i believe (other 
than what we get 'for free' within epoll, from having poll()-able files) 
- and even for those event sources threadlets offer a pretty good run 
for the money.

one can always find the point and workload where say 40,000 threads 
start trashing the L2 cache, but where 40,000 queued special requests 
are still fully in cache, and produce spectacular numbers.

        Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to