> Does that answer your questions? Thanks Ben, for taking the time to write. I did find your response (and Matt's too) both interesting and enlightening.
> > Node.js uses non-blocking I/O when it can and only falls back to a > > thread pool when it must. So, like file-system I/O, what would be other examples of Node employing the worker thread pool. It seems, knowing this would be one of the essentials to writing a scalable node app. > > 3. If the RDBMS instance (say, MySQL) is co-located on the Node server box, > > then would it be correct > > Communication with the database normally takes place over a TCP or > UNIX socket, I see that you assume (or recommend) a specific setup of web apps... e.g. in which the DBMS server is on a dedicated box of its own. This, incidentally, is typically the case with << C10K apps also. So, Node's evented nature won't magically obviate the need of a dedicated DBMS server. (Note that I've never really developed or deployed even a modest scale web app, let alone a C10K scalable app - hence all this newbie ignorance.) On Thursday, March 24, 2016 at 7:52:05 AM UTC+5:30, Ben Noordhuis wrote: > > Hello Harry, replies inline. > > On Wed, Mar 23, 2016 at 5:25 PM, Harry Simons <[email protected] > <javascript:>> wrote: > > Hello, > > > > > > I have not been able to see the following points addressed in all the > online > > material I have read to date on Node, and so, hope to be enlightened by > some > > very smart and knowledegable folks here that I presume would be reading > > this. > > > > > > 1. Since I/O happens asynchronously in worker threads, it is possible > for a > > single Node process to quickly/efficiently accept 1000s of incoming > requests > > compared to something like Apache. But, surely, the outgoing responses > for > > each of those requests will take their own time, won't it? For example, > if > > an isolated and primarily an I/O bound request takes, say, 3 seconds to > get > > serviced (with no other load on the system), then if concurrently hit > with > > 5000 such requests, won't Node take a lot of time to service them all, > > fully? If this 3-second task happens to involve exclusive access to the > > disk, then it would take 5000 x 3 sec = 15000 seconds, or over 4 hours > of > > wait to see the response for the last request coming out of the Node > app. In > > such scenarios, would it be correct to claim that a single-process Node > > configuration can 'handle' 1000s of requests per second (granted, a > > thread-server like Apache would do a lot worse with 5000 threads) when > all > > that Node may be doing is simply putting the requests 'on hold' till > they > > get fully serviced instead of rejecting them outrightly on initial their > > arrival itself? I'm asking this because as I'm reading up on Node, I'm > often > > hearing how Node can address the C10K problem without any co-mention of > any > > specific application setups or any specific application types that Node > can > > or cannot handle... other than the broad, CPU- vs I/O-bound type of > > application classification. > > Node.js uses non-blocking I/O when it can and only falls back to a > thread pool when it must. Sockets, pipes, etc. are handled in > asynchronous, non-blocking fashion using native system APIs but e.g. > file I/O is offloaded to a thread pool. > > > 2. What about the context switching overhead of the workers in the > > worker-thread pool? If C10K requests hit a Node-based application, won't > the > > workers in the worker-thread pool end up context-switching just as much > as > > the user threads in the thread pool of a regular, threaded-server (like > > Apache)...? because, all that would have happened in Node's event thread > > would be a quick request-parsing and request-routing, with the remainder > > (or, the bulk) of the processing still happening in the worker thread? > That > > is, does it really matter (as far as minimization of thread > > context-switching is concerned) whether a request/response is handled > from > > start to finish in a single thread (in the manner of threaded-server > like > > Apache), or whether it happens transparently in a Node-managed worker > thread > > with only minimal work (of request parsing and routing) subtracted from > it? > > Ignore here the simpler, single-threaded user model of coding that comes > > with an evented server like Node. > > See above. Depending on the application, you may not hit the thread > pool much or at all. > > > 3. If the RDBMS instance (say, MySQL) is co-located on the Node server > box, > > then would it be correct to classify a database CRUD operation as a pure > I/O > > task? My understanding is, a CRUD operation on a large, relational > database > > will typically involve heavyduty CPU- and I/O-processing, and not just > > I/O-processing. However, the online material that I've been reading seem > to > > label a 'database call' as merely an 'I/O call' which supposedly makes > your > > application an I/O-bound application if that is the only the thing your > > application is (mostly) doing. > > Communication with the database normally takes place over a TCP or > UNIX socket, so as far as node.js is concerned, it's not much > different from any other network connection. The heavy-duty number > crunching takes place in a different process, the RDBMS. > > > 4. A final question (related to the above themes) that may require > knowledge > > of modern hardware and OS which I am not fully up-to-date on. Can I/O > (on a > > given I/O device) be done in parallel, or even concurrently if not > > parallelly, and THUS, scale proportionally with user-count? Example: > Suppose > > I have written a file-serving Node app that serves files from the local > > hard-disk, making it strongly an I/O-bound app. If hit with N (== ~ > C10K) > > concurrent file serving requests, what I think would happen is this: > > > > The event-loop would spawn N async file-read requests, and go idle. > > (Alternatively, Node would have pre-spawned all its workers in the pool > on > > process startup.) > > The N async file-read requests would each get assigned to N worker > threads > > in the worker thread pool. If N > pool size, then the balance will be > made > > to wait in some sort of an internal queue to get assigned to a worker. > > Each file-read request would run concurrently at best, or sequentially > at > > worst - but definitely NOT N-parallelly. That is, even a RAID > configuration > > would be able to service merely a handful of file-read requests > parallelly, > > and certainly not all N parallelly. > > This would result in a large, total wait-time for the last file serving > > request to be served fully. > > That's by and large correct. > > > So, if all these 4 points are true, how could we really say that a > single > > Node-process based application is good for (because it scales well) for > > I/O-bound applications? Can the mere ability to receive a large number > of > > incoming requests and keep them all on hold indefinitely while their I/O > > fully completes (versus, rejecting them outrightly on their initial > arrival > > itself) be called 'servicing the requests'? Can such an application be > seen > > as scaling well with respect to user-count? > > For many applications the answer is 'yes' because they can break up > the request in smaller parts that they can service independently. > > Say your application has to read a) a file from disk, b) query a > database, and c) consult a web service before it can send a reply. In > the traditional web server model, it takes a+b+c time, whereas with > the asynchronous model it's max(a,b,c). > > max(a,b,c) <= a+b+c so worst case, it performs the same, but common > case, it's much faster. > > Does that answer your questions? > -- Job board: http://jobs.nodejs.org/ New group rules: https://gist.github.com/othiym23/9886289#file-moderation-policy-md Old group rules: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines --- You received this message because you are subscribed to the Google Groups "nodejs" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/nodejs/bde23265-e244-47f9-95be-4b261de0848d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
