Hello,
I have not been able to see the following points addressed in all the online material I have read to date on Node, and so, hope to be enlightened by some very smart and knowledegable folks here that I presume would be reading this. 1. Since I/O happens asynchronously in worker threads, it is possible for a single Node process to quickly/efficiently accept 1000s of incoming requests compared to something like Apache. But, surely, the outgoing responses for each of those requests will take their own time, won't it? For example, if an isolated and primarily an I/O bound request takes, say, 3 seconds to get serviced (with no other load on the system), then if concurrently hit with 5000 such requests, won't Node take *a lot* of time to service them all, *fully*? If this 3-second task happens to involve exclusive access to the disk, then it would take 5000 x 3 sec = 15000 seconds, or over 4 hours of wait to see the response for the last request coming out of the Node app. In such scenarios, would it be correct to claim that a single-process Node configuration can 'handle' 1000s of requests per second (granted, a thread-server like Apache would do a lot worse with 5000 threads) when all that Node may be doing is simply putting the requests 'on hold' till they get *fully* serviced instead of rejecting them outrightly on initial their arrival itself? I'm asking this because as I'm reading up on Node, I'm often hearing how Node can address the C10K problem without any co-mention of any specific application setups or any specific application types that Node can or cannot handle... other than the broad, CPU- vs I/O-bound type of application classification. 2. What about the context switching overhead of the workers in the worker-thread pool? If C10K requests hit a Node-based application, won't the workers in the worker-thread pool end up context-switching just as much as the user threads in the thread pool of a regular, threaded-server (like Apache)...? because, all that would have happened in Node's event thread would be a quick request-parsing and request-routing, with the remainder (or, the bulk) of the processing still happening in the worker thread? That is, does it really matter (as far as minimization of thread context-switching is concerned) whether a request/response is handled from start to finish in a single thread (in the manner of threaded-server like Apache), or whether it happens transparently in a Node-managed worker thread with only minimal work (of request parsing and routing) subtracted from it? Ignore here the simpler, single-threaded user model of coding that comes with an evented server like Node. 3. If the RDBMS instance (say, MySQL) is co-located on the Node server box, then would it be correct to classify a database CRUD operation as a pure I/O task? My understanding is, a CRUD operation on a large, relational database will typically involve heavyduty CPU- and I/O-processing, and not just I/O-processing. However, the online material that I've been reading seem to label a 'database call' as merely an 'I/O call' which supposedly makes your application an I/O-bound application if that is the only the thing your application is (mostly) doing. 4. A final question (related to the above themes) that may require knowledge of modern hardware and OS which I am not fully up-to-date on. Can I/O (on a given I/O device) be done in parallel, or even concurrently if not parallelly, and THUS, scale proportionally with user-count? Example: Suppose I have written a file-serving Node app that serves files from the local hard-disk, making it strongly an I/O-bound app. If hit with N (== ~ C10K) concurrent file serving requests, what I think would happen is this: - The event-loop would spawn N async file-read requests, and go idle. (Alternatively, Node would have pre-spawned all its workers in the pool on process startup.) - The N async file-read requests would each get assigned to N worker threads in the worker thread pool. If N > pool size, then the balance will be made to wait in some sort of an internal queue to get assigned to a worker. - Each file-read request would run concurrently at best, or sequentially at worst - but definitely NOT N-parallelly. That is, even a RAID configuration would be able to service merely a handful of file-read requests parallelly, and certainly not all N parallelly. - This would result in a large, total wait-time for the last file serving request to be served *fully*. So, if all these 4 points are true, how could we really say that a single Node-process based application is good for (because it scales well) for I/O-bound applications? Can the mere ability to receive a large number of incoming requests and keep them all on hold indefinitely while their I/O *fully* completes (versus, rejecting them outrightly on their initial arrival itself) be called 'servicing the requests'? Can such an application be seen as scaling well with respect to user-count? Many thanks in advance. Regards, /HS -- Job board: http://jobs.nodejs.org/ New group rules: https://gist.github.com/othiym23/9886289#file-moderation-policy-md Old group rules: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines --- You received this message because you are subscribed to the Google Groups "nodejs" group. To unsubscribe from this group and stop receiving emails from it, send an email to nodejs+unsubscr...@googlegroups.com. To post to this group, send email to nodejs@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/nodejs/212aa412-25a3-4fd5-b7db-46d4a215995b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.